?
Solved

Removing unwanted HTML tags

Posted on 2005-03-13
10
Medium Priority
?
170 Views
Last Modified: 2013-12-24
Im parsing three different webpages for specific links. Im using regex to parse the links and they work fine. However I then need to cfhttp each of the links I have parsed, but html tags surround the links therefore the cfhttp fails.
For example my regexp get the following results:
 <a href="http://www.website1/?id=3440305">
HREF='http://www.website/news.asp?ID=491'

But I cannot cfhttp these strings as I need to remove html tags and the HREF's. I have tried using REReplace with "<[^>]*>", "", "ALL")> as is used in most CF books, but it doesnt work in removing the HREF's.


0
Comment
Question by:VHSB
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
10 Comments
 
LVL 21

Expert Comment

by:pinaldave
ID: 13530026
<[^>]*>", "", "ALL") should have removed the < a href.
If you post the code we can look into it.
---Pinal
0
 

Author Comment

by:VHSB
ID: 13530793
<cfhttp method="get" URL="#Trim(xmlObj.xmlRoot.site[i].xmlAttributes.index)#" ResolveURL="yes"></cfhttp>
            
                  <cfset StartPos = 1>
                  <cfloop condition ="True">

                        <!---Parse the site index pages for job links--->
                        <cfset Match = REFindNoCase(#Trim(xmlObj.xmlRoot.site[i].parse.xmlAttributes.re)#, cfhttp.FileContent, StartPos, True)>

                        <cfif Match.pos[1] EQ 0>
                              <cfbreak>
      
                        <cfelse>
                              <cfset StartPos = Match.pos[1] + Match.len[1]>
                              <!---<cfset Foundlinks = Mid(cfhttp.FileContent, Match.pos[1], Match.len[1])>--->
                              <cfset StripLinks = #REReplace(#Mid(cfhttp.FileContent, Match.pos[1], Match.len[1])#,'<td><a\s*HREF[[:punct:]]','',"all")#>
                              
                              <!---Store the list of FoundLinks into the Links Array--->
                              <cfset LinksArray= ListToArray(StripLinks)>                                    
                                    <cfdump var="#LinksArray#">
      
                              
                        </cfif>
                              
                  </cfloop>

Im going wrong somewhere but Im not sure where. Thanks
0
 

Author Comment

by:VHSB
ID: 13530800
Sorry that was a previous attempt, my current one is as stated here:
#REReplace(#Mid(cfhttp.FileContent, Match.pos[1], Match.len[1])#,"<[^>]*>", "", "ALL"))#>
0
Optimum High-Definition Video Viewing and Control

The ATEN VM0404HA 4x4 4K HDMI Matrix Switch supports 4K resolutions of UHD (3840 x 2160) and DCI (4096 x 2160) with refresh rates of 30 Hz (4:4:4) and 60 Hz (4:2:0). It is ideal for applications where the routing of 4K digital signals is required.

 
LVL 35

Expert Comment

by:mrichmon
ID: 13536716
This may be a good resource - many different regular expression patterns having to do with matching html:
http://www.regexlib.com/DisplayPatterns.aspx?cattabindex=4&categoryId=8
0
 

Author Comment

by:VHSB
ID: 13538811
Mrichmon, thanks for that, Ive experimented with a couple of ideas from that site but no luck.
Still struggling, thanks
0
 

Author Comment

by:VHSB
ID: 13544330
The points are going up for this one guys. Thanks
0
 
LVL 7

Expert Comment

by:black0ps
ID: 13617456
Don't paq it yet.
0
 
LVL 7

Accepted Solution

by:
black0ps earned 2000 total points
ID: 13618549
Ok, custom tag done. I've tested it with a couple of sites and it looks like it's going to work. Let me know how it works out for you:
http://www.clearresults.net/tags/
It's the links tag at the bottom.

-- Ian
0
 

Author Comment

by:VHSB
ID: 13620878
Excellent Ian. Thats fantastic.
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Did you know SD-WANs can improve network connectivity? Check out this webinar to learn how an SD-WAN simplified, one-click tool can help you migrate and manage data in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A web service (http://en.wikipedia.org/wiki/Web_service) is a software related technology that facilitates machine-to-machine interaction over a network. This article helps beginners in creating and consuming a web service using the ColdFusion Ma…
Most ColdFusion developers get confused between the CFSet, Duplicate, and Structcopy methods of copying a Structure, especially which one to use when. This Article will explain the differences in the approaches with examples; therefore, after readin…
Do you want to know how to make a graph with Microsoft Access? First, create a query with the data for the chart. Then make a blank form and add a chart control. This video also shows how to change what data is displayed on the graph as well as form…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses
Course of the Month9 days, 17 hours left to enroll

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question