Solved

How to use coldfusion to replace dynamic text when using cfhttp

Posted on 2003-10-23
7
606 Views
Last Modified: 2013-12-24
I'm using cfhttp to pull in page content from another site and then using ReplaceList to format the output to the style that I want. This works fine for most of the content except for the date content which is generated dynamicly and I don't want to display the date content. I have not been able to strip out the date content because the format is sometimes 4 characters (12:30) sometimes 3 characters (1:30) and sometimes 1 character (3). Since the date code is always between the same specific tags (<SPAN CLASS='byttl'><br>AP Top News At 1:16 p.m. EDT</SPAN>) I'm wondering if there is some way to replace everything that is placed between these tags with nothing, or with another string? Sort of like a wildcard.


Here is an example of the code I'm running now:

<cfhttp method="get" url="http://www.SomeURL.com/index.html" resolveurl="yes">
</cfhttp>


<cfset output = #ReplaceList(cfhttp.FileContent,"<table BORDER=0>,<TR>,<TD>,</SPAN>,</TR>,</TD>,<SPAN CLASS='byttl'>,<SPAN CLASS='storylink'>,<SPAN CLASS='topheadline'>,<SPAN CLASS='firsttopheadline'>,<br>AP Top News At,EDT,....,</TABLE>,..." , ",,,,,,,,,,,,,,,,<br>")#>

<cfoutput>
<font size="-1" face="Arial, Helvetica, sans-serif">#output#</font>
</cfoutput>
0
Comment
Question by:McHack
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
  • 2
  • +1
7 Comments
 
LVL 1

Expert Comment

by:kjuliff
ID: 9609745
Replace the strings <SPAN CLASS='byttl'><br>  and </SPAN> with single characters that you know will not appear anywhere else in the string.

Then do a rereplace of anything between those characters with the date that you want. Then put the <SPAN CLASS='byttl'><br>  and </SPAN>, back.

You will need to use a regular expression to do the replace BETWEEN part.

I replaced all instances of #<CF variable># with the string "dynamically generated" like this -
<CFSET b= #REReplace (TRIM(string), "{1}\~.*{1}\~","&laquo;dynamically generated&raquo;", "all") #>
where I had a variable in the string representing a number. I would then get
 something like this
There are «dynamically generated» Australians currently registered as living in Belgium.
0
 
LVL 2

Expert Comment

by:jonnygo55
ID: 9609794
basically what you want to do is to search for the start span tag:
<cfset startPosTag = findnocase(output,"<span class='byttl'>")>

get the first position after that tag:
<cfset startPosContent = startPosTag + 20>

get the first position of the end tag after that specific start span tag
<cfset endPosTag = findnocase(output,"</span>",startPosContent)>
 
get the content
<cfset content = mid(output,startPosContent,endPosTag - startPosContent)>

then just replace the content with whatever...
<cfset newOutput = replace(output,content,'whatever')>

something like that...
0
 
LVL 1

Expert Comment

by:kjuliff
ID: 9609798
PS I left something out in the above example. This should make it clearer.

I have a string called summary. I replace all the pound signs (but it could be any string) with a tilda ~.

<cfset a= #Replace(summary, "##", "~" , "all") #>

Then I replace everything between two tilda's with the string 'dynamically generated'.
<CFSET b= #REReplace (TRIM(string), "{1}\~.*{1}\~","&laquo;dynamically generated&raquo;", "all") #>

b will now countain not #numaussies# but
are «dynamically generated»

as in
There are «dynamically generated» Australians currently registered as living in Belgium.


I did this for output of a Verity search query where the search string returned could have CF variables that were not defined in the page displaying the search results.





0
Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

 

Author Comment

by:McHack
ID: 9610441
So far no luck,

Basically what I'm trying to do is strip out everything except for the links and the text. I've been successful with everything except for the date portion. Ideally I would just like to get ride of the date but have not been able to. I have not been able to remove the comments either since they are dynamic too but so far they don't seem to interfere with anything.  

Here is an example of what the code looks like when I first bring it in before removing anything:

<TABLE BORDER=0>
<!-- BEGIN PACKAGE 150058 -->
<!-- $Id: Package.java,v 1.57 2003/04/22 21:01:40 mike Exp $ -->
<!-- BEGIN TOP HEADLINE DECORATED 55 -->
<!-- $Id: TopHeadlineDecorator.java,v 1.10 2003/04/22 21:01:41 mike Exp $ -->
<!-- BEGIN TOP HEADLINE 9009 2003/10/23 13:18:11 -->
<!-- $Id: TopHeadline.java,v 1.49 2003/04/22 21:01:41 mike Exp $ -->
<TR><TD><SPAN CLASS='byttl'><br>AP Top News At 1:16 p.m. EDT</SPAN></TD></TR>
<!-- BEGIN TOPHEADLINEITEM 7367213 -->
<!-- $Id: TopHeadlineItem.java,v 1.52 2003/04/22 21:01:41 mike Exp $ -->
<TR><TD><SPAN CLASS='storylink'><SPAN CLASS='topheadline'><SPAN CLASS='firsttopheadline'>&nbsp;<BR>
<A HREF=http://www.SomeURL.com/dynamic/stories/I/IRAQ_CONFERENCE>U.S., U.N. Seek Billions to Rebuild Iraq</A></SPAN></SPAN></SPAN><BR>
MADRID, Spain (AP) -- U.S. and Iraqi officials pleaded for billions to rebuild Iraq at a donors conference that opened Thursday with warnings that they might not get all they need right away....</TD></TR>
<!-- END TOPHEADLINEITEM 7367213 -->
<!-- BEGIN TOPHEADLINEITEM 7367214 -->
<!-- $Id: TopHeadlineItem.java,v 1.52 2003/04/22 21:01:41 mike Exp $ -->
<TR><TD><SPAN CLASS='storylink'><SPAN CLASS='topheadline'>&nbsp;<BR>
<A HREF=http://www.SomeURL.com/dynamic/stories/W/WAL_MART_ARRESTS>300 Illegal Workers Arrested at Wal-Marts</A></SPAN></SPAN><BR>
WASHINGTON (AP) -- Federal officials arrested more than 300 illegal workers at 61 Wal-Mart stores across the country early Thursday morning and searched the office of one of the retail chain's corporate executives, a federal official said....</TD></TR>
<!-- END TOPHEADLINEITEM 7367214 -->
<!-- BEGIN TOPHEADLINEITEM 7367215 -->
<!-- $Id: TopHeadlineItem.java,v 1.52 2003/04/22 21:01:41 mike Exp $ -->
<TR><TD><SPAN CLASS='storylink'><SPAN CLASS='topheadline'>&nbsp;<BR>
<A HREF=http://www.SomeURL.com/dynamic/stories/B/BUSH>Bush Heckled in Australian Parliament</A></SPAN></SPAN><BR>
CANBERRA, Australia (AP) -- Heckled inside and outside Australia's Parliament, President Bush offered a pointed answer to those who say the war with Iraq wasn't worth fighting....</TD></TR>
<!-- END TOPHEADLINEITEM 7367215 -->
<!-- END TOP HEADLINE 9009 -->
<!-- END TOP HEADLINE DECORATED 55 -->
<!-- BEGIN PACKAGE ITEM VERTICAL SPACER -->
<TR>
<TD><IMG SRC='http://www.SomeURL.com/icons/spacer.gif' HEIGHT=8 WIDTH=1 ></TD>
</TR>
<!-- END PACKAGE ITEM VERTICAL SPACER -->
<!-- END PACKAGE 150058 -->
</TABLE>
0
 
LVL 4

Accepted Solution

by:
procept earned 250 total points
ID: 9612837
Hi,

<cfset newText = reReplaceNoCase(cfhttp.filecontent, "<SPAN CLASS='byttl'><br>[^<]+</span>", "", "ALL")>

will remove the SPAN tags and all content between them, as long as there is always a "<br>" after the opening <span>.

HTH,

Chris
0
 

Author Comment

by:McHack
ID: 9614575
Thanks Chris, that's exactly what I needed. For future reference could you explain how the [^<] part works? Thanks again, here comes your points.
0
 
LVL 4

Expert Comment

by:procept
ID: 9619017
Hi McHack,

the regular expression means

everything that starts with "<SPAN CLASS='byttl'><br>", can be followed by anything except "<" and must end with "</span>".

[^<]+ means "one or more chars that are not '<'.

HTH,

Chris
0

Featured Post

Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Periodically we have to update or add SSL certificates for customers. Depending upon your hosting plan you may be responsible for the installation and/or key generation. In the wake of Heartbleed many sites were forced to re-key. We will concen…
Lease-to-own eliminates the expenditure of hardware replacement and allows you to pay off the server over time. Usually, this is much cheaper than leasing servers. Think of lease-to-own as credit without interest.
The purpose of this video is to demonstrate how to Test the speed of a WordPress Website. Site Speed is an important metric of a site’s health. Slow site speed can result in viewers leaving your site quickly and not seeing your content. This…
The purpose of this video is to demonstrate how to set up basic WordPress SEO. This will be demonstrated using a Windows 8 PC. The plugin used will be WordPress SEO by Yoast. Go to your WordPress login page. This will look like the following: myw…

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question