Link to home
Start Free TrialLog in
Avatar of McHack
McHack

asked on

Regular expression to replace tags

Does anyone have a regular expression that will remove any tag and the content between the tags and replace it with the blank tag value? For instance if I want to remove the "<head></head>" tags and any character and/or nested tags between them and replace them with "<head><title>My Title</title></head>".
Avatar of mrichmon
mrichmon

You could do it in two steps.

Step 1) Replace <head>....</head> with <head></head>

You could use a regular expression similar to:
<head>*</head>

Step 2) Then you can replace <head></head> with what you want using a simple replace.


You could probably combine into one step, up to you.
Avatar of McHack

ASKER

Ok, that didn't work. When I ran that example the page "<HEAD></HEAD>" tags were not removed and the content in between them was not removed of which there is a ton (javascripts, meta tags etc.).

Here is the code I was running:

<CFTRY>
       <cfhttp method="get" url="http://www.someurl.com" resolveurl="yes">
        </cfhttp>
      <CFCATCH>
            <CFSET ThrowError = true>
      </CFCATCH>
</CFTRY>
<cfset ApScriptEdit = #ReReplaceNoCase(cfhttp.FileContent, "<head>*</head>", "", "ALL")#>
<Cfoutput>#ApScriptEdit#</Cfoutput>
Hi

Almost there try this:

<cfset ApScriptEdit  = rereplacenocase(cfhttp.FileContent, "<head>.*?</head>","","ALL")>

Mause
ASKER CERTIFIED SOLUTION
Avatar of umbrae
umbrae

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Whoops. Probably want to remove the htmlCodeFormat() from that output, was doing that for debugging.

-Umbrae
Avatar of McHack

ASKER

umbrae

Works great, just what I was looking for.

It looks so simple when I see the solution but some how I never seem to get the regular expressions right.

Thanks for the help!

McHack