Verdend
asked on
RegEx to strip font tags in ASP
Hi Experts,
I have a CMS that allows users to update web content stored in a database. I would like to use a regular expression to clean the content of font tags prior to updaing the database. If a font tag contains a color definition, that part can stay, everything else (font name, size, wieght, etc) needs to be stripped out. If a font tag contains NO color definition then the entire tag (opening and closing) should be removed.
I found a couple website examples that claim to do nearly what I need - but, ahem, I don't know how to use the examples in an asp page. My page has a strNewContent. I already use Replace to to fix single and double quotes before makeing the database update. Could someone please tell me the exact syntax to use a regular expression to clean the font tags from strNewContent prior to updating the database?
Below are the samples I've found:
One website said the following wold remove all font tags when replaced with backreference \6
<(FONT|font)([ ]([a-zA-Z]+)=("|')[^"\']+( "|'))*[^>] +>([^<]+)( </FONT>|</ font>)
Another website said the following would strip all but the color from a font tag:
<\*?font # Match start of Font Tag
(?(?=[^>]+color.*>) #IF/THEN lookahead color in tag
(.*?color\s*?[=|:]\s*?) # IF found THEN move ahead
('+\#*?[\w\s]*'+ # CAPTURE ColorName/Hex
|"+\#*?[\w\s]*"+ # single or double
|\#*\w*\b) # or no quotes
.*?> # & move to end of tag
|.*?> # ELSE move to end of Tag
) # Close the If/Then lookahead
# Use Multiline and IgnoreCase
# Replace the matches from RE with MatchEvaluator below:
# if m.Groups(1).Value<>"" then
# Return "<font color=" & m.Groups(1).Value & ">"
# else
# Return "<font>"
# end if
Thanks for your help!
Verdend
I have a CMS that allows users to update web content stored in a database. I would like to use a regular expression to clean the content of font tags prior to updaing the database. If a font tag contains a color definition, that part can stay, everything else (font name, size, wieght, etc) needs to be stripped out. If a font tag contains NO color definition then the entire tag (opening and closing) should be removed.
I found a couple website examples that claim to do nearly what I need - but, ahem, I don't know how to use the examples in an asp page. My page has a strNewContent. I already use Replace to to fix single and double quotes before makeing the database update. Could someone please tell me the exact syntax to use a regular expression to clean the font tags from strNewContent prior to updating the database?
Below are the samples I've found:
One website said the following wold remove all font tags when replaced with backreference \6
<(FONT|font)([ ]([a-zA-Z]+)=("|')[^"\']+(
Another website said the following would strip all but the color from a font tag:
<\*?font # Match start of Font Tag
(?(?=[^>]+color.*>) #IF/THEN lookahead color in tag
(.*?color\s*?[=|:]\s*?) # IF found THEN move ahead
('+\#*?[\w\s]*'+ # CAPTURE ColorName/Hex
|"+\#*?[\w\s]*"+ # single or double
|\#*\w*\b) # or no quotes
.*?> # & move to end of tag
|.*?> # ELSE move to end of Tag
) # Close the If/Then lookahead
# Use Multiline and IgnoreCase
# Replace the matches from RE with MatchEvaluator below:
# if m.Groups(1).Value<>"" then
# Return "<font color=" & m.Groups(1).Value & ">"
# else
# Return "<font>"
# end if
Thanks for your help!
Verdend
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Function StripFontTags(Fontstring)
Dim RegularExpressionObject
Set RegularExpressionObject = New RegExp
With RegularExpressionObject
.Pattern = "\s+(?!color)(\w)*\s*=\s*(
.IgnoreCase = True
.Global = True
End With
If Fontstring <> "" Then
StripFontTags = RegularExpressionObject.Re
Else
StripFontTags = ""
End If
Set RegularExpressionObject = Nothing
End Function
Response.Write(StripFontTa