Link to home
Start Free TrialLog in
Avatar of dwe0608
dwe0608Flag for Australia

asked on

Regular Expression

Hi guys,

I have a string which contains a series of characters, including html tags. I would like to preserve the text but remove all tags, replacing the paragraph end </p> tag with <br> - I have found the following script, but not understanding regular expressions makes it hard to understand what it does exactly what exactly does this function do and how can I modify it to preserve the text as I said earlier.

Function stripTags(HTMLstring)
	Set RegularExpressionObject = New RegExp
	With RegularExpressionObject
		.Pattern = "<[^>]+>"
		.IgnoreCase = True
		.Global = True
	End With
	stripTags = RegularExpressionObject.Replace(HTMLstring, "")
	Set RegularExpressionObject = nothing
End Function

Open in new window


MTIA

DWE
ASKER CERTIFIED SOLUTION
Avatar of matija_
matija_
Flag of Croatia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of dwe0608

ASKER

hi matija_ - ok I can see what you've done - you take the incoming HTMLString, replace </p> with #br /# and then strip the tags, and then replace #br /# with <br /> - but what does the pattern do ? ie that character sequence means nothing to me, so what do they mean?
The patterns finds and removes every trace of "< anything inside brackets >" inside your text.
Avatar of dwe0608

ASKER

so will that delete something like "<input type="text" value="test value" /> including the value of text ... ? I suppose it would wouldnt it ...
Avatar of dwe0608

ASKER

thanks for the help ...
Yes it would. Glad I could help...