asked on

Replacing all text on a page without touching HTML tags

I am trying to use Regular Expressions to replace all text (including spaces) within an HTML fragment without touching the HTML tags.  Does any one have a Regex pattern that could accomplish this?

Sample Input:
<p>This is a line with a <b>bold</b> word.</p>

Sample Output:

Any help would be great.

I have not tested this but the logic seems to fit your query.

Dim MyFragment as XElement = "<p>This is a line with a <b>bold</b> word.</p>"
For Each x In MyFragment
  x.Value = x.Value.Replace("*", "X")
string result = System.Text.RegularExpressions.Regex.Replace("<p>This is a line with a <b>bold</b> word.</p>", "(?<=>.*?)[^<>](?=.*?<)", "x");

Thanks for your hep nepaluz.  That is an interesting approach.  It seems to use Linq but I am a novice when it comes to that technology.  I can't get your code to work.  I keep getting an error:

BC30002: Type 'XElement' is not defined.

What am I missing?
Thank you kaufmed, but your regex patern also replaces all contets within (INTERNAL) html tags.  I get the following string as a result:

Like Isaid, I did not test the code, and yes, it was flawed (to be economical with the truth).

I think kaufmed's solution will work for you (haven't tested it either though), if not, ping back and I SHALL test and modify my suggestion.
It worlked great - Thanks for your help.
NP. Glad to help  :)