Link to home
Create AccountLog in
Avatar of hankknight
hankknightFlag for Canada

asked on

ASP.NET / VB / REGEX: Fix Double Entities

I use ASP.NET / VB.

How can I fix entities that been encoded twice?

For example,
©
Should become:
©

" zzz    ©    A&B   Hello  & — World     123 

Open in new window

Avatar of kaufmed
kaufmed
Flag of United States of America image

Try:

Dim result As String = System.Text.RegularExpressions.Regex.Replace(input, "(?<=&)amp;(?=#\w+;)", String.Empty)

Open in new window

Avatar of hankknight

ASKER

kaufmed, your idea works on numerical entities like:
&amp;#169;

But it does not work on named entities like:
&amp;mdash;
ASKER CERTIFIED SOLUTION
Avatar of kaufmed
kaufmed
Flag of United States of America image

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer