[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 24041
  • Last Modified:

Using HtmlDecode() to convert   to space

I'm using HtmlDecode() to convert html entities such as   .  It works fine except if I have an '&' anywhere in the string preceding the   then it does not convert to a space.  If I remove the '&' or place it after the   then it works OK.  For example:

HttpUtility.HtmlDecode("M & M &nbsp;")  << Does not convert the &nbsp;

HttpUtility.HtmlDecode("M  M &nbsp;")  << Replaces the &nbsp; with a space, as desired.

HttpUtility.HtmlDecode("M  M &nbsp; R & D")  << Replaces the &nbsp; as desired, works OK when the '&' follows the &nbsp;

Any ideas?
0
acadalzo
Asked:
acadalzo
  • 2
  • 2
  • 2
2 Solutions
 
raterusCommented:
The point of HtmlEncode/Decode is to provide an easy way to encode/decode characters that the browser considers special characters, the ampersand '&' is one of these characters.  It's not a way to convert &nbsp; to a space, that is the browsers job.  There really is no conversion between &nbsp; and a space that HTMLDecode should pick up on.  HtmlDecode should always be called on something that was HtmlEncoded first.
0
 
acadalzoAuthor Commented:
The html is coming from a scrape of a web page.  I use the WebRequest class to return the html from the page to my program.  I have to transform entities such as &nbsp; and &#8217; before saving the text to our database.  Can you recommend a way to do this.  HtmlDecode was working well for me until I encountered the issue where there is a preceeding '&'.  For example:

System.Web.HttpUtility.HtmlDecode("A &nbsp;B&nbsp; C  D  &nbsp;  4C&#8217;s of Alameda County")
Results: A  B  C  D     4C’s of Alameda County  << Results are OK

but:
System.Web.HttpUtility.HtmlDecode("A &nbsp;B&nbsp; C & D  &nbsp;  4C&#8217;s of Alameda County")
Results: A  B  C & D  &nbsp;  4C’s of Alameda County  <<  One &nbsp; not transformed.
0
 
Bob LearnedCommented:
Now that looks like a bug in HtmlDecode.

Bob
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
raterusCommented:
ampersands '&' should never be put directly into an html file, unless you are using them to refer to a special character, &nbsp; or &#8217; like your example says.

With that said, "A &nbsp;B&nbsp; C & D  &nbsp;  4C&#8217;s of Alameda County", your second example, is invalid HTML because of the stray '&' between C and D, so you really can't expect HTMLdecode to know what to do with that string.  Provide it valid HTML and it will provide you with a valid transformation, confuse it, and .. well you get what you get...
0
 
Bob LearnedCommented:
Now, that is a very good point about ampersands :)

Bob
0
 
acadalzoAuthor Commented:
I've been informed that it works in the next version of .Net Framework (V2).
Thanks to all who responded.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 2
  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now