UTF-8 character set does not support the ¬ character (logical not sign) in ASP pages

I am moving a product over to unicode from all ascii/varchar data.

I changed all pages to have

Response.Charset = "UTF-8"


<meta http-equiv="content-type" content="text/html;charset=utf-8">

But we have made extensive use throughout the product of the ¬ character as a separator. When I debug the following line under UTF8

            sHTMLTree = dc.GetPage("HTMLTree", "1", "CURRENTLIST", 0, 0, 0, 0, "", 0, 0, "MainMenu", 2, "0¬-1¬1,")

I see this..

            sHTMLTree = dc.GetPage("HTMLTree", "1", "CURRENTLIST", 0, 0, 0, 0, "", 0, 0, "MainMenu", 2, "0-11,")

So the IIS process has removed all ¬ characters.

Is there a charset we can use that would support this character ? Most other characters like dollar, hash(pound in us) etc have been used so we dont really know an alternative ?
Who is Participating?

Improve company productivity with a Business Account.Sign Up

smidgie82Connect With a Mentor Commented:
The ¬ character isn't part of the ASCII-7 character set (rather, the extended ASCII, also known as ISO-8859-1).  As such, if you just change the character set to render in without ensuring that the encoding on disk is properly updated to match, it will cause problems with that character (and any character with index above 127 in the character set).  This should solve the problem:

Back up all your code first.  Then, copy the existing code (viewed as ISO-8859-1) into Notepad.  Make sure it displays the way you want it to. Now do a "Save As," and select "UTF-8" under the "Encoding" drop-down menu.  Save it over itself.  You'll need to repeat this for every file, which could be a major task depending on the size of your site, but I don't know of any faster automated way to do it.  After you get done, the file encoding should match the display character set, and you shouldn't see any more problems.
plqAuthor Commented:
By the way, if I change my browser from ISO to unicode, the logical not sign (even on this question html page) changes to a question mark

The UTF-8 encoding definitely supports this character: the problem isn't in UTF-8. Are you certain that the thing is missing and hasn't been filtered out by the debug display? The NOT sign is U+00AC, or 0xC2 ,0xAC in UTF-8. Another thing to watch out for is whether the original source has been stored correctly - perhaps the editor has stripped out the character? Of course, you may have included the character as its native Windows encoding, instead of its UTF value, which is meaningless.

When you change your browser from ISO (which ISO? I presume you mean ISO-8859-1 or Latin-1) to Unicode, the reason it doesn't display is because you have included the character as its Windows CP1252 value, not as UTF8, i.e. this page isn't Unicode, so it can't display as Unicode.
Easily Design & Build Your Next Website

Squarespace’s all-in-one platform gives you everything you need to express yourself creatively online, whether it is with a domain, website, or online store. Get started with your free trial today, and when ready, take 10% off your first purchase with offer code 'EXPERTS'.

bpmurrayConnect With a Mentor Commented:
Actually, ASCII is only 7-bit (ISO-646). It NEVER has 8 bits and there is actually no such thing as extended ASCII. The usual confusion is that ANSI and ISO-8859-1 and Windows 1252 are the same. In fact, ANSI Latin-1 and ISO 8859-1 are identical, and the C1 region is not populated. However, Microsoft have seen fit to put characters into the range 0x80-0x9F. However, the NOT sign *is* part of 8859-1 and has the value 0xAC.

The bug here is that plg is including the NOT character as is, as the value 0xAC, in his code. This is *NOT* UTF-8. Therefore, it's being ignored.
plqAuthor Commented:
OK go easy on me as I'm not totally clear on encoding just yet.

The dc object is written in vb.net - that dc.GetPage function parameter gets the value "0-11" instead of "0¬-1¬1" so the memory behind the scenes has definitely been modified. When I remove the meta tag the ¬ characters are preserved and get passed through to vb.

So, lets go back to basics. I create a test.asp with the following content.

<meta http-equiv="content-type" content="text/html;charset=utf-8">
      Response.Write "Hello¬World"

.. and that worked fine.

So now I'm going to paste this into the product to see where ¬ starts getting ignored. I will come back in a short while...
plqAuthor Commented:
Right. .. crossed comments. I think I understand this.

The 300 asp pages are being maintained in vb2005 so I think it will give me an option to upgrade the whole lot to unicode if I just paste a bit of unicode in there.. I will report back on this.
Just be careful when you refer to "Unicode". There are a number of representations of this:

   *  UTF-8: this is 1-3 bytes to encode all of Unicode
   *  UTF-16: this uses 16-bit values (unsigned short) to encode the Basic Multilingual Plane (BMP), using "surrogates" to address non-BMP characters, resulting in 2 x 16-bit values per character for non-BMP chars
   *  UCS-2: like UTF-16, this uses 16-bit values, but does not support surrogates, i.e. it only supports characters in the BMP
   *  UTF-32: this uses 32-bit values as a linear space to encode all of Unicode as fixed-size characters

UTF-8 is popular because it looks like ASCII for characters < 0x80, so it encodes English in 1 byte. I also does not have NULL bytes, so it doesn't waste space and can be processed in a manner similar to usual string processing.
Oh - forgor to say: we're not being hard on you, only typing quickly so it may come across as blunt. No offence meant - only trying to help! :-)
plqAuthor Commented:
Yes, I saved the problematic ASP file as unicode and now its working.

Now I've just got to do 299 more. yyyyyuuukk actually 338 more, bigger yuuk

Maybe I'll write a vb program to do it instead !

I'll keep this Question open just until the conversion is done.

thanks for your help
plqAuthor Commented:
Excellent. with a bit of swish global replace in vs2005 I now have all the pages saved as unicode without writing any programs to do it.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.