Solved

Why does iso-8859-1 work with åäö but not utf-8? Isn't utf-8 supposed to support alla languages?

Posted on 2010-09-24
32
1,759 Views
Last Modified: 2013-11-18
Hi guys

I have posted questions about character sets here before.. Here is yet another one.

I'm testing the function "window.onbeforeunload()" wich is supposed to warn the user if he/she is about to exit my page somehow. It works fine, but there is a problem.
When i enter åäö-characters in the returnvalue (the text that will be shown on the popup-box)
it doesn't interpret the characters correctly but instead show the classical black diamond with a questionmark inside.

I started using utf-8 trough the whole project since I thought that it was capable of handling all characters there are. I have this in the <head> on the php-page.

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Since it didn't work i tried to find a solution..

The javascriptfile wich have the function declaration in it is loaded like this:

<script language="Javascript" type="text/javascript" src="js_includes/main_page_01_js_01.js"></script>

But I read that if you add charset declaration to it it tells the browser wich character encoding to use.. so I rewrote it to look like this:

<script language="Javascript" charset="utf-8" type="text/javascript" src="js_includes/main_page_01_js_01.js"></script>

The utf-8 doesn't work either.. However.. if i rewrite it to use ISO-8859-1 instead
it works.

It also work if I just set the header of the php-file to ISO-8859-1
But not when the javascript file is opened with charset=utf-8

Wich leads me to think that utf-8 can't handle åäö in a nice manner.

Or have i missed something??

I really need to get this straight...










0
Comment
Question by:walkman69
  • 17
  • 12
  • 3
32 Comments
 
LVL 3

Expert Comment

by:rgeers
Comment Utility
What characterset is your browser set to use? Perhaps it is fixed to iso-8859-1.
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
I haven't changed any settings. I'm using firefox latest version.
And it is set to: Unicode UTF-8
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
Btw, here is the whole javascript function-declaration:

window.onbeforeunload = function () {
   return "Om du valt att uppdatera sidan kommer ändringarna att säkerhetskopieras."
}
0
 
LVL 82

Expert Comment

by:leakim971
Comment Utility
What about the charset of the page itself ? UTF-8 too ?
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
Yes
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
This is the beginning of the page:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">


<head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
I even tried to change DOCTYPE to transitionalm without any result of course.
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
*transitional
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
I'm using a web hotel, I know there are some way to check the PHP specs on the page, but I just can't remember now.. =/

Would be nice to know if they somehow have limited the PHP to only use ISO-8859-1.
0
 
LVL 82

Expert Comment

by:leakim971
Comment Utility
Please confirm accent is right with the following :

(on my side no problem when I left the page putting an other URL I get the message with the accents)
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
<script language="javascript">
window.onbeforeunload = function () {
   return "Om du valt att uppdatera sidan kommer ändringarna att säkerhetskopieras."
}
</script>
</head>
<body>
</body>
</html>

Open in new window

0
 
LVL 3

Expert Comment

by:rgeers
Comment Utility
Did you check the server sending the UTF-8? Are you getting utf-8 from your server or iso8859-1?
0
 
LVL 3

Expert Comment

by:rgeers
Comment Utility
Remember to use different tools when testing, are you familiar with wget or curl?
0
 
LVL 3

Expert Comment

by:rgeers
Comment Utility
You might want to read http://www.tldp.org/HOWTO/Unicode-HOWTO.html to set everthing to UTF-8
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
leakim971: Yes, accent is right with page embedded javascript.

rgeers: I'm not really sure about what i'm getting, and no i'm not familiar with wget or curl.
0
 
LVL 3

Expert Comment

by:rgeers
Comment Utility
Try: wget -O - <url-to-test>

Download wget: http://gnuwin32.sourceforge.net/packages/wget.htm
Wget manual: http://www.gnu.org/software/wget/manual/wget.html

See how your national characters are represented. You are aware of this: http://www.w3schools.com/tags/ref_entities.asp and http://www.w3schools.com/TAGS/ref_charactersets.asp I guess?
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
With phpinfo.php i only found out this much:

default_charset      Local value: no value      Master value: no value

_SERVER["HTTP_ACCEPT_CHARSET"]:         ISO-8859-1,utf-8;q=0.7,*;q=0.7

_ENV["HTTP_ACCEPT_CHARSET"]:       ISO-8859-1,utf-8;q=0.7,*;q=0.7
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 2

Author Comment

by:walkman69
Comment Utility
rgeers:  uhm.. wierd, i tried to download it to a file, but when i did it said that is saved the file.. And when using dir /s test.txt to find it, it says that it is in that directory but i can't see it, and i can't open it.. Quite disturbing..
0
 
LVL 3

Expert Comment

by:rgeers
Comment Utility
That was strange indeed. Can you show me the output of the command?
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
rgeers: Ok I got it to work.. I had to log in as administrator..

Anyway, it downloads the php-file with those strange characters instead of åäö.
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
Wich is wierd since those usually works when just loading the page. It's just the javascript file that is somewhat wierd. But then again i've had quite som trouble with charsets.
0
 
LVL 3

Expert Comment

by:rgeers
Comment Utility
Then you know what the serveral is sending to your browser. Could you change one of these characters and change it to &aring; for å, or use another example. Just to see what your server sends?
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
Done, heres the result:

Before it sent: Ã¥  where it said å on the page

Now when i change it to &aring; it sends &aring;

This is the exact same behaviour that occurs when running the javascript file.
Since i tried to rewrite åäö to &aring; &auml; &ouml;

And it sent those right back at me..

0
 
LVL 3

Expert Comment

by:rgeers
Comment Utility
The funny characters might be correct showing utf-8 in a iso8859-1 cmd prompt. If you look here:

http://illegalargumentexception.blogspot.com/2009/04/i18n-unicode-at-windows-command-prompt.html

You might understand why. So the next thing to do now is to get the file you made be shown correctly. There are something editors on widows that allow you to switch codepage. I believe notepad++ is one that allows you to, let me find out
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
The difference is that the main page looks fine in the browser, but not the javascript function output wich i mentioned earlier.. shouldn't both look the same?
0
 
LVL 3

Accepted Solution

by:
rgeers earned 500 total points
Comment Utility
The &aring; is something your browser is set to interpret, so your webserver is doing the right thing, not changing any. Try to load the file into a editor and switch codepage. Let me see if i can do this here.
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
I don't think the command prompt has anything to do with it..

I changed the header of the php-file to use iso-8859-1
And downloaded the file again with wget, the result was that it returned the &aring; as just &aring;
But it returned the åäö as åäö.. So iso-8859-1 just seem to work better than utf-8. ^^
0
 
LVL 3

Expert Comment

by:rgeers
Comment Utility
Well this shows that the command prompt is showing you iso8859-1 or probably codepage 437, try this with chcp in the command prompt. So now you need to make the commandprompt UTF-8 or get an editor that shows you UTF-8. The funny character are probably UTF-8.
0
 
LVL 2

Author Comment

by:walkman69
Comment Utility
wow... I'm not the brightest bulb in the light shop..

My knowledge about charset are still quite limited.. Now i have learned that your document has to be formatted and saved with utf-8.. ^^

When saving the documents with this charset in the editor of your choice i noticed that the page displays the php-file AND the javascript output correctly, the reason it only showed it correctly when in ISO-8859-1 mode was of course since my editor was set to save documents in european-mode. (ISO-8859-1)..

^^ I feel quite stupid.. yet relieved right now..

0
 
LVL 3

Expert Comment

by:rgeers
Comment Utility
There are some parsers that can output your code as &aring; for å etc. depending on what kind of language you write your script, and this is a more standards complying method. Because you have np garanti in what kind of browser you are running your code, this might be another option. This is show on your output as code, but in your browser it will render correctly. Your either need to parse each string before you output this via this converter, or you just write your message encoded with these &amp; symbols. But again this is another path. We can try and solve your problem by setting the right tags and character set first.
0
 
LVL 3

Expert Comment

by:rgeers
Comment Utility
I'm glad I could be of help and enlight the situation. Same thing has happened to me too.
0
 
LVL 2

Author Closing Comment

by:walkman69
Comment Utility
Since you talked about changing code-pages I remembered that I had seen some setting for this in my editor, and changed it to UTF-8. Saved the PHP-file and the javascript-file in UTF-8 and uploaded it to the server.

And voila!..

Thanks rgeers!
0
 
LVL 82

Expert Comment

by:leakim971
Comment Utility
Congratulations @rgeers, impressive!
0

Featured Post

Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

Join & Write a Comment

Suggested Solutions

JavaScript can be used in a browser to change parts of a webpage dynamically. It begins with the following pattern: If condition W is true, do thing X to target Y after event Z. Below are some tips and tricks to help you get started with JavaScript …
This article discusses how to create an extensible mechanism for linked drop downs.
This video teaches viewers about errors in exception handling.
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now