Why does iso-8859-1 work with åäö but not utf-8? Isn't utf-8 supposed to support alla languages?

Hi guys

I have posted questions about character sets here before.. Here is yet another one.

I'm testing the function "window.onbeforeunload()" wich is supposed to warn the user if he/she is about to exit my page somehow. It works fine, but there is a problem.
When i enter åäö-characters in the returnvalue (the text that will be shown on the popup-box)
it doesn't interpret the characters correctly but instead show the classical black diamond with a questionmark inside.

I started using utf-8 trough the whole project since I thought that it was capable of handling all characters there are. I have this in the <head> on the php-page.

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Since it didn't work i tried to find a solution..

The javascriptfile wich have the function declaration in it is loaded like this:

<script language="Javascript" type="text/javascript" src="js_includes/main_page_01_js_01.js"></script>

But I read that if you add charset declaration to it it tells the browser wich character encoding to use.. so I rewrote it to look like this:

<script language="Javascript" charset="utf-8" type="text/javascript" src="js_includes/main_page_01_js_01.js"></script>

The utf-8 doesn't work either.. However.. if i rewrite it to use ISO-8859-1 instead
it works.

It also work if I just set the header of the php-file to ISO-8859-1
But not when the javascript file is opened with charset=utf-8

Wich leads me to think that utf-8 can't handle åäö in a nice manner.

Or have i missed something??

I really need to get this straight...










LVL 2
walkman69Asked:
Who is Participating?
 
rgeersConnect With a Mentor Commented:
The &aring; is something your browser is set to interpret, so your webserver is doing the right thing, not changing any. Try to load the file into a editor and switch codepage. Let me see if i can do this here.
0
 
rgeersCommented:
What characterset is your browser set to use? Perhaps it is fixed to iso-8859-1.
0
 
walkman69Author Commented:
I haven't changed any settings. I'm using firefox latest version.
And it is set to: Unicode UTF-8
0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 
walkman69Author Commented:
Btw, here is the whole javascript function-declaration:

window.onbeforeunload = function () {
   return "Om du valt att uppdatera sidan kommer ändringarna att säkerhetskopieras."
}
0
 
leakim971PluritechnicianCommented:
What about the charset of the page itself ? UTF-8 too ?
0
 
walkman69Author Commented:
Yes
0
 
walkman69Author Commented:
This is the beginning of the page:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">


<head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
0
 
walkman69Author Commented:
I even tried to change DOCTYPE to transitionalm without any result of course.
0
 
walkman69Author Commented:
*transitional
0
 
walkman69Author Commented:
I'm using a web hotel, I know there are some way to check the PHP specs on the page, but I just can't remember now.. =/

Would be nice to know if they somehow have limited the PHP to only use ISO-8859-1.
0
 
leakim971PluritechnicianCommented:
Please confirm accent is right with the following :

(on my side no problem when I left the page putting an other URL I get the message with the accents)
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
<script language="javascript">
window.onbeforeunload = function () {
   return "Om du valt att uppdatera sidan kommer ändringarna att säkerhetskopieras."
}
</script>
</head>
<body>
</body>
</html>

Open in new window

0
 
rgeersCommented:
Did you check the server sending the UTF-8? Are you getting utf-8 from your server or iso8859-1?
0
 
rgeersCommented:
Remember to use different tools when testing, are you familiar with wget or curl?
0
 
rgeersCommented:
You might want to read http://www.tldp.org/HOWTO/Unicode-HOWTO.html to set everthing to UTF-8
0
 
walkman69Author Commented:
leakim971: Yes, accent is right with page embedded javascript.

rgeers: I'm not really sure about what i'm getting, and no i'm not familiar with wget or curl.
0
 
rgeersCommented:
Try: wget -O - <url-to-test>

Download wget: http://gnuwin32.sourceforge.net/packages/wget.htm
Wget manual: http://www.gnu.org/software/wget/manual/wget.html

See how your national characters are represented. You are aware of this: http://www.w3schools.com/tags/ref_entities.asp and http://www.w3schools.com/TAGS/ref_charactersets.asp I guess?
0
 
walkman69Author Commented:
With phpinfo.php i only found out this much:

default_charset      Local value: no value      Master value: no value

_SERVER["HTTP_ACCEPT_CHARSET"]:         ISO-8859-1,utf-8;q=0.7,*;q=0.7

_ENV["HTTP_ACCEPT_CHARSET"]:       ISO-8859-1,utf-8;q=0.7,*;q=0.7
0
 
walkman69Author Commented:
rgeers:  uhm.. wierd, i tried to download it to a file, but when i did it said that is saved the file.. And when using dir /s test.txt to find it, it says that it is in that directory but i can't see it, and i can't open it.. Quite disturbing..
0
 
rgeersCommented:
That was strange indeed. Can you show me the output of the command?
0
 
walkman69Author Commented:
rgeers: Ok I got it to work.. I had to log in as administrator..

Anyway, it downloads the php-file with those strange characters instead of åäö.
0
 
walkman69Author Commented:
Wich is wierd since those usually works when just loading the page. It's just the javascript file that is somewhat wierd. But then again i've had quite som trouble with charsets.
0
 
rgeersCommented:
Then you know what the serveral is sending to your browser. Could you change one of these characters and change it to &aring; for å, or use another example. Just to see what your server sends?
0
 
walkman69Author Commented:
Done, heres the result:

Before it sent: Ã¥  where it said å on the page

Now when i change it to &aring; it sends &aring;

This is the exact same behaviour that occurs when running the javascript file.
Since i tried to rewrite åäö to &aring; &auml; &ouml;

And it sent those right back at me..

0
 
rgeersCommented:
The funny characters might be correct showing utf-8 in a iso8859-1 cmd prompt. If you look here:

http://illegalargumentexception.blogspot.com/2009/04/i18n-unicode-at-windows-command-prompt.html

You might understand why. So the next thing to do now is to get the file you made be shown correctly. There are something editors on widows that allow you to switch codepage. I believe notepad++ is one that allows you to, let me find out
0
 
walkman69Author Commented:
The difference is that the main page looks fine in the browser, but not the javascript function output wich i mentioned earlier.. shouldn't both look the same?
0
 
walkman69Author Commented:
I don't think the command prompt has anything to do with it..

I changed the header of the php-file to use iso-8859-1
And downloaded the file again with wget, the result was that it returned the &aring; as just &aring;
But it returned the åäö as åäö.. So iso-8859-1 just seem to work better than utf-8. ^^
0
 
rgeersCommented:
Well this shows that the command prompt is showing you iso8859-1 or probably codepage 437, try this with chcp in the command prompt. So now you need to make the commandprompt UTF-8 or get an editor that shows you UTF-8. The funny character are probably UTF-8.
0
 
walkman69Author Commented:
wow... I'm not the brightest bulb in the light shop..

My knowledge about charset are still quite limited.. Now i have learned that your document has to be formatted and saved with utf-8.. ^^

When saving the documents with this charset in the editor of your choice i noticed that the page displays the php-file AND the javascript output correctly, the reason it only showed it correctly when in ISO-8859-1 mode was of course since my editor was set to save documents in european-mode. (ISO-8859-1)..

^^ I feel quite stupid.. yet relieved right now..

0
 
rgeersCommented:
There are some parsers that can output your code as &aring; for å etc. depending on what kind of language you write your script, and this is a more standards complying method. Because you have np garanti in what kind of browser you are running your code, this might be another option. This is show on your output as code, but in your browser it will render correctly. Your either need to parse each string before you output this via this converter, or you just write your message encoded with these &amp; symbols. But again this is another path. We can try and solve your problem by setting the right tags and character set first.
0
 
rgeersCommented:
I'm glad I could be of help and enlight the situation. Same thing has happened to me too.
0
 
walkman69Author Commented:
Since you talked about changing code-pages I remembered that I had seen some setting for this in my editor, and changed it to UTF-8. Saved the PHP-file and the javascript-file in UTF-8 and uploaded it to the server.

And voila!..

Thanks rgeers!
0
 
leakim971PluritechnicianCommented:
Congratulations @rgeers, impressive!
0
All Courses

From novice to tech pro — start learning today.