Solved

Why does iso-8859-1 work with åäö but not utf-8? Isn't utf-8 supposed to support alla languages?

Posted on 2010-09-24
32
1,822 Views
Last Modified: 2013-11-18
Hi guys

I have posted questions about character sets here before.. Here is yet another one.

I'm testing the function "window.onbeforeunload()" wich is supposed to warn the user if he/she is about to exit my page somehow. It works fine, but there is a problem.
When i enter åäö-characters in the returnvalue (the text that will be shown on the popup-box)
it doesn't interpret the characters correctly but instead show the classical black diamond with a questionmark inside.

I started using utf-8 trough the whole project since I thought that it was capable of handling all characters there are. I have this in the <head> on the php-page.

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Since it didn't work i tried to find a solution..

The javascriptfile wich have the function declaration in it is loaded like this:

<script language="Javascript" type="text/javascript" src="js_includes/main_page_01_js_01.js"></script>

But I read that if you add charset declaration to it it tells the browser wich character encoding to use.. so I rewrote it to look like this:

<script language="Javascript" charset="utf-8" type="text/javascript" src="js_includes/main_page_01_js_01.js"></script>

The utf-8 doesn't work either.. However.. if i rewrite it to use ISO-8859-1 instead
it works.

It also work if I just set the header of the php-file to ISO-8859-1
But not when the javascript file is opened with charset=utf-8

Wich leads me to think that utf-8 can't handle åäö in a nice manner.

Or have i missed something??

I really need to get this straight...










0
Comment
Question by:walkman69
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 17
  • 12
  • 3
32 Comments
 
LVL 3

Expert Comment

by:rgeers
ID: 33753912
What characterset is your browser set to use? Perhaps it is fixed to iso-8859-1.
0
 
LVL 2

Author Comment

by:walkman69
ID: 33753950
I haven't changed any settings. I'm using firefox latest version.
And it is set to: Unicode UTF-8
0
 
LVL 2

Author Comment

by:walkman69
ID: 33753963
Btw, here is the whole javascript function-declaration:

window.onbeforeunload = function () {
   return "Om du valt att uppdatera sidan kommer ändringarna att säkerhetskopieras."
}
0
Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

 
LVL 82

Expert Comment

by:leakim971
ID: 33753979
What about the charset of the page itself ? UTF-8 too ?
0
 
LVL 2

Author Comment

by:walkman69
ID: 33754050
Yes
0
 
LVL 2

Author Comment

by:walkman69
ID: 33754058
This is the beginning of the page:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">


<head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
0
 
LVL 2

Author Comment

by:walkman69
ID: 33754069
I even tried to change DOCTYPE to transitionalm without any result of course.
0
 
LVL 2

Author Comment

by:walkman69
ID: 33754071
*transitional
0
 
LVL 2

Author Comment

by:walkman69
ID: 33754142
I'm using a web hotel, I know there are some way to check the PHP specs on the page, but I just can't remember now.. =/

Would be nice to know if they somehow have limited the PHP to only use ISO-8859-1.
0
 
LVL 82

Expert Comment

by:leakim971
ID: 33754174
Please confirm accent is right with the following :

(on my side no problem when I left the page putting an other URL I get the message with the accents)
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
<script language="javascript">
window.onbeforeunload = function () {
   return "Om du valt att uppdatera sidan kommer ändringarna att säkerhetskopieras."
}
</script>
</head>
<body>
</body>
</html>

Open in new window

0
 
LVL 3

Expert Comment

by:rgeers
ID: 33754277
Did you check the server sending the UTF-8? Are you getting utf-8 from your server or iso8859-1?
0
 
LVL 3

Expert Comment

by:rgeers
ID: 33754319
Remember to use different tools when testing, are you familiar with wget or curl?
0
 
LVL 3

Expert Comment

by:rgeers
ID: 33754343
You might want to read http://www.tldp.org/HOWTO/Unicode-HOWTO.html to set everthing to UTF-8
0
 
LVL 2

Author Comment

by:walkman69
ID: 33756193
leakim971: Yes, accent is right with page embedded javascript.

rgeers: I'm not really sure about what i'm getting, and no i'm not familiar with wget or curl.
0
 
LVL 3

Expert Comment

by:rgeers
ID: 33756626
Try: wget -O - <url-to-test>

Download wget: http://gnuwin32.sourceforge.net/packages/wget.htm
Wget manual: http://www.gnu.org/software/wget/manual/wget.html

See how your national characters are represented. You are aware of this: http://www.w3schools.com/tags/ref_entities.asp and http://www.w3schools.com/TAGS/ref_charactersets.asp I guess?
0
 
LVL 2

Author Comment

by:walkman69
ID: 33757541
With phpinfo.php i only found out this much:

default_charset      Local value: no value      Master value: no value

_SERVER["HTTP_ACCEPT_CHARSET"]:         ISO-8859-1,utf-8;q=0.7,*;q=0.7

_ENV["HTTP_ACCEPT_CHARSET"]:       ISO-8859-1,utf-8;q=0.7,*;q=0.7
0
 
LVL 2

Author Comment

by:walkman69
ID: 33757921
rgeers:  uhm.. wierd, i tried to download it to a file, but when i did it said that is saved the file.. And when using dir /s test.txt to find it, it says that it is in that directory but i can't see it, and i can't open it.. Quite disturbing..
0
 
LVL 3

Expert Comment

by:rgeers
ID: 33757980
That was strange indeed. Can you show me the output of the command?
0
 
LVL 2

Author Comment

by:walkman69
ID: 33757989
rgeers: Ok I got it to work.. I had to log in as administrator..

Anyway, it downloads the php-file with those strange characters instead of åäö.
0
 
LVL 2

Author Comment

by:walkman69
ID: 33758005
Wich is wierd since those usually works when just loading the page. It's just the javascript file that is somewhat wierd. But then again i've had quite som trouble with charsets.
0
 
LVL 3

Expert Comment

by:rgeers
ID: 33758015
Then you know what the serveral is sending to your browser. Could you change one of these characters and change it to &aring; for å, or use another example. Just to see what your server sends?
0
 
LVL 2

Author Comment

by:walkman69
ID: 33758081
Done, heres the result:

Before it sent: Ã¥  where it said å on the page

Now when i change it to &aring; it sends &aring;

This is the exact same behaviour that occurs when running the javascript file.
Since i tried to rewrite åäö to &aring; &auml; &ouml;

And it sent those right back at me..

0
 
LVL 3

Expert Comment

by:rgeers
ID: 33758102
The funny characters might be correct showing utf-8 in a iso8859-1 cmd prompt. If you look here:

http://illegalargumentexception.blogspot.com/2009/04/i18n-unicode-at-windows-command-prompt.html

You might understand why. So the next thing to do now is to get the file you made be shown correctly. There are something editors on widows that allow you to switch codepage. I believe notepad++ is one that allows you to, let me find out
0
 
LVL 2

Author Comment

by:walkman69
ID: 33758104
The difference is that the main page looks fine in the browser, but not the javascript function output wich i mentioned earlier.. shouldn't both look the same?
0
 
LVL 3

Accepted Solution

by:
rgeers earned 500 total points
ID: 33758125
The &aring; is something your browser is set to interpret, so your webserver is doing the right thing, not changing any. Try to load the file into a editor and switch codepage. Let me see if i can do this here.
0
 
LVL 2

Author Comment

by:walkman69
ID: 33758149
I don't think the command prompt has anything to do with it..

I changed the header of the php-file to use iso-8859-1
And downloaded the file again with wget, the result was that it returned the &aring; as just &aring;
But it returned the åäö as åäö.. So iso-8859-1 just seem to work better than utf-8. ^^
0
 
LVL 3

Expert Comment

by:rgeers
ID: 33758181
Well this shows that the command prompt is showing you iso8859-1 or probably codepage 437, try this with chcp in the command prompt. So now you need to make the commandprompt UTF-8 or get an editor that shows you UTF-8. The funny character are probably UTF-8.
0
 
LVL 2

Author Comment

by:walkman69
ID: 33758214
wow... I'm not the brightest bulb in the light shop..

My knowledge about charset are still quite limited.. Now i have learned that your document has to be formatted and saved with utf-8.. ^^

When saving the documents with this charset in the editor of your choice i noticed that the page displays the php-file AND the javascript output correctly, the reason it only showed it correctly when in ISO-8859-1 mode was of course since my editor was set to save documents in european-mode. (ISO-8859-1)..

^^ I feel quite stupid.. yet relieved right now..

0
 
LVL 3

Expert Comment

by:rgeers
ID: 33758222
There are some parsers that can output your code as &aring; for å etc. depending on what kind of language you write your script, and this is a more standards complying method. Because you have np garanti in what kind of browser you are running your code, this might be another option. This is show on your output as code, but in your browser it will render correctly. Your either need to parse each string before you output this via this converter, or you just write your message encoded with these &amp; symbols. But again this is another path. We can try and solve your problem by setting the right tags and character set first.
0
 
LVL 3

Expert Comment

by:rgeers
ID: 33758298
I'm glad I could be of help and enlight the situation. Same thing has happened to me too.
0
 
LVL 2

Author Closing Comment

by:walkman69
ID: 33758406
Since you talked about changing code-pages I remembered that I had seen some setting for this in my editor, and changed it to UTF-8. Saved the PHP-file and the javascript-file in UTF-8 and uploaded it to the server.

And voila!..

Thanks rgeers!
0
 
LVL 82

Expert Comment

by:leakim971
ID: 33758471
Congratulations @rgeers, impressive!
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I found this questions asking how to do this in many different forums, so I will describe here how to implement a solution using PHP and AJAX. The logical flow for the problem should be: Write an event handler for the first drop down box to get …
Q&A with Course Creator, Mark Lassoff, on the importance of HTML5 in the career of a modern-day developer.
The goal of the video will be to teach the user the difference and consequence of passing data by value vs passing data by reference in C++. An example of passing data by value as well as an example of passing data by reference will be be given. Bot…
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)

623 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question