Solved

Charsets and all languages

Posted on 2006-06-12
8
334 Views
Last Modified: 2008-03-17
I need a list of all languages with what respective charset to use in the HTML/PHP web pages and in the MySQL database.   Also what should the collation be for MySQL?  What is the different between charset/collation for MySQL when dealing with languages?  I have seen sites describing what charsets to use for languages but they all differ.  I need a solid set so I can hardcode these options into a script.

Specifically I think this list...

Catalan
Portuguese
Czech
German
Danish
English
Spanish
Finnish
Faroese
French
Hungarian
Japanese
Italian
Dutch
Norwegian
Polish
Romanian
Russian
Swedish
Turkish
Chinese
0
Comment
Question by:killer455
  • 3
  • 2
8 Comments
 
LVL 18

Expert Comment

by:Eternal_Student
ID: 16892859
This may help with the php and mysql side of things:

http://dev.mysql.com/doc/refman/5.0/en/charset.html
0
 

Author Comment

by:killer455
ID: 16895093
Yes I have seen this but I need a detailed answer here specific to my question.

0
 
LVL 15

Expert Comment

by:bpmurray
ID: 16905434
If you use UTF-8 (or UTF-16) you can support all the above languages. If you want to use a platform encoding, you can use:

ISO-9959-1, latin1 or Windows 1252 for : Catalan, Portuguese, Czech, German, Danish, English, Spanish, Finnish, Faroese, French, Italian, Dutch, Norwegian, Swedish
ISO-8859-2, latin2 or Windows 1251 for : Hungarian, Polish, Romanian
CP932, sjis or Shift-JIS for : Japanese
ISO-8859-5, KOI-8-R for: Russian (Cyrillic)
ISO-8859-9 for : Turkish
Big5 for : Traditional Chinese (Taiwanese)
GB 18030 for Simplified Chinese (PRC) - Note that this is required by the Chinese government so the old GB 2312 is no longer acceptable.

I very strongly recommend that you use UTF-8 since that is a universal solution for all the above and all the others you don't mention. MySql refers to this as "utf8". BTW, be careful - ucs2, another flavor of Unicode, will not fully support the Chinese and Japanese requirements since it doesn't support surrogates.

Be aware that character encodings and collations are not the same thing. They are only related in that they are associated with a particular country, but you can have multiple collations in one country as you can have multiple character encodings. MySql associates a *default* collation with a particular encoding. This is usually OK, but isn't necessarily correct. I recommend that you use utf8 and then have a collation against each of these languages, so you end up with a list that maps the language to the required collation. You can identify the available collations by doing a SHOW COLLATION LIKE 'utf%';
0
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
LVL 15

Expert Comment

by:bpmurray
ID: 16910937
I just realized there's a typo in the above: the Latin charsets are all ISO-8859-x, not "9959". These are usually referenced as latinx, where x = the part of the 8859 codepages.
0
 

Author Comment

by:killer455
ID: 16917618
When coding an application that many different languaged users will use.  Is there a easy way... like a php function changeLanguage() that could set everything necesarry for the new language support for the HTML/PHP pages and the database?  Can this be changed on the fly?

0
 
LVL 15

Accepted Solution

by:
bpmurray earned 50 total points
ID: 16918416
There are two main areas you have to watch out for when making your app internationally-enabled. These are the encoding and the locale settings. The encoding can be simplified by always using Unicode, UTF-8 is the most popular, although UTF-16 is probably the easiest to manipulate. The locale info is more complex. This contains the information that varies from locale to locale (see CLDR on unicode.org) and includes stuff like date formats (the US uses M/D/Y, the UK uses D/M/Y; the West uses the Gregorian calendar, Japan uses the Year of the Emperor, Arab countries use the Hijri calendar, etc.), number formats (1,000,000 is displayed as 10,00,000 in Hindi), collation sequences (j,k,l, ll,m, n,o,p,q,r,rr,s ... in traditional Spanish), casing (Turkish uppercase "i" has a dot on it, and lowercase "I" has no dot), etc. etc.

While the basic functionalities of this stuff are available in Java and C/C++, ICU4C & ICU4C provide extended functionality (see http://icu.sourceforge.net). Until I saw your question, I wasn't aware that there was any ICU support for PHP, although it seemed logical that there should be. I did a quick check of the php site, and it looks like it's on its way - see http://ie2.php.net/manual/en/ref.unicode.php. This is great news - it shoudl make this standard across many facets of the web.




0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Menu Trouble 8 46
Problem with Connection / Parameter: 4 21
Create and populate iFrame onClick of a button? 1 87
Randomize in Owl Carousel v1.3.2 6 68
Preface This is the third article about the EE Collaborative Login Project. A Better Website Login System (http://www.experts-exchange.com/A_2902.html) introduces the Login System and shows how to implement a login page. The EE Collaborative Logi…
There are two main kinds of selectors in CSS: One is base selector like h1, h2, body, table or any existing HTML tags.  For instance, the following rule sets all paragraphs (<p> elements) to red: (CODE) CSS also allows us to define our own custom …
The viewer will receive an overview of the basics of CSS showing inline styles. In the head tags set up your style tags: (CODE) Reference the nav tag and set your properties.: (CODE) Set the reference for the UL element and styles for it to ensu…
The viewer will learn the benefit of using external CSS files and the relationship between class and ID selectors. Create your external css file by saving it as style.css then set up your style tags: (CODE) Reference the nav tag and set your prop…

861 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question