Solved

Charsets and all languages

Posted on 2006-06-12
8
307 Views
Last Modified: 2008-03-17
I need a list of all languages with what respective charset to use in the HTML/PHP web pages and in the MySQL database.   Also what should the collation be for MySQL?  What is the different between charset/collation for MySQL when dealing with languages?  I have seen sites describing what charsets to use for languages but they all differ.  I need a solid set so I can hardcode these options into a script.

Specifically I think this list...

Catalan
Portuguese
Czech
German
Danish
English
Spanish
Finnish
Faroese
French
Hungarian
Japanese
Italian
Dutch
Norwegian
Polish
Romanian
Russian
Swedish
Turkish
Chinese
0
Comment
Question by:killer455
  • 3
  • 2
8 Comments
 
LVL 18

Expert Comment

by:Eternal_Student
ID: 16892859
This may help with the php and mysql side of things:

http://dev.mysql.com/doc/refman/5.0/en/charset.html
0
 

Author Comment

by:killer455
ID: 16895093
Yes I have seen this but I need a detailed answer here specific to my question.

0
 
LVL 15

Expert Comment

by:bpmurray
ID: 16905434
If you use UTF-8 (or UTF-16) you can support all the above languages. If you want to use a platform encoding, you can use:

ISO-9959-1, latin1 or Windows 1252 for : Catalan, Portuguese, Czech, German, Danish, English, Spanish, Finnish, Faroese, French, Italian, Dutch, Norwegian, Swedish
ISO-8859-2, latin2 or Windows 1251 for : Hungarian, Polish, Romanian
CP932, sjis or Shift-JIS for : Japanese
ISO-8859-5, KOI-8-R for: Russian (Cyrillic)
ISO-8859-9 for : Turkish
Big5 for : Traditional Chinese (Taiwanese)
GB 18030 for Simplified Chinese (PRC) - Note that this is required by the Chinese government so the old GB 2312 is no longer acceptable.

I very strongly recommend that you use UTF-8 since that is a universal solution for all the above and all the others you don't mention. MySql refers to this as "utf8". BTW, be careful - ucs2, another flavor of Unicode, will not fully support the Chinese and Japanese requirements since it doesn't support surrogates.

Be aware that character encodings and collations are not the same thing. They are only related in that they are associated with a particular country, but you can have multiple collations in one country as you can have multiple character encodings. MySql associates a *default* collation with a particular encoding. This is usually OK, but isn't necessarily correct. I recommend that you use utf8 and then have a collation against each of these languages, so you end up with a list that maps the language to the required collation. You can identify the available collations by doing a SHOW COLLATION LIKE 'utf%';
0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 
LVL 15

Expert Comment

by:bpmurray
ID: 16910937
I just realized there's a typo in the above: the Latin charsets are all ISO-8859-x, not "9959". These are usually referenced as latinx, where x = the part of the 8859 codepages.
0
 

Author Comment

by:killer455
ID: 16917618
When coding an application that many different languaged users will use.  Is there a easy way... like a php function changeLanguage() that could set everything necesarry for the new language support for the HTML/PHP pages and the database?  Can this be changed on the fly?

0
 
LVL 15

Accepted Solution

by:
bpmurray earned 50 total points
ID: 16918416
There are two main areas you have to watch out for when making your app internationally-enabled. These are the encoding and the locale settings. The encoding can be simplified by always using Unicode, UTF-8 is the most popular, although UTF-16 is probably the easiest to manipulate. The locale info is more complex. This contains the information that varies from locale to locale (see CLDR on unicode.org) and includes stuff like date formats (the US uses M/D/Y, the UK uses D/M/Y; the West uses the Gregorian calendar, Japan uses the Year of the Emperor, Arab countries use the Hijri calendar, etc.), number formats (1,000,000 is displayed as 10,00,000 in Hindi), collation sequences (j,k,l, ll,m, n,o,p,q,r,rr,s ... in traditional Spanish), casing (Turkish uppercase "i" has a dot on it, and lowercase "I" has no dot), etc. etc.

While the basic functionalities of this stuff are available in Java and C/C++, ICU4C & ICU4C provide extended functionality (see http://icu.sourceforge.net). Until I saw your question, I wasn't aware that there was any ICU support for PHP, although it seemed logical that there should be. I did a quick check of the php site, and it looks like it's on its way - see http://ie2.php.net/manual/en/ref.unicode.php. This is great news - it shoudl make this standard across many facets of the web.




0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

There are two main kinds of selectors in CSS: One is base selector like h1, h2, body, table or any existing HTML tags.  For instance, the following rule sets all paragraphs (<p> elements) to red: (CODE) CSS also allows us to define our own custom …
SASS allows you to treat your CSS code in a more OOP way. Let's have a look on how you can structure your code in order for it to be easily maintained and reused.
Viewers will learn about if statements in Java and their use The if statement: The condition required to create an if statement: Variations of if statements: An example using if statements:
Viewers will learn about the regular for loop in Java and how to use it. Definition: Break the for loop down into 3 parts: Syntax when using for loops: Example using a for loop:

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now