MySQL Collation

Hi! Just wondering what's the best collation to use. Prior to MySQL 5, I don't think I had this choice...

I would put my money on utf8_general_ci, but I also see ascii_general_ci in use, and the default seems to be latin1_swedish_ci ......
LVL 21
Julian MatzJoint ChairpersonAsked:
Who is Participating?
 
mankowitzCommented:
It really depends what languages you are going to use. If your text is mostly english and european languages, I would stick with the latin collation -- because it is the default and everyone else is using it. If you need kanji,chinese or another kind of pictogram language, you should use that one.
0
 
ncooCommented:
latin1_swedish or latin1_general are both good, I've not had a problem with either for Europe, the Americas(N&S) and India.

They do say if you're page is going to be of content type UTF8 you should use a UTF8 collation.

But what ever you do make sure any key/foreign fields are all of the same type, otherwilse you will be in for some real trouble. Trust me on that one!
0
 
Julian MatzJoint ChairpersonAuthor Commented:
Thanks!

I'm using mainly English, sometimes French, German, Italian, etc. for content management systems.

My websites are all UTF8 (my Apache server forces this content-type) and the MySQL data is usually inserted through HTML input fields and PHP, and I also sometimes put HTML into the database. That wouldn't make a difference, no?
0
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

 
mankowitzCommented:
No, you should be fine with that.
0
 
ncooCommented:
Either will do perfectly ok, I would probably opt for UTF8 it will give you more scope should you want to expand the management system away from Europe and in to Asia for example.


 latin1_bin           West European (multilingual), Binary
 latin1_danish_ci         Danish, case-insensitive
 latin1_general_ci         West European (multilingual), case-insensitive
 latin1_general_cs         West European (multilingual), case-sensitive
 latin1_german1_ci         German (dictionary), case-insensitive
 latin1_german2_ci         German (phone book), case-insensitive
 latin1_spanish_ci         Spanish, case-insensitive
 latin1_swedish_ci         Swedish, case-insensitive


  utf8_bin           Unicode (multilingual), Binary
 utf8_czech_ci         Czech, case-insensitive
 utf8_danish_ci         Danish, case-insensitive
 utf8_estonian_ci         Estonian, case-insensitive

>>>>>> utf8_general_ci         Unicode (multilingual), case-insensitive

 utf8_icelandic_ci         Icelandic, case-insensitive
 utf8_latvian_ci         Latvian, case-insensitive
 utf8_lithuanian_ci         Lithuanian, case-insensitive
 utf8_persian_ci         Persian, case-insensitive
 utf8_polish_ci         Polish, case-insensitive
 utf8_roman_ci         West European, case-insensitive
 utf8_romanian_ci         Romanian, case-insensitive
 utf8_slovak_ci         Slovak, case-insensitive
 utf8_slovenian_ci         Slovenian, case-insensitive
 utf8_spanish2_ci         Traditional Spanish, case-insensitive
 utf8_spanish_ci         Spanish, case-insensitive
 utf8_swedish_ci         Swedish, case-insensitive
 utf8_turkish_ci         Turkish, case-insensitive
 utf8_unicode_ci         Unicode (multilingual), case-insensitive
0
 
Julian MatzJoint ChairpersonAuthor Commented:
Thanks for your comments!

latin1_swedish - is this what most people use because it seems to be the default or is it because it supports the most European/Latin language characters??

Are there any disadvantages to using UTF8?

0
 
ncooCommented:
By default, MySQL uses the latin1 (cp1252 West European) character set and the latin1_swedish_ci collation that sorts according to Swedish/Finnish rules. These defaults are suitable for the United States and most of Western Europe.

The only problem I could seem to find was to do with the creation of a UTF8 database. Some hosts will have to be contacted to create a correct UTF8 database.

CREATE DATABASE `name` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;

You also will not be able to use ISO with a UTF database.

For simplicity the default may be best (latin1_swedish_ci).
0
 
Julian MatzJoint ChairpersonAuthor Commented:
Thanks!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.