John
asked on
MySQL Converting from latin1 to utf8
I'm a little confused as to how to proceed. I have a number of tables that now need to be able to store their Text and VarChar fields as UTF8 so as to support the Japanese AND English character set.
I'm new to using MySQL Workbench but from what I can tell, my tables are stored with the latin1 collation and all of the Text and VarChar fields use their table's default collation.
I have been able to place Japanese characters into a field from a command line session (SSH in to the box and then ran mysql from the command prompt) using an update command:
Update Products Set ProductName = "This is ¿¿¿¿¿ test" Where ProductNo = 1011
When I issue a "Select ProductName From Products Where ProductNo = 1011", MySQL shows me the correct characters. My Putty session is using utf8 and seems to display Japanese characters with ease.
The problem is that if I look at the data using Workbench, the Japanese characters are replaced with accented European characters. I suspect that this is because the default collation is latin1 and Putty or MySQL is translating things so they seem to work.
Q: Can I use the Alter Table command (or the Workbench Alter table dialog) to change the collation on each table to "utf8 - default collation" resulting in the current English text remaining intact (since it is compatible with the first 256 characters of utf8) and then I'll be able to populate and retrieve data from now on in utf8 without having any weird conversion issues?
I'm using MySQL 5.0.77 and MySql Workbench with an SSH connection for DB admin.
I'm new to using MySQL Workbench but from what I can tell, my tables are stored with the latin1 collation and all of the Text and VarChar fields use their table's default collation.
I have been able to place Japanese characters into a field from a command line session (SSH in to the box and then ran mysql from the command prompt) using an update command:
Update Products Set ProductName = "This is ¿¿¿¿¿ test" Where ProductNo = 1011
When I issue a "Select ProductName From Products Where ProductNo = 1011", MySQL shows me the correct characters. My Putty session is using utf8 and seems to display Japanese characters with ease.
The problem is that if I look at the data using Workbench, the Japanese characters are replaced with accented European characters. I suspect that this is because the default collation is latin1 and Putty or MySQL is translating things so they seem to work.
Q: Can I use the Alter Table command (or the Workbench Alter table dialog) to change the collation on each table to "utf8 - default collation" resulting in the current English text remaining intact (since it is compatible with the first 256 characters of utf8) and then I'll be able to populate and retrieve data from now on in utf8 without having any weird conversion issues?
I'm using MySQL 5.0.77 and MySql Workbench with an SSH connection for DB admin.
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
I have to look at workbench to see which option you should use:
http://dev.mysql.com/doc/refman/5.0/en/charset-charsets.html describes the character sets/collation options.
You may want to first copy the current database into a new one and then try your changes on the test database to make sure the change does not have an adverse impact.
http://dev.mysql.com/doc/refman/5.0/en/charset-charsets.html describes the character sets/collation options.
You may want to first copy the current database into a new one and then try your changes on the test database to make sure the change does not have an adverse impact.
ASKER
Can you confirm that going from latin1 to utf8 will not result in any weird character conversions. All of the data in there right now is 7-bit ascii with no control characters (I guess the Text fields have have line separators)?
I want to do this with MySql Workbench but am a bit confused because it only gives me the option to set the collation. There is no option to set the character set although in the description is makes it sound like collation is the same thing as character set. I have attached a screen shot.
I'm also confused as to which utf8 to choose. As you can see in the screen shot, there is utf8 default, general, bin, unicode followed by a bunch of language/country options. I was hoping to not have to be specific since I want to be able to use other languages as well in the future.