Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

"Convert to UTF8" Without Converting

Posted on 2006-11-06
6
Medium Priority
?
2,567 Views
Last Modified: 2011-08-18
I have a table which CLAIMS it's using latin1 character encoding, however what's stored in each row is actually UTF-8 data.  To clarify: SHOW CREATE TABLE insists that the encoding is latin1, but SELECT HEX(column_name) reveals a UTF-8 sequence.

I want to tell MySQL that this table is really stored as UTF-8.  Doing this, however, by any method I've seen described in the manual, causes MySQL to "convert" my data.  That is, it takes my UTF-8 string, interprets it as latin1, converts that "latin1" string to UTF-8, and stores some mangled data back in my table.

I need a way to just indicate "Listen, trust me, this is UTF-8; don't do anything about it, just record for posterity's sake that I'm using a different encoding than you think I am."


It took me a while to get my head around all this, so I'll be happy to clarify if it's confusing.  I expect, though, that there's a really simple solution I just haven't encountered.

Note that statements like ALTER TABLE t CHANGE COLUMN c c VARCHAR(50) CHARACTER SET utf8;  do this undesirable conversion.  ALTER TABLE t CHARACTER SET utf8;  does nothing but change the default (and leaves the columns marked as latin1).  ALTER TABLE t CONVERT TO CHARACTER SET utf8;  does the undesirable conversion again.

Experts?
0
Comment
Question by:VoteyDisciple
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
6 Comments
 
LVL 19

Author Comment

by:VoteyDisciple
ID: 17884183
(And let's just not even talk about how exactly this situation came to pass or why I've now spent all morning figuring out what was going on, mmmmk?)  (-:
0
 
LVL 30

Accepted Solution

by:
todd_farmer earned 400 total points
ID: 17884408
Are you able to export the data correctly using mysqldump?  If so, you could dump the contents, truncate the table, alter it to use UTF8, then reload the data.  Of course, this won't help you provide a reproducible test case to MySQL if you are wanting to open a bug report.

What version of MySQL are you using?
0
 
LVL 19

Author Comment

by:VoteyDisciple
ID: 17885504
Well, simply dumping and then importing the data shouldn't change anything inherently (it'll just create the table described as latin1 and dump in the utf8 data from the dump), so what would I need to change?

I tried playing with altering the table definition in the dumped .sql file to SAY it's utf8 but that didn't seem to have the desired effect.  I may just have screwed that up, since it seems promising, but unless  I made a mistake I don't think that did the trick.
0
Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

 
LVL 30

Expert Comment

by:todd_farmer
ID: 17885568
That's what I had in mind.  You could also take the dump file and eliminate the CREATE TABLE statement (or use the appropriate mysqldump flags to prevent it in the first place).  Then truncate and alter the existing table so that it ius UTF8 and reload the data.
0
 
LVL 19

Author Comment

by:VoteyDisciple
ID: 17890687
I've played with this some more this morning, and here's what I've tried:

1.  Dump a table (with no special modifiers) to a file.
2.  Change, at the beginning of the file, SET NAMES latin1 to SET NAMES utf8
3.  Change the table definition to CHARACTER SET utf8
4.  Import the data again

In this case I'm feeding UTF-8 data over a connection that I've indicated uses UTF-8, into a database table that's designed for UTF-8.

The data get "converted."  I have absolutely  no idea why, but the bytes in my .sql file aren't the same bytes that end up in the database table.
0
 
LVL 19

Author Comment

by:VoteyDisciple
ID: 17901328
That did ultimately work; it just took picking the right combination of utf8 vs. latin1 throughout each step of this insanely bizarre process:

1.  mysqldump --default-character-set=latin1 > file.sql
2.  In file.sql, leave SET NAMES latin1 as-is
3.  In file.sql, change CREATE TABLE statements to use DEFAULT CHARACTER SET utf8
4.  Import using mysql < file.sql (no special indicator here)

This, it seems, preserves the correct byte sequences, but in tables that announce correctly that they use UTF-8.
0

Featured Post

[Webinar] Lessons on Recovering from Petya

Skyport is working hard to help customers recover from recent attacks, like the Petya worm. This work has brought to light some important lessons. New malware attacks like this can take down your entire environment. Learn from others mistakes on how to prevent Petya like worms.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
In this article, I’ll talk about multi-threaded slave statistics printed in MySQL error log file.
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
Suggested Courses

636 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question