Solved

LOAD DATA INFILE error caused by foreign characters

Posted on 2011-03-10
13
922 Views
Last Modified: 2012-06-27
I am importing from a CSV file into MySQL using LOAD DATA INFILE as follows:
LOAD DATA INFILE '#theCSVFile#' INTO TABLE tempTable FIELDS TERMINATED BY '|'

Open in new window

This has worked successfully before, but today I get an error:
Incorrect string value: '\xF6gberg' for column 'AuthorisedAssigneeFullName' at row 17
which is caused by the person's name
Högberg
containing a foreign character.

How do I fix the problem so names with foreign characters are processed normally?

For completeness, I created the CSV file by extracting data from an Excel spreadsheet under ColdFusion 9, using the following code:

<!---Read the xlsx file into a query---> 
<cfspreadsheet action="read" src="#sourcefile#" sheet="2" query="xlsxData" columns="2-48" rows="8-9999"> 
<!--- Write the contents of the query out to a new file--->
<cfset Delim = "|">
<cfset sb = createObject("java", "java.lang.StringBuffer")/><!--- creates empty StringBuffer --->
<cfset endLine = chr(10)><!--- set endline to {LF} chr(10)--->
<cfloop query="xlsxData" endrow="#xlsxData.recordCount#">
<cfset sb.append(COL_1).append(Delim).append(COL_2).append(Delim).append(COL_3).append(Delim).append(COL_4).append(Delim).append(COL_5).append(Delim).append(COL_6).append(Delim).append(COL_7).append(Delim).append(COL_8).append(Delim).append(COL_9).append(Delim).append(COL_10).append(Delim).append(COL_11).append(Delim).append(COL_12).append(Delim).append(COL_13).append(Delim).append(COL_14).append(Delim).append(COL_15).append(Delim).append(COL_16).append(Delim).append(COL_17).append(Delim).append(COL_18).append(Delim).append(COL_19).append(Delim).append(COL_20).append(Delim).append(COL_21).append(Delim).append(COL_22).append(Delim).append(COL_23).append(Delim).append(COL_24).append(Delim).append(COL_25).append(Delim).append(COL_26).append(Delim).append(COL_27).append(Delim).append(COL_28).append(Delim).append(COL_29).append(Delim).append(COL_30).append(Delim).append(COL_31).append(Delim).append(COL_32).append(Delim).append(COL_33).append(Delim).append(COL_34).append(Delim).append(COL_35).append(Delim).append(COL_36).append(Delim).append(COL_37).append(Delim).append(COL_38).append(Delim).append(COL_39).append(Delim).append(COL_40).append(Delim).append(COL_41).append(Delim).append(COL_42).append(Delim).append(COL_43).append(Delim).append(COL_44).append(Delim).append(COL_45).append(Delim).append(COL_46).append(Delim).append(COL_47)>
<!--- Add a newline unless it's the last record in the query --->
<cfif currentRow NEQ recordCount>
<cfset sb.append(endLine)>
</cfif>
</cfloop>
<cfset theCSVFile = "C:\\Documents and Settings\\All Users\\Application Data\\MySQL\\MySQL Server 5.1\\Data\\mytracker\\tempFile.csv">
<cffile action="write" file="#theCSVFile#" output="#sb.toString()#">

Open in new window

0
Comment
Question by:ShanghaiD
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 5
  • 3
13 Comments
 
LVL 92

Expert Comment

by:objects
ID: 35103415
you can specify the charset to use

LOAD DATA INFILE '#theCSVFile#' INTO TABLE tempTable  CHARACTER SET XXX  FIELDS TERMINATED BY '|'

and make sure the database is created with the appropriate charset
0
 
LVL 52

Expert Comment

by:_agx_
ID: 35104036
I don't remember if you also need to enable unicode for the dsn. Try objects's suggestions first.  IF you get an error/or the characters don't insert correctly, then also add the unicode settings to your dsn
http://www.experts-exchange.com/Database/MySQL/Q_26874251.html#a35086595

1) Select  Data & Services > Datasources > (Your MySQL Datasource)
2) Click:   Show Advanced Settings
3) In "Connection Settings" enter:
useUnicode=true&characterEncoding=UTF-8

Open in new window

0
 

Author Comment

by:ShanghaiD
ID: 35104254
Thank you both, but still getting the error message.  My table code is:
<!---Create Temporary table--->
<cfquery name="CreateTemporaryTable" datasource="mysqlTracker">
CREATE TABLE `temptable` (
`AuthorisedAssigneeFullName` varchar(499) DEFAULT NULL,
`Emailaddress` varchar(199) DEFAULT NULL,
`Nationality` varchar(499) DEFAULT NULL,
etc
etc
etc
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT; 
</cfquery>
				
<cfquery name="LoadDataInfile" datasource="mysqlTracker">
LOAD DATA INFILE '#theCSVFile#' INTO TABLE tempTable CHARACTER SET utf8 FIELDS TERMINATED BY '|' 
</cfquery>

Open in new window

and My dsn connection string is set as:
zeroDateTimeBehavior=convertToNull&useUnicode=true&characterEncoding=UTF-8

Open in new window

I'm  currently working aroudn the problem by loopign through the query object created when CF9 imported the Excel spreadhseet, but using LOAD DATA INFILE would be faster if we could get it to work.
0
Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

 
LVL 92

Expert Comment

by:objects
ID: 35104338
is the data in the csv correct ie. is it being generated correctly?
0
 

Author Comment

by:ShanghaiD
ID: 35104458
Yes, it is (and everything works fine if there are no foreign characters in a person's name).

Here is the offending line in the CSV file  whcih is triggering the error:

J Högberg|||||||||1-Mar-10|OPEN|XY EA||||||||||||N/A|Commuter|||||N/A|N|||||||||||||||0|(

Open in new window

0
 

Author Comment

by:ShanghaiD
ID: 35104714
I'm going to have to abandon trying to get LOAD DATA INFILE to work on this and continue instead to use CFLOOP through each spreadsheet row because I have now also found another problem  with Chinese characters in some persons' names.  Unlike the Högberg character problem, the Chinese characters do not throw an error, but when I view the Mysql table, I see ?????? for the characters which have been inserted.  However, when I process the spreadsheet using the CFLOOP query, the Chinese characters (and also Högberg) come across correctly.  
So, although looping is slower than LOAD DATA INFILE,  at least I have a practical solution to my problem.  Thanks everyone for trying to help me.
0
 
LVL 52

Expert Comment

by:_agx_
ID: 35104799
If you're seeing "??????" it means it's not being inserted with the correct encoding. So you get the "??" unknown characters instead.  I'm going to test it and see what I get.  What version of MySQL are you using?  
0
 
LVL 92

Expert Comment

by:objects
ID: 35104834
whats the encoding of the CSV file?
0
 

Author Comment

by:ShanghaiD
ID: 35104836
Server Version: 5.1.53-communityand Encoding: 65001 (UTF-8).
0
 
LVL 52

Accepted Solution

by:
_agx_ earned 500 total points
ID: 35104916
Change the charset of your cffile and it should work. Example:

<cfprocessingdirective pageEncoding="utf-8">
<cfsavecontent variable="sampleData">
Högberg      
Classical Greek: ¿a¿¿¿ ¿a¿e¿¿ d¿¿aµa¿¿
Greek (monotonic): ¿p¿¿¿ ¿a f¿¿ spasµ¿¿a ¿¿a¿¿¿ ¿¿¿¿¿ ¿a p¿¿¿ t¿p¿ta.
¿ ¿ ¿ ¿
¿ ¿ ¿ ¿
</cfsavecontent>
<cffile action="write" file="c:\test.txt" output="#sampleData#" charset="utf-8">

Don't know if you need to specify the encoding on your buffer too
ie
<cffile action="write" file="#theCSVFile#" output="#sb.toString("UTF8")#"  charset="utf-8">

0
 
LVL 52

Expert Comment

by:_agx_
ID: 35104923
Darn .. it mangled my utf8 samples ;-) But the important part is the <cffile ..> code anyway.
0
 

Author Comment

by:ShanghaiD
ID: 35104943
Success!    You have found my solution (both for Högberg and the Chinese characters)!
I was not explicitly setting the CSV encoding, and when I changed this in the CFFILE tag from:
<cffile action="write" file="#theCSVFile#" output="#sb.toString()#">

Open in new window

to:
<cffile action="write" file="#theCSVFile#" charset="utf-8" output="#sb.toString()#">

Open in new window

then everything works without a problem (using LOAD DATA INFILE).
Thanks for persevering to guide me to the solution!  You deserve the points!
0
 
LVL 52

Expert Comment

by:_agx_
ID: 35104982
Great, glad it's working now :)
0

Featured Post

Secure Your Active Directory - April 20, 2017

Active Directory plays a critical role in your company’s IT infrastructure and keeping it secure in today’s hacker-infested world is a must.
Microsoft published 300+ pages of guidance, but who has the time, money, and resources to implement? Register now to find an easier way.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
collection output issue 9 64
Need help aligning JRadioButton that put in a box using GridBagLayout 2 27
DTD and JAVA versions 1 31
MySQL programmer starter 25 27
Creating and Managing Databases with phpMyAdmin in cPanel.
When table data gets too large to manage or queries take too long to execute the solution is often to buy bigger hardware or assign more CPUs and memory resources to the machine to solve the problem. However, the best, cheapest and most effective so…
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…
This tutorial will introduce the viewer to VisualVM for the Java platform application. This video explains an example program and covers the Overview, Monitor, and Heap Dump tabs.

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question