We help IT Professionals succeed at work.

Bulk insert french characters (SQL 2008)

IT-CICA
IT-CICA asked
on
Hi,

I am trying to do a bulk insert form csv file. The file may contain some french characters. When I open the file in notepad, the french character shows up fine, when  Iopen the same file in excel, the french charatcer turns into something like MÉNAR001 as opposed to MÉNAR001 shown in note pad.

When I bulk insert this file using following SQL statement:
bulk insert dbo.TestBulk from 'D:\ReportTest.csv'
   WITH
      
  (      
        CODEPAGE = 'ACP',
        FIELDTERMINATOR = '|',
    ROWTERMINATOR = '\n'
         
  )
the field does not contain french charatcers.
I tried using codepage values like RAW,OEM,65001, but the field still contains strange characters.

Thanks in advance,
Gauri

Comment
Watch Question

Try like below.
Also make sure that the columns that will contain those special characters are of the type nvarchar. You might also want to check this link for further references:
http://msdn.microsoft.com/en-us/library/ms188289.aspx 
 

bulk insert dbo.TestBulk from 'D:\ReportTest.csv' 
   WITH 
      
  (      
        CODEPAGE = 'ACP',
        FIELDTERMINATOR = '|', 
    	ROWTERMINATOR = '\n',
	DATAFILETYPE = 'widechar' 
         
  )

Open in new window

Author

Commented:
I checked the datatype of the column is nvarchar where the French characters would be present.

I tried your parameters for Bulk insert. But the French characters did not get copied. I got the following meessage:

Bulk load: DataFileType was incorrectly specified as widechar. DataFileType will be assumed to be char because the data file does not have a Unicode signature.


Author

Commented:
I suspect it has something to do with the file encoding. The file is UTF 8 encoded. If I save it as ANSI, the French characters get copied fine with CodePage 1252. But the user wouldn;t want to save the file
as ANSI (This file is a Crystal Report exported as csv)
>>Bulk load: DataFileType was incorrectly specified as widechar. DataFileType will be assumed to be char because the data file does not have a Unicode signature.
I suspect it has something to do with the file encoding. <<
Yes, that's the case. As indicated in the BOL link, I've posted before:
"For a format file to work with a Unicode character data file, all the input fields must be Unicode text strings (that is, either fixed-size or character-terminated Unicode strings). "

 

Author

Commented:
I am not using a format file. The data file is a .csv file and am trying to Bulk insert the data using Bulk Insert command. Is there any workaound for this problem?
Well, indeed Bulk insert does not support UTF-8 encoding.
You might consider import wizard instead:
http://www.geodatasource.com/mssql2008-import.html
Or use a function to convert utf-8 to utf-16. Check the following links:
http://www.sqlteam.com/Forums/topic.asp?TOPIC_ID=62406
 http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=72911
 
you might find the following article interesting as well:
http://support.microsoft.com/kb/232580 

Author

Commented:
I saved the file in UTF -16 and then the French characters showed up fine. Moderator, could you please close this question?
>> saved the file in UTF -16 and then the French characters showed up fine<<
That's what I've told you to do in comment http:#a32993855 so why don't you close the question yourself accepting that comment as the answer?