?
Solved

MS SQL Bulk Import UTF8 CSV data

Posted on 2014-03-11
5
Medium Priority
?
4,510 Views
Last Modified: 2014-04-05
We have to import UTF8 formated CSV data into a MS SQL 2008R2 DB table.

MS SQL don't support the codepage 65001.

Also the datafiletypes "widechar" and "widenative" stops with the error message that the first column is too long in the first line.

For convert the data file without lost information we try convert the file from UFT8 to UTF16, but UTF16 is not a known coding in .NET.  Only UTF7, UTF8 and UTF32 are knowing codings.

What is good working solution for this problem?
0
Comment
Question by:hpnix4ever
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
5 Comments
 
LVL 143

Assisted Solution

by:Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3] earned 2000 total points
ID: 39920803
for the time being, here some MS official statement about why the UTF-8 is not supported (yet):
https://connect.microsoft.com/SQLServer/feedback/details/321839/msft-edw-tapblock-bcp-doesnt-recognize-field-terminator-on-code-page-65001

and .net does support utf-16, namely unicode encoding:
http://msdn.microsoft.com/en-us/library/system.text.unicodeencoding%28v=vs.110%29.aspx
0
 

Accepted Solution

by:
hpnix4ever earned 0 total points
ID: 39923119
Thanks for answer. I will try to convert from UTF8 to UTF16 before SQL bulk import.
------------------------------
Now i had test some convert functions, but nothing of the results can be imported in SQL.  

For read the UTF8 file i had used a streamreader with encoding UFT8. For write as UTF16 i had used a streamwriter with unicode encoding.

For verification i had convert from UTF8 to UTF16 and back to UFT8 that i can load with Excel. The UTF16 file has double size as UTF8, but i can't show with my programs. The reconvert file i can load to Excel and it looks fine. Only the SQL import don't work. Anything by the UTF16 file looks wrong.

For the string conversation i had used all known .NET functions and also some native conversations, but nothing is ready for the bulk import.

The SQL Part is like this:

BULK INSERT tbl_Import
FROM 'C:\temp\test.csv'
WITH
(
  FIELDTERMINATOR=';',
  ROWTERMINATOR='\r\n',
  -- DATAFILETYPE = 'widechar',
  -- DATAFILETYPE= 'widenative',
  CODEPAGE = 1200
 -- CODEPAGE = 1201
)

See the other versions as SQL comments
0
 

Author Comment

by:hpnix4ever
ID: 39953109
The solution was to convert the data before import from UTF-8 to Unicode-BigEndian with a .NET-Programm. Unicode-BigEndian can be imported with BULK-Import.
0
 

Author Closing Comment

by:hpnix4ever
ID: 39979837
It's no solution posible direct inside MS SQL. Only a converison to UnicodeBigEndian with a external Script is working.
0

Featured Post

Python: Series & Data Frames With Pandas

Learn the basics of Python’s pandas library of series & data frames and how we can use these tools for data manipulation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
I have a large data set and a SSIS package. How can I load this file in multi threading?
Viewers will learn how the fundamental information of how to create a table.
The goal of the video will be to teach the user the concept of local variables and scope. An example of a locally defined variable will be given as well as an explanation of what scope is in C++. The local variable and concept of scope will be relat…

718 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question