Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Load CSV file with Commas within its columns into SQL server using SSIS

Posted on 2011-03-18
14
Medium Priority
?
1,291 Views
Last Modified: 2012-08-14
Hello,

I have a CSV file in the following format

Name,       Tournament ,                       Code ,    Cash
"a",         "bit $1,000, express",             "f"  ,      $1000
"n",         "Tour",hold $10",                     "y" ,       $10

so here  "bit $1,000express"   and   "Tour",hold $10"  are actually in the Tournament coloumn as you can see from my illustration. This is exactly how most of the data in my file is represented.

So when I use SSIS to try to load this CSV file into my SQL 2008 database, the middle column is truncated at the commas that occur with the text of the column into separate columns .

so here "000", and "express" will be moved to next two columns and  "hold $10" will be in the next columns after its rightful column.

Does anyone know how to solve this kind of problem in SSIS .
By the way its a large CSV file with many columns .

Thanks.
0
Comment
Question by:marvo2010
  • 6
  • 3
  • 2
  • +2
14 Comments
 
LVL 34

Accepted Solution

by:
Paul MacDonald earned 2000 total points
ID: 35166723
Is this representative of all the data in the CSV file?  I ask because you could replace all the ",  (quote-comma-space) combinations with tabs, then delete the remaining quotes.  That might mess up your "Tour", hold $10" item, but maybe not.

Otherwise, if you have the original dataset, you could export it as tab-delimited and then inport it that way.
0
 

Author Comment

by:marvo2010
ID: 35166767
Yeah Thats the main issue really. There are commas within the columns , and the column delimiter is also comma, so where ever the SSIS see a comma, it basically separates the column into a separate column. I am wondering if anyone has econutered this kind of problem and how they were able to solve it.
0
 

Author Comment

by:marvo2010
ID: 35166785
If I export it as a Tab separated CSV file , will it not still insert Tabs where ever it sees commas just like SSIS is separating data into columns based on the columns.
0
Get your Conversational Ransomware Defense e‑book

This e-book gives you an insight into the ransomware threat and reviews the fundamentals of top-notch ransomware preparedness and recovery. To help you protect yourself and your organization. The initial infection may be inevitable, so the best protection is to be fully prepared.

 
LVL 7

Expert Comment

by:Alex_W
ID: 35167121
If you can replace the , separator with another character, say ; leaving the internal commas in each column intact when perhaps a small .NET app could perform the necessary formatting for you.  You could:

1) read the rows into arrays specifying the separation character (;)
2) find and replace the remaining commas (inside each column) to a more friendly character or even remove them
3) then write out the line as comma separated so its ready to import as CSV while having replaced the internal commas so they dont cause these issues.

If this sounds like a solution to you I can post some test code in VB.NET...
0
 

Author Comment

by:marvo2010
ID: 35167149
yes this certainly sounds like a solution . Lets test the VB code using a Script task in SSIS.

Thanks
0
 
LVL 34

Expert Comment

by:Paul MacDonald
ID: 35167321
Yes, that was what I was driving at.  If you can replace the actual column delimiters at the export source, that would be ideal.  I was illustrating how you could change the commas to tabs if you don't have the original source data (can't export it again).  And no, when you import the data into SSIS, you should be allowed to indicate the data is tab-delimited rather than comma-delimited and SQL Server should consume it just fine.
0
 
LVL 7

Expert Comment

by:Alex_W
ID: 35167500
If you can replace the column delimiters to a tab then it sounds as though paulmacd's solution would be easiest because you'll have no middlware and be able to import directly to SQL Server.  If this for some reason isnt possible and your after some VB code to do some re-formatting, then please post and I'll supply some.
0
 

Author Comment

by:marvo2010
ID: 35168422
yeah we have the files already and needs to load them rightly now. so if you can supply the vb code and how to apply it , that will be great
Thanks
0
 
LVL 7

Expert Comment

by:Alex_W
ID: 35171259
This is made up from some bits and peices and untested for your purpose but should get you started with the functionality, maybe best in a console app to quickly run the routine against the file:
Dim m_source As String()   ' create an array to read the rows
m_source = System.IO.File.ReadAllLines("C:\Input.csv")   ' array of source file lines
Dim Output As New System.IO.StreamWriter("C:\Output.csv")   ' file to output

For x = 1 To m_source.Length - 1   ' iterate down source file rows (starting from 1 - row after headers)
  Dim DataToWrite() As String = m_source(x).Split(";"c)   ' data array or current row split with a ;
  DataToWrite = DataToWrite.Replace(",", "_")   ' replace the internal comma with something
  DataToWrite = DataToWrite.Replace(";", ",")   ' now swap the ; separator back to a , for CSV compatibility
  Output.WriteLine(DataToWrite)   ' write the output back out
Next

Output.Close()

Open in new window

0
 
LVL 3

Expert Comment

by:seenall
ID: 35171962
The way I have preferred getting around this issue is by loading the entire record into a single column staging table and then performing the delimited extraction using T-SQL CHARINDEX and SUBSTRING functions to reach the target table

This will at least avoid maintaining an external .net module and keep the process integrated into your DTSX package.

Just a thought
0
 

Author Comment

by:marvo2010
ID: 35175666
Hello Seenall,

Pls do have a step by step solution of how to implmement your suggested solution without an external .Net module like u said using T-SQL Charindex and SUBSTRING. I will appreciate to have a look.

Thanks
0
 

Author Comment

by:marvo2010
ID: 35188177
Thanks Guys,

I have been able to potentially solve this problem.Converting the files to Tab delimited  worked perfectly. I ve  got a little solution that can actually convert the files to Tab delimited from folder to folder and then I need to Iterate through the files using SSIS and load to my staging table .
Microsoft have not got a solution for this from what I have learnt during this process but the "NEVER SAY NEVER ATTITUDE" always gets the job done no matter what. I love never to quit.

Thanks guys
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 36243460
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.
0

Featured Post

Fill in the form and get your FREE NFR key NOW!

Veeam is happy to provide a FREE NFR server license to certified engineers, trainers, and bloggers.  It allows for the non‑production use of Veeam Agent for Microsoft Windows. This license is valid for five workstations and two servers.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this series, we will discuss common questions received as a database Solutions Engineer at Percona. In this role, we speak with a wide array of MySQL and MongoDB users responsible for both extremely large and complex environments to smaller singl…
In this article, we’ll look at how to deploy ProxySQL.
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
This lesson discusses how to use a Mainform + Subforms in Microsoft Access to find and enter data for payments on orders. The sample data comes from a custom shop that builds and sells movable storage structures that are delivered to your property. …

916 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question