Solved

SQL BULK INSERT - How to import a CSV file with Commas inside

Posted on 2012-04-07
7
2,293 Views
Last Modified: 2012-06-27
Hi
I have to import a CSV file containing several million rows into SQL Server, in which one specific column contains a comma in some records. This comma is causing the import to split the field value into two pieces and move values into the wrong columns for those records.

When the file is opened in Notepad a record has this example form:

Value1,Value2,Value3,,Value5,Value6,"Flat1, High Street",Value8,,Value10

The problem is caused by the column containing "Flat1, High Street" due to the internal comma. I have no control over the format of the file as this comes from an external operating system and so have to deal with the file as it comes. I cannot import it first into Excel to perform manipulations as the file is several million rows.

Is there a way to perform the BULK INSERT into SQL, or to process the file first with some other tool? If it helps, I can rely on knowing that it is always the same column that may cause problems. Also, the problem values are always surrounded by "" as in the example above.
0
Comment
Question by:qprjohn121
7 Comments
 
LVL 37

Expert Comment

by:Neil Russell
Comment Utility
You need to remember to set the text qualifier to be a double quote then the csv will import even with the comma inside the quotes.
0
 
LVL 17

Expert Comment

by:Anuroopsundd
Comment Utility
0
 
LVL 17

Accepted Solution

by:
Anuroopsundd earned 500 total points
Comment Utility
see queries from http://sqlusa.com/bestpractices2005/notepad/
one example from above link.

DECLARE  @DynamicSQL NVARCHAR(512)
DECLARE  @csvFilePath VARCHAR(64),
         @bcpFormatPath VARCHAR(64)
 
SET @csvFilePath = 'f:\data\feed\inventory20151023.txt'
SET @bcpFormatPath = 'f:\data\format\inventory.fmt'
SET @DynamicSQL = 'BULK INSERT [PartsInventory] FROM ''' + @csvFilePath + ''' WITH (formatfile = ''' + @bcpFormatPath + ''')'
 
PRINT @DynamicSQL -- test & debug
 
/*  
BULK INSERT [PartsInventory] FROM 'e:\data\inventory20120201.txt'  
WITH (formatfile = 'e:\data\inventory.fmt')
*/
EXEC SP_EXECUTESQL   @DynamicSQL
GO
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:qprjohn121
Comment Utility
Sorry Neilsr I'm a beginner at this, could you clarify the specifics - here's what I have tried so far:

BULK INSERT dbo.stagingTable
FROM 'C:\Users\user\Desktop\TestImport.CSV'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW = 2
0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
Unfortunately you will not be able to use BULK INSERT to import this file with first editing the file and even then it would be a nightmare.
You need to use some other tool to do this.  I suspect you may find that SSIS will also choke on it if that column does not have text delimiters on all the rows.  If that is the case and you cannot use any third party tool then you will have to resort to rolling your own import routine.
0
 

Author Comment

by:qprjohn121
Comment Utility
Ok I have cracked this in principle now. The key learning was to create a Format File which allows you to specify a different terminator for each field, so that fields 1 to 5 in my example use ',' as the terminator, field 6 uses ',"' as the terminator, field 7 uses '",' as the terminator, and the rest use ',' as the terminator apart from the last field which uses '\r\n' to mark the end of the record.

I created an XML Format File called 'TestImportFormatFile.xml' and referenced this file in my SQL import:

BULK INSERT dbo.staging_TestImport
FROM 'C:\myFolder\TestSmallFile.CSV'
WITH
(
FIRSTROW=2,
FORMATFILE = 'C:\myFolder\TestImportFormatFile.xml'
)

Hope this helps others.
0
 

Author Closing Comment

by:qprjohn121
Comment Utility
This didnt answer all points of my question but identified for me the need to use Format Files.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

If you’re thinking to yourself “That description sounds a lot like two people doing the work that one could accomplish,” you’re not alone.
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now