Solved

SQL Import Comma Delimited with double quotes in value

Posted on 2011-02-28
5
1,641 Views
Last Modified: 2013-11-10
I have some comma delimited text files I am trying to import into my tables. One text file per table. I have no control over the output of the csv's so I can't change the delimiter or the likes.

The table in this example they are being imported into:

CREATE TABLE [dbo].[CorporationName] (
      [CorporationNameID] [int] NOT NULL ,
      [CorporationID] [int] NOT NULL ,
      [Name] [varchar] (255) NOT NULL ,
      [NameTypeId] [int] NOT NULL ,
      [Title] [varchar] (100) NULL ,
      [Salutation] [varchar] (50) NULL ,
      [Prefix] [varchar] (30) NULL ,
      [LastName] [varchar] (30) NULL ,
      [MiddleName] [varchar] (30) NULL ,
      [FirstName] [varchar] (30) NULL ,
      [Suffix] [varchar] (20) NULL
) ON [PRIMARY]

A few example rows:

36,235223,"OREGON-ZEUS MINING CO., INC.",1,"","","","","","",""
123308,346608,"Lawncare LLC.",1,"","","","","","",""
123312,346640,"John C, Reilly, M.D.  ""A Professional Corporation""",1,"","","","","","",""


Using the Import and Export Wizard within SQL SMS 2005, setting the Format to Delimited and Text qualifier to " (a single double-quote) gets me pretty good results - even Corps with a comma in the value seem to be imported properly - UNTIL it runs into a row such as the last one above, where the name of the corporation itself contains double quotes.

John C, Reilly, M.D. "A Professional Corporation"

is the actual name of the Corporation.. but apparently the double quotes in the value are throwing off the import cause I get an assortment of errors and the import fails.


- Copying to [Corporations].[dbo].[CorporationName] (Error)
      Messages
      * Error 0xc0202055: Data Flow Task: The column delimiter for column "Column 10" was not found.
       (SQL Server Import and Export Wizard)
      
      * Error 0xc0202092: Data Flow Task: An error occurred while processing file "C:\Temp\CorporationName.txt" on data row 23721.
       (SQL Server Import and Export Wizard)
      
      * Error 0xc0047038: Data Flow Task: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED.  The PrimeOutput method on component "Source - CorporationName_txt" (1) returned error code 0xC0202092.  The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.  There may be error messages posted before this with more information about the failure.
       (SQL Server Import and Export Wizard)
      
      * Error 0xc0047021: Data Flow Task: SSIS Error Code DTS_E_THREADFAILED.  Thread "SourceThread0" has exited with error code 0xC0047038.  There may be error messages posted before this with more information on why the thread has exited.
       (SQL Server Import and Export Wizard)
      
      * Error 0xc0047039: Data Flow Task: SSIS Error Code DTS_E_THREADCANCELLED.  Thread "WorkThread0" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown.  There may be error messages posted before this with more information on why the thread was cancelled.
       (SQL Server Import and Export Wizard)
      
      * Error 0xc0047021: Data Flow Task: SSIS Error Code DTS_E_THREADFAILED.  Thread "WorkThread0" has exited with error code 0xC0047039.  There may be error messages posted before this with more information on why the thread has exited.
       (SQL Server Import and Export Wizard)


Some of these text files contain 200k+ rows so I can't convert them to Tab delimited via Excel (max of 65k rows). Any ideas? Thanks in advance!
      
0
Comment
Question by:alright
  • 2
5 Comments
 
LVL 22

Expert Comment

by:8080_Diver
ID: 35002178
Well, if I had to work with that data, I think I would import the whole line into a single VarChar(big enough) column as the first step.  Then I would probably create CLR function/stored procedure to parse the data (based upon the commas, comma-quotes, and quote-commas).

The parsing would basically be along these line:
Starting from the left most character,
Search for either a comma or a double quote (i.e. find each with a CHARINDEX() and take the lowest number that is not 0);
If neither is found (i.e. both returns are 0), then the rest of the data constitutes the last column;
If a comma is found before a double quote is found, that represents a column of data, use it;
If a double quote is found before a comma, search for a Double-Quote-Comma and, if it is found, that that defines a column of data;
If the search for a Double-Quote-Comma returns a 0 (i.e. not found) then the rest of the data constitutes the last column;
If the parsing has not reached the end of the data, loop back to the first bullt; otherwise, get the work row.
0
 
LVL 45

Expert Comment

by:aikimark
ID: 35011786
if you have Excel 2007 (or later), the worksheet row limit has been expanded to 1M.
0
 

Accepted Solution

by:
alright earned 0 total points
ID: 35020153
Thank you for the help. Strangely enough though, I imported the files, without any transformation, via Access and it digested them perfectly. I was then able to do an Import within SQL SMS from the Access database and everything transferred over perfectly... I guess Access still has its uses!

Thanks again
0
 

Author Closing Comment

by:alright
ID: 35067713
Hopefully this can help someone else in the future
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
When it comes to protecting Oracle Database servers and systems, there are a ton of myths out there. Here are the most common.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

832 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question