Solved

SQL Import Comma Delimited with double quotes in value

Posted on 2011-02-28
5
1,615 Views
Last Modified: 2013-11-10
I have some comma delimited text files I am trying to import into my tables. One text file per table. I have no control over the output of the csv's so I can't change the delimiter or the likes.

The table in this example they are being imported into:

CREATE TABLE [dbo].[CorporationName] (
      [CorporationNameID] [int] NOT NULL ,
      [CorporationID] [int] NOT NULL ,
      [Name] [varchar] (255) NOT NULL ,
      [NameTypeId] [int] NOT NULL ,
      [Title] [varchar] (100) NULL ,
      [Salutation] [varchar] (50) NULL ,
      [Prefix] [varchar] (30) NULL ,
      [LastName] [varchar] (30) NULL ,
      [MiddleName] [varchar] (30) NULL ,
      [FirstName] [varchar] (30) NULL ,
      [Suffix] [varchar] (20) NULL
) ON [PRIMARY]

A few example rows:

36,235223,"OREGON-ZEUS MINING CO., INC.",1,"","","","","","",""
123308,346608,"Lawncare LLC.",1,"","","","","","",""
123312,346640,"John C, Reilly, M.D.  ""A Professional Corporation""",1,"","","","","","",""


Using the Import and Export Wizard within SQL SMS 2005, setting the Format to Delimited and Text qualifier to " (a single double-quote) gets me pretty good results - even Corps with a comma in the value seem to be imported properly - UNTIL it runs into a row such as the last one above, where the name of the corporation itself contains double quotes.

John C, Reilly, M.D. "A Professional Corporation"

is the actual name of the Corporation.. but apparently the double quotes in the value are throwing off the import cause I get an assortment of errors and the import fails.


- Copying to [Corporations].[dbo].[CorporationName] (Error)
      Messages
      * Error 0xc0202055: Data Flow Task: The column delimiter for column "Column 10" was not found.
       (SQL Server Import and Export Wizard)
      
      * Error 0xc0202092: Data Flow Task: An error occurred while processing file "C:\Temp\CorporationName.txt" on data row 23721.
       (SQL Server Import and Export Wizard)
      
      * Error 0xc0047038: Data Flow Task: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED.  The PrimeOutput method on component "Source - CorporationName_txt" (1) returned error code 0xC0202092.  The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.  There may be error messages posted before this with more information about the failure.
       (SQL Server Import and Export Wizard)
      
      * Error 0xc0047021: Data Flow Task: SSIS Error Code DTS_E_THREADFAILED.  Thread "SourceThread0" has exited with error code 0xC0047038.  There may be error messages posted before this with more information on why the thread has exited.
       (SQL Server Import and Export Wizard)
      
      * Error 0xc0047039: Data Flow Task: SSIS Error Code DTS_E_THREADCANCELLED.  Thread "WorkThread0" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown.  There may be error messages posted before this with more information on why the thread was cancelled.
       (SQL Server Import and Export Wizard)
      
      * Error 0xc0047021: Data Flow Task: SSIS Error Code DTS_E_THREADFAILED.  Thread "WorkThread0" has exited with error code 0xC0047039.  There may be error messages posted before this with more information on why the thread has exited.
       (SQL Server Import and Export Wizard)


Some of these text files contain 200k+ rows so I can't convert them to Tab delimited via Excel (max of 65k rows). Any ideas? Thanks in advance!
      
0
Comment
Question by:alright
  • 2
5 Comments
 
LVL 22

Expert Comment

by:8080_Diver
ID: 35002178
Well, if I had to work with that data, I think I would import the whole line into a single VarChar(big enough) column as the first step.  Then I would probably create CLR function/stored procedure to parse the data (based upon the commas, comma-quotes, and quote-commas).

The parsing would basically be along these line:
Starting from the left most character,
Search for either a comma or a double quote (i.e. find each with a CHARINDEX() and take the lowest number that is not 0);
If neither is found (i.e. both returns are 0), then the rest of the data constitutes the last column;
If a comma is found before a double quote is found, that represents a column of data, use it;
If a double quote is found before a comma, search for a Double-Quote-Comma and, if it is found, that that defines a column of data;
If the search for a Double-Quote-Comma returns a 0 (i.e. not found) then the rest of the data constitutes the last column;
If the parsing has not reached the end of the data, loop back to the first bullt; otherwise, get the work row.
0
 
LVL 45

Expert Comment

by:aikimark
ID: 35011786
if you have Excel 2007 (or later), the worksheet row limit has been expanded to 1M.
0
 

Accepted Solution

by:
alright earned 0 total points
ID: 35020153
Thank you for the help. Strangely enough though, I imported the files, without any transformation, via Access and it digested them perfectly. I was then able to do an Import within SQL SMS from the Access database and everything transferred over perfectly... I guess Access still has its uses!

Thanks again
0
 

Author Closing Comment

by:alright
ID: 35067713
Hopefully this can help someone else in the future
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Read about achieving the basic levels of HRIS security in the workplace.
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
Via a live example, show how to setup several different housekeeping processes for a SQL Server.
Viewers will learn how the fundamental information of how to create a table.

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

25 Experts available now in Live!

Get 1:1 Help Now