DTS SQL 2000 Errors in Importing 1Gb text file

Posted on 2011-03-24
Medium Priority
Last Modified: 2013-11-30
The text file is 1Gb in size, with each record being on its own row with the correct number of columns etc.

The fields are all enclosed with double quotes and each column is tab delimited.

When I DTS import, I select the source as text file with the following settings:

Import Type: Delimited
File Type: ANSI
Row Delimiter: {CR}{LF}
Text Qualifier: Double Quote {"}

When I click next, I choose Tab delimted and everything lines up correctly.  I am importing to a table that has all the correct field names, field sizes and correct alignment.  However when I import the file, it terminates with the error that there is no column delimiter on row 123456 (for example).

I open the text file and go to row 123456 and see that the name field has "BOB SMITH AND SANDY "JONES"".  Another example would be "Patrick O"Brian"....argh!

So they are using double quotes inside the field.  I fix it, save it and import it to find that theres another error (same issue) on row 234567.  

I'd like to go back to the vendor and tell them to fix their data but thats unfortunately, not an option and we have been waiting months for this data.

I cant go through each row individually, so any ideas to get around this, programmatically with SQL?  
Question by:Wedmore
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 3
LVL 10

Expert Comment

ID: 35210851
I know you have sql 2000 but I found that SQL 2008 Express handles the import a little better. It will at least import the column(s) correctly but it leaves double quotes within the field. So if you downloaded and installed the free SQL Server Express and run attached query you could have a clean import.

You might want to search for a free Parser program to 'clean' the CSV file before importing too.    
LVL 10

Expert Comment

ID: 35210866
Forgot to mention last step, if you install sql 2008 you can then run DTS to export to CSV and import to sql 2000

Author Comment

ID: 35217407
I cant install software as its a work PC.  Any solutions for SQL 2000 setup?
Simplifying Server Workload Migrations

This use case outlines the migration challenges that organizations face and how the Acronis AnyData Engine supports physical-to-physical (P2P), physical-to-virtual (P2V), virtual to physical (V2P), and cross-virtual (V2V) migration scenarios to address these challenges.

LVL 10

Expert Comment

ID: 35218655
Not for SQL 2000. I think your best option is to clean the file before you try the DTS import. this one might work but I havent tried it. CSved http://www.softpedia.com/get/System/File-Management/CSVed.shtml
LVL 75

Expert Comment

by:Anthony Perkins
ID: 35224874
Can you tell us what Service Pack you are using?
Have you considered using some other tool that is more forgiving such as MS Access to import the data?

Author Comment

ID: 35226279
The server is on SP5.  I have tried Access but that will output the erroneous rows as a list.  I'd still want to be able to "fix" them and import them somehow.
LVL 75

Expert Comment

by:Anthony Perkins
ID: 35226903
>>The server is on SP5.<<
Are you sure about that?  You may want to double check that, as far as I recall the last Service Pack was 4.  This will confirm one way or the other:

Have you tried using BCP or BULK INSERT?

Author Comment

ID: 35227466
Tried BCP/Bulk Insert but I dont have access to the server in the sense I cant put the file locally on the machine and the server itself doesnt have network shares setup that I could use.

I am wondering if there is a pattern or filtering I could do by importing the whole row, not delimited by any fields and then parse it....just thinking aloud now.

Will double check tomorrow on the service pack.
LVL 75

Accepted Solution

Anthony Perkins earned 2000 total points
ID: 35228018
It seems like then that your best option is to correct the file programatically.  You can do this with an ActiveX Script task from within the DTS Package that needs to execute prior to the Data Transformation task.  See below.  But it could also be done from a VBScript sript.

What this should do is take these lines:
"Patrick O"Brian"

And convert them to a format that DTS supports, such as:
"Patrick O""Brian"

No doubt this could be done more efficiently with RegEx, however I suspect you may be able to follow this better.  Be warned that on a 1GB file, it may take a while!
Option Explicit

Function Main()
Const ForReading = 1, ForWriting = 2
Dim FSO, InStm, OutStm

Set FSO = CreateObject("Scripting.FileSystemObject")
Set InStm = FSO.OpenTextFile("The path to a copy of your file goes here", ForReading)
Set OutStm = FSO.OpenTextFile("The path of the file in the Connection for the Text file goes here", ForWriting, True)

Do While Not InStm.AtEndOfStream
	OutStm.WriteLine FixQuotes(InStm.ReadLine)

Main = DTSTaskExecResult_Success
End Function

Function FixQuotes(ByVal Buffer)
Const DQ = """", DELIMITER = ","
Dim DQPos, PrevChar, NextChar

DQPos = 2
Do While DQPos > 0
	DQPos = InStr(DQPos, Buffer, DQ, vbBinaryCompare)
	If DQPos > 0 Then
		PrevChar = Mid(Buffer, DQPos - 1, 1)
		NextChar = Mid(Buffer, DQPos + 1, 1)
		If PrevChar <> DQ And PrevChar <> DELIMITER And NextChar <> vbNullString Then
			Buffer = Left(Buffer, DQPos) & DQ & Right(Buffer, Len(Buffer) - DQPos)
		End If
		DQPos = DQPos + 1
	End If

FixQuotes = Buffer

End Function

Open in new window


Author Comment

ID: 35242157
VBScript worked, I see now I would have to do an Update and replace to remove the ("") surrounding each field.

Author Closing Comment

ID: 35242164
Thanks.  You're right, it took 14 hrs to import/reformat the 1Gb file.

Unfortunately, unless the vendor fixes their data, this is the only workaround.

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.
Suggested Courses

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question