Solved

DTS SQL 2000 Errors in Importing 1Gb text file

Posted on 2011-03-24
11
515 Views
Last Modified: 2013-11-30
The text file is 1Gb in size, with each record being on its own row with the correct number of columns etc.

The fields are all enclosed with double quotes and each column is tab delimited.

When I DTS import, I select the source as text file with the following settings:

Import Type: Delimited
File Type: ANSI
Row Delimiter: {CR}{LF}
Text Qualifier: Double Quote {"}

When I click next, I choose Tab delimted and everything lines up correctly.  I am importing to a table that has all the correct field names, field sizes and correct alignment.  However when I import the file, it terminates with the error that there is no column delimiter on row 123456 (for example).

I open the text file and go to row 123456 and see that the name field has "BOB SMITH AND SANDY "JONES"".  Another example would be "Patrick O"Brian"....argh!

So they are using double quotes inside the field.  I fix it, save it and import it to find that theres another error (same issue) on row 234567.  

I'd like to go back to the vendor and tell them to fix their data but thats unfortunately, not an option and we have been waiting months for this data.

I cant go through each row individually, so any ideas to get around this, programmatically with SQL?  
0
Comment
Question by:Wedmore
  • 5
  • 3
  • 3
11 Comments
 
LVL 10

Expert Comment

by:Mez4343
Comment Utility
I know you have sql 2000 but I found that SQL 2008 Express handles the import a little better. It will at least import the column(s) correctly but it leaves double quotes within the field. So if you downloaded and installed the free SQL Server Express and run attached query you could have a clean import.

You might want to search for a free Parser program to 'clean' the CSV file before importing too.    
0
 
LVL 10

Expert Comment

by:Mez4343
Comment Utility
Forgot to mention last step, if you install sql 2008 you can then run DTS to export to CSV and import to sql 2000
0
 

Author Comment

by:Wedmore
Comment Utility
I cant install software as its a work PC.  Any solutions for SQL 2000 setup?
0
 
LVL 10

Expert Comment

by:Mez4343
Comment Utility
Not for SQL 2000. I think your best option is to clean the file before you try the DTS import. this one might work but I havent tried it. CSved http://www.softpedia.com/get/System/File-Management/CSVed.shtml
0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
Can you tell us what Service Pack you are using?
Have you considered using some other tool that is more forgiving such as MS Access to import the data?
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:Wedmore
Comment Utility
The server is on SP5.  I have tried Access but that will output the erroneous rows as a list.  I'd still want to be able to "fix" them and import them somehow.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
>>The server is on SP5.<<
Are you sure about that?  You may want to double check that, as far as I recall the last Service Pack was 4.  This will confirm one way or the other:
SELECT SERVERPROPERTY('ProductLevel')

Have you tried using BCP or BULK INSERT?
0
 

Author Comment

by:Wedmore
Comment Utility
Tried BCP/Bulk Insert but I dont have access to the server in the sense I cant put the file locally on the machine and the server itself doesnt have network shares setup that I could use.

I am wondering if there is a pattern or filtering I could do by importing the whole row, not delimited by any fields and then parse it....just thinking aloud now.

Will double check tomorrow on the service pack.
0
 
LVL 75

Accepted Solution

by:
Anthony Perkins earned 500 total points
Comment Utility
It seems like then that your best option is to correct the file programatically.  You can do this with an ActiveX Script task from within the DTS Package that needs to execute prior to the Data Transformation task.  See below.  But it could also be done from a VBScript sript.

What this should do is take these lines:
"BOB SMITH AND SANDY "JONES""
"Patrick O"Brian"

And convert them to a format that DTS supports, such as:
"BOB SMITH AND SANDY ""JONES"""
"Patrick O""Brian"

No doubt this could be done more efficiently with RegEx, however I suspect you may be able to follow this better.  Be warned that on a 1GB file, it may take a while!
Option Explicit

Function Main()
Const ForReading = 1, ForWriting = 2
Dim FSO, InStm, OutStm

Set FSO = CreateObject("Scripting.FileSystemObject")
Set InStm = FSO.OpenTextFile("The path to a copy of your file goes here", ForReading)
Set OutStm = FSO.OpenTextFile("The path of the file in the Connection for the Text file goes here", ForWriting, True)

Do While Not InStm.AtEndOfStream
	OutStm.WriteLine FixQuotes(InStm.ReadLine)
Loop
InStm.Close
OutStm.Close

Main = DTSTaskExecResult_Success
End Function

Function FixQuotes(ByVal Buffer)
Const DQ = """", DELIMITER = ","
Dim DQPos, PrevChar, NextChar

DQPos = 2
Do While DQPos > 0
	DQPos = InStr(DQPos, Buffer, DQ, vbBinaryCompare)
	If DQPos > 0 Then
		PrevChar = Mid(Buffer, DQPos - 1, 1)
		NextChar = Mid(Buffer, DQPos + 1, 1)
		If PrevChar <> DQ And PrevChar <> DELIMITER And NextChar <> vbNullString Then
			Buffer = Left(Buffer, DQPos) & DQ & Right(Buffer, Len(Buffer) - DQPos)
		End If
		DQPos = DQPos + 1
	End If
Loop

FixQuotes = Buffer

End Function

Open in new window

0
 

Author Comment

by:Wedmore
Comment Utility
VBScript worked, I see now I would have to do an Update and replace to remove the ("") surrounding each field.
0
 

Author Closing Comment

by:Wedmore
Comment Utility
Thanks.  You're right, it took 14 hrs to import/reformat the 1Gb file.

Unfortunately, unless the vendor fixes their data, this is the only workaround.
0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Join & Write a Comment

JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now