VDanner
asked on
Import CSV file into SQL table
I am developing a package to import a flat CSV file into a SQL table. It does work, but I am unhappy with the performance of over an hour to import an average of 50,000 records. I first tried the Data Flow Task to map the columns and define the output colun datatypes/lengths to match the SQL destination columns. Some output columns are to be ignored and are selected as such in the transformation. I tried initially with the CSV source file location on my local drive, being imported over to the server. With the poor performance, I then tried the CSV source file location on the same server as the destination, but performance was still poor.
So I then tried a Bulk Insert Task, which I'm not very familiar with. This does not seem to work since the source columns are larger than the destinationon and/or I have some source columns that need to be ignored.
Any ideas how I can improve performance of this import?
So I then tried a Bulk Insert Task, which I'm not very familiar with. This does not seem to work since the source columns are larger than the destinationon and/or I have some source columns that need to be ignored.
Any ideas how I can improve performance of this import?
ASKER
Since I am so new to Bulk Insert, I would very much appreciate an example.
sure, gimme one sec
this is a genericised version of a proc i use to load files. i actually am loading a LOT more than this, but, i pulled it all out except for one, just to keep things simple:
CREATE PROC dbo.usp_loadFiles
AS
SET NOCOUNT ON
BULK INSERT database.dbo.table
FROM '\\servername\directorynam e\fileA.cs v'
WITH(FORMATFILE = 'E:\MSSQL\formatfile.fmt', FIELDTERMI NATOR = ',',ROWTERMINATOR = '\r',FIRSTROW=2,TABLOCK,RO WS_PER_BAT CH = 100000)
SET NOCOUNT OFF
GO
You will see, I have the format file in there,k as well as the field and row terminators, FIRSTROW and ROWS_PER_BATCH.
FIRSTROW is necessary because the files have headers (column names). i need to skip that row before starting to load the data
here is a format file:
8.0
6
1 SQLCHAR 0 50 "," 2 field1 SQL_Latin1_General_CP1_CI_ AS
2 SQLCHAR 0 1 "," 3 field2 SQL_Latin1_General_CP1_CI_ AS
3 SQLCHAR 0 1 "," 4 field3 SQL_Latin1_General_CP1_CI_ AS
4 SQLCHAR 0 4 "," 5 field4 SQL_Latin1_General_CP1_CI_ AS
5 SQLCHAR 0 4 "," 6 field5 SQL_Latin1_General_CP1_CI_ AS
6 SQLCHAR 0 4 "\n" 7 field6 SQL_Latin1_General_CP1_CI_ AS
(i've just cut/pasted... hopefully it prints out ok)
do you have BOL, and if so, you really need to read 'using format files'. it is exactly what you need to know - and it explains the format file in detail
CREATE PROC dbo.usp_loadFiles
AS
SET NOCOUNT ON
BULK INSERT database.dbo.table
FROM '\\servername\directorynam
WITH(FORMATFILE = 'E:\MSSQL\formatfile.fmt',
SET NOCOUNT OFF
GO
You will see, I have the format file in there,k as well as the field and row terminators, FIRSTROW and ROWS_PER_BATCH.
FIRSTROW is necessary because the files have headers (column names). i need to skip that row before starting to load the data
here is a format file:
8.0
6
1 SQLCHAR 0 50 "," 2 field1 SQL_Latin1_General_CP1_CI_
2 SQLCHAR 0 1 "," 3 field2 SQL_Latin1_General_CP1_CI_
3 SQLCHAR 0 1 "," 4 field3 SQL_Latin1_General_CP1_CI_
4 SQLCHAR 0 4 "," 5 field4 SQL_Latin1_General_CP1_CI_
5 SQLCHAR 0 4 "," 6 field5 SQL_Latin1_General_CP1_CI_
6 SQLCHAR 0 4 "\n" 7 field6 SQL_Latin1_General_CP1_CI_
(i've just cut/pasted... hopefully it prints out ok)
do you have BOL, and if so, you really need to read 'using format files'. it is exactly what you need to know - and it explains the format file in detail
ASKER
Thank you for this example. I will try out this morning and if all goes well close this question. I'm not sure what you mean by BOL? I am working in Business Intelligence Development Studio (BIDS) and have now opened their Help engine, searching on format files. Between your example and their narratives this should resolve my problem.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you. Regarding BOL (duh...should have guess that). I am downloading that version now and feel confident this and your example will do the trick.
Excellent. Good luck!
You need only to use a format file. see BOL: 'Using Format Files'
This will tell you exactly how to workaround the table and file variances. Let me know if you need an example.