Solved

Scheduled importing of CSV file using DTS or SSIS

Posted on 2009-03-30
3
1,046 Views
Last Modified: 2013-11-30
I have a business partner with FTP access to our server.  He has his system set to automatically FTP a CSV data file to our server.  I need to read this file periodically and import or update his existing records per the CSV file.  New files will be FTP'd every day for example and will have a unique name of some sort.  I don't think this is a complicated project as the application is very much just a web based (dotnet),  15 field data storage and recall application.

The most complicated part (for me anyway) may be trying to figure out whether I should be updating or adding new records based on the clients CSV.  Client will be sending their PrimaryKey with their records so I will know if their records already exist, will just need to write the code for that (is this where the DTS/SSIS part comes in?)

... so, after thinking out loud :) I need:

1) more education on implementing DTS or SSIS for use with CSV files (we have MS SQL Server 2000).
2) scheduling DTS/SSIS packages for execution as SQL Agent job.
3) a way to know which CSV files should be read next at the next scan
4) how to delete the CSVs after they are imported (or at some later date if a recommended)
5) a way to distinguish which records need to be imported and which ones need to be updated
6) how to execute the importing/updating after records are identified
** sounds like the DTS/SSIS "package" may take care of 5 & 6.

I know there are many existing questions about DTS and importing, but the information is scattered.  I wouldnt mind if part of the solution points to other quality solutions, articles, websites and best practices for solving this project.
0
Comment
Question by:newknew
3 Comments
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 24025356
>>The most complicated part (for me anyway) may be trying to figure out whether I should be updating or adding new records based on the clients CSV.<<
Actually that is the wron approach.  Do not import directly into Production table(s).  Always import into a staging table and then use that to validate the data before you insert and/or update into the final tables.
0
 
LVL 30

Accepted Solution

by:
nmcdermaid earned 500 total points
ID: 24025879
1) more education on implementing DTS or SSIS for use with CSV files (we have MS SQL Server 2000).
SQL 2000 > DTS
SQL 2005 > SSIS
So you'll need to use DTS at this point. If you are considering an upgrade to SQL 2005 in the near future then I suggest you do this in SSIS on SQL 2005 as the updgrade path from DTS > SSIS is not always simple.

2) scheduling DTS/SSIS packages for execution as SQL Agent job.
You just have to keep in mind that the package runs ON the server (all paths need to be as though you are on the server), and also remember that it runs as the SQL Agent service logon user.
3) a way to know which CSV files should be read next at the next scan
Usually once you successfully import a file you move it to an archive folder. So you never import them again
4) how to delete the CSVs after they are imported (or at some later date if a recommended)
Move them to an archive folder.
5) a way to distinguish which records need to be imported and which ones need to be updated
As you have surmised, as long as the primary key is in the source file, thats pretty easy. You should definitely make use of a staging table as acperkins has suggested.
6) how to execute the importing/updating after records are identified
As acperkins has suggested, you import them all into a staging table, then compare.
 
But it is very important to note that implementing these kind of solutions (looping through source files) in DTS requires T-SQL or ActiveX scripting, and this does not migrate very well to SSIS. If you have the opportunity then I strongly suggest you upgrade to SQL 2005 and do this in SSIS.
Upgrade path is simple (backup from SQL 2000 and restore into SQL 2005). Then of course there is regression testing if you want.
Steps 3,4,5 can be fundamentally different between DTS and SSIS. i.e. for SSIS it can in some cases be practical to not use a staging table, and looping and archiving files requires a lot less work in SSIS.
Here is a sample page for looping and importing files in DTS just to give you a taste:
http://www.sqldts.com/246.aspx
It is basically a DTS 'hack'
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Let's review the features of new SQL Server 2012 (Denali CTP3). It listed as below: PERCENT_RANK(): PERCENT_RANK() function will returns the percentage value of rank of the values among its group. PERCENT_RANK() function value always in be…
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question