How to continously load data from csv file to MS SQL server table?

Mamunbau
Mamunbau used Ask the Experts™
on
I have a csv file in which data are continuously inserting by a program. I want to read that data continuously or at certain interval and insert (not update) that data to a certain file.

Time interval may be at each 5 min. I just want to get the latest data inserted to table. It is expected that within 5 min maximum 200 new data may inserted to that csv file. 99.5 % reliability required for this process.

My first preference is MS SQL server 2008 express edition. if the solution is not possible with that, Please suggest the cheapest product I can use to achieve solution.

 However, I also need to make backup for one week long csv file and start with a empty csv file. How can I avoid downtime?

I have no need to keep transaction log. Please help.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Look at scheduling a SQL Server Job: http://msdn.microsoft.com/en-us/library/ms191439.aspx

The job should use BULK INSERT function: http://msdn.microsoft.com/en-us/library/ms188365.aspx

This'll work fine if the CSV filename doesn't change.

Can I ask, can you not get program to insert directly?
Hm, as far as I can suggest, you SHOULD try to get the csv in more files if possible. Then use BULK INSERT command to import this in an SQL table. It works well and fast.

There is no way for T-SQL alone to "know" what rows are new in a csv. Another idea is to use a temp table in SQL to always insert all rows from csv, then make a T-SQL procedure to only copy the ones that don't already exist in the destination table. Provided you have a way to know this (a unique identifier should be present in csv).

Hope my idea helps you.

Commented:
>>Please suggest the cheapest product I can use to achieve solution.

If you are OK with the limitations provided by either Oracle XE or MS SQL Server XE, then you can't go wrong with either one of them.  Both limit your database size to 4GB, so you have to decide if that provides enough storage for your requirements.
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Commented:
If you choose Oracle XE, you can use an external table that connects directly to the CSV file, so you don't need to do any additional file processing.

http://download.oracle.com/docs/cd/B25329_01/doc/admin.102/b25107/impexp.htm
Commented:
Dear all

Thank you very much for your comments.

I am going to SQL server with a VB program (windows service) to do the job.

Since thousands of records will be inserted in csv file within minutes, inserting them all in a temp db and updating that is may not be a preferred idea. As same reason bulk insert and job scheduling will not help.

moreover csv filename will be changed per day basis.

So, I have no alternative to develop a custom service.


You may not have to develop a Windows service, you can also code a console application in .NET and have windows scheduler run it every 5 minutes. Windows service might be more complex  to program, and, if you only need to run this procedure you wrote in .NET every 5 minutes, then this will also do.

When using .NET you can easily scan a folder to find if new files have appeared. You can use the SQL DB to remember what files were previously processed and what was the last processed line, so you can continue where you stopped when you process the same file more times.

A small suggestion if you use .NET for this - don't use the File classes (Open and Readline methods), use the Streamreader and Filestream classes in .NET, they are blazingly fast.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial