Insert command inside transaction takes too long
Posted on 2006-05-26
I am developing a utility in C# .NET 2.0 that performs a large number of inserts into a sql server 2000 database. This utility started out using BulkCopy and has since been switched to use SQLXMLBulkLoad to keep RAM usage down. Anyhow, this utility moves a LARGE amount of data to the specified SQL server in a very quick way. It can load small amounts of data from the xml file in seconds and even large data sets in mere minutes. The change to SQLXMLBulkLoad has required me to load the data into temporary tables and then perform inserts and updates to move this data where I need it, which works fine. Recently, I just switched it to use the SqlTransaction object in the .NET 2.0 Framework. At first, I did not noticed any problems with this change.
After testing some small and medium sized datasets, I noticed the update and insert commands were a little slower than without the transaction. I moved the Bulkloads outside of the transaction (that process is handled before the transaction takes place and the temp tables are deleted after the transaction is committed) and that provided a good enough speed boost to make the transfer fairly fast again. Recently, I tried loading a data set that contained 200,000 rows whose fields would be inserted into 2 tables. To be more specific, 2 of the fields in the 200,000 rows go into one table (Table1) and 40 of the fields of the 200,000 rows go into another table (Table2). I bulkload the data into the temporary table consisting of all the fields and the entire 200,000 rows. I then start a transaction, turn off all triggers on the tables, and then perform the insert commands. The insert of the 2 fields into Table 1 is very fast. Then I perform the insert into Table 2 (using a field from Table 1 through a join also). This insert never ends. I've let it sit there for over 8 hours and it does nothing. I have run a few benchmarks and realized that doing this with just a few thousand rows requires a large amount of time to complete. If I do all of this without a transaction, it runs fine and takes less than 5 minutes. If I do this all in the query analyzer, it runs just as fast. Also, I am about 98% sure this is not an issue with deadlocking since it works on smaller data sets (less than 200 rows) and even on medium-sized data sets (a couple thousand rows), it eventually finishes after a while.
The option of not using a transaction is out of the question as I am doing large inserts of data and if any error occurs, I need all the data to not be committed. I've even gone through and added "WITH (NOLOCK)" and "SET NOCOUNT ON" to all my select statements just to make sure it is as optimized on the SQL server as possible. One more bit of information is that I am doing all the insert commands with the SqlCommand object and the insert commands are using "INSERT INTO" with a "SELECT" statement from the temporary table. Does anyone have any recommendations on how I should go about fixing this?
For an added bonus, after this process finishes, it will then perform an insert of 500,000 rows for another table... I haven't even gotten there yet!
Thanks in advance!