Optimize large delete and insert operations with large rows
Posted on 2004-09-21
I have a table with around 4.8 million records, and about 65 columns each record. On a daily basis, I need to upload data from another source to this table. Data in the original source can change up to 13 months back, so I bring only the latest 13 months from the source into a holding table (replacing the data in it every day). I need to add/replace data in the "production" table with the new data from the holding table. Because there are so many columns and any of them can change, there is no easy way to determine which rows have changed and only update those; so I am simply deleting all rows in the production table that fall within the date range existing in the holding table, and then inserting everything from the holding table into it. Here's the problem.
Deleting the data from the production table takes a long time (over 20 minutes). On average, about 760,000 records need to be deleted from this 4.8 million record table, which has several indexes. Is there a way to delete records faster or without loggin? Once deleted, I don't need to get them back. A partial truncate would be ideal, but that does not exist. I tried copying the remaining records to another table, then truncating it, then inserting from the other table and from the holding table to complete the set, but that was not any faster. The delete looks something like this:
DECLARE @firstDate datetime, @lastDate datetime,
SET @firstDate = (SELECT min(the_date) FROM Holding_Table)
SET @lastDate = (SELECT max(the_date) FROM Holding_Table)
--delete records in Production table that exist in holding table
WHERE the_date BETWEEN @firstDate AND @lastDate --takes about 26 mins
--insert records from holding table into production table, which has additional fields
INSERT INTO Production_Table
SELECT *, null, null, null, null, null, null, null, dbo.getQuarter(the_date),null
FROM Holding_Table --takes about 12 mins
Any suggestions? I have indexes on the date fields for both tables (among others that are used for queries).
After this, I have update queries that update the null values inserted above, based on joins with other tables, which also are taking a long time.