• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1585
  • Last Modified:

Delete 29 million rows from 1 billion rows table. (SQL Server 2008)

I need suggestions from experts.
I need to delete 29 million rows from 1 billion rows table.  This is SQL Server 2008.
I am sure foillowing query will work. I know that it will take an couple of hours or more to run it.
Does anybody have any suggestions/improvments in following queries? My main aim is to have these queries consume minimum resources and avoid 'transaction log full' error.


      Declare @rows int
      set @rows= 1

      WHILE @rows > 0
      begin
             DELETE TOP (100000) FROM HugeTable
             Where Key < 29660822
                   
             set @rows = @@ROWCOUNT

             CHECKPOINT
      end
0
chinawal
Asked:
chinawal
5 Solutions
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
it will only avoid that error if you put a WAITFOR DELAY into the loop, and ensure you have transaction log backups running during that "time" ...otherwise, you could also "delete at once ..."
0
 
Lara FEACommented:
If you don't need maintain txn log for this DB
you can add this after update to avoid tempdb full error.
backup log databaseName with truncate_only


0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
trofimoval:
 WITH TRUNCATE_ONLY command is discontinued from SQL Server 2008.
you cannot suggest that, the question posted in sql 2008, and it should not be suggested anyhow, under normal conditions ...
0
Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
Lara FEACommented:
angelIII
 I didn't know that it was deprecated, Thank you for information.
BTW,  I would appreciate if you look on this open question when you have time
http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/Q_26433435.html
0
 
dwkorCommented:
I believe you have clustered index on key column?

I would suggest to put delete statement in the transaction:

      WHILE @rows > 0
      begin
             
             begin tran
                DELETE TOP (100000) FROM HugeTable
                Where Key < 29660822
                select @rows = @@rowcount  
             commit            
      end

If you have FULL recovery mode, you can perform transaction log backup after each transaction like AngelIII suggested. I would do the separate transactions even if you don't backup the log in order to minimize locking.

If you need to perform this action regularly, you could think about table partitioning.
0
 
Scott PletcherSenior DBACommented:
The easiest way would be to put the db in SIMPLE recovery mode [... you almost certainly can't do that since it's a prod db, but that is the easiest way :-) ] .

In FULL mode, you might try reducing to 50K at a time.  However, as noted earlier, the only way to clear that log space is to back up the log.

[Btw, you don't need explicit transactions here, each statement will implicitly be its own transaction.]

If the WHERE value is the first/only clus key column, verify that SQL is using  the index to do the DELETEs and, if it's not, "force" it to with a "hint" ("INDEX (...)").


How many indexes are on the table?  You also need to consider that when sizing the DELETEs, since very DELETE will naturally have to be applied to all indexes on the table as well.
0
 
jwareroCommented:
Hi. Is this a one time task or a recurring one? does the table have any relationships setup . What ad do is the following
1) prepare a create table statement script based on the structure for HugeTable
2) rename the table HugeTable to HugeTable_1 from SSMS
3) run the create table statement to recreate the HugeTable (which would now contain no records)
4) do an insert statement to retrieve only the records that you would like to retain from HugeTable_1
5) Remove any relationships inherited by HugeTable_1
6) Drop table HugeTable_1

This all depends also on whether you have referential integrity relationships set up with other tables which would require that you drop the relationships then recreate them. I believe this would take a shorter time than running a delete on the table

jwarero
0

Featured Post

The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now