• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 169
  • Last Modified:

How do I Delete millions of data points and Store the deleted records in SQL 2005?

Hi guys,

I am trying to delete approximately 13.5 million data points while simultaneously moving the deleted data to a storage folder.  When we try to look up production trends for the past two weeks the graph takes too long to load the 13.5 million points.  The objective is to keep only the current data for the past fourteen days in the original table while storing the rest in old records.  I have made a query to delete up to fourteen days ago by removing 100,000 rows at a time and would like to stick to a similar number because it allows me to see the progress of the query and does not overload the query.  

Here is part of the table I am using:
id                source_id         datetime                  data
11720123      27      9/29/2014 12:33:31 PM      422
11720124      17      9/29/2014 12:33:31 PM      1969
11720125      25      9/29/2014 12:33:31 PM      771
11720126      16      9/29/2014 12:33:31 PM      518
11720127      6      9/29/2014 12:33:31 PM      1079
11720128      12      9/29/2014 12:33:31 PM      854

Here is the query I have used to delete the data thus far:
USE [master]

DECLARE @total_rows_in_table int
DECLARE @rows_deleted int
DECLARE @total_rows_deleted int
SELECT @total_rows_in_table = COUNT(*)
FROM dbo.table1 WITH (NOLOCK)
SET @rows_deleted = 0
SET @total_rows_deleted = 0

WHILE 1 = 1
BEGIN
      DELETE top (100000)
      FROM dbo.table1
      WHERE Datetime<= DATEADD(d, -14, Getdate())      
      IF @rows_deleted=0
BEGIN
            CHECKPOINT
      END --IF
      WAITFOR DELAY '00:00:00.15'
END --WHILE

PRINT 'Total Rows Deleted=' + CAST (@total_rows_deleted AS varchar (10))
0
Tom_Hickerson
Asked:
Tom_Hickerson
2 Solutions
 
Phillip BurtonCommented:
Why don't you do it the other way round? Instead of deleting, then copying, why not copy, then delete? Of the top of my head, something like:

INSERT INTO dbo.table2
SELECT top (100000) *
FROM dbo.table1
WHERE Datetime<= DATEADD(d, -14, Getdate()) 

DELETE dbo.table1
FROM dbo.table1 T1
JOIN dbo.table2 T2
ON T1.id = t2.id

Open in new window

0
 
Christopher GordonSenior Developer AnalystCommented:
You might also want to take a look at the OUTPUT clause:

http://msdn.microsoft.com/en-us/library/ms177564.aspx
0
 
Scott PletcherSenior DBACommented:
Is the table clustered on datetime?  You should delete using the clustering key only.

Why count the records in the table first?  That's a waste of time, since you're ultimately deleting solely by datetime and not by record count.

You need a condition in the DELETE to prevent you from deleting the last 14 days' worth.

Also, when a DELETE finds no rows to delete, you should BREAK out of the loop.  Btw, you test variable @rows_deleted but you don't have a statement:
SET @row_deleted = @@ROWCOUNT
after the delete, so the variable will always contain NULL.
0
 
Tom_HickersonAuthor Commented:
The row count was left over from a query I previously used on another set of tables and was not intended for this query.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now