Avatar of niceoneishere
niceoneishere
 asked on

SQL statement to delete records from a table based on date range.

I my website is based on sql server 2005 and ASP.NET 2.0. (C#) I track all the website errors thru a table called Exceptions and this table has now more than 200000 records.

I trying to clean it up a little. basically remove all the records from the years 2007 till 2011.

how can i do this without impacting the site as well as the database.

my table is as follows

USE [MsiteDB]
GO
/****** Object:  Table [dbo].[ExceptionLog]  *****/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[ExceptionLog](
	[Id] [int] IDENTITY(1,1) NOT NULL,
	[ExceptionMessage] [varchar](max) NOT NULL,
	[ExceptionTime] [datetime] NOT NULL,
 CONSTRAINT [PK_ExceptionLog] PRIMARY KEY CLUSTERED 
(
	[Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO
SET ANSI_PADDING OFF

Open in new window


Thanks and appreciate it
ASP.NETMicrosoft SQL Server 2005SQL

Avatar of undefined
Last Comment
Scott Pletcher

8/22/2022 - Mon
Scott Pletcher

>> basically remove all the records from the years 2007 till 2011. <<

Do you want to remove through 2011 or through 2010?


1) Convert the date range to an Id range, since id is the clustering key
2) delete in batches, to reduce locks

DECLARE @ending_id int
DECLARE @max_rows_to_delete_at_one_time int
DECLARE @rows_deleted int

SELECT @ending_id = MAX(id)
FROM dbo.ExceptionLog WITH (NOLOCK)
WHERE
    ExceptionTime < '20110101'  --<< remove all thru 2010; change to '20120101' to remove all of 2011 also
YZlat

What's wrong with running DELETE statement on all records in the table where Year is between 2007 and 2011?
Scott Pletcher

DECLARE @max_rows_to_delete_at_one_time int
DECLARE @rows_deleted int

SET @max_rows_to_delete_at_one_time = 5000
SET @rows_deleted = @max_rows_to_delete_at_one_time

WHILE @rows_deleted > 0
BEGIN
    DELETE TOP (@max_rows_to_delete_at_one_time)
    FROM dbo.ExceptionLog
    WHERE id < @ending_id
    SET @rows_deleted = @@ROWCOUNT
END --WHILE
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
YZlat

DELETE FROM ExceptionLog WHERE YEAR(ExceptionTime) BETWEEN '2007' AND '2011'
niceoneishere

ASKER
I just worried that If i straight out do the delete command it might hang up the site. I was wondering If I can delete between the date exception time range and say 10000  records at a time

Thanks
Scott Pletcher

RE-WORDING:

When I said:
>> 2) delete in batches, to reduce locks <<

I mean reduce locks that must be *held at the same time*.

If you just delete all rows in one statement, you'll likely lock the table.  Also, the transaction log for that db will have to grow to hold ALL the deleted data.

By doing it in batches, you can lessen the overall impact on that db and the system.

If the log on that db is not very large, expand the log the full amount needed yourself BEFORE running the DELETEs.  Automatic log expansion is a lot of overhead and causes more delays in the db than doing it yourself.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
niceoneishere

ASKER
Scott, thanks for replying sir, where would i run the code you have posted I can use sql management studio express

thanks and sorry I am little new to this
YZlat

Put Scott's  code inside a procedure and run it from Management Studio using EXEC statement
Zberteoc

If you really want to make sure you don't impact the production data during the business hours then better create a job that will be executed daily at 3:00AM and do this:

DELETE TOP 10000 FROM ExceptionLog WHERE YEAR(ExceptionTime) BETWEEN '2007' AND '2011';

Before that makes sure you create an index on the ExceptionTime column otherwise the delete will scan the table every time executed:


CREATE INDEX IX_ExceptionLog_ExceptionTime ON ExceptionLog (ExceptionTime)

You could use top 1,000 instead of 10,000 but you'd have to make sure the fill up rate of that table is lower than the number you delete daily. You could also schedule the job to execute every 6 hr or so and then for sure you can make the number to delete only 1,000, which will practically have no impact on the table.
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
Zberteoc

Actually in order to make the query use the index you will have to change it to:

DELETE TOP 10000 FROM ExceptionLog WHERE ExceptionTime BETWEEN '2007-01-01' AND '2011-12-31 23:59:59.999';

as BETWEEN includes the values given.

Also if you want to keep this job permanently so it will keep your table clean all the time then you should change the delete to remove dates that are older than a year:

DELETE TOP 1000 FROM ExceptionLog WHERE ExceptionTime < DATEADD(mm, -12, GETDATE());

Again the number has to be grater then the rate you fill the table daily if the job is scheduled daily.
ASKER CERTIFIED SOLUTION
Scott Pletcher

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
Zberteoc

I still insist on 2 things:

1. You need an index on ExceptionTime even if you use Scott's method.

2. The job solution is simpler and can be scheduled. If you still prefer Scott's method it can be put in a job as well.

A job will take care of the problem permanently.
niceoneishere

ASKER
thanks guys I really appreciate it one final question how would I make Scott solution to have both the 1 and 2 of what Zberteoc suggested.

Thanks
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Zberteoc

Assuming you have the Management Studio:

In order cu create a job, under the server node you expand the SQL agent node and then right click on the Job > New Job.

- General section: you give it a name
- Steps section: click New and give it a name, choose your database and then copy and paste what code you want to execute in the big empty panel. Click OK.
- Schedule section - define how often you need the job to execute, you can choose daily and ans specify a time. You have to give it a name as well before you save by clicking OK.
Zberteoc

For SQL agent with SQL express you need this apparently:

http://standalonesqlagent.codeplex.com/
Scott Pletcher

A) Express does not have a job scheduler.

B) You don't necessarily need this to automatically run multiple times.

C) You DON'T need a separate index on ExceptionTime -- yes, the table will then have to be scanned ONCE to complete the DELETEs, but that's WAYYYY less overhead than constantly maintaining that index for all rows as they are added.

D) Almost certainly the table would best be *clustered* on ExceptionTime, rather than id, but I won't get into that in any more detail (unless the Author would like me to).  ALL queries against that table would improve and continue to run well; you may not even need to purge the table then.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
niceoneishere

ASKER
Thanks worked out perfectly, I just removed the 2  lines

DECLARE @max_rows_to_delete_at_one_time int
DECLARE @rows_deleted int

as it was declared twice.

and the query worked fast I reduced the table from 330000 records to 35000 in less than 15 mins.

Thanks
Scott Pletcher

I guarantee you, it wouldn't have been nearly as fast DELETEing all rows at one time! :-)