Solved

Trying to delete millions of records in a database

Posted on 2013-12-19
18
395 Views
Last Modified: 2013-12-20
I'm trying to delete millions of records, they are all useless logs being recorded. When I delete, my transaction log gets filled even though my database is set to simple recovery. Why log is getting filled even with simple recovery? So it shows error messages.
Now, I searched online, it shows that I have to change to FULL recovery, then change back to Simple to change log_reuse_wait to NO_WAIT so that it doesn't fill up log file. Is this safe to run in production server?

USE [master]
GO  
ALTER DATABASE <db name> SET RECOVERY full  
GO
ALTER DATABASE <db name> SET RECOVERY SIMPLE WITH NO_WAIT;
GO
USE [db name]
GO
DBCC SHRINKFILE('<log file name>', 0, TRUNCATEONLY)
0
Comment
Question by:crcsupport
  • 5
  • 5
  • 3
  • +4
18 Comments
 
LVL 13

Assisted Solution

by:magarity
magarity earned 32 total points
Comment Utility
Deleting millions of rows is always a pain!  I can't give a definitive answer to the safety of those ALTER statements but they do look OK.

Once you get that figured out, you can use this handy script to delete a few thousand at a time to keep it manageable: http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SQL_Server_2008/Q_28292477.html
(instead of the WHERE datefield in that one you just put your own critera)
0
 
LVL 52

Assisted Solution

by:Carl Tawn
Carl Tawn earned 32 total points
Comment Utility
You should delete in batches rather than trying to do the whole lot in one go. The reason the log fills up is because deleting 1 million rows in one go is a single transaction so there is no opportunity to trim the log part way through.

Take a look at this for an example of batch deleting: http://stackoverflow.com/questions/896810/sql-batched-delete
0
 
LVL 10

Assisted Solution

by:Banthor
Banthor earned 156 total points
Comment Utility
Changing to Simple recovery will break mirrors and log shipping. I don't know how your Production environment is configured.

You are going to use some kind of looping to and subsets. IF this has too be completed during some kind of downtime window just create a script that deletes a bunch, executes a log backup, then shrink
set  rowcount 10 

declare @rc int 
delete from admintools.dbo.bkp_files where BackupStartDate < Dateadd(Day,-7,getdate())

SELECT @rc = @@ROWCOUNT 
While @rc > 0 
BEGIN
    Begin Try
		Begin Transaction
			delete from admintools.dbo.bkp_files where BackupStartDate < Dateadd(Day,-7,getdate())
		Commit Transaction 
	end try 
	Begin Catch 
			Rollback Transaction
			-- Add Backup TRN Log 
			
	End Catch

END 

	-- Add Backup TRN Log 
	-- Execute Schrink File 

Open in new window

0
 
LVL 10

Assisted Solution

by:Banthor
Banthor earned 156 total points
Comment Utility
you may want to try a value bigger than 10
set  rowcount 10

Open in new window

0
 
LVL 10

Assisted Solution

by:Banthor
Banthor earned 156 total points
Comment Utility
Update , Forgot to reset @rc
set  rowcount 100

declare @rc int 
delete from admintools.dbo.bkp_files where BackupStartDate < Dateadd(Day,-7,getdate())

SELECT @rc = @@ROWCOUNT 
While @rc > 0 
BEGIN

    Begin Try
		Begin Transaction
			delete from admintools.dbo.bkp_files where BackupStartDate < Dateadd(Day,-7,getdate())
			SELECT @rc = @@ROWCOUNT
		Commit Transaction 
	end try 
	Begin Catch 
			Rollback Transaction
			-- Add Backup TRN Log 
			
	End Catch

END 

	-- Add Backup TRN Log 
	-- Execute Schrink File 

Open in new window

0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 156 total points
Comment Utility
NO -- don't run that code.

If your db is already in Simple mode, you should be OK as is.

You do need to DELETE the rows in batches, but at least 1000 at a time.

Run this code to verify that nothing is holding log records:

SELECT name, log_reuse_wait_desc
FROM sys.databases
WHERE name = '<db_name>'

If the second column is "NOTHING", then you're good.


DECLARE @rows_deleted int
DECLARE @total_rows_deleted int
SET @rows_deleted = 0
SET @total_rows_deleted = 0

WHILE 1 = 1
BEGIN
    DELETE TOP (1000)
    FROM [dbo].[table_name]
    SET @rows_deleted = @@ROWCOUNT
    SET @total_rows_deleted = @total_rows_deleted + @rows_deleted
    IF @rows_deleted < 1000
        BREAK
    WAITFOR DELAY '00:00:00.150'
END --WHILE

PRINT 'Total Rows Deleted = ' + CAST(@total_rows_deleted AS varchar(10))
0
 
LVL 68

Assisted Solution

by:Qlemo
Qlemo earned 62 total points
Comment Utility
Scott's approach is what I would use myself. But I definitely would print out the (floating) 'total rows deleted' count in the loop (prior to the waitfor), so you have something to watch.
0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 156 total points
Comment Utility
Not a bad thought, but the PRINTs will be buffered anyway, so you won't see the output until the end, or the buffer fills up (which it won't unless you print many more bytes than just row counts).

If you want to check the progress initially, just cancel the query.  Then you should see the:
1000 row(s) processed
or the like messages over and over.

Then re-start the code.
0
 
LVL 10

Assisted Solution

by:Banthor
Banthor earned 156 total points
Comment Utility
Rather than print , use raiserror with nowait and the message is printed at execution.

Declare @erm nvarchar(max) 
SELECT @erm = 'Deleted ' + cast(@RC as varchar(10)) + ' Rows' 
Raiserror(@ERM,0,1) with nowait 

Open in new window

0
Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

 
LVL 10

Assisted Solution

by:Banthor
Banthor earned 156 total points
Comment Utility
I do not see any reason for the delay and without a transaction wrapper the entire execution will wait for 1 to no longer equal 1
So you will be locking the table for the entire time and still filling the trn with a transaction that can not be mitigated by a log backup.
0
 
LVL 8

Assisted Solution

by:Surrano
Surrano earned 62 total points
Comment Utility
I agree with Banthor: a commit in the loop is a must, otherwise a loop makes no sense.

I'm not familiar with cursor usage in MS SQL but if you can manage it it's much better (performance-wise) than deleting first 1000 all the time.

If you need to delete the majority of the records, say 9M out of 10M total it's much better to
- create a new table from all the rows remaining (still with the cursor approach),
- drop old table,
- rename new table to old name
This probably needs application outage, though.

As a final solution I'd recommend reorganising this logging table to use partitions that rotate e.g. every day or every week. Partitions can be truncated:
- truncate is DDL
- truncate is O(1)
- no tedious transaction log
- no outage

Cheers,
0
 
LVL 68

Assisted Solution

by:Qlemo
Qlemo earned 62 total points
Comment Utility
Nope, cursor is not better, because a commit destroys it. The delete should run with autocommit or explicit commit, of course, that much is true.
0
 
LVL 8

Assisted Solution

by:Surrano
Surrano earned 62 total points
Comment Utility
Sad to learn that. It works like charm in Oracle. I mean we use cursors to select the lines, but the rest of the code deletes lines based on data returned by cursor, not directly using the cursor. I don't know if it makes sense in MS SQL though.
0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 156 total points
Comment Utility
>> without a transaction wrapper the entire execution will wait for 1 to no longer equal 1
So you will be locking the table for the entire time and still filling the trn with a transaction that can not be mitigated by a log backup. <<


100% wrong.  Outside of an explicit transaction, SQL Server will treat every statement as a separate transaction and automatically commit it, or roll it back if an error occurred.
0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 156 total points
Comment Utility
>> Rather than print , use raiserror with nowait and the message is printed at execution. <<

That is also not true (in the sense that you won't see the message in your output screen yet).  Easy enough to demonstrate:


DECLARE @counter smallint
SET @counter = 1

WHILE @counter <= 50
BEGIN
    RAISERROR('Test message #%d.', 16, 1, @counter) WITH NOWAIT
    WAITFOR DELAY '00:00:00.250'
    SET @counter = @counter + 1
END
0
 
LVL 1

Author Comment

by:crcsupport
Comment Utility
The server is on virtual machince with limited space. When I run this, should I keep shrinking log file and run batch? Or is there any way to disable log file growing?

The server is at simple recovery mode. I thought with simple recovery, it doesn't write any to log file, but it does. I tried to set 'NO WAIT' on our testing server which has the daily moved database. But it doesn't update the variable, instead still shows as 'NOTHING' ;

 USE [master]
GO  
ALTER DATABASE nextgen SET RECOVERY full  
GO
ALTER DATABASE nextgen SET RECOVERY SIMPLE WITH NO_WAIT
GO

SELECT [name], recovery_model_desc, log_reuse_wait_desc  
FROM sys.databases;

The server is in simple recovery model and it doesn't have any mirroring or log ship set.
0
 
LVL 69

Accepted Solution

by:
ScottPletcher earned 156 total points
Comment Utility
You want it to show as "nothing".

Again, do NOT put the db into FULL mode for dealing with the DELETEs -- that will just cause you issues when you do certain other tasks.  I'm not sure what web page you saw that on, but just IGNORE IT.

If you don't have enough log space to store 1,000 deleted rows, then increase the log space yourself ahead of time -- don't let it autogrow, as that is much less efficient.


-- increase log space to handle 1000 (or however many) deleted rows
ALTER DATABASE <db name> MODIFY FILE ( NAME = <logical_not_physical_log_file_name>, SIZE = 40MB )
-- if you get an error stating that the log is already larger than that, then you're good, and you can ignore the msg.


If you're that restricted on disk space, add a CHECKPOINT to the DELETE loop, to insure log truncation occurs in a very timely manner:


DECLARE @rows_deleted int
DECLARE @total_rows_deleted int
SET @rows_deleted = 0
SET @total_rows_deleted = 0

WHILE 1 = 1
BEGIN
    DELETE TOP (1000)
    FROM [dbo].[table_name]
    SET @rows_deleted = @@ROWCOUNT
    SET @total_rows_deleted = @total_rows_deleted + @rows_deleted
    IF @rows_deleted < 1000
        BREAK
    CHECKPOINT
    WAITFOR DELAY '00:00:00.150'
END --WHILE

PRINT 'Total Rows Deleted = ' + CAST(@total_rows_deleted AS varchar(10))
0
 
LVL 8

Expert Comment

by:Surrano
Comment Utility
wow I believe this ckpt trick is worth more than my comment :)
0

Featured Post

Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

Join & Write a Comment

How to leverage one TLS certificate to encrypt Microsoft SQL traffic and Remote Desktop Services, versus creating multiple tickets for the same server.
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now