Solved

Copy data without a large amount of logging

Posted on 2006-11-13
11
281 Views
Last Modified: 2008-02-01
I have a table with 3.2 million rows of data in it.  I would like to transfer about 2 million rows out of it into an archive table.  Then once a month I plan on transfering all records that are over a year old.   I would like to know how to accomplish this with the least amount of growth in the log files.  
I am using SQL 2000, the database is set to full recovery mode and the log files are backed up every hour.
0
Comment
Question by:schlepuetz
  • 5
  • 4
  • 2
11 Comments
 
LVL 29

Expert Comment

by:Nightman
ID: 17932077
Honestly, you can't. These transactions will be logged, and there is nothing that you can do about it. Best option would be to increase the backup frequency, do the transfer in small batches with CHECKPOINTING and backing up the log in between (transactions will still be logged, but individual log file growth can be kept to a minimum so you don't have to worry about the size as much).

Alternatively, perhaps you should consider SQL 2005 and table partitioning.
0
 
LVL 11

Expert Comment

by:regbes
ID: 17932148
Hi schlepuetz,

here is a nice eg of what can be done by changing b
etween user modes and recovery models to accomplish what you want

/*============================================================================
  File:     DBAlterForBatchOperation.sql

  Summary:  This script is functional yet really just pseudo code to give you an
            idea of how batch processing can be automated to switch recovery model.

  Date:     April 12, 2005

  SQL Server Version: 9.00.1116.08 (s/b IDW14)
------------------------------------------------------------------------------
  Copyright (C) 2005 Kimberly L. Tripp, SYSolutions, Inc.
  All rights reserved.

  For more scripts and sample code, check out
    http://www.SQLskills.com

  This script is intended only as a supplement to demos and lectures
  given by Kimberly L. Tripp.  
 
  THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF
  ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED
  TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
  PARTICULAR PURPOSE.
============================================================================*/

-- This is a sample script based on the idea that Bulk_Logged Recovery Model should
-- only be used for the length of time that the batch operation is running.

-- A quick overview of the steps are as follows:
-- 1) Perform a Log Backup
-- 2) Change the Recovery Model to Bulk_Logged
-- 3) Perform the Batch Operation
-- 4) Change the Recovery Model back to Full
-- 5) Perform a Log Backup (remember this might be VERY large)

-- OK - now for the automation through TSQL script!

-- You might want to create a JunkDB to test this on?
CREATE DATABASE JunkDB
-- Back it up first...
BACKUP DATABASE JunkDB TO DISK = 'c:\Program Files\Microsoft SQL Server\mssql\backup\JunkDBBackup.bak'
    WITH NOINIT, STATS = 10
---------------------------------------------------------------------------------------------------------------------
-- 1) Perform a Log Backup
----------------------------------------------------------------------------------------------------------------------
BACKUP LOG JunkDB TO DISK = 'c:\Program Files\Microsoft SQL Server\mssql\backup\JunkDBBackup.bak'
    WITH NOINIT, STATS = 10

----------------------------------------------------------------------------------------------------------------------
-- 2) Change the Recovery Model to Bulk_Logged
----------------------------------------------------------------------------------------------------------------------

-- May want to programmatically check the status before setting.
IF DATABASEPROPERTYEX('JunkDB', 'Recovery') = 'FULL'
    ALTER DATABASE JunkDB
        SET RECOVERY BULK_LOGGED
        -- Users CAN be actively processing in the DB when the Recovery Model is changed
        -- this may or may not be desired? Adding the "terminate" clause DOES NOT affect
        -- current users if ONLY the recovery model is changed. In order to use the terminate
        -- option you must supply a change a database "state" change. In this case you might
        -- want to set RESTRICTED_USER.
            SET RECOVERY BULK_LOGGED, RESTRICTED_USER WITH ROLLBACK AFTER 10
        -- Or you may want to change the state from READ_ONLY to READ_WRITE?
        -- Use the following WITHOUT the RESTRICTED_USER setting
            SET RECOVERY BULK_LOGGED, READ_WRITE WITH ROLLBACK AFTER 10
        -- OR if both settings are desired
            SET RECOVERY BULK_LOGGED, RESTRICTED_USER, READ_WRITE WITH ROLLBACK AFTER 10
        -- Changing a state option with ROLLBACK AFTER will terminate active users and
        -- rollback their processing transactions after n (10) seconds.
        -- NOTE: In order to rollback the acess transactions, this command may take MORE than
        -- 10 seconds.
go

----------------------------------------------------------------------------------------------------------------------
-- 3) Perform the Batch Operation
----------------------------------------------------------------------------------------------------------------------

-- The batch operation should be broken into smaller components if possible and
-- you should consider performing Log backups at various intervals throughout the
-- batch operation so at to further minimize the potential work loss exposure.
-- NOTE: You can only backup the log after a batch TRANSACTION has completed.

-- Run though your processes to update all of the data via
-- automated mass modifications, scripts, batches, etc.

----------------------------------------------------------------------------------------------------------------------
-- 4) Change the Recovery Model back to Full
----------------------------------------------------------------------------------------------------------------------

-- Again, may want to programmatically check the status before setting.
IF DATABASEPROPERTYEX('JunkDB', 'Recovery') = 'Bulk_Logged'
    ALTER DATABASE JunkDB
        SET RECOVERY FULL, MULTI_USER
        -- Or if you also changed the state
        -- Use the following WITHOUT the RESTRICTED_USER setting
        SET RECOVERY FULL, READ_ONLY
        -- OR if both settings are needed
        SET RECOVERY FULL, MULTI_USER, READ_ONLY

go

---------------------------------------------------------------------------------------------------------------------
-- 5) Perform a Log Backup (remember this might be VERY large)
----------------------------------------------------------------------------------------------------------------------
BACKUP LOG JunkDB TO DISK = 'c:\Program Files\Microsoft SQL Server\mssql\backup\JunkDBBackup.bak'
    WITH NOINIT, STATS = 10
0
 
LVL 11

Expert Comment

by:regbes
ID: 17932162
schlepuetz,

this e.g. uses bulk logged for your purposes you may want to use simple
0
 
LVL 29

Expert Comment

by:Nightman
ID: 17932217
I was actually at a seminar in Cape Town with Kimberly Tripp 2 weeks ago, and we discussed this in detail. Operationally, bulk logged isn't actually recommended due to the risks of:
a. FORGETTING TO CHANGE IT BACK - you have no idea how often this happens (and once you're here, there is no tail of the log to prevent data loss in a recovery scenario!)
b. Notice the WITH ROLLBACK AFTER 10 statement? Imagine you have a long running transaction that gets rolled back because of this. Not pretty.

Unless you are absolutely 100% sure of your operating patterns and can pick a time when you are sure there is NOBODY accessing the database, AND you are 100% confident that you will not leave it in a bulk logged state, don't do it.
0
 
LVL 11

Expert Comment

by:regbes
ID: 17932390
Nightman,

i think i was at the same one in JHB :)

I agree with your concerns but if the above is implemnted correctly, with atention paid to your concerns the objective of minimal logging can be achived

the other option that i use is to set up some sort of ittrative statement / cursor that will break the 7 mill row delete down into 500k chunks with a transaction dump/backup between each one

this will keep the log size under control but have the other effect of many more tran backups
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 29

Expert Comment

by:Nightman
ID: 17932477
Hey, a fellow South African! Howzit!

Like I said, the alternative would be smaller transactions, CHECKPOINT, backup the log. Repeat until done. Yes, at the cost of more tran backups, but no operational risk. Disk space is cheap (in context), data loss and/or downtime isn't.
0
 
LVL 11

Expert Comment

by:regbes
ID: 17932594
its all Lekker here!

oops i forgot what you said up front, and then re hashed it

seems we are saying the same thing !
0
 
LVL 29

Expert Comment

by:Nightman
ID: 17932644
schlepuetz, the bottom line is that there is a way, but we don't recommend it due to the operational risks invloved. But it's your call.
0
 
LVL 3

Author Comment

by:schlepuetz
ID: 17932734
So I would need to do the transfer in smaller chunks and since the log files are backed up every hour I could create a job that transfers chunks once an hour.  This would keep my log file from getting to large correct?
0
 
LVL 29

Accepted Solution

by:
Nightman earned 250 total points
ID: 17932766
Yes. Alternatively you could do it in small chunks and explicitly CHECKPOINT and then backup the log between chunks. This would probably be more efficient.
0
 
LVL 3

Author Comment

by:schlepuetz
ID: 17932783
Thank you both very much!
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Performance is the key factor for any successful data integration project, knowing the type of transformation that you’re using is the first step on optimizing the SSIS flow performance, by utilizing the correct transformation or the design alternat…
When you hear the word proxy, you may become apprehensive. This article will help you to understand Proxy and when it is useful. Let's talk Proxy for SQL Server. (Not in terms of Internet access.) Typically, you'll run into this type of problem w…
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now