Optimizing record movement between linked servers

I'm in the process of implementing an archiving strategy, where data in certain tables that is older than 3 months is moved from the Primary server to the Archive server. In the code section you'll see the stored proc code for one of these tables.

I've been reading that with linked SQL servers, it is much faster to 'pull' data than to 'push' it. Obviously my first crack below was pushing data to the Archive server. I intend on re-writing it as an SP in the Archive server, and issuing a remote procedure call from the Primary.

Any suggestions on the following:
1. Each week there will be about 4M records to move... does it really matter if the loop is 10K, 100K, or 1M records? (Or do I need a loop at all once I've reached the steady-state.)
2. See anywhere I can make the code more efficient for using a linked server?
3. After this SP runs, I have another that has to pull back (unarchive) certain records that must remain on the primary. I should leave that SP as a call from the Primary, correct? (To pull).

Thanks!
SET XACT_ABORT ON 
SET NOCOUNT ON

declare @ArchiveDataOlderThan DateTime   
set @ArchiveDataOlderThan = Dateadd(m, -3, getdate())

create table #ids ([id] int not null) --Temp table to hold ID's for insert/delete in loops of 1M records

while (1=1) 
begin
    truncate table #ids
    
    insert into #ids
        select top 1000000 lOEEEventId as [id] from [BizWare].[BizWareUser].[oeeeventflexdata]
        where lOEEEventId in 
			(select [lOEEEventId] from [BizWare].[BizWareUser].[oeeevent] where tstart < @ArchiveDataOlderThan)
        order by lOEEEventId

    if (select count(*) from #ids) = 0 break --Exit the loop

	begin distributed tran
		insert into [Archive_Server].[BizWare_Archive].[BizWareUser].[oeeeventflexdata]
            select * from [BizWare].[BizWareUser].[oeeeventflexdata]
            where loeeeventid in (select [id] from #ids)
            order by loeeeventid

        delete from [BizWare].[BizWareUser].[oeeeventflexdata]
            where loeeeventid in (select [id] from #ids)
    commit tran
end
drop table #ids

Open in new window

jdallen75Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

jdallen75Author Commented:
Update: I've re-written the SP to run from the Archive server. The inserts into Bizware_Archive don't seem that bad, but the deletes on Primary_Server are painfully slow (3K/min). I'm wondering how I can change this without losing the transactional functionality - to ensure proper sync in case of any failures?
SET XACT_ABORT ON 
SET NOCOUNT ON

declare @ArchiveDataOlderThan DateTime   
set @ArchiveDataOlderThan = Dateadd(m, -3, getdate())   --older than 3 months

create table #ids ([id] int not null) --Temp table to hold ID's for insert/delete in loops of 1M records

while (1=1) 
begin
    truncate table #ids
    
    insert into #ids
        select top 1000000 lOEEEventId as [id] from [10.190.14.14].[BizWare].[BizWareUser].[oeeeventflexdata]
        where lOEEEventId in 
			(select [lOEEEventId] 
			from [10.190.14.14].[BizWare].[BizWareUser].[oeeevent] 
			where tstart < @ArchiveDataOlderThan)
        order by lOEEEventId

    if (select count(*) from #ids) = 0 break --Exit the loop

	begin distributed tran
		insert into [BizWare_Archive].[BizWareUser].[oeeeventflexdata]
            select * from [10.190.14.14].[BizWare].[BizWareUser].[oeeeventflexdata]
            where loeeeventid in (select [id] from #ids)
            order by loeeeventid

        delete from [Primary_Server].[BizWare].[BizWareUser].[oeeeventflexdata]
            where loeeeventid in (select [id] from #ids)
    commit tran
end
drop table #ids

Open in new window

David ToddSenior Database AdministratorCommented:
Hi,

Does the delete need to be physical?

That is, why not update a flag value - maybe this will be quicker than a distributed delete.

Then a separate process can physically delete these rows so marked.

HTH
  David
Raja Jegan RSQL Server DBA & Architect, EE Solution GuideCommented:
>> Each week there will be about 4M records to move... does it really matter if the loop is 10K, 100K, or 1M records? (Or do I need a loop at all once I've reached the steady-state.)

Copying 4M records via Linked Server is not definitely recommended as well as best practice too.
And use one of the below options:

1. Use SSIS packages to copy / move records from Source to Destination Server.
2. Use Transactional Replication to move records on a real time basis to your archive server and then delete records from the Source server alone. Make sure your DELETE action is not triggered while setting up replication.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Angular Fundamentals

Learn the fundamentals of Angular 2, a JavaScript framework for developing dynamic single page applications.

jdallen75Author Commented:
Dtodd, the flag value isn't a bad idea, but the database is part of a commercial application, and I don't think I could safely change the existing tables. I question whether "update" has the same cost across a linked server as insert & delete.
jdallen75Author Commented:
rrjegan17, transactional replication is not necessary in our case - our database server has numerous functions (database, application server, web server, etc)... yes I know, terrible practise.

SSIS did not occur to me, but do you think it has much advantage over linked server transactions? Great idea though.
Raja Jegan RSQL Server DBA & Architect, EE Solution GuideCommented:
>> SSIS did not occur to me, but do you think it has much advantage over linked server transactions?

Check out Few comparisons listed here:

http://consultingblogs.emc.com/jamiethomson/archive/2006/03/14/SSIS_3A00_-Data-flows-or-T_2D00_SQL.aspx

Besides when dealing with huge data, SSIS can perform far more better than Stored procedures.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2005

From novice to tech pro — start learning today.