Time to merge 42K records into 300M records recently increased significantly. Alternatives?

SQL Server 2008 R2

Migrating data from a production database into a reporting database every morning.  Because of the way the data is collected and updated in the production database, I must look back as much as 30 days to pull values for a SourceID and ProductionDate into the reporting database.  This 30 day recordset returns about 42000 records, almost instantaneously (this is the "source").

The "target" table contains several decades worth of data (over 300M records) indexed by the same SourceID and ProductionDate.

What I have been doing is merging the "source" data into the reporting database table, updating the records which already exist with data from the "source" and appending records which don't already exist.  I assumed this would be quicker than performing a delete operation and an insert, but lately the time it takes to run this simple merge operation has grown from about 40 minutes to over an 1:45:00 (give or take).  And I cannot figure out what caused this huge increase in time.

1.  Would it be quicker to perform a delete and than an append, than the merge
2.  What steps should I take to determine the cause of the increase in time?
3.  Instead of using the entire table in the reporting database as the target, can I somehow use a subset of the table as the target?  Something like:
MERGE SELECT * from tbl_Reports WHERE ProductionDate >= @FromDate as Target

Open in new window

or maybe
MERGE vw_Reports_Table_Last35days as Target

Open in new window

which would use an updateable view as the target?
LVL 51
Dale FyeOwner, Developing Solutions LLCAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Koen Van WielinkBusiness Intelligence SpecialistCommented:
Hi Dale,

What kind of method are you using to merge the data? Are you using the Merge function in MSSQL server, or are you using separate insert, (logical) delete and update statements? I guess from your question that you are not "stacking" your historical records? For example, when the attributes that belong to a certain SourceID have changed, do you then simply overwrite the data, or do you mark the existing records as deleted and do you insert a new, current record with the same sourceID? Do you have your tables properly indexed so that the records can easily be retrieved?
You have a few different options but it depends on your setup. Hence all the questions :p.
0
Dale FyeOwner, Developing Solutions LLCAuthor Commented:
Yes, I am using the MERGE function.

No, I'm not stacking the historical records, I'm simply overwriting the old values with new ones.

Yes, the tables are indexed on the fields being used in the MERGE statements JOIN
0
Koen Van WielinkBusiness Intelligence SpecialistCommented:
Hi Dale,

The merge function is relatively slow performance wise. If you're not keeping history, you might want to consider using a double Except statement which compares sets instead of records. The idea is as follows:
1. Store your source data in a temp table or CTE (let's call that A)
2. Store your target data in a temp table or CTE (which we'll call B). Columns must be the same as for table A
3. In a third CTE or temp table, store the result of "Select * from A Except Select * From B". Call this AminB.
4. Do the same in reverse: "Select * from B Except Select * from A". Call this BminA.
The table AminB now gives you all the records that exist in your source, but not yet in your target (i.e. the records you need to insert).
The table BminA contains the records that exist in your target, but not in your source (which I assume you wish to delete as you don't keep history).
It's then roughly:

delete from target where exists (select 1 from BminA where target.key = BminA.key)
And
insert into target select * from AminB

Because the Except operator uses the whole set to compare rather than per record it should perform much faster.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

Dale FyeOwner, Developing Solutions LLCAuthor Commented:
Getting an error:
Msg 208, Level 16, State 1, Line ##
Invalid object name 'Recent_Production'

with syntax that looks like:
WITH Recent_Production (FieldList)
AS 
(
SELECT FieldList
FROM ProductionTable
)

DELETE FROM tbl_Reports
WHERE Exists (SELECT 1 FROM Recent_Production  
WHERE Entity_ID = tbl_Reports.Entity_ID AND ProductionDate = tbl_Reports.docDate)

INSERT INTO tbl_Reports
SELECT * FROM Recent_Production

Open in new window

The error is pointing to the INSERT INTO tbl_Reports line

I've also tried:
INSERT INTO tbl_Reports(FieldList)
SELECT FieldList FROM Recent_Production

Open in new window


The CTE and the DELETE processes worked right but the insert process is not working properly.  Seems like it is dropping the CTE.  
I event remarked out the INSERT section and simply tried:
SELECT * FROM Recent_Production

Open in new window

and that gave me the same error
0
Dale FyeOwner, Developing Solutions LLCAuthor Commented:
Figured it out.

CTE's can only be used in one operation, so I replaced the CTE with a temp table.

Appreciate the recommendation as this is significantly faster than the MERGE operation!
0
Koen Van WielinkBusiness Intelligence SpecialistCommented:
Hi Dale, great to hear it worked. Apologies for the late reply. Your comment on the CTE being used in 1 operation is indeed correct.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.