Link to home
Start Free TrialLog in
Avatar of Morpheus7
Morpheus7

asked on

Track data changes in SQL Server 2005 for incemental load into a Data warehouse.

Hi,
I am currently developing a data warehouse. Unfortunately some of the source data is located on SQL server 2005 instances.  For the smaller tables I am using the lookup and conditional split tasks in SSIS. Larger tables have in excess of 10 million rows, so this process is not suitable. Is there a way I can track changes in 2005 so as to only upload the new and updated records.

Any guidance would be appreciated. Thanks.
Avatar of Scott Pletcher
Scott Pletcher
Flag of United States of America image

You can use timestamp column in each table.  The timestamp value is an incrementing number, so only rows with a higher-value timestamp will have changed since the last time  you pulled data.

The only issue might be if the instance restarts.  I'm not sure how SQL handles that with regard to timestamp value.
ASKER CERTIFIED SOLUTION
Avatar of Pawan Kumar
Pawan Kumar
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Morpheus7
Morpheus7

ASKER

Hi,
Thank you for the replies. I think it will have to be option 1. The warehouse and staging dbs are on SQL Server 2016, but some of the source databases are on SQL Server 2005. So the CDC and Change Tracking options are not available. Any further help would be appreciated.
Thanks
So are you willing to add a new column called ETLLoadDateTime in all your tables? You need an alter statement?
Hi,

Yes, no problem. It seems that my options are limited with 2005.

Thanks
Hi,

Any guidance on how to implement this would be greatly appreciated.

Many thanks
Hi, how would the updates to existing records be detected as opposed to inserts?

Thanks
How many tables do you have in MSSQL 2005 that need to be imported into the DW?
Does some of them with a date column? If so, can you relate the others that doesn't have a date with the ones that have it?
Don't forget that creating a timestamp column will make you need to update the tables and depending on the size of the tables you may create locks for a long period of time, so you need to perform this when the database is not being used.