jcneil4
asked on
Daily import using DTS- Only want new records.
I have a DTS package that imports several tables from an ODBC compilant DB into my SQL data warehouse. The rows are deleted in the destination tables before import.
A few of the tables are large (over 1,000,000 rows). These are history tables such as INVOICE_HISTORY_DETAILS. I want to speed up the daily import by only brining in data that doesen't exist in the destination table or at a minimum use a query to minimize the date range (I don't want to import the last 6 years of history every day).
Of course if I do import every row where inv_date > "some recent date" how do I make sure that only data that doesn't exist is appended to the table?
Thanks
A few of the tables are large (over 1,000,000 rows). These are history tables such as INVOICE_HISTORY_DETAILS. I want to speed up the daily import by only brining in data that doesen't exist in the destination table or at a minimum use a query to minimize the date range (I don't want to import the last 6 years of history every day).
Of course if I do import every row where inv_date > "some recent date" how do I make sure that only data that doesn't exist is appended to the table?
Thanks
ASKER
I'm sorry, I should have mentioned that I'm a SQL newbie....
I created a table with one column called start_date, data type is smalldatetime, then I entered a recent date as the first row. Is this what you meant by
"Create a 'control table' that contains a single field holding the beginning date of the range that I want to import"?
Assuming that is what you meant then on to the next step...
I have a DTS that deletes data from destination (SQL) table then imports entire table from source- it does this for about 12 tables. I only want to use the date filter on 3 of these tables- can you roughly describe how to can I modify my existing DTS to:
1. Loads this start date into a global variable
2. Deletes all records in the destination table that exist after this date
3. Transfers data with a data pump which contains a select statement which only selects records after this date.
Thanks a lot..
I created a table with one column called start_date, data type is smalldatetime, then I entered a recent date as the first row. Is this what you meant by
"Create a 'control table' that contains a single field holding the beginning date of the range that I want to import"?
Assuming that is what you meant then on to the next step...
I have a DTS that deletes data from destination (SQL) table then imports entire table from source- it does this for about 12 tables. I only want to use the date filter on 3 of these tables- can you roughly describe how to can I modify my existing DTS to:
1. Loads this start date into a global variable
2. Deletes all records in the destination table that exist after this date
3. Transfers data with a data pump which contains a select statement which only selects records after this date.
Thanks a lot..
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Everything worked great- Now my problem is that I am trying to use the GV in the Transform Data Task to only retrieve data after the date from the source DB with something like:
SELECT ARN_InvHistoryHeader.*
FROM ARN_InvHistoryHeader
WHERE ARN_InvHistoryHeader.invoi ceDate > ?
And when I click on Parameters to select the GV I am getting the error "the SQL statement does not contain any parameters." I've tried "?", '?', (?), etc... any ideas?
SELECT ARN_InvHistoryHeader.*
FROM ARN_InvHistoryHeader
WHERE ARN_InvHistoryHeader.invoi
And when I click on Parameters to select the GV I am getting the error "the SQL statement does not contain any parameters." I've tried "?", '?', (?), etc... any ideas?
hmmm this has worked for me but only in Sybase and SQL Server databases. What type of source database do you have?
What happens if you:
A. Remove the where clause and hit 'Parse'
B. Add the where clause with a constant and hit parse?
C. Add the where clause with your parameter placeholder and hit parse?
Here is a link to a workaround method if you can't get it to work. This is a good site generally.
http://www.sqldts.com/default.aspx?205
If you don't want to use a global variable you can also use native date functions for example:
SELECT ARN_InvHistoryHeader.*
FROM ARN_InvHistoryHeader
WHERE ARN_InvHistoryHeader.invoi ceDate > DATEADD(dd,-10,GETDATE())
Will get all records from ten days ago. The only thing wrong with that is its stuck on ten days - its difficult to reload history if you need to.
What happens if you:
A. Remove the where clause and hit 'Parse'
B. Add the where clause with a constant and hit parse?
C. Add the where clause with your parameter placeholder and hit parse?
Here is a link to a workaround method if you can't get it to work. This is a good site generally.
http://www.sqldts.com/default.aspx?205
If you don't want to use a global variable you can also use native date functions for example:
SELECT ARN_InvHistoryHeader.*
FROM ARN_InvHistoryHeader
WHERE ARN_InvHistoryHeader.invoi
Will get all records from ten days ago. The only thing wrong with that is its stuck on ten days - its difficult to reload history if you need to.
ASKER
The source database is Sage MAS200- providex 32bit ODBC driver
What happens if you:
A. Remove the where clause and hit 'Parse'-
it is successful.
B. Add the where clause with a constant and hit parse?
It works only in this format: WHERE ARN_InvHistoryHeader."Invo iceDate" > {d '2005-07-26'}
C. Add the where clause with your parameter placeholder and hit parse?
I'm not sure what you mean by this...
What happens if you:
A. Remove the where clause and hit 'Parse'-
it is successful.
B. Add the where clause with a constant and hit parse?
It works only in this format: WHERE ARN_InvHistoryHeader."Invo
C. Add the where clause with your parameter placeholder and hit parse?
I'm not sure what you mean by this...
Try this
SELECT ARN_InvHistoryHeader.*
FROM ARN_InvHistoryHeader
WHERE ARN_InvHistoryHeader.invoi ceDate > {d ? }
Its possible that the ODBC driver doesn't support parameters. Do you have any doco with it?
SELECT ARN_InvHistoryHeader.*
FROM ARN_InvHistoryHeader
WHERE ARN_InvHistoryHeader.invoi
Its possible that the ODBC driver doesn't support parameters. Do you have any doco with it?
ASKER
Failed.
It doesn't have any documentation.. the ODBC driver is a really old version but we are forced to use it because of our version of the database. I'm sure that's what is wrong (no support of parameters). If I ever get to figure out how to support parameters, I'll be able to get it working based on what you've told me so far.
Thanks for the help.. I'll close and award you the points.
It doesn't have any documentation.. the ODBC driver is a really old version but we are forced to use it because of our version of the database. I'm sure that's what is wrong (no support of parameters). If I ever get to figure out how to support parameters, I'll be able to get it working based on what you've told me so far.
Thanks for the help.. I'll close and award you the points.
Can you create a view in the source database? That view could have a hard coded two month window or something?
You still won't be able to use GV's but at least you could reduce the data set
You still won't be able to use GV's but at least you could reduce the data set
ASKER
No- I cant. I'm just going to have to import the whole table.. I'll live. Thanks again.
Actually there is one more thing you could do.... you could dynamically assign the SQL in the source query everytime.
This link:
http://www.sqldts.com/default.aspx?213
Shows roughly how to do it, except you want to use the SourceSQLStatement property instead of the SourceObjectName property.
This link:
http://www.sqldts.com/default.aspx?213
Shows roughly how to do it, except you want to use the SourceSQLStatement property instead of the SourceObjectName property.
1. Create a 'control table' that contains a single field holding the beginning date of the range that I want to import
2. Create a DTS which
1. Loads this start date into a global variable
2. Deletes all records in the destination table that exist after this date
3. Transfers data with a data pump which contains a select statement which only selects records after this date.
Every day the date in the control table gets shuffled along one day. You can also manually change it to reload data if required.
>> how do I make sure that only data that doesn't exist is appended to the table
Unless you have a unique key which you can use to match then then you can't.
You need to investigate your source system and see if you can rely on this date method to correctly load data.
i.e. do people back date their transactions?
You can reduce this risk by making the load date always a couple of months behind.