• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 498
  • Last Modified:

Daily import using DTS- Only want new records.

I have a DTS package that imports several tables from an ODBC compilant DB into my SQL data warehouse.  The rows are deleted in the destination tables before import.
 
A few of the tables are large (over 1,000,000 rows).  These are history tables such as INVOICE_HISTORY_DETAILS.  I want to speed up the daily import by only brining in data that doesen't exist in the destination table or at a minimum use a query to minimize the date range (I don't want to import the last 6 years of history every day).  

Of course if I do import every row where inv_date > "some recent date" how do I make sure that only data that doesn't exist is appended to the table?

Thanks
0
jcneil4
Asked:
jcneil4
  • 6
  • 5
1 Solution
 
nmcdermaidCommented:
This is what I do:

1. Create a 'control table' that contains a single field holding the beginning date of the range that I want to import
2. Create a DTS which
   1. Loads this start date into a global variable
   2. Deletes all records in the destination table that exist after this date
   3. Transfers data with a data pump which contains a select statement which only selects records after this date.

Every day the date in the control table gets shuffled along one day. You can also manually change it to reload data if required.


>> how do I make sure that only data that doesn't exist is appended to the table

Unless you have a unique key which you can use to match then then you can't.

You need to investigate your source system and see if you can rely on this date method to correctly load data.

i.e. do people back date their transactions?

You can reduce this risk by making the load date always a couple of months behind.
0
 
jcneil4Author Commented:
I'm sorry, I should have mentioned that I'm a SQL newbie....
 
I created a table with one column called start_date, data type is smalldatetime, then I entered a recent date as the first row. Is this what you meant by
"Create a 'control table' that contains a single field holding the beginning date of the range that I want to import"?

Assuming that is what you meant then on to the next step...

I have a DTS that deletes data from destination (SQL) table then imports entire table from source- it does this for about 12 tables.  I only want to use the date filter on 3 of these tables- can you roughly describe how to can I modify my existing DTS to:
 
   1. Loads this start date into a global variable
   2. Deletes all records in the destination table that exist after this date
   3. Transfers data with a data pump which contains a select statement which only selects records after this date.

Thanks a lot..
0
 
nmcdermaidCommented:
Yep thats what I meant by a control table.

To create your DTS:

1. Go to Data Transformation Sevices, Local Packages, Right click and create a new package
2. Press Connection then MS OLE DB connection (the first one). Enter detail to connect to the destination database, give it a decent name and press OK
3. Create your globl variable: Ensure the connection you just created is NOT selected then press Package/Properties. Press the global variables tab. In the first row here type your globl variable name i.e. gStartDate, pick a data type of Date, and type 1900-01-01 in the value column and press OK.
4. Now load the value from the table into the global variable: Press Task then Execute SQL Task. In here type SELECT YourField FROM YourControlTable. Press Parse query to ensure its valid. Press the parameters button. Press the Output Parameters tab. Select Row Value. You'll see your table field name below. In the output global variable column select your global variable. Press OK

Now test it:

1. Press Packages/Properties go to the global variables tab and note that the value of your GV is 1900-01-01
2. Right click in the Execute SQL task (yellow cylinder) and press Execute
3. Go back the GV viewer and note that it has loaded in your date value. Hooray!!


Unfortunately I have to run now but I will post back in a few hours. In the meantime see if you can work out how to use an execute SQL task to delete data from your destination table based on the GV.

Basically you want to drop a new Execute SQL task on the page then put some SQL like this in there:

DELETE YourTable WHERE StartDate > ?


the ? is a placeholder for the GV and this time should allow you to fill in details on the input parameters tab.

You may want to do this on a test system in case it deletes the wrong stuff!!


0
Prepare for your VMware VCP6-DCV exam.

Josh Coen and Jason Langer have prepared the latest edition of VCP study guide. Both authors have been working in the IT field for more than a decade, and both hold VMware certifications. This 163-page guide covers all 10 of the exam blueprint sections.

 
jcneil4Author Commented:
Everything worked great- Now my problem is that I am trying to use the GV in the Transform Data Task to only retrieve data after the date from the source DB with something like:

SELECT ARN_InvHistoryHeader.*
FROM ARN_InvHistoryHeader
WHERE ARN_InvHistoryHeader.invoiceDate > ?

And when I click on Parameters to select the GV I am getting the error "the SQL statement does not contain any parameters."  I've tried "?", '?', (?), etc... any ideas?
0
 
nmcdermaidCommented:
hmmm this has worked for me but only in Sybase and SQL Server databases. What type of source database do you have?

What happens if you:

A. Remove the where clause and hit 'Parse'
B. Add the where clause with a constant and hit parse?
C. Add the where clause with your parameter placeholder and hit parse?

Here is a link to a workaround method if you can't get it to work. This is a good site generally.

http://www.sqldts.com/default.aspx?205


If you don't want to use a global variable you can also use native date functions for example:

SELECT ARN_InvHistoryHeader.*
FROM ARN_InvHistoryHeader
WHERE ARN_InvHistoryHeader.invoiceDate > DATEADD(dd,-10,GETDATE())

Will get all records from ten days ago. The only thing wrong with that is its stuck on ten days - its difficult to reload history if you need to.


0
 
jcneil4Author Commented:
The source database is Sage MAS200- providex 32bit ODBC driver

What happens if you:

A. Remove the where clause and hit 'Parse'-
it is successful.

B. Add the where clause with a constant and hit parse?
It works only in this format: WHERE ARN_InvHistoryHeader."InvoiceDate" > {d '2005-07-26'}

C. Add the where clause with your parameter placeholder and hit parse?
I'm not sure what you mean by this...
0
 
nmcdermaidCommented:
Try this

SELECT ARN_InvHistoryHeader.*
FROM ARN_InvHistoryHeader
WHERE ARN_InvHistoryHeader.invoiceDate > {d ? }


Its possible that the ODBC driver doesn't support parameters. Do you have any doco with it?




0
 
jcneil4Author Commented:
Failed.  

It doesn't have any documentation.. the ODBC driver is a really old version but we are forced to use it because of our version of the database.  I'm sure that's what is wrong (no support of parameters).  If I ever get to figure out how to support parameters, I'll be able to get it working based on what you've told me so far.  

Thanks for the help.. I'll close and award you the points.

0
 
nmcdermaidCommented:
Can you create a view in the source database? That view could have a hard coded two month window or something?

You still won't be able to use GV's but at least you could reduce the data set
0
 
jcneil4Author Commented:
No- I cant.  I'm just going to have to import the whole table.. I'll live.  Thanks again.
0
 
nmcdermaidCommented:
Actually there is one more thing you could do.... you could dynamically assign the SQL in the source query everytime.

This link:

http://www.sqldts.com/default.aspx?213


Shows roughly how to do it, except you want to use the SourceSQLStatement property instead of the SourceObjectName property.
0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

  • 6
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now