Link to home
Start Free TrialLog in
Avatar of ssebring
ssebring

asked on

SSIS Dynamically load tables with different schemas (DB2 to SQL Server)

I've set up a SSIS control flow that selects the following columns from a table into a system.object:  db2SrcQuery, destTblName

A loop then maps variables via the Foreach ADO Enumerator:
User::db2SrcQuery / User::destTblName

Inside the loop a data flow task connects to DB2 and executes the User::db2SrcQuery's select statement's result into the destination table (User::destTblName).

All works great except for the big showstopper of table schema changes.

Found a blog with a nice solution for this except that it uses OpenQuery -- something the client stands firm on not using.  I need to stay with connection managers for the db2 extraction - also, they don't want files so bulk insert is not an option.

So...

Are there any great suggestions out there on how to do this in SSIS?  

1. Foreach ADO Enumerator collection / variable mapping of sql query & dest table name
    a. connect to db2 via connection manager and exec sql variable command
    b. insert result into destination table (via table name variable) -- table schemas differ

Thank you!
Avatar of 8080_Diver
8080_Diver
Flag of United States of America image

If you have3 a known set of DB2 tables to process (which I would definitely hope you do ;-), then you could create a DB2ImportCotrols table that has, among other pertinent pieces of controling information, the connection information (especially if you have to hit various LIBs in teh DB2 world), the DB2 source file name, and the target table name (including Schema).  Then, you could cycle through this control table to do the processing.
By having the connection information, you could compare the current connection string to the one from the table each time you read a new control row and only change the connection if the strings are different.  The control table would also provide the full schame.filename information for the SQL Server side.  In fact, you could even go so far as to include the SQL Server connection manager's connection string so that, should the need arise, you could handle multiple SQL Server database connections. ;-)
I do this with a process I set up and I include the last datetime that a particular control row's processing was successful.
Avatar of ssebring
ssebring

ASKER

Sorry, but I understood little of what you were describing.

1. Yes, the DB2 tables are known.

2. Regarding creating a db2 control table: at this time, the dev team does not have db2 create table rights but it's possible to request it if the solution can be proven worthy enough.

3. Not sure what you mean about hitting various LIBS in the DB2 world.  What we need to do is a db2 select query from several joined tables and output the result straight into sql server.  The client doesn't want to use files & I'm not sure what you mean about providing a full schema.filename -- do you mean a table definition?


Please clarify and thank you for your reply!
1. Yes, the DB2 tables are known.
Great!

2. Regarding creating a db2 control table: at this time, the dev team does not have db2 create table rights but it's possible to request it if the solution can be proven worthy enough.
I was not talking about creating a table in the DB2 world but, rather, in the SQL Server world.  Since you are using SSIS, I rather assumed that you were working from the SQL Server side of teh equation.

3. Not sure what you mean about hitting various LIBS in the DB2 world.  
Ask your DB2 expert what  LIB is and that pperson doesn't know, then you had best find someone who does! ;-)  A LIB is a means by which databases are more or less partitioned in DB2.

What we need to do is a db2 select query from several joined tables and output the result straight into sql server.  The client doesn't want to use files & I'm not sure what you mean about providing a full schema.filename -- do you mean a table definition?
DB2 databases are not physically constructed in the same way that a SQL Server database is.  DB2 Databases are actually files that contain PK and FK logical links . . . and that is where my reference to files comes in.  
Instead of "full schema.filename" I probably wshould have said "full schema.tablename"  (when I am dealing with DB2 related questions, I tend to blur the distinction between a table and a file because DB2 developers treat them as the same thing. ;-).
So, what I am saying is that I would create a SQL SERVER table that would contain:
  • TableProcessingID INT Identity (1-1);
  • TargetTableName VarChar(200) -- the 4-part name for the SQL Server table to receive the data;
  • SQLServerConnectionString VarChar(400) -- whatever connection string is required to access the SQL Server database that contains the table;
  • DB2ConnectionString VarChar(400) -- whatever connection string is required to access the DB2 database that contains the table;
  • SQLStatement VarChar(8000) -- the DB2 SQL statement to extract whatever data from the DB2 world and insert it into the SQL Server table;
  • LastProcessed DateTime -- the datetime of the last successful processing of this data.
Given that information, your SSIS package should be able to use a query to pull the data from that control table, perform a FOR EACH that uses the data from that table to set all the connections and everything else that you need to perform the ETL process.
thank you for your time in another reply.

Your last description seems to be almost exact to what I already have setup and described in my initial posted question.  

I've got a loop iterating on each var containing the db2 query string -- the problem I face is that when a table def changes, ssis doesn't dynamically reset the column mappings in the data flow task, so it fails.  

What am I missing in your description that would make this work?  

My current task flow is:

1. select target table name and db2 query from sql server control table into vars
2. inside loop:
         data flow task
            - ole db db2 source - Data access mode: sql command from variable
            - ole db sql server destination - Data access mode Table name or view name variable
                 (here is where the column mappings happen on the initial setup, on next iteration the
                   column mapping does not dynamically get reset by ssis, hence the fail if the table columns
                   are different)

Thanks for yet another round of this question.
Well, essentially, I was figuring on using a single Control Flow component (Execute SQL Task) to handle the execution of the INSERT INTO ( . . . ) SELECT . . . .  FROM table WHERE constraints SQL statement.  That effectively by-passes the use of the components that yoyu are using in the dataflow.  
An Execute SQL Task can use a variable (the SQLStatement from my control table), among other options, for providing the SQL statement to execute.
that would have been great, something I thought more than once about in fact, but the statement would have to use an openquery/linkedserver which the client is against.  Unless I'm missing something here?
ASKER CERTIFIED SOLUTION
Avatar of 8080_Diver
8080_Diver
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
thank you for your comments!