Link to home
Start Free TrialLog in
Avatar of Marcus Aurelius
Marcus AureliusFlag for United States of America

asked on

SSIS Standard Template for Reuse by Business Units

Experts,

I have a requirement to create a "Standard SSIS Template" that will contain data work-flows from 3 different datasource servers.

The idea is to create a single package that can be configured by use "Variables" and/or "Parameters" that can be used to manipulate the package Source and Target Table Fields so that it can be used across multiple Business units.

Again, I'm really trying to create a process template SSIS package that can be used across different business areas within our company and that can be dynamically modified using PAREMETER and/or VARIABLE input.

Can someone provide me with some PROS/CONS of attempting to do this?

Can this even be handled within SSIS?
Avatar of Jim Horn
Jim Horn
Flag of United States of America image

Your requirements are too vague to be actionable, as in my experience SSIS packages are highly customized based on the task at hand, and stating "Variables" and "Parameters" in double quotes, and manipulate source and target fields without stating what that manipulation is, just doesn't provide enough details.

Check out Top 10 Ways to Ask Better Questions for some tips on how to tailor your questions to encourage faster answers.
Avatar of Marcus Aurelius

ASKER

That is the requirement:

Create a package that contains:

Work-flows to move data from Source DB to Target DB Table1
4 Work-flows due to 4 Sources.
1 Target Table with up-to 50 fields that contain ALL fields from the 4 sources.
All Source and Target fields are to be "controlled" by PARAMETERS / VARIABLES.

Once complete the SSIS Package could be used for any NEW business unit data that requires to be integrated into this same TARGET TABLE1.
The change in sources and target fields are to be handled by PARAMETER / VARIABLE inputs and not modifications to the SSIS package other than
Parameter/Variable input should be needed....

MAKE SENSE>>>>?
>All Source and Target fields are to be "controlled" by PARAMETERS / VARIABLES.
Explain this in greater detail, and exactly who (what role) gave you these requirements.
When interacting with a database, SSIS needs to know the columns and data types of the columns it will be interacting with.  As such, you can't use variables or parameters as placeholders for dynamic columns or tables.
At best, you can use a script task that makes use of parameters / variables to build a dynamic query.

However...

SSIS is more than a tool to get / put data in databases.  There's usually some kind of transformation or manipulation that's done to the data.  THAT part can't reasonably be done in a templated manner.  You have no idea what data you have, so there's no way to predict what needs to be done with it.
Honestly, that is the question that I need "expert" input on.

Can SSIS handle the manipulation of "....ALL Source and Target fields to be controlled by PARAMETERS / VARIABLES..."...??

There are larger reasons behind why this is being requested that you may or may not understand and so i'm trying to offer minimal information to try and keep from confusing the base question.

The reason for the controls being the PARAMETERS / VARIABLES is that this is Phase 1 of a larger Data Warehouse ETL process. This process will entail me creating metadata from the SSIS package to track performance analytics and data threshold analytics on how our data is moved around across the data architecture. Phase 1 will focus on Business Unit 1, phase 2 will focus on perhaps Business Unit 2 and 3,...etc..etc..

We want the core SSIS development to focus on this type of heavy lifting of all configuration so that moving forward we should be able to manipulate the movement of "somewhat like data table/fields" merely by modifying input PARAMETERS / VARIABLES.

Any input / feedback no matter how small is greatly appreciated!
thanks
Still waiting on answers from you.

>Can SSIS handle the manipulation of "....ALL Source and Target fields
wtf is 'manipulation'?  Add new columns, change the data type of a column, exclude certain columns, write an expression to calculate the value of a column .. ?  SSIS can do all of these.  

>controlled by PARAMETERS / VARIABLES..."...??
wtf is 'controlled'?  Lookup value(s) from another table based on a variable, add a new column and make all values a variable, loop through a process with a BusinessUnit variable where the process behaves differently based on the variable, conditionally do any of these based on the variable value, you get the idea.

So, the knee-jerk, no details answer is 'Yes, SSIS can do those things'.   But to give a practical answer you need to define this better.

>so that it can be used across multiple Business units.
>Phase 1 will focus on Business Unit 1, phase 2 will focus on perhaps Business Unit 2 and 3,...etc..etc..
You'll have to figure out if the business unit processes are similar enough where this can be done in a loop executing the same process but with different variables in the loop for the different business units, or if separate SSIS packages need to be built for the different business units, or somewhere in between.
btw Give this a good read, as it'll answer a lot of basic ETL requirements questions and is good for a few laughs ...  Requirements Document Template for an ETL Project
>"somewhat like data table/fields"
To expand on Megan's point, SSIS requires a defined schema 'contract' between source and target in order to perform data flow tasks, and anything less then that such as 'somewhat like' will often throw mapping errors.   So if this is a large part of this requirement you'll need to define what 'somewhat like' is, and be prepared for an answer of 'You can't do that in SSIS or most ETL tools.'

I know a guy who managed to pull off dynamic table loads in SSIS, but it was a heavily scripted process that took months and is well beyond a single EE question.
So NOONE has used a UNION Query to pull in different fields from different sources INTO a single TABLE?

One idea is the use a STORED PROCEDURE that contains the SQL Script that uses a series of UNION ALL statements to pull in different numbers of columns per UNION...that results into a SINGLE TABLE.

If this is used, then PARAMETERS could potentially be used to CONTROL the input columns for the SOURCE to TARGET mappings.

THIS IS THE KIND OF THINKING THAT I WAS HOPING TO ILLICIT FROM YOU EXPERTS.

Any comments or additional input......????

:-/
ASKER CERTIFIED SOLUTION
Avatar of Jim Horn
Jim Horn
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
So, in essence, SSIS is more limited than SQL when it comes to stored procedures.  SSIS demands to know the schema of a result set returned by a stored procedure.  SSIS will not accept a result set from a stored procedure when the schema is not known prior to run time.

I'm sure someone, somewhere has worked around that constraint by creating the same schema in every possible result set -- but IMO, that's working against the notion of highly re-useable, maintainable code.
You could also build one giant table in SQL that can hold any of the possible outcomes from the stored procedure... but again... WHY do such a thing?  It smells of bad design.

A more plausible option would be to use a script task that builds and executes your dynamic query.  C# doesn't require a known result set schema.  You could then hold your result set in an SSIS object variable.  But then what?  In order to do any kind of SSIS transformation with that result set, you need to know what you have... so in that regard, it seems kinda pointless to use SSIS.  Just use a SQL stored procedure if your task does not involve any of the transformation features in SSIS.  Or build a C# app that has classes for each of your possible objects, exec your stored procedure and then instantiate the appropriate class for your result set.  Then move on with whatever else needs to be done to the data.

I guess my point is that it seems like you're trying to use the wrong tool to achieve your goal.
THANKS! I appreciate the feedback!