RAFUser
asked on
Using the same data in multiple Data Flow Tasks
I have a situation where we need to read data from multiple source files and merge it all together to form one destination file. Since the existence of any of those flat files can be flaky, we want to have several Data Flow Tasks - one for each file to load it to memory, and one to actually do the merging. This will allow us to throw an error and determine which specific file wasn't available. However, we can't find the best way to "transfer" data from one data flow task to another.
Our first attempt was to load the source file to a Recordset destination, assign it to a variable, and use it in the final Data Flow Task, but we found out that there aren't Recordset sources. After doing research, we found that we may want to load the source file to a Raw File Destination, but this appears to actually create a file on disk. I'm not so sure we would gain much from using this approach.
This doesn't seem like a request that is out of the ordinary. How do other people use the same data in multiple Data Flow Tasks?
Our first attempt was to load the source file to a Recordset destination, assign it to a variable, and use it in the final Data Flow Task, but we found out that there aren't Recordset sources. After doing research, we found that we may want to load the source file to a Raw File Destination, but this appears to actually create a file on disk. I'm not so sure we would gain much from using this approach.
This doesn't seem like a request that is out of the ordinary. How do other people use the same data in multiple Data Flow Tasks?
ASKER
I am not using a temp table in the database because we aren't using the database at all for this. We are importing from several files and exporting to another file.
We set up our Recordset to save to an object variable... now, how can I access that from a different Data Flow Task? There is no Recordset Source.
We set up our Recordset to save to an object variable... now, how can I access that from a different Data Flow Task? There is no Recordset Source.
You dont have recordset source... you are correct... but you can add a for each loop container to get the values from recordset and save each time execute the loop in a SSIS Variable...
But you need to have a source inside dataflow... and you need for several dataflows...
Soo... I suggest to create a temp table to do your job as HoggZilla also suggest.
regards!
Pedro
But you need to have a source inside dataflow... and you need for several dataflows...
Soo... I suggest to create a temp table to do your job as HoggZilla also suggest.
regards!
Pedro
ok... the schema is the same in all source files?!
ASKER
No, there are 5 different source files with 5 different structures. I wanted to have 5 different Data Flow Tasks load each individual file to an in-memory recordset, and then have a 6th Data Flow Task perform several Merge Joins on the in-memory recordsets.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
and why you dont have the 5 different sources in the same dataflow and make all the transformations to insert in the destination?!
ASKER
Pedro, read the first paragraph in the original message:
"Since the existence of any of those flat files can be flaky, we want to have several Data Flow Tasks - one for each file to load it to memory, and one to actually do the merging. This will allow us to throw an error and determine which specific file wasn't available. However, we can't find the best way to "transfer" data from one data flow task to another."
"Since the existence of any of those flat files can be flaky, we want to have several Data Flow Tasks - one for each file to load it to memory, and one to actually do the merging. This will allow us to throw an error and determine which specific file wasn't available. However, we can't find the best way to "transfer" data from one data flow task to another."
ASKER
Hogg, I will probably just end up using temp tables.
"This will allow us to throw an error and determine which specific file wasn't available"
You can control it inside one dataflow... using the red line output (error)...
I'm asking it, because you have several advantages to use it in the same dataflow... and I'm here for free to help you try to achive your goals!! So, let me know if you need my help and an SSIS package example!
Regards!
pedro
www.pedrocgd.blogspot.com
Is there a valid reason not to use a temp table in the database? Sure would make it easy. Load it, Write from it, Truncate it?