Solved

Dynamic split of data in SSIS 2008

Posted on 2011-03-17
10
1,103 Views
Last Modified: 2012-05-11
Hello experts,

Looking for suggestions on the best way to handle this situation. I have multiple SSIS packages for processing financial transactions. The first process has a file watcher on an FTP folder that sees a file hit and imports the transactions into a staging table for later processing. this process is event driven so it can happen multiple times in a day as files get deposited.

The second process is manually triggered by a user once conditions in our system have been met and transactions are ready to be processed. When the user triggers this process I need to grab all the transactions that have not been processed from the staging table and transform them and insert into other tables for a third party application to work with.

My problem lies here, when I grab the transactions to be processed, I need to divide them into like batches with like dates. So all buys for March 3rd in one batch, buys for March 4 in another, sells in another, etc. It doesn't appear that a conditional split will work because of the expressions needing to be hard coded. We have over 460 different transaction codes and manually creating a task process for each is not something I look forward to.

So, how would you suggest I use SSIS to take this single recordset from the staging table and divide it into other datasets each representing a group with a common date and transaction code for the entire set to be processed? I am thinking the only way I may be able to do this is with a script task, but can I dynamically create variables to hold the different datasets? Would I have a static Object variable to hold an array of dataset names and iterate through that list to call all the dynamic datasets I need to process?

Thanks...
0
Comment
Question by:Hooznext
  • 6
  • 3
10 Comments
 
LVL 37

Expert Comment

by:ValentinoV
ID: 35163948
Could you explain why you need that data to get processed in those different batches as you've described? (seems a bit weird)
0
 
LVL 1

Author Comment

by:Hooznext
ID: 35164427
The third party application that will ultimately process these requires the transactions to be grouped into batches. Our operations department requires like transactions for each date to be grouped, so for each day there is a batch of buy transactions, a batch of sells, etc.

The third party app is a bit difficult to work with as there is no supported API, I am on my own. It would be easier for me to have a single dataset that would represent what goes into the batch so I can create the batch in the third party app, get its ID and write the transactions into the table with that batch ID, then move on to the next dataset and repeat.
0
 
LVL 22

Expert Comment

by:8080_Diver
ID: 35165718
First pull the transaction codes/dates/whatever that identify the batches into an SSIS (object type) variable so that you can determine the sets to be processed.
 
Then use that information to drive a For Each Loop Container that processes a batch;
0
 
LVL 1

Author Comment

by:Hooznext
ID: 35166005
8080 Diver,

That is sort of what I was thinking, use an Execute SQL Task to create an array of file dates, a for each loop through the dates that would pull the transaction codes for that date, then iterate through the codes and process the transactions. So it would be a nested loop processing each transaction.

I was wondering if there is a more efficient way to do this than to hit the database several times and use nested loops. If I could grab all the transactions at once, then get it all split into the needed datasets and dump them back. I guess I am looking for someone to help me look outside the box I am accustomed to and show me a more efficient way to do this.
0
 
LVL 1

Assisted Solution

by:Hooznext
Hooznext earned 0 total points
ID: 35201902
OK guys, here's my solution...let me know if anyone knows a better way to do it.

I created a stored proceedure that returns all the transactions that are not processed yet, I added two columns to the dataset 'TranGroup' and 'TranGroupTotal'. The first tells me what group the transaction belongs to, the second is the total number of groups in the recordset.

In control flow I have an Execute SQL Task to call the proc and assign the results to an Object variable. Then I go into a script task that assigns the TranGroupTotal to a Int variable in the package. That leads to a For Loop Container that loops from 1 to TranGroupTotal, inside the loop is a script task that takes the raw transactions and selects the transactions for group X of the loop and dumps them into another Object variable. Once I have the group of transactions into a dataset I go to a Data Flow and have a Script Component that uses the dataset as a source.

From there I do a normal transformation and have my transactions inserted into the database and back to the loop for the next group of transactions. Not exactly the way I thought I was going to do it, but it does prevent a bunch of calls to the database and endless nested loops as well.

I'm sure there is a better way to do this, but I don't know of it, if anyone has advice I am open to it as projects like this make me learn new things which is why I have a love/hate relationship with this stuff.
0
Backup Your Microsoft Windows Server®

Backup all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

 
LVL 22

Expert Comment

by:8080_Diver
ID: 35208441
Hooznet has basically taken my suggestion, replaced the initial SQL Query to pull the list to process with a call to a Stored Procedure and then used my suggested For Each Loop Container approach to drive the process that, because of HoozNet's additional knowledge about the data, he has then fleshed out in more detail.

Seems to me that I should at least get some pints. ;-)
0
 
LVL 1

Author Comment

by:Hooznext
ID: 35208634
8080,

I understand your objection, let me ask this. If I asked how to fly an airplane from New York to LA and you said "First you start the engine, take off and fly to LA then you land and shut off the engine." would you call that a helpful response?

In your suggestion, once you get the data into an Object and run a for each loop, how do you suggest you divide that single recordset into the individual sets needed to be processed. Then for that matter how do you make the Object variable available to the data flow task as a source for processing? I know now because I spent the last couple days figuring it out, I really don't see how your response benefitted me.

I had already posed splitting the data and iterating through the records to process them, how is that any different than a for each loop as you suggested?
0
 
LVL 22

Accepted Solution

by:
8080_Diver earned 250 total points
ID: 35208879
Hooznet,

Tell you what, I will withdraw my objection.  My assumptions were in error when I thought that you were trying to figure out how to set up the "iterate through the data" portion of your task.  I was actually providing a bit more detail addressed at your final question:
Would I have a static Object variable to hold an array of dataset names and iterate through that list to call all the dynamic datasets I need to process?

Maybe I should have quoted that question and just answered with, "Yes."

Whoever is moderating this question should feel free to grant HoozNet's request to close it as originally indicated.

As for the crack about the flying of an airplane, what I would have answered to that is that you would iniially need to file a flight plan indicating whether you intended to use one of the standard airways or to ply point-to-point and whether you would be using VFR or IFR flight rules.  Once you did that, then you should follow your flight plan.  I would have then expected some feedback that might indicate more details about what you were really trying to find out about flying from BNYC to LAX.  If I got no feedback, then I would assume that you had wondered about what you needed to do to register the fact that you were going to LAX or what your choices were about flight rules.
0
 
LVL 1

Author Comment

by:Hooznext
ID: 35208908
8080 Diver,

I like you...keep the dirty side down.
0
 
LVL 1

Author Closing Comment

by:Hooznext
ID: 35239115
You have a good sense of humor and I may not have been clear in my post. I can see where you may have thought I didn't understand how to use a For Each Loop in SSIS.
0

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.
This video demonstrates how to create an example email signature rule for a department in a company using CodeTwo Exchange Rules. The signature will be inserted beneath users' latest emails in conversations and will be displayed in users' Sent Items…

947 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

24 Experts available now in Live!

Get 1:1 Help Now