Solved

Dynamic split of data in SSIS 2008

Posted on 2011-03-17
10
1,098 Views
Last Modified: 2012-05-11
Hello experts,

Looking for suggestions on the best way to handle this situation. I have multiple SSIS packages for processing financial transactions. The first process has a file watcher on an FTP folder that sees a file hit and imports the transactions into a staging table for later processing. this process is event driven so it can happen multiple times in a day as files get deposited.

The second process is manually triggered by a user once conditions in our system have been met and transactions are ready to be processed. When the user triggers this process I need to grab all the transactions that have not been processed from the staging table and transform them and insert into other tables for a third party application to work with.

My problem lies here, when I grab the transactions to be processed, I need to divide them into like batches with like dates. So all buys for March 3rd in one batch, buys for March 4 in another, sells in another, etc. It doesn't appear that a conditional split will work because of the expressions needing to be hard coded. We have over 460 different transaction codes and manually creating a task process for each is not something I look forward to.

So, how would you suggest I use SSIS to take this single recordset from the staging table and divide it into other datasets each representing a group with a common date and transaction code for the entire set to be processed? I am thinking the only way I may be able to do this is with a script task, but can I dynamically create variables to hold the different datasets? Would I have a static Object variable to hold an array of dataset names and iterate through that list to call all the dynamic datasets I need to process?

Thanks...
0
Comment
Question by:Hooznext
  • 6
  • 3
10 Comments
 
LVL 37

Expert Comment

by:ValentinoV
ID: 35163948
Could you explain why you need that data to get processed in those different batches as you've described? (seems a bit weird)
0
 
LVL 1

Author Comment

by:Hooznext
ID: 35164427
The third party application that will ultimately process these requires the transactions to be grouped into batches. Our operations department requires like transactions for each date to be grouped, so for each day there is a batch of buy transactions, a batch of sells, etc.

The third party app is a bit difficult to work with as there is no supported API, I am on my own. It would be easier for me to have a single dataset that would represent what goes into the batch so I can create the batch in the third party app, get its ID and write the transactions into the table with that batch ID, then move on to the next dataset and repeat.
0
 
LVL 22

Expert Comment

by:8080_Diver
ID: 35165718
First pull the transaction codes/dates/whatever that identify the batches into an SSIS (object type) variable so that you can determine the sets to be processed.
 
Then use that information to drive a For Each Loop Container that processes a batch;
0
 
LVL 1

Author Comment

by:Hooznext
ID: 35166005
8080 Diver,

That is sort of what I was thinking, use an Execute SQL Task to create an array of file dates, a for each loop through the dates that would pull the transaction codes for that date, then iterate through the codes and process the transactions. So it would be a nested loop processing each transaction.

I was wondering if there is a more efficient way to do this than to hit the database several times and use nested loops. If I could grab all the transactions at once, then get it all split into the needed datasets and dump them back. I guess I am looking for someone to help me look outside the box I am accustomed to and show me a more efficient way to do this.
0
 
LVL 1

Assisted Solution

by:Hooznext
Hooznext earned 0 total points
ID: 35201902
OK guys, here's my solution...let me know if anyone knows a better way to do it.

I created a stored proceedure that returns all the transactions that are not processed yet, I added two columns to the dataset 'TranGroup' and 'TranGroupTotal'. The first tells me what group the transaction belongs to, the second is the total number of groups in the recordset.

In control flow I have an Execute SQL Task to call the proc and assign the results to an Object variable. Then I go into a script task that assigns the TranGroupTotal to a Int variable in the package. That leads to a For Loop Container that loops from 1 to TranGroupTotal, inside the loop is a script task that takes the raw transactions and selects the transactions for group X of the loop and dumps them into another Object variable. Once I have the group of transactions into a dataset I go to a Data Flow and have a Script Component that uses the dataset as a source.

From there I do a normal transformation and have my transactions inserted into the database and back to the loop for the next group of transactions. Not exactly the way I thought I was going to do it, but it does prevent a bunch of calls to the database and endless nested loops as well.

I'm sure there is a better way to do this, but I don't know of it, if anyone has advice I am open to it as projects like this make me learn new things which is why I have a love/hate relationship with this stuff.
0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 
LVL 22

Expert Comment

by:8080_Diver
ID: 35208441
Hooznet has basically taken my suggestion, replaced the initial SQL Query to pull the list to process with a call to a Stored Procedure and then used my suggested For Each Loop Container approach to drive the process that, because of HoozNet's additional knowledge about the data, he has then fleshed out in more detail.

Seems to me that I should at least get some pints. ;-)
0
 
LVL 1

Author Comment

by:Hooznext
ID: 35208634
8080,

I understand your objection, let me ask this. If I asked how to fly an airplane from New York to LA and you said "First you start the engine, take off and fly to LA then you land and shut off the engine." would you call that a helpful response?

In your suggestion, once you get the data into an Object and run a for each loop, how do you suggest you divide that single recordset into the individual sets needed to be processed. Then for that matter how do you make the Object variable available to the data flow task as a source for processing? I know now because I spent the last couple days figuring it out, I really don't see how your response benefitted me.

I had already posed splitting the data and iterating through the records to process them, how is that any different than a for each loop as you suggested?
0
 
LVL 22

Accepted Solution

by:
8080_Diver earned 250 total points
ID: 35208879
Hooznet,

Tell you what, I will withdraw my objection.  My assumptions were in error when I thought that you were trying to figure out how to set up the "iterate through the data" portion of your task.  I was actually providing a bit more detail addressed at your final question:
Would I have a static Object variable to hold an array of dataset names and iterate through that list to call all the dynamic datasets I need to process?

Maybe I should have quoted that question and just answered with, "Yes."

Whoever is moderating this question should feel free to grant HoozNet's request to close it as originally indicated.

As for the crack about the flying of an airplane, what I would have answered to that is that you would iniially need to file a flight plan indicating whether you intended to use one of the standard airways or to ply point-to-point and whether you would be using VFR or IFR flight rules.  Once you did that, then you should follow your flight plan.  I would have then expected some feedback that might indicate more details about what you were really trying to find out about flying from BNYC to LAX.  If I got no feedback, then I would assume that you had wondered about what you needed to do to register the fact that you were going to LAX or what your choices were about flight rules.
0
 
LVL 1

Author Comment

by:Hooznext
ID: 35208908
8080 Diver,

I like you...keep the dirty side down.
0
 
LVL 1

Author Closing Comment

by:Hooznext
ID: 35239115
You have a good sense of humor and I may not have been clear in my post. I can see where you may have thought I didn't understand how to use a For Each Loop in SSIS.
0

Featured Post

Free Gift Card with Acronis Backup Purchase!

Backup any data in any location: local and remote systems, physical and virtual servers, private and public clouds, Macs and PCs, tablets and mobile devices, & more! For limited time only, buy any Acronis backup products and get a FREE Amazon/Best Buy gift card worth up to $200!

Join & Write a Comment

Suggested Solutions

Audit has been really one of the more interesting, most useful, yet difficult to maintain topics in the history of SQL Server. In earlier versions of SQL people had very few options for auditing in SQL Server. It typically meant using SQL Trace …
This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now