Link to home
Start Free TrialLog in
Avatar of dbaSQL
dbaSQLFlag for United States of America

asked on

SSIS package - very long runtime -- how to diag

I have a 'parent' package which calls 55 different packages within -- it's actually an upgrade from SQL/SSIS v2005 to SQL/SSIS v2012.  Still in development, but has been running presumably fine for about a month.  The average runtime for the parent package is about 45 minutes, and again, it fires 55 underlying packages.  We found recently that the results from at least one of the underlying packages is not as expected.  Let's call it the SDA package.  We then found the SDA package that I used in the migration from v2005 to v2012 was actually not the one that was intended, so I backed up the current one and replaced it with another SDA package.  Everything compiled, and I ran the parent package to test things out -- but it ran for more than 12 hours.  Remember, the normal runtime for the parent package is about 45 minutes.  I disabled the SDA package and ran the parent package again -- it completed just fine, so I know the problem is the new SDA package.

I am trying to analyze where the time is being spent, and why.   The SDA package is basically one sql task, one dft and then two larger sequence containers of multiple dfts.  At a glance, both packages appear identical, but they certainly are not performing that way.  I typically use OnError, OnWarning and OnTaskFailed events for my package logging, however now I'm looking at Diagnostic and/or OnProgress events.  I also wonder if I can enable a data viewer that I may use to observe while the package is running.  This is time critical and I need to identify the reason the package is running so much longer.

Are there any SSIS Experts that can provide the right path for me on this troubleshooting?  Is there any further information that I can provide?
Avatar of Jim Horn
Jim Horn
Flag of United States of America image

There's not enough information here to answer your question.

>The SDA package is basically one sql task, one dft and then two larger sequence containers of multiple dfts.
Are you doing any metrics logging on these tasks, such as for any given SP / data pump the number of rows inserted/updated/deleted, and how long it took?  Could be that the number of rows being pumped has increased, could be someone added/altered an index which is causing the data pump to dog, maybe someone added a trigger to the destination table that is firing.

Do the target tables have any primary keys / foreign keys, which are being validated in the data pump, when perhaps these are not needed?

Are you inserting all columns to the destination table, when not all of them are needed?

Did someone alter the schema(s) of the destination table?

>I also wonder if I can enable a data viewer that I may use to observe while the package is running.  
Data viewers let you look at sets but not do performance tuning.
If you deployed the package to the SSIS catalog using the Project Deployment model and then pulled it into a job then you should be able to use the reporting from the catalog to see just which part is being so very slow. Where I work we had a job that took about 6 hours every night, and it turned out that it was just because of one humungous view. Re-writing things to make that small piece of code more efficient worked wonders.

Mike
Avatar of dbaSQL

ASKER

Thank you, Jim.  The bigger project is an upgrade from 2005 to 2012;  the new instance is already in place, but it is  not in use.   Nobody is touching it until I can validate this ETL piece.  For that reason I know it is not a matter of anyone altering or adding indexes, or schemas or tables.  Nobody is on this new box except for me.  

It is also not a matter of an increase in records because the source dataset has not grown.  In short, the ETL begins with a TRUNCATE of a bunch of tables, and then the parent package is fired.  It calls the 55 internal packages to each perform varied manipulations, and ultimately reload all the truncated tables with new/current data.  Because it is still in development, the source dataset is not growing at all.  It is reprocessing the same dataset on every execution.

I did not author the original packages.  I migrated/upgraded a copy of the packages that I was told were the ones currently running in production.  It turns out I may have been given the wrong copy of the packages.  Three of the packages are in question - this is one of them - and after loading them into the parent package, the parent package runtime went from 45-48 minutes to complete, to now running in excess of 12 hours;  the parent package has not completed successfully since I made this change.

As I was reviewing my logging events, I did see that some indexes were being created if not exists. I did not, however, see a previous drop of those indexes in the process flow.  I don't know yet know if this factors in.  The same action, however, is also in the older SDA package, which completes successfully.

What I do know is that it is the SDA package that is the source of this increased time.  My question is what is the best way to move forward now with that package, to determine where the time is being spent.  The SDA package has actually been running now for nearly an hour -- if I were logging the events correctly, I think I would be able to open the log and see where the time is being spent.  No?

Jim, is that enough information to give you a more clear picture of what may be occurring?  Or at least which direction you would recommend from here?
Avatar of dbaSQL

ASKER

Well, I can see the container that it is taking all of the time on, as well as the dft, but it still is not clear why one package is running so much longer than the other.  Do either of you have any suggestions for moving forward?  Or any other SSIS Experts?
So just what is the DFT in question doing? In my case it was a view with a huge cte and about 10 parts to a union'd query afterwards!

Mike
Avatar of dbaSQL

ASKER

It's a lookup.  I've ran both SDA packages side by side.  The old one completes in 3 minutes, the new one that I tried to replace the old one with, does not complete. It runs for 46 minutes before I kill it.  The logging shows me it is the 'Lookup Consecutive Years' task within the dft.  I've looked a the logic from both of them, and there is a definite difference, though I do not understand why it is causing the increased runtime that it is.  The logic is a CTE followed by a UNION.  It is the second half of the union that is different between the two statements.

I have posted two pieces of code.  'old' is the query for the lookup in the package that completes, 'new' is the one that runs forever and must be killed.

To be clear, I have ran the package multiple times by itself, and it is always the 'Lookup Consecutive Years' task that is running for the incredibly long time.  The attachments are the query from that task.

What do you think, dcp?

If the package log output would be helpful, please let me know.  I'd just need to dummy a few server/domain names.
LookupConsecutiveYears---old.sql
LookupConsecutiveYears---new.sql
ASKER CERTIFIED SOLUTION
Avatar of DcpKing
DcpKing
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of dbaSQL

ASKER

If you looked at the two code posts I made, you can see the only variance between the two statements was the bottom half of the union in the CTE.  Interestingly enough, they both return the same data/same record count when I run them myself manually.  One of them is slightly longer in runtime, but only slightly.  I replaced the piece in the slow running package with the piece from the other one -- now it runs fine, and the package completes very quickly.  I don't see the reason for such a dramatic difference, but it's working now, and that's what I care about.  Thank you for the input, Mike.  Definitely appreciated.
Given that the two complete on their own in very similar times I would guess that the new one is actually being run for every record in the CTE, whereas the other runs once and the results are used many times. Hence my suggestion of temporarily making a table from the CTE ...making the query plans easier to understand.

Anyhow, good to hear that you're happy with the results.

Mike