Link to home
Start Free TrialLog in
Avatar of 25112
25112

asked on

SSIS - how to skip a step if it is spinning indefinitely in a step?

is there a way to quit a step if it takes more than __ hours in a step?
Avatar of Jim Horn
Jim Horn
Flag of United States of America image

Nope.  

The only meaningful way I know to handle this situation is to break your 'step' into some kind of a loop, say instead of processing 10 million rows process them 50,000 rows at a time, and somewhere in that loop is an INSERT statement to a log table.  

This doesn't solve the problem of it taking a long time, but at least it gives you a log table to look at to see how fast it is running.

There is also the more obvious answers such as eliminate columns not needed to be processed, process fewer rows, use ELT to load your data into an all-varchar table, which goes fast, and THEN do all of your expression work within SQL Server, etc.
What do you mean by "step?" SSIS has control flow tasks and data flow components. Do you mean one or the other of those? Then there are SQL Server Agent jobs, which have 'steps.' Is that what you are referring to?
Avatar of 25112
25112

ASKER

i apologize for the miscommunication..
it is a SQLAgentjob with multiple steps, with a SSIS package referred in each step.
what is a SSIS package is spinning in step 4 indefinitely? is there a way recommended to handle it?
If this is a production job, you should preferably have logging set up. WIth SSIS 2012 and beyond you can use project deployment and logging will be set up semi-automatically. With older versions you have to jump through more hoops. Are you using SSIS 2008?
Here is what looks like a good post from MSSQLTips about setting up logging: Integrated Logging with the Integration Services Package Log Providers

I'd point you to the Microsoft docs, but they make my head spin sometimes.

Another alternative would be to reproduce the problem in a dev environment, and to run the package using the SSIS or SSDT designer.
>it is a SQLAgentjob with multiple steps, with a SSIS package referred in each step.
>what is a SSIS package is spinning in step 4 indefinitely?
Okay, I'll bite.  What's a 'Step 4'?  Inserting data, looping through an entire folder, FTP connection, execute stored procedure spThisTakesForever, data pump to every man, woman, and child on the planet, wait for the Zombie Apocolype, close your eyes and use 'The Force'.  What??

Keep in mind that experts here cannot connect to your data source(s) and run queries, or run things like SSIS packages in Debug mode, so we are entirely dependent on what you tell us to answer your questions.

Please give my article ​Top 10 Ways to Ask Better Questions a read for some helpful advice on how to tailor your questions to get better and faster answers.
By the way, have you reviewed the SQL agent log for the job step? Sometimes it contains useful information; sometimes not. The SSIS log (see my comments above) can contain whatever you choose to write to it to figure out what's going on.
Avatar of 25112

ASKER

thank you for your input. For SQL 2012+ I will keep set the log option.
Yes, this is 2008.

Jim, thanks for that input. I will attempt to be specific and helpful and concise.

rephrasing the question below:

On any SQL Step, if for whatever technical reason (deadlock, data issue, stored procedure code logic loops endlessly which was not diagnosed earlier etc etc) the step does not complete, and even if you come after 8 hours or 8 days, the job is still running in that step.

I agree that technical reason has to be diagnosed.

but in general, is there anything you can augment to a SQL JOB Step to make sure that the job will fail instead of spin indefinitely, if that should occur. Could any kind of timeout factor be possibly introduced (for any unknown or unfound reason for which a step may indefinitely spin for a long time)? Any input on this would be appreciated.
SSIS is capable of running concurrent tasks and data flow streams. Tasks normally run sequentially, but as far as I can remember that is only due to precedence constraints that force serial execution, although it is subject to a package property that can limit parallelism (but the default is to run in parallel). Likewise, within data flows the processing sequence is determined by the dependency of downstream consumers on upstream sources.

I would have to try this, and I can do that if you would like, but it seems like it should be possible to run a 'timer' task in parallel with your data flow task, one that could potentially stop the package if it timed out.
Avatar of 25112

ASKER

>>I would have to try this, and I can do that if you would like, but it seems like it should be possible to run a 'timer' task in parallel with your data flow task, one that could potentially stop the package if it timed out.

wow. thanks for thinking through this and a possible idea of solution.. I was not able to find any other idea yet..

>>run a 'timer' task in parallel with your data flow task
so this applies only to steps that have SSIS package execution, right?

with your solution proposal, if there are 10 steps in the job (regardless of  if the step involved a package or not), and if I prefer to introduce the timeout feature, can it be introduced even in steps that are void of SSIS package; if this is workable, I would be glad to introduce the feature in each step, regardless of 10 or 20. (just as a safety feature- to crash the execution rather than indefinite execution loop regardless of underlying reason(s))

thanks again for possible idea.
The steps in a SQL Server Agent job are more or less serial. You can do things with the step order, but only one step runs at a time. Now I suppose you could run two jobs together, one to do the ETL and another to time the first one. The ETL job could start the timer job, and then the timer job could stop the ETL job if it ran too long.

You might have some interesting edge cases to deal with, although if you use the database to communicate between the two jobs using job unique IDs then it could be fairly clean. The SPs for doing this are sp_start_job and sp_stop_job. WAITFOR could be useful in the timer job, which probably ought to be a T-SQL step that executes a stored procedure. Both job steps would need appropriate permissions in order to start or stop a job.

I've certainly never tried this, but I don't see why it wouldn't work.
Avatar of 25112

ASKER

that's reasonable.. I'd try

just to clarify: when you said ''timer' task' were you thinking of WAITFOR command specifically
ASKER CERTIFIED SOLUTION
Avatar of Megan Brooks
Megan Brooks
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of 25112

ASKER

thank you kindly.. we will discuss this proposal and move as a team.
regards,