SQL 2005 64 Bit job failure

Hello everyone and thank you for your help:
Background:  Have a SQL 2005 64bit with one instance with three databases (A,B,C).  This is connected to a MS cluster using Windows 2003R2 64 bit SP3.  The cluster was at SP and we were getting timeout errors running jobs from database A to B.  It was recommended to upgrade to SP3 which we did.  Now we get timeout errors from A to B and B to C when running jobs.  
Error meassage:
Date  1/8/2010 10:13:04 AM
Log  Job History (BPW_DAILY_00 - stream 00 hard delete extracts [00:01])

Step ID  3
Server  servername\PWp
Job Name  BPW_DAILY_00 - stream 00 hard delete extracts [00:01]
Step Name  Delete Records in Staging Files
Duration  00:12:05
Sql Severity  16
Sql Message ID  16389
Operator Emailed  
Operator Net sent  
Operator Paged  
Retries Attempted  0

Message
Executed as user: server01\APP-PWp. TCP Provider: The semaphore timeout period has expired. [SQLSTATE 08S01] (Error 121)  Communication link failure [SQLSTATE 08S01] (Error 121).  The step failed.

Lou PereiraAsked:
Who is Participating?
 
Lou PereiraAuthor Commented:
The issue was fixed with my earlier post fix.  Thank you
0
 
Raja Jegan RSQL Server DBA & Architect, EE Solution GuideCommented:
>> Now we get timeout errors from A to B and B to C when running jobs.  
>> Step Name  Delete Records in Staging Files

Ok.. Kindly let me know what exactly( logic) is done in the job and how..
Also it fails while Deleting records ( Guessing as per your job step description).
If you have huge no. of records to delete, then it might create Transactional logs more and spanning for a long time requires more server resources thus causing these kind of issues..

Timeouts can be reduced by tuning the existing logic or approach and by placing appropriate indexing if not present in the existing one..

Some more inputs on the aspects mentioned should help me to guide you in the correct direction..
0
 
Lou PereiraAuthor Commented:
The jobs are copying records from one DB to another in the same instance.  The DB is very large I believe 8TB or >.    I also saw posts regarding implementing registry keys as the following:
HKLM\SYSTEM\CurrentControlSet\Services\lanmanserver\parameters
"MaxFreeConnections"=dword: 00000064
"MinFreeConnections"=dword: 00000020
"MaxRawWorkItems"=dword: 00000200
"MaxWorkItems"=dword: 00002000

1. Set TcpMaxDataRetransmissions to 30 (decimal);
2. Set KeepAliveInterval to 25000 (decimal).


This does not seem to work!! Thank you for your help and try to be detailed as I'm not a DBA.
0
Improve Your Query Performance Tuning

In this FREE six-day email course, you'll learn from Janis Griffin, Database Performance Evangelist. She'll teach 12 steps that you can use to optimize your queries as much as possible and see measurable results in your work. Get started today!

 
Raja Jegan RSQL Server DBA & Architect, EE Solution GuideCommented:
Haven't tried out the above Registry key approached on my own and hence not able to comment on it..

Kindly post out the queries which causes timeouts so that we can try optimizing it or tuning it out along with Execution plans of them..

Hope you are aware of how to check execution plans in SSMS...
0
 
Lou PereiraAuthor Commented:
I'm confused with your statements.  The error I posted is all over the SQL community with various fixes.   Have you ever seem this error message before, try googling it, this will take you to the SQL msg boards.  I think I fixed with one of the fixes I got from the SQL msg board.  
0
 
Raja Jegan RSQL Server DBA & Architect, EE Solution GuideCommented:
Ok.. If that is the case, then request you to post the link / solution which worked for your case and close this question accordingly..
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.