Advertisement

07.14.2008 at 09:27PM PDT, ID: 23564996
[x]
Attachment Details

SQL 2005 resource group fails over unexpectedly in MS Cluster

[x]
The Solution Rating System

With so many solutions, how can you tell which solutions are most likely to help you and which ones are not? To provide you with a tool to use, we rate our solutions based on various elements that most accurately determine if a solution is a quality solution. To explain what factors affect the solution rating, here are the elements we take into consideration when formulating our solution rating.

  • The Grade of the Solution
  • The Zone Rank of the Expert Providing the Solution
  • The Number of Author and Expert Comments
  • The Number of Experts Contributing
  • The Feedback of the Community

Your Input Matters
Because of the way the system is set up, the most important variable in this equation is you. As a member of Experts Exchange, you are able to cast your vote on the quality of the solutions in regard to how complete, accurate, helpful and easy to understand each solution is. When you provide your feedback, each rating is adjusted accordingly. So, if you see a solution that has a poor rating that you think is a good solution, let us know by rating it. As you do, the rating will be adjusted and will become more accurate for other members of our site.

If you have any suggestions that you would like to make for our rating system, please ask a question in the Suggestions Zone of Community Support.

Thank you!

8.0
Tags:

Microsoft, SQL Server, 2005, In a 2 node cluster - Windows 2003 EE x64, HP, Proliant, BL685c G1

We are seeing issues with our SQL cluster where the SQL 2005 cluster group fails over to the other node in the cluster.  It gives errors about the network name no longer being unavailable and communication link failure.  Below are the relevant cluster.log entries:

0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 40; message = [Microsoft][SQL Native Client]Named Pipes Provider: The specified network name is no longer available.

0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 40; message = [Microsoft][SQL Native Client]Communication link failure
0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] OnlineThread: QP is not online.
0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
0000108c.0000a108::2008/07/15-02:48:27.812 ERR  SQL Server <SQL Server>: [sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure
000007c8.000010a4::2008/07/15-02:48:30.672 INFO [FM] NotifyCallBackRoutine: enqueuing event
000007c8.000010a4::2008/07/15-02:48:30.672 INFO [FM] Calling RmNotifyChanges in monitor 108c.
000007c8.00000984::2008/07/15-02:48:30.672 INFO [CP] CppResourceNotify for resource SQL Server
000007c8.00000984::2008/07/15-02:48:30.672 INFO [CP] CppRundownCheckpoints removing RNB for Software\Microsoft\Microsoft SQL Server\MSSQL.1\SQLServerSCP
000007c8.00000984::2008/07/15-02:48:30.687 INFO [CP] CppRundownCheckpoints removing RNB for Software\Microsoft\Microsoft SQL Server\MSSQL.1\PROVIDERS
000007c8.00000984::2008/07/15-02:48:30.703 INFO [CP] CppRundownCheckpoints removing RNB for Software\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLSERVER
000007c8.00000984::2008/07/15-02:48:30.703 INFO [CP] CppRundownCheckpoints removing RNB for Software\Microsoft\Microsoft SQL Server\MSSQL.1\Cluster
000007c8.00000984::2008/07/15-02:48:30.718 INFO [CP] CppRundownCheckpoints removing RNB for Software\Microsoft\Microsoft SQL Server\MSSQL.1\SQLserverAgent
000007c8.00000984::2008/07/15-02:48:30.718 INFO [CP] CppRundownCheckpoints removing RNB for Software\Microsoft\Microsoft SQL Server\MSSQL.1\Replication
000007c8.00000984::2008/07/15-02:48:30.734 INFO [CP] CppRundownCheckpoints removing RNB for Software\Microsoft\Microsoft SQL Server\MSSQL.1\CPE
000007c8.000009a8::2008/07/15-02:48:30.750 WARN [FM] FmpHandleResourceTransition: Resource Name = 1d3ab5eb-38e6-44da-8341-347b2b4aad30 [SQL Server] old state=2 new state=4
000007c8.000009a8::2008/07/15-02:48:30.750 INFO [GUM] GumSendUpdate:  Locker waiting            type 0 context 8
000007c8.000009a8::2008/07/15-02:48:30.750 INFO [GUM] Thread 0x9a8 UpdateLock wait on Type 0
000007c8.000009a8::2008/07/15-02:48:30.750 INFO [GUM] GumpDoLockingUpdate: lock was free, granted to 1
000007c8.000009a8::2008/07/15-02:48:30.750 INFO [GUM] GumpDoLockingUpdate successful, Sequence=16695 Generation=0
000007c8.000009a8::2008/07/15-02:48:30.750 INFO [GUM] GumSendUpdate: Locker dispatching seq 16695      type 0 context 8
000007c8.000009a8::2008/07/15-02:48:30.750 INFO [GUM] GumSendUpdate: Dispatching seq 16695      type 0 context 8 to node 2
000007c8.000009a8::2008/07/15-02:48:30.750 INFO [GUM] GumSendUpdate: Locker updating seq 16695      type 0 context 8
000007c8.000009a8::2008/07/15-02:48:30.750 INFO [GUM] GumpDoUnlockingUpdate releasing lock ownership
000007c8.000009a8::2008/07/15-02:48:30.750 INFO [GUM] GumSendUpdate: completed update seq 16695      type 0 context 8
000007c8.000009a8::2008/07/15-02:48:30.750 INFO [FM] FmpPropagateResourceState: resource 1d3ab5eb-38e6-44da-8341-347b2b4aad30 failed event.
000007c8.000009a8::2008/07/15-02:48:30.750 INFO [FM] FmpHandleResourceFailure: taking resource 1d3ab5eb-38e6-44da-8341-347b2b4aad30 and dependents offline

Some details about the cluster setup:

* Two node HP BL685c G1 systems
* Windows 2003 x64 EE R2 SP1
* Heartbeat is a single nic on each node
* Public interface is teamed set to auto on each nic in the team
* Running SQL 2000 in its own cluster group and SQL 2005 in its own as well (only have issues with 2005 instance)

What could be causing this?  I am leaning towards possible nic issues but need assistance on further troubleshooting this issue.

Thank you.
 
 
 
Accepted Solution by j0lx:

All comments and solutions are available to Premium Service Members only. Start your 7-day free trial to view the solution to this question.

Already a member? Login to view this solution.

 
 
Author Comment by zarcow:

All comments and solutions are available to Premium Service Members only. Start your 7-day free trial to view the solution to this question.

Already a member? Login to view this solution.

 
 
Author Comment by zarcow:

All comments and solutions are available to Premium Service Members only. Start your 7-day free trial to view the solution to this question.

Already a member? Login to view this solution.

 
 
Administrative Comment by cs97jjm3:

All comments and solutions are available to Premium Service Members only. Start your 7-day free trial to view the solution to this question.

Already a member? Login to view this solution.

 
 
Administrative Comment by Computer101:

All comments and solutions are available to Premium Service Members only. Start your 7-day free trial to view the solution to this question.

Already a member? Login to view this solution.

 
 
20081119-EE-VQP-46 / EE_QW_2_20070628