Hi,
We are currently using SQL Server 2005 to mirror 16 databases using synchronous mirroring with automatic failover. Every few months the mirror server and the witness server time out against the primary server causing some but not all of the databases to failover.
I've had performance monitor running when this issue has occurred and have not seen any spikes or unusual values which could suggest a problem on the primary server (including network interface counter). I also have Idera SQL diagnostic Manager monitoring the primary server and this has not shown any issues on the Databases or any sudden increases in requests etc.
The first error code seen in the SQL Server logs on the primary server is
Error: 1479, Severity: 16, State: 2.
The first error code seen in the SQL Server logs on the mirror server is
Error: 1479, Severity: 16, State: 1.
This suggests a network problem however there was an engineer RDP'd onto the primary server at the time and didn't experience any connection issues. We also haven't seen any problems or errors on the switches connecting the servers.
Is any one else having this kind of issue with mirroring? Any suggestions on what it could be and how to fix it?
We are using SQL Server 2005 Enterprise build 9.0.3054 on the primary and mirror servers
and SQL Server 2005 Express build 9.0.3042 on the witness server
I have a similar setup.
Are you using any Virtualisation ?
I had to install this hotfix on both my server.
http://support.microsoft.com/kb/937745
maybe you can search your logs to see if you got any of these errors.
Are the heavy transaction systems, with a lot of throughput ?
I would also recommend patching up your SP to a min of
Microsoft SQL Server 2005 - 9.00.4035.00 (X64) Nov 24 2008 16:17:31 Copyright (c) 1988-2005 Microsoft Corporation Enterprise Edition (64-bit) on Windows NT 5.2 (Build 3790: Service Pack 2)
Also, please see this
http://support.microsoft.com/kb/947462
You can increase the timeout, default is 10 seconds,
I wouldn't recommend this but you could do it for test purposes to rule out the network.
If you do change it over 10 seconds, any time the SQL Service is restarted, it defaults back to 10 seconds so keep that in mind.