Gene Moody
asked on
VMWare + SQL Server = Random Dropped SQL Server Connections
Hi there! I have an ESX 2.5.3 (build 22981) VMWare installation running, among other things, a SQL Server 2005 under Windows Server 2003 Enterprise. Periodically, at what seems to be random (un-logged) intervals, the SQL Server "loses" various client connections. Just disconnects them, regardless of KeepAlive settings (which I moved from 30,000 down to 10,000).
Things could be complicated by the fact that the Client software is a hokey Accounting package written using MS Access 2K3 as a front end, but connecting to SQL Server for data. But I'm not too sure that Access is to blame, this time - it appears that the common problem is the (Virtualized) SQL Server.
Before we virutalized everything, and the SQL Server/2K3 Server were "metal", this did not occur. Searching VMWare's KBs have shed literally no light here, and Microsoft's documentation seems to think that by setting the KeepAlive to a shorter interval will resolve the problem - which it hasn't.
I am aware and have studied <http://blogs.msdn.com/sql_protocols/archive/2006/03/09/546852.aspx> and <http://support.microsoft.com/kb/137983/?sd=RMVP&fr=1>, neither of which seem to help. And my users, while patient, are curious to see if we might de-virutalize this server - if that will help their connectivity issues.
Any ideas what to check? I'm admittedly a noob when it comes to VMWare - I love it, but I'm not sure I know enough to get the most out of it.
Thanks (in advance) for your time and trouble,
- The Lurking LongFist
Things could be complicated by the fact that the Client software is a hokey Accounting package written using MS Access 2K3 as a front end, but connecting to SQL Server for data. But I'm not too sure that Access is to blame, this time - it appears that the common problem is the (Virtualized) SQL Server.
Before we virutalized everything, and the SQL Server/2K3 Server were "metal", this did not occur. Searching VMWare's KBs have shed literally no light here, and Microsoft's documentation seems to think that by setting the KeepAlive to a shorter interval will resolve the problem - which it hasn't.
I am aware and have studied <http://blogs.msdn.com/sql_protocols/archive/2006/03/09/546852.aspx> and <http://support.microsoft.com/kb/137983/?sd=RMVP&fr=1>, neither of which seem to help. And my users, while patient, are curious to see if we might de-virutalize this server - if that will help their connectivity issues.
Any ideas what to check? I'm admittedly a noob when it comes to VMWare - I love it, but I'm not sure I know enough to get the most out of it.
Thanks (in advance) for your time and trouble,
- The Lurking LongFist
How many VM's are being hosted on the host? I've had issues where SQL is robbed of memory as VMWare was starved and trying to divvy it up as it saw fit.
ASKER
Thank you for your prompt response!
There are three of 'em "in there": the DNS Box, the Intra-Net Server Box, and the SQL Server Box. Intra-Net is currently idling, as its application(s) was/were lost during the big (ugly) crash of May, wherein we learned that my predecessor had no clue at all what a disaster recovery system was, nor how to impliment one (if he knew). So, we've got basically a DNS/File Server and a SQL Server, apparently in a shootout for resources.
How could I tell? The machine they're on is never over 33% utilized at any one time:
There are three of 'em "in there": the DNS Box, the Intra-Net Server Box, and the SQL Server Box. Intra-Net is currently idling, as its application(s) was/were lost during the big (ugly) crash of May, wherein we learned that my predecessor had no clue at all what a disaster recovery system was, nor how to impliment one (if he knew). So, we've got basically a DNS/File Server and a SQL Server, apparently in a shootout for resources.
How could I tell? The machine they're on is never over 33% utilized at any one time:
Virtual Machines: 25 %
System Services: 6 %
System Total: 31 %
...and the only place it gets even close is RAM utilization - but it's not close to "the red" yet:System Services: 6 %
System Total: 31 %
Total Memory (3.8 G)
Virtual Machines: 1.8 G
System Services: 601.6 M
System Total: 2.4 G
...is there some sort of logger or other tool (beyond the web interface) that might allow me to watch this?Virtual Machines: 1.8 G
System Services: 601.6 M
System Total: 2.4 G
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
Will do - unfortunately I can only perform these changes with the VM down (I know, tell you something we all don't already know - sorry!) so I'll need to wait until after 17:00 to implement the changes - I think you may have hit it right on the head.
Last night, after all the excitement died down, I found that someone had re-started a now-defunct server, which (among other things) was responsible for "pushing into the orange" on RAM. So I killed it, and watched the numbers drop:
So, I need to "upgrade" the SQL Server box with more RAM; starting out at 1GB and (possibly) moving up. Sounds like a plan.
I will return and report results soonest - but it'll be about 7 or so hours (from this post) before I can effect change, and approximately 14 hours after that before any real results can be noted. However, I will monitor this channel for the duration, just in case.
Thanks for your timely help and advice!!!
Last night, after all the excitement died down, I found that someone had re-started a now-defunct server, which (among other things) was responsible for "pushing into the orange" on RAM. So I killed it, and watched the numbers drop:
Virtual Machines: 14 %
System Services: 5 %
System Total: 19 %
Memory (3.8 G)
Virtual Machines: 1.3 G
System Services: 535.0 M
System Total: 1.9 G
...and for some strange reason, nobody had any complaints this morning!System Services: 5 %
System Total: 19 %
Memory (3.8 G)
Virtual Machines: 1.3 G
System Services: 535.0 M
System Total: 1.9 G
So, I need to "upgrade" the SQL Server box with more RAM; starting out at 1GB and (possibly) moving up. Sounds like a plan.
I will return and report results soonest - but it'll be about 7 or so hours (from this post) before I can effect change, and approximately 14 hours after that before any real results can be noted. However, I will monitor this channel for the duration, just in case.
Thanks for your timely help and advice!!!
ASKER
Changes complete: VM now has 1024 MB (1 GB) assigned to it - and the first thing it did was take up 769.0 MB of it! In less than ten minutes it had claimed all 1024 MB available to it - so I guess it's just hide-n-watch time, right?
I'll return and report soonest. Thanks again for your time and timely attention!
I'll return and report soonest. Thanks again for your time and timely attention!
No Worries.
By nature; SQL, IIS and Exchange all try and grab as much memory as possible. Dont forget that the OS will be taking a portion of that 1GB. Depending on how brave you are and if you can replicate this in a test environment; you can control how much memory SQL grabs by using the Max Memory settings (SQL Management Studio; Right Click on Server Name, Properties -> Memory)
HTH
By nature; SQL, IIS and Exchange all try and grab as much memory as possible. Dont forget that the OS will be taking a portion of that 1GB. Depending on how brave you are and if you can replicate this in a test environment; you can control how much memory SQL grabs by using the Max Memory settings (SQL Management Studio; Right Click on Server Name, Properties -> Memory)
HTH
ASKER
Doubling the available RAM has resolved the connection issue - even though the microsoft systems are RAM-hungry. I'll have to keep an eye on my other VM servers - "butterfly effect" may cause other unintended consequences - but we're up and running - thank you very much!