Avatar of Dustin Saunders
Dustin Saunders
Flag for United States of America asked on

SQL Instances spike CPU, start being slow/refusing connections

I've had an issue with some 2014 SP2 SQL servers, where multiple instances stop responding and have to be restarted before the applications can reconnect to them.

It's standalone instances, all on the same server.  Databases are on local store.

It seems like the problem comes from SQL server spiking up CPU usage (each instance is using 3-5% CPU non-stop).  Here's what is odd...  if I reboot the server it's <3% CPU usage average.  But if I restore a database into any one of those instances the server starts going CPU crazy after about an hour.  If I never restore a database, never a problem.

After the CPU starts going crazy, if I look at execution plans on all instances, the statistics are WAY off.  (expected 82 rows actual 22500)  Statistics look fine before issue starts.
Microsoft SQL ServerVMwareVirtualization

Avatar of undefined
Last Comment
Dustin Saunders

8/22/2022 - Mon
Vitor Montalvão

What's the VM configuration (vCPU, vRAM and Storage)?
How many SQL Server instances do you have installed? Can you list their version and edition?
Did you configure for each of those SQL Server instances the min and max memory? If so, post here their memory configurations as well.
Dustin Saunders

ASKER
XenServer - (1s4c CPU, 6GB RAM, 10k RPM SAS)  Paging file usage peak 1.42% (6GB pagefile)

19 instances, all 2014 SP2.  3 of them are SQL standard the rest are express.

None have max or min memory.  After I restored a database to MSSQL$RWK668UZSK every instance went from 0 CPU usage to bouncing around 5-6% non stop over the course of about an hour.  It'll stay like that until I reboot (attached).
wtfSQL.png
Scott Pletcher

Maybe you need to turn off automatic statistics updates on restored dbs(?).  I usually either explicitly rebuild stats after an attach/restore or turn off automated stats (if not already off).  For large dbs, this can have a big impact on the system.
Your help has saved me hundreds of hours of internet surfing.
fblack61
Dustin Saunders

ASKER
Do you think that would cause the CPU to go nuts for all instances?  If I leave it untouched for a week, it will continue to use CPU like that until it's rebooted.  The biggest DB here is 1GB, the rest are approx 100 MB.
ASKER CERTIFIED SOLUTION
Vitor Montalvão

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
SOLUTION
Scott Pletcher

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Vitor Montalvão

19 instances, all 2014 SP2.  3 of them are SQL standard the rest are express
So 16 Express instances. Express Edition is limited only to use 1GB RAM and 1 CPU.
Dustin Saunders

ASKER
@Vitor - The total size of all SQL instance mdf are 3.2GB, but I can't see that it needs more than that?  It's not paging at all.

But what's weird is that if I don't restore a database, no problem.  It can go a month with no problem.  If a restore happens, the CPU does this every time.  I can reproduce this result consistently.

@Scott -  I can try that and see if it makes a difference.
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Vitor Montalvão

The total size of all SQL instance mdf are 3.2GB, but I can't see that it needs more than that?  It's not paging at all.
Problem is that as the minute the SQL Server instance is up and running it will grab and reserve the memory so it will let few or none to others depending on the order the instances are starting.
Btw, with so small databases whatfor do you need 19 instances? Those 3 standards editions aren't enough? Or if you want to get rid of the licenses then use 3 Express editions instead.
Dustin Saunders

ASKER
I correlated this with items in the events history.  In May the server was spun up, then the databases put on it.  Worked fine until June when we did the first restore.  Then every time after that, when we'd do a restore, it would go crazy.

Initially, the topology was 2 socket 2 core (an engineer mistakenly set it to this) so the CPU usage is lopsided (express was stuck with 2 cores).  But the average was lower so we didn't get alerts until a customer complained.  It's now 1s4c to balance the CPU usage.
sqlCPUHistory.png
Dustin Saunders

ASKER
Btw, with so small databases whatfor do you need 19 instances? Those 3 standards editions aren't enough? Or if you want to get rid of the licenses then use 3 Express editions instead.

We're a hosting company, so each company has a separate instance.  In this case, they need to be able to access their instance to restore in test databases, etc.  We license through SPLA so some have express, some standard, depending on what the customer wants (and wants to pay).
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
Vitor Montalvão

In May the server was spun up, then the databases put on it.  Worked fine until June when we did the first restore.  
But before the first restore how many databases existed in the SQL Server instances? What makes this DB being restored special? Or isn't only with this DB in particular but any restore that you perform?
Dustin Saunders

ASKER
All of the current databases except one existed starting in May, I restored them then had to run updates and it got rebooted that day.  No restores happened from May - mid June.

Then for testing, this software company will upload test databases and restore them.  Each one of the customers have a different database, so it could be any companies.  In each case, it was a different database restored.

The most recent (8-18) was a new customer, who uploaded and restored the database, but that is the only net new database since May.  Yesterday I rebooted to confirm this suspicion and restored a generic database that is 20MB to one of the instances to cause the problem.
Dustin Saunders

ASKER
I can try putting 30 GB of RAM on it and see if the CPU issue still occurs, but I can't try that until maintenance window tonight.
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Vitor Montalvão

It's not only adding RAM. You need to configure your SQL Server instances to limit their memory usage so they don't fight for memory between them.
Since SQL Express can only address 1GB RAM, then you don't need to configure their memory. So having 16GB for all of them leaves 14GB RAM to be shared between 3 Standard Editions. Try to give each one 4GB so it will leave 2GB for Windows.
Here's the MSDN article to show how to configure SQL Server memory.
Scott Pletcher

You should definitely set 'max memory' for the standard editions of SQL Server.  But be aware that SQL does not automatically immediately grab the max memory available to it.  And it does not permanently reserve memory unless you set 'min memory'.  However, freeing memory from SQL is a complex and expensive process, so you would want it to occur only rarely, if ever.

Also, typically adding RAM is by far the easiest and cheapest way to get better performance for SQL Server.  So if you can add RAM without much difficulty, you should do so, even if you're not 100% sure it's the cause of the problem, it will still allow you to allocate more memory to the SQL standard instances, which can't hurt.
Dustin Saunders

ASKER
I've given the server 30 GB RAM, set a max memory on the Standard database of 3 GB.  The server is using ~3.2 GB RAM over the weekend (Friday - Today).  No CPU spike.  Let it run all day yesterday as people return to work, no CPU.

Today, I restored a database to an express instance.  One hour later, CPU averaging 76%.
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
Vitor Montalvão

set a max memory on the Standard database of 3 GB.
On all 3?

Today, I restored a database to an express instance.  One hour later, CPU averaging 76%.
Better than 100% but memory still short for all 19 instances. Or you give more memory (let's say 64GB) or split the SQL Server instances in more VMs. After all you're providing a service and if it's not good the customers will quit and won't return.
Dustin Saunders

ASKER
Yes, on all three.

But the server is still using less than 4GB RAM...?  I have 26 GB extra memory on the server not yet used.
Vitor Montalvão

Yes but the SQL Server instance can't address more than 3GB now. Don't forget that you set the max server memory for the instance.
You can play with the max server memory configuration to find the ideal value for you.
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Dustin Saunders

ASKER
What I mean is, all of the instances all together total are using less than 4GB RAM, not just the SQL standard instance.  Each SQL instance is using about 150MB RAM average.
Vitor Montalvão

Each SQL instance is using about 150MB RAM average.
Are you using Task Manager to see that?
Dustin Saunders

ASKER
Total Server Memory in perfmon for each instance.
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
Vitor Montalvão

Remembered to return to this question. You can also use sys.dm_os_ring_buffers to check the SQL Server resources.
Dustin Saunders

ASKER
The issues don't occur in the same configuration when moved to SQL2012.  I've got a case open with MS to figure out why the issue only exists on SQL2014.