Solved

SQL Instances spike CPU, start being slow/refusing connections

Posted on 2016-09-02
26
63 Views
Last Modified: 2016-09-22
I've had an issue with some 2014 SP2 SQL servers, where multiple instances stop responding and have to be restarted before the applications can reconnect to them.

It's standalone instances, all on the same server.  Databases are on local store.

It seems like the problem comes from SQL server spiking up CPU usage (each instance is using 3-5% CPU non-stop).  Here's what is odd...  if I reboot the server it's <3% CPU usage average.  But if I restore a database into any one of those instances the server starts going CPU crazy after about an hour.  If I never restore a database, never a problem.

After the CPU starts going crazy, if I look at execution plans on all instances, the statistics are WAY off.  (expected 82 rows actual 22500)  Statistics look fine before issue starts.
0
Comment
Question by:Dustin Saunders
  • 12
  • 10
  • 3
26 Comments
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 41781614
What's the VM configuration (vCPU, vRAM and Storage)?
How many SQL Server instances do you have installed? Can you list their version and edition?
Did you configure for each of those SQL Server instances the min and max memory? If so, post here their memory configurations as well.
0
 
LVL 12

Author Comment

by:Dustin Saunders
ID: 41781628
XenServer - (1s4c CPU, 6GB RAM, 10k RPM SAS)  Paging file usage peak 1.42% (6GB pagefile)

19 instances, all 2014 SP2.  3 of them are SQL standard the rest are express.

None have max or min memory.  After I restored a database to MSSQL$RWK668UZSK every instance went from 0 CPU usage to bouncing around 5-6% non stop over the course of about an hour.  It'll stay like that until I reboot (attached).
wtfSQL.png
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 41781645
Maybe you need to turn off automatic statistics updates on restored dbs(?).  I usually either explicitly rebuild stats after an attach/restore or turn off automated stats (if not already off).  For large dbs, this can have a big impact on the system.
0
 
LVL 12

Author Comment

by:Dustin Saunders
ID: 41781653
Do you think that would cause the CPU to go nuts for all instances?  If I leave it untouched for a week, it will continue to use CPU like that until it's rebooted.  The biggest DB here is 1GB, the rest are approx 100 MB.
0
 
LVL 45

Accepted Solution

by:
Vitor Montalvão earned 350 total points
ID: 41781655
6GB for 19 MSSQL instances? This not even gives 1GB per instance!!!
When running out of memory the SQL Server engine will "attack" the CPU so you might want to increase drastically the memory for that server or even better, built another VM and migrate some SQL Server instances to there.
0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 150 total points
ID: 41781656
Honestly, I wouldn't think so, but it won't hurt to turn off auto stats, especially just briefly.  I was thinking maybe somehow the stats recompute caused a performance issue which cascaded.  Bit of a long shot probably.
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 41781664
19 instances, all 2014 SP2.  3 of them are SQL standard the rest are express
So 16 Express instances. Express Edition is limited only to use 1GB RAM and 1 CPU.
0
 
LVL 12

Author Comment

by:Dustin Saunders
ID: 41781667
@Vitor - The total size of all SQL instance mdf are 3.2GB, but I can't see that it needs more than that?  It's not paging at all.

But what's weird is that if I don't restore a database, no problem.  It can go a month with no problem.  If a restore happens, the CPU does this every time.  I can reproduce this result consistently.

@Scott -  I can try that and see if it makes a difference.
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 41781685
The total size of all SQL instance mdf are 3.2GB, but I can't see that it needs more than that?  It's not paging at all.
Problem is that as the minute the SQL Server instance is up and running it will grab and reserve the memory so it will let few or none to others depending on the order the instances are starting.
Btw, with so small databases whatfor do you need 19 instances? Those 3 standards editions aren't enough? Or if you want to get rid of the licenses then use 3 Express editions instead.
0
 
LVL 12

Author Comment

by:Dustin Saunders
ID: 41781689
I correlated this with items in the events history.  In May the server was spun up, then the databases put on it.  Worked fine until June when we did the first restore.  Then every time after that, when we'd do a restore, it would go crazy.

Initially, the topology was 2 socket 2 core (an engineer mistakenly set it to this) so the CPU usage is lopsided (express was stuck with 2 cores).  But the average was lower so we didn't get alerts until a customer complained.  It's now 1s4c to balance the CPU usage.
sqlCPUHistory.png
0
 
LVL 12

Author Comment

by:Dustin Saunders
ID: 41781693
Btw, with so small databases whatfor do you need 19 instances? Those 3 standards editions aren't enough? Or if you want to get rid of the licenses then use 3 Express editions instead.

We're a hosting company, so each company has a separate instance.  In this case, they need to be able to access their instance to restore in test databases, etc.  We license through SPLA so some have express, some standard, depending on what the customer wants (and wants to pay).
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 41781714
In May the server was spun up, then the databases put on it.  Worked fine until June when we did the first restore.  
But before the first restore how many databases existed in the SQL Server instances? What makes this DB being restored special? Or isn't only with this DB in particular but any restore that you perform?
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 12

Author Comment

by:Dustin Saunders
ID: 41781733
All of the current databases except one existed starting in May, I restored them then had to run updates and it got rebooted that day.  No restores happened from May - mid June.

Then for testing, this software company will upload test databases and restore them.  Each one of the customers have a different database, so it could be any companies.  In each case, it was a different database restored.

The most recent (8-18) was a new customer, who uploaded and restored the database, but that is the only net new database since May.  Yesterday I rebooted to confirm this suspicion and restored a generic database that is 20MB to one of the instances to cause the problem.
0
 
LVL 12

Author Comment

by:Dustin Saunders
ID: 41781747
I can try putting 30 GB of RAM on it and see if the CPU issue still occurs, but I can't try that until maintenance window tonight.
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 41781772
It's not only adding RAM. You need to configure your SQL Server instances to limit their memory usage so they don't fight for memory between them.
Since SQL Express can only address 1GB RAM, then you don't need to configure their memory. So having 16GB for all of them leaves 14GB RAM to be shared between 3 Standard Editions. Try to give each one 4GB so it will leave 2GB for Windows.
Here's the MSDN article to show how to configure SQL Server memory.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 41781842
You should definitely set 'max memory' for the standard editions of SQL Server.  But be aware that SQL does not automatically immediately grab the max memory available to it.  And it does not permanently reserve memory unless you set 'min memory'.  However, freeing memory from SQL is a complex and expensive process, so you would want it to occur only rarely, if ever.

Also, typically adding RAM is by far the easiest and cheapest way to get better performance for SQL Server.  So if you can add RAM without much difficulty, you should do so, even if you're not 100% sure it's the cause of the problem, it will still allow you to allocate more memory to the SQL standard instances, which can't hurt.
0
 
LVL 12

Author Comment

by:Dustin Saunders
ID: 41788032
I've given the server 30 GB RAM, set a max memory on the Standard database of 3 GB.  The server is using ~3.2 GB RAM over the weekend (Friday - Today).  No CPU spike.  Let it run all day yesterday as people return to work, no CPU.

Today, I restored a database to an express instance.  One hour later, CPU averaging 76%.
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 41788055
set a max memory on the Standard database of 3 GB.
On all 3?

Today, I restored a database to an express instance.  One hour later, CPU averaging 76%.
Better than 100% but memory still short for all 19 instances. Or you give more memory (let's say 64GB) or split the SQL Server instances in more VMs. After all you're providing a service and if it's not good the customers will quit and won't return.
0
 
LVL 12

Author Comment

by:Dustin Saunders
ID: 41788061
Yes, on all three.

But the server is still using less than 4GB RAM...?  I have 26 GB extra memory on the server not yet used.
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 41788067
Yes but the SQL Server instance can't address more than 3GB now. Don't forget that you set the max server memory for the instance.
You can play with the max server memory configuration to find the ideal value for you.
0
 
LVL 12

Author Comment

by:Dustin Saunders
ID: 41788068
What I mean is, all of the instances all together total are using less than 4GB RAM, not just the SQL standard instance.  Each SQL instance is using about 150MB RAM average.
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 41788073
Each SQL instance is using about 150MB RAM average.
Are you using Task Manager to see that?
0
 
LVL 12

Author Comment

by:Dustin Saunders
ID: 41788076
Total Server Memory in perfmon for each instance.
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 41799447
Remembered to return to this question. You can also use sys.dm_os_ring_buffers to check the SQL Server resources.
0
 
LVL 12

Author Closing Comment

by:Dustin Saunders
ID: 41810976
The issues don't occur in the same configuration when moved to SQL2012.  I've got a case open with MS to figure out why the issue only exists on SQL2014.
0

Featured Post

Get up to 2TB FREE CLOUD per backup license!

An exclusive Black Friday offer just for Expert Exchange audience! Buy any of our top-rated backup solutions & get up to 2TB free cloud per system! Perform local & cloud backup in the same step, and restore instantly—anytime, anywhere. Grab this deal now before it disappears!

Join & Write a Comment

It Is not possible to enable LLDP in vSwitch(at least is not supported by VMware), so in this article we will enable this, and also go trough how to enabled CDP and how to get this information in vSwitches and also in vDS.
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
Via a live example, show how to setup several different housekeeping processes for a SQL Server.

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now