asked on

BIG problem with upgrade from SQL Server 2014 physical to 2016 VM

We upgraded a physical sql server from SQL Server 2014 to a 2016 VM. There were a lot of changes made outside of just the SQL Server upgrade: OS, Application, SQL Server edition, changed to VM, to name a few.

It is v2016 sp2 CU8. It was Standard Edition before, and still is now. The server is maxed at 128GB ram for max server memory (mb).

From the point of go-live on the new server, ALL performance has been dreadfully slow. 20-30 times slower response than we saw on the old server, for the calls that complete. A lot of them are also timing out. Basically, it is unusable for production requirements.

I have checked everything -- the disk, the cpu, the memory usage -- I did a blitz and blitzfirst check, trying to find what is dragging the server --- absolutely none of my checks are coming back with anything questionable. I don't see any memory or cpu pressure, I don't see anything but expected read and write latencies on the disk and tempdb. I cannot find anything at the sql server to explain this remarkable performance degradation.

I believe it could be the VM, but I do not know how to confirm the VM configuration is good or lacking.
I also fear it could be something new in SQL Server 2016 that I am unaware of -- but, I've performed the 2014 to 2016 upgrade before, and never seen anything like this.

Any Expert help? A script I can run to point to something as the cause of this problem? Ok, I know that is too easy, but I need some insight. Any ideas? This is very urgent.

2 Notes:

1. I have used sp_WhoIsActive and we're seeing much more locking/blocking than expected, certainly more than was occuring on the old server. Is there something with 2016 that would open the door for more locking?

2. The ONLY things that I do see in the log are the two items below. The infinite recompile say 10-12 times, possibly more, and I find the same objects repeatedly in the sqlhandle. I question this, as I have not seen it before, but I doubt it is the main problem, as it is not happening at all times. It does not occur on the old server, so I wonder whether this may be a new 'feature' in the 2016 error/information handling. The 2nd one happens only at startup.

Message
A possible infinite recompile was detected for SQLHANDLE 0x030007003509380099931501DBA8000001000000000000000000000000000000000000000000000000000000,
PlanHandle 0x050007003509380070FE9905F401000001000000000000000000000000000000000000000000000000000000, starting offset 4946, ending offset 7262. The last
recompile reason was 2.

Message
Unsafe assembly 'microsoft.sqlserver.integrationservices.server, version=12.0.0.0, culture=neutral, publickeytoken=89845dcd8080cc91, processorarchitecture=msil' loaded into
appdomain 2 (SSISDB.dbo[runtime].1).

John Tsioumpris

Well start with the basics...at first specs of the Old server,specs of the host/specs of the VM...and of course which virtualization platform/hypervisor you use (eg. Vmware/Hyper-V/Citrix...to name a few(
Then the absolute simplistic check...just run a benchmark...just to see if the VM is really on its knees compared to the old server...for me the very first ...almost naive test is to run winrar or 7zip benchmark...nothing fancy...just a little number that shows how fast is one machine compared to another..

Scott Pletcher

Agreed, and definitely check the disk subsystem as well: channels, paths, etc..

Also, make sure the statistics on all tables were updated (if you can afford it, use FULLSCAN on all the larger tables). When going from 2014 to 2016, that really shouldn't be an issue, but, based on what you're seeing, I'd update the stats just to be sure.