Application upgrade on SQL 2014 is crawling.  Processes frequently suspend with PAGEIOLATCH_SH wait type.

Posted on 2016-09-10
Medium Priority
Last Modified: 2016-09-14
SQL 2014 Enterprise installed on Server 2012 R2
8 cpu core's & 24 gb mem
SQL instance is dedicated to ONE application database.  DB is 210 gb in size.
6 tempdb’s spread across C:\, D:\, E:\, each 12.5 gb for a total of 75 gb combined.  Trace flag 1117 applied.
System logs and application log stored on D:\
System DB’s (master, model, msdb) and application DB stored on E:\

We have a third party enterprise web application connected to a SQL database.  The objective is to upgrade the application to a newer version – this process executes a series of SQL scripts against the database.  

We made a copy of the production DB in our development environment.  Dev architecture & resources are identical to Prod.  The dev upgrade completed successfully in 6.5 hours.

The application upgrade is underway in production.  It appears to be moving much slower than what we encountered in dev/test.  Using SQL Activity Monitor we see our commands routinely become SUSPENDED with a wait type PAGEIOLATCH_SH.  
 Activity Monitor
Using Resource Monitor we see E:\ has a high amount of disk activity, specifically the .mdf data file of our application.
Resource Monitor
The prod upgrade is moving much slower than dev/test, at this rate it will take 10-12 hours to complete – double the amount of time in dev/test.  We did not observe so many SUSPENDED processes during dev/test.

I’ve been researching PAGEIOLATCH_SH on the www but so far nothing has led us to the culprit.  As a test we increased memory in the prod SQL server from 24 gb to 32 gb but did not receive noticeable performance gains.

What’s the best way to track down the ‘wait type’ we’re experiencing and how to best identify a solution?

Changing (or optimizing) the SQL queries executed during upgrade IS NOT an option.  We expect the same performance in prod as observed in dev.  

Thanks in advance for your expertise!
Question by:bmsande
LVL 16

Assisted Solution

by:Megan Brooks
Megan Brooks earned 500 total points
ID: 41793102
PAGEIOLATCH_SH would seem to indicate that you are doing a bunch of I/O. And you are, most likely, depending on just what those upgrade scripts actually do. If they rebuild 200 Gb worth of indexes, for example, that is a lot of I/O.

The resource monitor throughput numbers seem small: 5.5 Mb/sec total reads & writes on the application mdf? You must have quite a queue. That's something like 20 Gb per hour. If the upgrade needs to move ~200 Gb, that could take 10 hours. But it could very well need to move more than just the size of the database, and your estimate of 10-12 hours isn't surprising. So why is the I/O so slow?

My first thought was as to whether all the upgrade scripts were being run, and in the right order, but while a problem there might increase the amount of data to be moved, it shouldn't cut into bare throughput this way. Are you sure dev and prod are identical from an I/O throughput perspective?

The amount of memory and number of cores isn't going to matter so much when I/O is the bottleneck. Are both systems on the same SAN? How busy is the SAN? Is it shared with other servers? Are the CPUs virtualized? Is there contention there?
LVL 14

Accepted Solution

Nakul Vachhrajani earned 1000 total points
ID: 41793271
What is the value of "max degree of parallelism" on both the environments (Dev v/s Prod)?

It is possible that the production environments has threads running in parallel and causing a load on the I/O.

Also, are you seeing bottlenecks on the SAN? When we were prototyping a large upgrade in our development environment, we had seen PAGEIOLATCH_SH when the disks on our VM hosts were busy serving requests from lot of guests simultaneously. We shutdown VMs that were not critical and that helped improve the underlying disk performance by freeing up the queues.

--Turn on advanced options
sp_configure 'show advanced options',1

--Check the MAXDOP value 
sp_configure 'max degree of parallelism'

--Turn off advanced options
sp_configure 'show advanced options',0

Open in new window

LVL 43

Expert Comment

by:Eugene Z
ID: 41793295
is it VM or physical box ?
Check Vendor's doc for their "best practice" sql DB configuration..

for start: you must review and adjust your server setup

#0 did you install sp1 for sql 2014? install if vendors are ok
#1 "8 cpu core's & 24 gb mem" -- what is the Sql server min- Max memory? POS  needs (~8GB RAM)
#2 after you restored the DB on this  new server did you run DB maints:  CheckDB; reindex: update stats, etc?
if you did not -  try to run ..
#3 check with your Sysadmins if the drives for used DB are fast
#4 why do you spread  "6 tempdb’s across C:\, D:\, E:\, each 12.5 gb for a total of 75 gb combined. ?

try to get dedicated 2 drivers 1 for data 1 for log and set them there instead -- make 4 data files same size and add up to 8 if needed..
#5  did you have issues o n this box with tempdb? if not keep  Trace flag 1117  out of the picture (usually t-sql code optimization -extra indexes helps more than this flag)

#6 (if you can) try move User DB -data log files to some fast dedicated drives instead of E: drive where your system DBs ..
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

LVL 14

Expert Comment

by:Nakul Vachhrajani
ID: 41793299
Oh, and what is the initial size and growth increment for your log files and tempdb (Dev v/s production)?

If the file has to grow frequently, it would need to frequently request the OS for space. Better to have a large log file with larger growth increments so that you don't keep waiting too frequently.
LVL 53

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 500 total points
ID: 41793830
8 cpu core's & 24 gb mem
 SQL instance is dedicated to ONE application database.  DB is 210 gb in size.
24GB to support a 210GB DB looks short to me. What is the size of the larger table?
LVL 14

Expert Comment

by:Nakul Vachhrajani
ID: 41799217
Thank-you! If  you do not mind sharing, can you please share what worked? Was it the MAXDOP configuration or an issue with the SAN?

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When trying to connect from SSMS v17.x to a SQL Server Integration Services 2016 instance or previous version, you get the error “Connecting to the Integration Services service on the computer failed with the following error: 'The specified service …
How much do you know about the future of data centers? If you're like 50% of organizations, then it's probably not enough. Read on to get up to speed on this emerging field.
Viewers will learn how the fundamental information of how to create a table.
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
Suggested Courses

578 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question