Link to home
Start Free TrialLog in
Avatar of Veereshvnashi
VeereshvnashiFlag for India

asked on

SQL Server has encountered 17 occurrence(s) of IO requests taking longer than 15 seconds to complete on file

Hi,
I recieved following errors in one of the sql server today morning.
1. 2008-06-17 21:52:55.42 server    Insufficient memory available..  
2.2008-06-20 07:30:30.40 spid55    SQL Server has encountered 17 occurrence(s) of IO requests taking longer than 15 seconds to complete on file [E:\Program Files\Microsoft SQL Server\MSSQL\data\MYDATABASE3.mdf] in database [MYDATABASE2] (9).  The OS file handle is 0x0000 0
0430.  The offset of the latest long IO is: 0x000000b4bcc000        
3. 2008-06-17 21:54:19.90 server    Process 12:0 (fe8) UMS Context 0x002A8DD0 appears to be non-yielding on Scheduler 3.
4.2008-06-17 21:47:19.91 server    Potential deadlocks exist on all the schedulers.  

Today morning customer reported the timeouts for thier application. PLease help me on this.
                   
3.                                                                                                  
ASKER CERTIFIED SOLUTION
Avatar of MikeWalsh
MikeWalsh
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Also check out www.sql-server-performance.com  this is a good all around performance tuning website. It discusses in detail many of the performance monitor counters you can look at (http://www.sql-server-performance.com/tips/sql_server_performance_monitor_coutners_p1.aspx) and also discusses various tuning techniques and troubleshooting techniques. Easy site to search.
Avatar of Veereshvnashi

ASKER

In the server i confirmed that there is just 2gb physical Memory (very unusual for production servers) I recently joined here. and sql server has been assigned just 300mb fixed memory. I also observed in oneo the database the dbcc logfnfo is generating 1120 rows. Means so many vlfs created, But I checked that avg, disk que length is consistantly showing above 3. I have checked the cache hit ratio it is good ie >95%. and page life time expectancy is also very good ie 1200.  So i was just confused wether its a memory or the disk i/o or the combination of both.
So some of these could be considered new questions but i'll try:

1.) How big are your databases, how much activity occurs on the server? Limiting SQL Server to 300mb sounds awfully small. Cache Hit ratio being higher means you are not going out of the cache much to get data which is good so that may be enough but not necessarily. The more RAM the better generally but with intelligence. No need in going complete overkill wasting money.

2.)Yeah that dbcc loginfo tells me you might be doing a lot of shrinks and grows (or at least a lot of log growths). Those are not a cheap operation. They are made better in 2005 (with the right OS and instant file initialization if the right permissions exist on the service account (http://msdn.microsoft.com/en-us/library/ms175935.aspx) Even with instant file initization you still should "right size" your data and log files to avoid excess growths and try and avoid shrinks if the file is just going to grow.

3.) A Disk Queue length being "high" can spell a problem. WIth SANs and multiple disk arrays the queue number is not always 100% accurate but look at your current disk queue length also and see what it tends to be the majority of the time. I like to see that at a 0 the majority of the time with some spikes (say during a checkpoint, for example). What are your disk wait times and reads/writes/sec. Is your I/O subsystem keeping up with you? Are you on a SAN, internal drives, external array, etc? What kind of setup are the disks (RAID level, spindle size, speed, connection method, etc)

4.) Do you have data and log files on separate physical drives? What are the most common wait times you are seeing? How well written are the queries that are being hung up? Are you doing a lot of excess reads from table scans, poorly indexed joins, bad data model, etc.
HI Walsh,
Thanks for clearing my doubts, It looks like a disk problem. But i am curious about RAIDS. How will i know that each disk is in RAID 5 OR 10 OR 1 OR 0. and other details as you told RAID level, spindle size, speed, connection method, etc. Please let me know a way of getting these details. (Commands,queries). Thanks a lot for your Help
It all depends on your system and if you are doing hw raid vs software raid (if you are even doing raid).. Check with a system admin on the O/S/Hardware side.