Solved

SQL Server has encountered 17 occurrence(s) of IO requests taking longer than 15 seconds to complete on file

Posted on 2008-06-20
6
1,980 Views
Last Modified: 2011-10-03
Hi,
I recieved following errors in one of the sql server today morning.
1. 2008-06-17 21:52:55.42 server    Insufficient memory available..  
2.2008-06-20 07:30:30.40 spid55    SQL Server has encountered 17 occurrence(s) of IO requests taking longer than 15 seconds to complete on file [E:\Program Files\Microsoft SQL Server\MSSQL\data\MYDATABASE3.mdf] in database [MYDATABASE2] (9).  The OS file handle is 0x0000 0
0430.  The offset of the latest long IO is: 0x000000b4bcc000        
3. 2008-06-17 21:54:19.90 server    Process 12:0 (fe8) UMS Context 0x002A8DD0 appears to be non-yielding on Scheduler 3.
4.2008-06-17 21:47:19.91 server    Potential deadlocks exist on all the schedulers.  

Today morning customer reported the timeouts for thier application. PLease help me on this.
                   
3.                                                                                                  
0
Comment
Question by:Veereshvnashi
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
6 Comments
 
LVL 13

Accepted Solution

by:
MikeWalsh earned 500 total points
ID: 21831356
This blog post starts some discussion about this note you are seeing:

http://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/21/642314.aspx

Seeing that message combined with slow queries tells me you are probably experiencing something that is causing I/O stalls or delays in getting data off of the disk.

There is no one answer. I would look at your disk queues on your data/log drive are they high? Is your I/O subsystem having issues (look at your system event log), have you changed or increased I/O load (are you seeing a lot of paging?), Have you changed anything at the physical end? Is your buffer cache hit ratio low indicating a lot more reads than normal (may indicate a memory bound system). What other activity is happening on this server? Virus Scan doing real time scan could be slowing you down, etc.

Look at SQL, are your indexes defragmented, are your queries well written and taking advantage of indexes? The list goes on but I would get a general idea of some SQL Server/System performance monitor counters to look at some of the above, find where the issues are and then methodically rule causes in or out.
0
 
LVL 13

Expert Comment

by:MikeWalsh
ID: 21831383
Also check out www.sql-server-performance.com  this is a good all around performance tuning website. It discusses in detail many of the performance monitor counters you can look at (http://www.sql-server-performance.com/tips/sql_server_performance_monitor_coutners_p1.aspx) and also discusses various tuning techniques and troubleshooting techniques. Easy site to search.
0
 

Author Comment

by:Veereshvnashi
ID: 21834708
In the server i confirmed that there is just 2gb physical Memory (very unusual for production servers) I recently joined here. and sql server has been assigned just 300mb fixed memory. I also observed in oneo the database the dbcc logfnfo is generating 1120 rows. Means so many vlfs created, But I checked that avg, disk que length is consistantly showing above 3. I have checked the cache hit ratio it is good ie >95%. and page life time expectancy is also very good ie 1200.  So i was just confused wether its a memory or the disk i/o or the combination of both.
0
What Is Transaction Monitoring and who needs it?

Synthetic Transaction Monitoring that you need for the day to day, which ensures your business website keeps running optimally, and that there is no downtime to impact your customer experience.

 
LVL 13

Expert Comment

by:MikeWalsh
ID: 21835387
So some of these could be considered new questions but i'll try:

1.) How big are your databases, how much activity occurs on the server? Limiting SQL Server to 300mb sounds awfully small. Cache Hit ratio being higher means you are not going out of the cache much to get data which is good so that may be enough but not necessarily. The more RAM the better generally but with intelligence. No need in going complete overkill wasting money.

2.)Yeah that dbcc loginfo tells me you might be doing a lot of shrinks and grows (or at least a lot of log growths). Those are not a cheap operation. They are made better in 2005 (with the right OS and instant file initialization if the right permissions exist on the service account (http://msdn.microsoft.com/en-us/library/ms175935.aspx) Even with instant file initization you still should "right size" your data and log files to avoid excess growths and try and avoid shrinks if the file is just going to grow.

3.) A Disk Queue length being "high" can spell a problem. WIth SANs and multiple disk arrays the queue number is not always 100% accurate but look at your current disk queue length also and see what it tends to be the majority of the time. I like to see that at a 0 the majority of the time with some spikes (say during a checkpoint, for example). What are your disk wait times and reads/writes/sec. Is your I/O subsystem keeping up with you? Are you on a SAN, internal drives, external array, etc? What kind of setup are the disks (RAID level, spindle size, speed, connection method, etc)

4.) Do you have data and log files on separate physical drives? What are the most common wait times you are seeing? How well written are the queries that are being hung up? Are you doing a lot of excess reads from table scans, poorly indexed joins, bad data model, etc.
0
 

Author Comment

by:Veereshvnashi
ID: 21836417
HI Walsh,
Thanks for clearing my doubts, It looks like a disk problem. But i am curious about RAIDS. How will i know that each disk is in RAID 5 OR 10 OR 1 OR 0. and other details as you told RAID level, spindle size, speed, connection method, etc. Please let me know a way of getting these details. (Commands,queries). Thanks a lot for your Help
0
 
LVL 13

Expert Comment

by:MikeWalsh
ID: 21837136
It all depends on your system and if you are doing hw raid vs software raid (if you are even doing raid).. Check with a system admin on the O/S/Hardware side.
0

Featured Post

Is Your DevOps Pipeline Leaking?

Is your CI/CD pipeline a hodge-podge of randomly connected tools? You’ve likely got a tool to fix one problem & then a different tool to fix another, resulting in a cluster of tools with overlapping functionality. Learn how to optimize your pipeline with Gartner's recommendations

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

How to leverage one TLS certificate to encrypt Microsoft SQL traffic and Remote Desktop Services, versus creating multiple tickets for the same server.
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Via a live example, show how to setup several different housekeeping processes for a SQL Server.

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question