Solved

SQL Server Database - One Table has read issues - A mystery to solve!

Posted on 2014-04-01
10
164 Views
Last Modified: 2014-04-03
We have a 2008 R2 SQL Server with a number of minor databases on them.  One database, called ProdData had a situation last night that I struggle to find the root cause (or even a theory).

On our production floor, we have about 120 stations running a VB App which communicates with the SQL Database to update production, downtime, etc.

Within the ProdData database, there is a table called downtime, which simply writes a single record everytime downtime is recorded on one of the production stations.  It records the ID, Down Start, Down End, StationName, and the duration.  There are probably no more than 250-300 appends to this table every 8 hours.

Around 10PM last night during a shift change, the stations were filing their production numbers.  To do the final filing, they have to access 4 separate tables to assemble the data needed to file into our main system.

What we noticed was, when the stations attempted to collect the data from the downtime table, they were timing out and shutting down, thus not completing the filing process. When we linked into the network to troubleshoot, we found that we could not access the downtime table.  All other tables responded as they usually do.  The Downtime table is actually the smallest table, at only 190,000 records.  We purged 70,000 records, and there was no immediate improvement.  

5 hours later, everything is back to normal.  Like nothing was ever wrong.

Observations:

- All stations could continue to write to the downtime table without issue all night. We could not find a noticable pattern to suggest which station may have triggered the issue, or even what time the issue may have started.
- No stations could query the table in a reasonable amount of time, thus the timeouts.
- When we attempted to access the table, a select * would take 9 minutes
- On neither the workstations, nor the server did we observe:
    - Any event log items both Windows or SQL
    - No CPU, Memory or Network abnormalities.  No spikes at all.
- Now reviewing the data in the downtime table, we can find no data abnormalities in the last 2 weeks of data.  No special characters, missing fields, incomplete dates, nothing unexpected at all.

We have already made code changes to the VB App to accomodate a timeout and just omit downtime information if necessary, however the root cause of the issue is a complete and unexplainable mystery here.

So, anyone have any ideas?

It is  the worst kind of problem to me that has no explaination or theory to fall back on to fix.
0
Comment
Question by:TheMetalicOne
10 Comments
 
LVL 39

Accepted Solution

by:
lcohan earned 200 total points
ID: 39970268
WITH (NOLOCK) SQL hint is your answer in my opinion and I know there are a lot of PROs and CONs against it but I'm using it with success since SQL re-write to 2000 version.
Assuming you use default READ COMMITTED isolation level I would add it to ALL tables participating in select and joins regardless to avoid locking/blocking.
I would also install and use SQL own Performance Dashboard Reports to see what missing indexes you can add to help your long running queries.

Performance Dashboard

http://blogs.technet.com/b/rob/archive/2009/02/18/performance-dashboard-reports-for-sql-server-2008.aspx
http://blogs.msdn.com/b/sqlserverfaq/archive/2010/05/27/sql-server-performance-dashboard-reports-in-ssms-introduction-install-datediff-error-amp-modified-for-sql-2008.aspx
http://blogs.msdn.com/b/arvindsh/archive/2010/06/25/performance-dashboard-reports-in-sql-server-2008.aspx
0
 
LVL 5

Author Comment

by:TheMetalicOne
ID: 39970342
Thank you very much!  I will take your advice and see if it helps track down the issue.  I will close out the question tomorrow just in case there are some more ideas that may float by.

Regards!
0
 
LVL 29

Assisted Solution

by:Rich Weissler
Rich Weissler earned 200 total points
ID: 39970418
Mostly questions.

You didn't have a grow operation occur against the database file (not log file) at the same time as your issue, did you?

Is that downtime table, really a table or is it just a *shiver* heap?
0
 
LVL 5

Author Comment

by:TheMetalicOne
ID: 39970463
I will check for a grow op.

Sadly (I think) I have no idea what a shiver heap is. It is definitely a table in the db

Thanks
0
 
LVL 29

Expert Comment

by:Rich Weissler
ID: 39970522
Sorry... I was trying to be funny with the *shiver* part.  (It was a mental shake.)

A 'table' without a clustered index is properly referred to as a heap.  One of the problems with a heap, as I recall, is that every read against the table has to perform a complete table scan... it has to read ALL the records in the table.  Writes can be faster, because there shouldn't be page splits.
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 
LVL 5

Author Comment

by:TheMetalicOne
ID: 39970576
Lol. OK, that makes sense. I would suspect the table is a heap.   We are installing the performance dashboard as suggested to see where we can improve. Any suggestions on resources for best practice for indexing?

Thank you both for your contributions.
0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 50 total points
ID: 39970672
I too think that the [downtime] table being a heap could be your problem.

You can run this command to check, and to fix the problem if it is at the same time:

--Note: If you get an error about online rebuild not being available,
--remove "ONLINE = ON," from the command and re-run.
CREATE UNIQUE CLUSTERED INDEX downtime__CL ON downtime ( [Down Start], StationName ) WITH ( FILLFACTOR = 98, ONLINE = ON, SORT_IN_TEMPDB = ON ) ON [PRIMARY]

If the table is not a heap, i.e. it already has a clustered index, you will get a message stating that.
0
 
LVL 5

Author Comment

by:TheMetalicOne
ID: 39972099
The Table is indeed a clustered index already.  I suppose perhaps that is a default?

Thank you for all your contributions.  I am going to hold onto this post for the remainder of the day so I can post any actions.
0
 
LVL 48

Assisted Solution

by:PortletPaul
PortletPaul earned 50 total points
ID: 39974019
Perhaps the existing clustered index is less efficient than it could be.

It may be useful to review all the indexes on this table.
0
 
LVL 5

Author Closing Comment

by:TheMetalicOne
ID: 39976370
Thank you all for your feedback.  We are monitoring the system with the performance dashboard, the table is a cluster, but we will be reviewing all tables in the databases for index improvement opportunities.  

The app has been coded to detect a table problem like this moving forward, and we revised the code to cut the number of reads down by 60% just by improving code efficiency.

Problem has not re-occured, hopefully it will stay away.  

Thank you again!
0

Featured Post

Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

Join & Write a Comment

Suggested Solutions

I wrote this interesting script that really help me find jobs or procedures when working in a huge environment. I could I have written it as a Procedure but then I would have to have it on each machine or have a link to a server-related search that …
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now