[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Pitfalls to Running DBCC CHECKDB on a database server that has not been done in 1000+ days

Posted on 2014-08-05
12
Medium Priority
?
153 Views
Last Modified: 2014-08-11
I have a database that I have run every optimization on I can think of. We have tuned the server to the best we are going to get and I have tuned the queries as well and response times are reasonable.

The only thing left to do is to get caught up on maintenance. CHECKDB has not been run in over 1000 days. I actually do not think it has ever been run. I am afraid to do it because of what it might find. If the data is corrupt the backups will be corrupt. There is potential loss of data if fixes are needed and certainly some work if anything else is found.

They were on RTM and I had them patch to SP 3 does this help?

What are the issues I could face? Would anyone advise a course of action? PHYSICAL_ONLY at first? NOINDEX? I just created about the only ones that are there.

Thoughts?
0
Comment
Question by:mike1142
  • 7
  • 5
12 Comments
 
LVL 70

Expert Comment

by:Scott Pletcher
ID: 40241644
How long it's been won't affect the time it takes to do the checks.  You just have to consider db size.

You could certainly start with a "NOINDEX" and "PHYSICAL_ONLY" check.  If you have off-hours time, you can do a more thorough check then.


Btw, as to general performance tuning, did you review "missing index" and "index usage" stats?  The single biggest factor in performance is getting the best clustered index on every table.  Has that been done already?
0
 

Author Comment

by:mike1142
ID: 40241682
Yes that is what I have been doing. All the tables come (from vendor) with clustered indexes on each table. I have been adding some nonclustered indexes here and there and query performance improved drastically. I have rebuilt indexes and the the ones I did not I made sure the stats were up to date.

The question though is what 'bad' things could it find after all of these years? I have never seen one go this long. I am remote from the server but the admins are not really DBAs. It is a guest on a VMWare host. Don't know VMWare well.

Thanks
0
 

Author Comment

by:mike1142
ID: 40241688
I do have RDP to the actual server but have been connecting through a VPN tunnel instead  of directly working on the server.
0
 [eBook] Windows Nano Server

Download this FREE eBook and learn all you need to get started with Windows Nano Server, including deployment options, remote management
and troubleshooting tips and tricks

 
LVL 70

Expert Comment

by:Scott Pletcher
ID: 40241694
Just because a table has a clustered index doesn't mean it's the right clustered index.  Indeed, from a vendor it's a virtual guarantee it's not the best.

Again, have you actually looked at the stats from SQL Server itself?  Have you reviewed the top I/O queries?

Certainly you need to get the databases checked, but that won't affect performance, just insure that the data is useable.  Vital to do, of course, but not sufficient by itself.
0
 

Author Comment

by:mike1142
ID: 40241771
Yes but I have not choice. Nonclustered OK, touch the clustered no support. I have done quite a few DMV queries. What are you suggesting in particular? I have looked at the top queries by CPU, Duration, Logical Reads, Physical Reads and Logical Writes. Wait stats and IO stats. What should I be looking for in particular?
0
 
LVL 70

Expert Comment

by:Scott Pletcher
ID: 40241848
Indeed, I do have a specific script I'll provide.  But the best clustering key is critical for performance, so your tuning chances outside of it will be limited.

Biggest trouble signs to look for are:
1) a significant number of user scans of clustered indexes, especially on large tables
2) a lot of nonclustered indexes with the same key column(s)
3) a lot of included columns in nonclustered indexes


USE [<your_db_name_here>]

SET DEADLOCK_PRIORITY LOW --probably irrelevant, but just in case

DECLARE @list_missing_indexes bit
DECLARE @table_name_pattern sysname

--NOTE: showing missing indexes can take some time; set to 0 if you don't want to wait.
SET @list_missing_indexes = 1
SET @table_name_pattern = '%' --'%'=all tables.

PRINT 'Started @ ' + CONVERT(varchar(30), GETDATE(), 120)

IF @list_missing_indexes = 1
BEGIN
    SELECT
        GETDATE() AS capture_date,
        DB_NAME(mid.database_id) AS Db_Name,
        dps.row_count,
        OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name,
        mid.equality_columns, mid.inequality_columns, mid.included_columns,
        user_seeks, user_scans, ca1.max_days_active, unique_compiles,
        last_user_seek, last_user_scan, avg_total_user_cost, avg_user_impact,
        system_seeks, system_scans, last_system_seek, last_system_scan, avg_total_system_cost, avg_system_impact,
        mid.statement, mid.object_id, mid.index_handle
    FROM sys.dm_db_missing_index_details mid WITH (NOLOCK)
    CROSS APPLY (
        SELECT DATEDIFF(DAY, create_date, GETDATE()) AS max_days_active FROM sys.databases WHERE name = 'tempdb'
    ) AS ca1
    LEFT OUTER JOIN sys.dm_db_missing_index_groups mig WITH (NOLOCK) ON
        mig.index_handle = mid.index_handle
    LEFT OUTER JOIN sys.dm_db_missing_index_group_stats migs WITH (NOLOCK) ON
        migs.group_handle = mig.index_group_handle
    LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
        dps.object_id = mid.object_id AND
        dps.index_id IN (0, 1)
    --order by
        --DB_NAME, Table_Name, equality_columns
    WHERE
        1 = 1
        AND mid.database_id = DB_ID() --only current db
        AND OBJECT_NAME(mid.object_id) LIKE @table_name_pattern
        --AND mid.object_id IN (OBJECT_ID('<table_name_1>'), OBJECT_ID('<table_name_2>'))
    ORDER BY
        --avg_total_user_cost * (user_seeks + user_scans) DESC,
        Db_Name, Table_Name, equality_columns, inequality_columns
END --IF

PRINT 'Midpoint @ ' + CONVERT(varchar(30), GETDATE(), 120)
-- list index usage stats (seeks, scans, etc.)
SELECT
    ius2.row_num, DB_NAME() AS db_name,
    CASE WHEN i.name LIKE ca2.table_name + '%'
         THEN '~' + SUBSTRING(i.name, LEN(ca2.table_name) + 1 +
                  CASE WHEN SUBSTRING(i.name, LEN(ca2.table_name) + 1, 1) = '_' THEN
                      CASE WHEN SUBSTRING(i.name, LEN(ca2.table_name) + 2, 1) = '_' THEN 2 ELSE 1 END
                  ELSE 0 END, 200)
         ELSE i.name END AS index_name,
    CASE WHEN i.is_unique = 0 THEN 'N' ELSE 'Y' END + '.' +
    CASE WHEN i.is_primary_key = 0 AND i.is_unique_constraint = 0 THEN 'N' ELSE 'Y' END AS [uniq?],
    ca2.table_name,
    i.index_id, --ius.user_seeks + ius.user_scans AS total_reads,
    dps.row_count,
    SUBSTRING(key_cols, 3, 8000) AS key_cols, SUBSTRING(nonkey_cols, 3, 8000) AS nonkey_cols,
    ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates,
    ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update,
    fk.Reference_Count AS fk_ref_count,
    DATEDIFF(DAY, CASE WHEN o.create_date > ca1.sql_startup_date THEN o.create_date
        ELSE ca1.sql_startup_date END, GETDATE()) AS max_days_active,
    FILEGROUP_NAME(i.data_space_id) AS filegroup_name,
    ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates,
    ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_update
FROM sys.indexes i WITH (NOLOCK)
INNER JOIN sys.objects o WITH (NOLOCK) ON
    o.object_id = i.object_id
CROSS JOIN (
    SELECT create_date AS sql_startup_date FROM sys.databases WHERE name = 'tempdb'
) AS ca1
CROSS APPLY (
    SELECT OBJECT_NAME(i.object_id/*, DB_ID()*/) AS table_name
) AS ca2
OUTER APPLY (
    SELECT
        ', ' + COL_NAME(object_id, ic.column_id)
    FROM sys.index_columns ic
    WHERE
        ic.key_ordinal > 0 AND
        ic.object_id = i.object_id AND
        ic.index_id = i.index_id
    ORDER BY
        ic.key_ordinal
    FOR XML PATH('')
) AS key_cols (key_cols)
OUTER APPLY (
    SELECT
        ', ' + COL_NAME(object_id, ic.column_id)
    FROM sys.index_columns ic
    WHERE
        ic.key_ordinal = 0 AND
        ic.object_id = i.object_id AND
        ic.index_id = i.index_id
    ORDER BY
        COL_NAME(object_id, ic.column_id)
    FOR XML PATH('')
) AS nonkey_cols (nonkey_cols)
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
    dps.object_id = i.object_id AND
    dps.index_id = i.index_id
LEFT OUTER JOIN sys.dm_db_index_usage_stats ius WITH (NOLOCK) ON
    ius.database_id = DB_ID() AND
    ius.object_id = i.object_id AND
    ius.index_id = i.index_id
LEFT OUTER JOIN (
    SELECT
        database_id, object_id, MAX(user_scans) AS user_scans,
        ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+user_scans
    FROM sys.dm_db_index_usage_stats WITH (NOLOCK)
    WHERE
        database_id = DB_ID()
        --AND index_id > 0
    GROUP BY
        database_id, object_id
) AS ius2 ON
    ius2.database_id = DB_ID() AND
    ius2.object_id = i.object_id
LEFT OUTER JOIN (
    SELECT
        referenced_object_id, COUNT(*) AS Reference_Count
    FROM sys.foreign_keys
    WHERE
        is_disabled = 0
    GROUP BY
        referenced_object_id
) AS fk ON
    fk.referenced_object_id = i.object_id
WHERE
    i.object_id > 100 AND
    i.is_hypothetical = 0 AND
    i.type IN (0, 1, 2) AND
    o.type NOT IN ( 'IF', 'IT', 'TF', 'TT' ) AND
    (
     o.name LIKE @table_name_pattern AND
     o.name NOT LIKE 'dtprop%' AND
     o.name NOT LIKE 'filestream[_]' AND
     o.name NOT LIKE 'MSpeer%' AND
     o.name NOT LIKE 'MSpub%' AND
     --o.name NOT LIKE 'queue[_]%' AND
     o.name NOT LIKE 'sys%'
    )
    --AND OBJECT_NAME(i.object_id /*, DB_ID()*/) IN ('tbl1', 'tbl2', 'tbl3')
ORDER BY
    --row_count DESC,
    --ius.user_scans DESC,
    --ius2.row_num, --user_scans&|user_seeks
    db_name, table_name,
    -- list clustered index first, if any, then other index(es)
    CASE WHEN i.index_id IN (0, 1) THEN 1 ELSE 2 END,
    key_cols

PRINT 'Ended @ ' + CONVERT(varchar(30), GETDATE(), 120)
0
 

Author Comment

by:mike1142
ID: 40241990
Biggest trouble signs to look for are:
1) a significant number of user scans of clustered indexes, especially on large tables
2) a lot of nonclustered indexes with the same key column(s)
3) a lot of included columns in nonclustered indexes

I need a little more specificity here. I have a 50 row index that was scanned 4000 times and a 250 million row index that was never scanned. Large number of includes. I'd say the biggest has about a dozen or so.
0
 

Author Comment

by:mike1142
ID: 40242033
Also for the missing indexes. Of the 120 it is suggesting I am only interested in 38. 25 are on the same table. This table is very interesting to me yet. Finally there is one equality column that I intuitively know from working with the application that would probably have a big impact. Of those the column is listed as the only key on 15. An additional 8 have a combination of keys most include the first column of the composite primary key.

What is the tie breaker?
0
 
LVL 70

Accepted Solution

by:
Scott Pletcher earned 2000 total points
ID: 40242052
>> I need a little more specificity here. <<
Yeah, it's tough to give absolute numbers on something like this.  If I had the actual, preferably in a spreadsheet, I could go thru step by step the way I analyze it.

But it's just not a process that a simple cookbook list can duplicate.  (Hurry for that!, btw, since that's my job!)

Also, you have to look at how many days' activity the stats show.  One day likely won't be a full representation, whereas 30+ days' worth has a pretty good chance of being very representative of typical table use.


>> 250 million row index that was never scanned <<
Were there a significant number of seeks to it?  If so, that's great, that's how indexes should typically be used.


>> Large number of includes. I'd say the biggest has about a dozen or so. <<
Could be a sign of wrong clustering key.


>> Of those the column is listed as the only key on 15. An additional 8 have a combination of keys most include the first column of the composite primary key. <<
Almost certain that column should be the lead clustering key on that table, although, again, I'd have to see specific results to be sure.
0
 

Author Comment

by:mike1142
ID: 40251419
Thanks for you feedback. More information on indexing is especially useful for me due to the constraints I usually encounter. I like the idea of providing a list of changes and additions and have the rationale and stats to back it up. I also have seen that providing a rollback/forward script is helpful.

I am being admonished about how I neglected the question. So - I am concerned about running checkdb on this database and a colleague gave me a similar non-answer (no offense).  This is/was a badly neglected database there have been constant problems with memory/CPU pressure and poor performance. Its tough asking questions like this in a direct way but I asked several times about a memory limit on the host side being that this is a VM. Having used MS Hyper-V and not VMWare I guess I did not ask the question the right way because it was "discovered" that this had indeed occurred.
So better performance happier people. Now I go and throw checkdb in and it is discovered that data integrity has been compromised. Given the amount of time that has lapsed, the general state of maintenance and the tools intended purpose what are the risks? Ignoring it seems irresponsible causing more problems is not good either. That's the dilemma.
0
 

Author Closing Comment

by:mike1142
ID: 40253393
ScottPletcher,

I think I have put you in a situation where it would be difficult to give a recommendation. So thank you for your input, as always great job!
0
 
LVL 70

Expert Comment

by:Scott Pletcher
ID: 40253575
Thanks.

My typos get really bad when I'm rushed for time though:
"
>> I need a little more specificity here. <<
 Yeah, it's tough to give absolute numbers on something like this.  If I had the actual missing and index usage results, preferably in a spreadsheet, I could go thru step by step the way I analyze it.

 But it's just not a process that a simple cookbook list can duplicate.  (Hurray for that!, btw, since that's my job!)
"

But I believe I did directly address the q in the first sentence of my first post.  The length of time doesn't affect the process of running checkdb.


If you have data integrity issues, review all copies of the SQL Server error logs that you have available.  The default is only 7, unfortunately (the current one and six prior logs, named "ERRORLOG.1", "ERRORLOG.2", etc.).  The first time -- and only the first time -- the corruption is detected by a SQL process, detailed messages will be in the log describing the condition(s) found.
0

Featured Post

Restore individual SQL databases with ease

Veeam Explorer for Microsoft SQL Server delivers an easy-to-use, wizard-driven interface for restoring your databases from a backup. No expert SQL background required. Web interface provides a complete view of all available SQL databases to simplify the recovery of lost database

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This is basically a blog post I wrote recently. I've found that SARGability is poorly understood, and since many people don't read blogs, I figured I'd post it here as an article. SARGable is an adjective in SQL that means that an item can be fou…
This shares a stored procedure to retrieve permissions for a given user on the current database or across all databases on a server.
This video shows how to quickly and easily deploy an email signature for all users in Office 365 and prevent it from being added to replies and forwards. (the resulting signature is applied on the server level in Exchange Online) The email signat…
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an anti-spam), the admin…

873 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question