SQL Server torn pages

Posted on 2005-04-12
Last Modified: 2008-01-09

I’m receiving an 823 error message indicating I’m having a torn page.  What I’m, supposed to do is recover the database.  I’ve done that and I continue to get the error.  If my understanding of a torn page is correct, the error occurred on the write of the data and would not be detected until the page is actually read again. If the page is not read for a while the error could have indeed occurred a while, more than the last backup, ago.  I’ve gone back a month of restores and the problem still persists.  

This seems suspicious to me.  If I understand how the users of the database actually use the data, I don’t think the problem could have festered that long without being read. But then again the user may not be using the data how I think they are using it.  Anyway, after a restore I do a DBCC CHECKDB  and CHECKTABLE WITH PHYSICAL_ONLY  and nothing is found.  The doc say torn pages should be detected.  

Question 1.  How accurate is it to say that DBCC CHECKDB will detect torn pages.  Is the online doc correct?

The database seems to run fine until I attempt to do the following from a nightly maintenance job.

delete from incoming where date_time < (getdate() - 45)

I receive the 823 error when the job executes.  If I issue

Select * from incoming where date_time < (getdate() - 45)

It returns rows and makes me scratch my head in wonderment because it appears to there is no torn page on a select * but there is on a delete.

Question 2.  Could someone explain why this may be the case?

Question 3.  If this is an active physical intermittent device problem where would I go to detect an actual device issue?

Question 4. Since I’ve exhausted my month of nightly backups and the problem still occurs, could a way around this problem be to

1 turn torn page detection off,
2 create a temp table
3 copy into the temp table all the data that is more recent then 45 days.
4 drop and recreate the old table
5 move the data from the temp table to the new table.

I realize some data may be lost, but I think I will have been able to retain some of the most recent data which will be used in users report generation.

Thanks for your time and effort.

Dan H.
Question by:dhully
    LVL 15

    Expert Comment

    Hi dhully,

    Please can you post the full error message?
    LVL 15

    Assisted Solution


    Author Comment

    error messages is as follows.

    I/O error (torn page) detected during read at offset 0x000009162f2000 in file 'D:\Websense Reporter\wslogdb50.mdf'..

    followed by

    Error while redoing logged operation in database 'wslogdb50'. Error at log record ID (295960:94705:141)..


    Author Comment

    We do have compaq equipment and cacheing is turned on SP3a applied.  Will work on getting cahcing turned off.

     Currently I had the strange notion that the torn page may have been on an index built on the date time column of the incoming table and not the table itself.  Just a strange though I had in a reduced sleep state.  Running the job with the index dropped. Takes a while with over 135 million rows.  
    LVL 27

    Expert Comment

    A torn page indicates an incomplete I/O operation, maybe caused by power failures or bit operations. If torn page detection is ON, then a bit is flipped for each 512-byte sector of an 8KB page when the page is written to disk.

    (Maybe the reason why the SELECT does not cause the error and the DELETE does - a SELECT does not write to the disk)

    If the bit is in the wrong state when the page is later read by SQL Server, the page was written incorrectly. Disks perform I/0 operations in 512-byte sectors (16 sectors to a page). If the system crashes or a defective disk driver can cause a glitch between the time the operating system writes the first 512-byte sector to disk and the completion of the 8KB I/0 operation. This corrupts the data page, and makes your entire database corrupt. There is no way to fix a torn page except by restoring to a known good backup (because it is physically inconsistent).

    LVL 25

    Expert Comment

    If the problem is an index, though, you should be able to drop the index and rebuild.
    LVL 25

    Accepted Solution

    I also read this:

    Torn page in non-clustered indexes will be deleted when the index is
    rebuilt under any of the repair options. Any other kind of torn page can
    only be fixed with REPAIR_ALLOW_DATA_LOSS.

    You should move to new hardware and restore from a backup before considering
    running repair (as your last resort).
    LVL 27

    Assisted Solution

    Before running DBCC CHECKTABLE with the REPAIR_ALLOW_DATA_LOSS, I agree with jrb1, move the data.

    Run DBCC CHECKDB to determine all inconsistencies.
    RUN DBCC CHECKCATALOG to see if there are any larger issues related to the inconsistency.
    RUN DBCC CHECKTABLE to determine the inconsistency on the specific table
    Create another table with the same structure and add _NEW
    Run a SELECT statement with the WHERE clause to determine what can be read.
    INSERT SELECT the data into the _New table
    RUN DBCC CHECKTABLE to verify the table is error free
    Attempt to bring the remainder from a previous backup
    Backup the db
    Rename the original table _OLD
    Rename the copy to drop the _NEW suffix
    Drop _OLD
    Test the App
    Run DBCC CHECKDB to ensure consistency errors have been taken care of
    If this does not solve the problem, then run DBCC CHECKDB with REPAIR_ALLOW_DATA_LOSS, but expect to lose the data originally indicated in the DBCC CHECKDB statement plus the potential for any linked data.

    Author Comment

    As of this moment we are back and running.  

    1.      I used a backup of the previous night before the torn page message.  
    2.      Drop and rebuilt the index on the date_column of incoming table.
    3.      Ran the delete job multiple times deleting fewer rows of data at the each time as opposed to all the rows that needed to be deleted at the same time.

    I realize its bad practice to make two changes at the same time but the users and management were getting restless.  (Can’t blame them.)

    So it would appear there was a bad index or the controller cache was getting flooded.

    Thanks for your time and effort


    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Maximize Your Threat Intelligence Reporting

    Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

    APEX (Application Express) is used to develop a web application from Oracle. SQL Workshop is one of the tools that comes with Oracle APEX to query or modify the database objects or to make any changes to the structure.
    This article explains all about SQL Server Piecemeal Restore with examples in step by step manner.
    Video by: Steve
    Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
    Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

    761 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    8 Experts available now in Live!

    Get 1:1 Help Now