Render Failed

SharePoint 2007 Std SP2 - 32 bit
SQL 2005 SP3 - 32 bit

Issue found on the 28th November.

Navigated to a document library and found the error message:

<!-- #RENDER FAILED -->

Windows Error shows 5586

Tried the following:
- Modified existing Views  (Failed)
- Create new Views (Failed)
- Created same webpart on same page (Failed - Render Failed)
- Created a new page with added library web part (Failed - Render Failed)


Had a reboot of all servers on the following weekend and when SharePoint booted up.  It showed the following error on the home page:

Teh "CalendarListView" Web Part appears to be causing a problem.  Warning: Fatal error 824 occurred at Dec 5 2011 8:12AM.  Note the error and time, and contact your system administrator.

Removed the web part, which allowed users to access the homepage.  If we re-add the webpart, it showed the Rendered Failed and accessing the Calendar list directly shows the same error message.

In the content database, the results from dbcc checktable (alluserdata) shows the following:

Results from dbcc checktable (alluserdata)
 
The results are:
DBCC results for 'AllUserData'.
Msg 8928, Level 16, State 1, Line 1
Object ID 1365579903, index ID 1, partition ID 72057597721772032, alloc unit ID 72057597818306560 (type In-row data): Page (1:113913) could not be processed.  See other errors for details.
Msg 8939, Level 16, State 98, Line 1
Table error: Object ID 1365579903, index ID 1, partition ID 72057597721772032, alloc unit ID 72057597818306560 (type In-row data), page (1:113913). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12716041 and -4.
Msg 8976, Level 16, State 1, Line 1
Table error: Object ID 1365579903, index ID 1, partition ID 72057597721772032, alloc unit ID 72057597818306560 (type In-row data). Page (1:113913) was not seen in the scan although its parent (1:623388) and previous (1:113912) refer to it. Check any previous errors.
Msg 8978, Level 16, State 1, Line 1
Table error: Object ID 1365579903, index ID 1, partition ID 72057597721772032, alloc unit ID 72057597818306560 (type In-row data). Page (1:113914) is missing a reference from previous page (1:113913). Possible chain linkage problem.
There are 18869 rows in 2153 pages for object "AllUserData".
CHECKTABLE found 0 allocation errors and 4 consistency errors in table 'AllUserData' (object ID 1365579903


Ran a checkdisk:

Event Type:        Error
Event Source:    MSSQL$EIDE
Event Category:                (2)
Event ID:              824
Date:                     12/12/2011
Time:                     12:26:16 AM
User:                     DomainName\SystemAccountName
Computer:          ServerName
Description:
SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0xe8f1950b; actual: 0xb7c2b1c9). It occurred during a read of page (1:106841) in database ID 31 at offset 0x000000342b2000 in file 'E:\SQLServer\Data\WSS_Content_SharePoint.mdf'.  Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.


Can you please let me know what we should do for this scenario?  I know that repairing the database directly is not supported.

Thanks, Ezs





LVL 6
ezskolAsked:
Who is Participating?
 
ezskolAuthor Commented:
Thanks Marten.

We actually got it working in the test environment.  So we will do another trial run to ensure we took the right steps and apply it to the production server.

This is the approach that MS have provided us:

1.  Take backup of the main content database wss_content_intranetname

2.  Restore the backup into test environment SQL server with any name eg. wss_content_production

3.  On the test environment, attached the restored database  ie.  wss_content_production to the website  http://intranetname

4.  Run DBCC query against the database
     
5.  Set database to single user

6.  Run dbcc checktable (alluserdata, repair_allow_data_loss)

7. Set database to multi_user

8.  Confirm everything is working on the test environment.

BEFORE THE NEXT STEP, ENSURE THAT YOU HAVE PERFORMED A GOOD BACKUP OF THE DATABASE.

9.  If the above is successful, then ONLY run the query on the production environment.



There were other solutions in their KB, but the above worked well for us.

Ezs.



0
 
Marten RuneSQL Expert/Infrastructure ArchitectCommented:
What kind of backups are availible?
0
 
ezskolAuthor Commented:
Full and incremental backup are available.

0
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

 
Marten RuneSQL Expert/Infrastructure ArchitectCommented:
Well, quote:
"DBCC results for 'AllUserData'.
Msg 8928, Level 16, State 1, Line 1
Object ID 1365579903, index ID 1, partition ID 72057597721772032, alloc unit ID 72057597818306560 (type In-row data): Page (1:113913) could not be processed.  See other errors for details.
Msg 8939, Level 16, State 98, Line 1
Table error: Object ID 1365579903, index ID 1, partition ID 72057597721772032, alloc unit ID 72057597818306560 (type In-row data), page (1:113913). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed."

Means somethings wrong with index ID 1, see link:
http://www.sql-server-pro.com/dbcc-checkdb.html
Quote from link:
"If n is 0 or 1 you have data corruption and need to perform one of the options described below."

Another quote thats important for your case:
"Restoring from a backup
If the recovery model is FULL (or BULK_LOGGED, with some limitations), you can backup the tail of the log, perform a restore (with norecovery) from the last clean full backup, followed by subsequent log backups and finally the tail of the log."
Followed by:
"If only a few pages are affected you have the option of selectively restoring only the bad pages, as follows:
RESTORE DATABASE yourdb PAGE = '1:94299'
FROM DISK = 'C:\yourdb.bak'
WITH NORECOVERY"

So in your case it seems you can remedy your data by using this restore. Though I would use DBCC page to view the corruption, if possible, it might error out on you:
See and read thorughly (especially the part of Traceflag 3604 if its new to you):
http://blogs.msdn.com/b/sqlserverstorageengine/archive/2006/06/10/625659.aspx

Now a summary:
You have corruption(and its not on nonclustered indexes), this gives two options:
1. Do a restore from last KNOWN GOOD backup, i e just because a backup didnt fail its not certain that it does not contain corruption. Do a restore to a testserver and run a full DBCC CHECKDB (no infomesseges ofcourse) on the restored database, now you have a KNOWN GOOD backup.
2. Restore only corrupt pages, should do the trick in your case, IF and that is IF you have checksum and not torn pages on the database. It's practically impossible to have a faulty page with a correct checksum, and for exactly your corrupt page. No I would say this is safe.
Now youre good to go!

NO youre not good to go.
First take a fullbackup once its back in a good state, then run DBCC Checkdb again (now you have a new starting point). But heres the important stuff.

CORRUPTION OCCURRED, you NEED to figure out why, to prevent it from happening again. The worst scenario is you do this page restore and all is well, and tomorrow your harddrive/SAN fails in a nonrecovery catastrophic way, and you saw the signs, but didn't pursue it. So you need to find the root cause of this corruption. This is the most overlooked and the most important step, dont skip it.

Good luck,  Marten
0
 
ezskolAuthor Commented:
Forgot to mention that it wasn't a hardware issue as the corruption was also displayed in the test environment.

Our test environment is a virtual machine, which is a direct clone of our physical environment with the same instances and names.

Detaching and re-attaching the database, does not help either during our troubleshooting.  We also had to clear the Web Config logs, stop WSS timer, reboot server before applying the above fixed the issue.

Hope that helps.
0
 
ezskolAuthor Commented:
I've requested that this question be closed as follows:

Accepted answer: 0 points for ezskol's comment http:/Q_27490324.html#37289002

for the following reason:

Thanks Marten and Microsoft for the help. :)<br /><br />Cheers, Ezs
0
 
ezskolAuthor Commented:
Sorry, I want to share the points with Marten.
0
 
ezskolAuthor Commented:
I've requested that this question be closed as follows:

Accepted answer: 0 points for ezskol's comment http:/Q_27490324.html#37289002
Assisted answer: 250 points for martenrune's comment http:/Q_27490324.html#37283877

for the following reason:

Solution above worked.
0
 
Marten RuneSQL Expert/Infrastructure ArchitectCommented:
Corruption did occur, then it got replicated to your test environment. You should never have to accept dataloss so don't take lightly on this!

For your sake:
CORRUPTION OCCURRED, you NEED to figure out why, to prevent it from happening again. The worst scenario is you do this page restore and all is well, and tomorrow your harddrive/SAN fails in a nonrecovery catastrophic way, and you saw the signs, but didn't pursue it. So you need to find the root cause of this corruption. This is the most overlooked and the most important step, dont skip it.

You do understand that repair_allow_dataloss does what it says. You have lost data. With the restore of pages, you would minimize all dataloss, although the only way to be really sure is to do a restore from a healty backup. And then let users redo their work from that point foreward.

I do understand this is not possible in all scenarios.

Glad to see youre happy, and up and running.

Regards Marten
PS did I understand correctly, you want me to gain all points. If so I can object to closing and you'll be able to remedy the points. Otherwise it's no biggie!
0
 
ezskolAuthor Commented:
If you can, that would be great!  I tried to object the points and tried again, but didn't give me the option to assign you points.
0
 
Marten RuneSQL Expert/Infrastructure ArchitectCommented:
Objecting on authors behalf, see quotes    Quote 1:  "PS did I understand correctly, you want me to gain all points. If so I can object to closing and you'll be able to remedy the points. Otherwise it's no biggie!"    Quote 2:  "If you can, that would be great!  I tried to object the points and tried again, but didn't give me the option to assign you points."    //Marten
0
 
South ModModeratorCommented:
Starting the auto-close procedure on behalf of the question asker. Please see the referenced question for more details.

SouthMod
Community Support Moderator
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.