Solved

One of the databases reported SUSPECT after reboot - why at reboot?

Posted on 2014-09-10
19
147 Views
Last Modified: 2014-09-28
What is the connection between reboot and Status of database. How can reboot reset or alert a new status for a database?
0
Comment
Question by:25112
  • 9
  • 7
  • 3
19 Comments
 
LVL 5

Author Comment

by:25112
ID: 40314410
This is SQL 2005, if that should make a different...

server reboots take place every 90 days. is that too much distance that can cause issues for sql server?
0
 
LVL 32

Accepted Solution

by:
ste5an earned 167 total points
ID: 40314494
The reboot per se is not a problem. The only direct cause: the drives have not enough time to settle down (write cache).

One the other hand, are you sure that the db wasn't already in suspect state?
0
 
LVL 5

Author Comment

by:25112
ID: 40314501
>>One the other hand, are you sure that the db wasn't already in suspect state?
the database was in use and hence I don't believe it could have been in suspect state. Also when I ran checkdb before the reboot, it did not give errors.. but now after the reboot it mentioned several consistency and allocation errros.

>>is that too much distance that can cause issues for sql server?
do you mean the first few minutes after the reboot?
0
 
LVL 45

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 333 total points
ID: 40314508
The server could has some issue that only a reboot will make it "alive" like for example a previous installation of a driver, hotfix or even an application.
Also can be hardware issue like storage (you can check if you can reach the database files) or even a security issue.
0
 
LVL 32

Assisted Solution

by:ste5an
ste5an earned 167 total points
ID: 40314509
Check your disks. Sounds like an problem with the IO system.
0
 
LVL 45

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 333 total points
ID: 40314526
Even the first thing to do is to check SQL Server log and OS event log.
Usually you can find important information there.
0
 
LVL 5

Author Comment

by:25112
ID: 40314610
when I ran checkdb now, I got this in errorlog.. what does it tell you?
---

2014-09-10 09:39:19.090      spid54      DBCC CHECKDB (ROSTER_TEST) executed by RSBSV\agors found 539 errors and repaired 0 errors. Elapsed time: 0 hours 2 minutes 12 seconds.
2014-09-10 09:39:19.180      spid54      Using 'dbghelp.dll' version '4.0.5'
2014-09-10 09:39:19.180      spid54      **Dump thread - spid = 54, PSS = 0x179F5AC0, EC = 0x179F5AC8
2014-09-10 09:39:19.180      spid54      ***Stack Dump being sent to C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\LOG\SQLDump0127.txt
2014-09-10 09:39:19.180      spid54      * *******************************************************************************
2014-09-10 09:39:19.180      spid54      *
2014-09-10 09:39:19.180      spid54      * BEGIN STACK DUMP:
2014-09-10 09:39:19.180      spid54      *   09/10/14 09:39:19 spid 54
2014-09-10 09:39:19.180      spid54      *
2014-09-10 09:39:19.180      spid54      * DBCC database corruption
2014-09-10 09:39:19.180      spid54      *
2014-09-10 09:39:19.180      spid54      * Input Buffer 88 bytes -
2014-09-10 09:39:19.180      spid54      *               dbcc checkdb (ROSTER_TEST)
2014-09-10 09:39:19.180      spid54      *  
2014-09-10 09:39:19.180      spid54      * *******************************************************************************
2014-09-10 09:39:19.180      spid54      * -------------------------------------------------------------------------------
2014-09-10 09:39:19.180      spid54      * Short Stack Dump
2014-09-10 09:39:19.210      spid54      Stack Signature for the dump is 0x00000092
2014-09-10 09:39:21.250      spid54      External dump process return code 0x20002001.  The error information has been submitted to Watson error reporting.
2014-09-10 09:39:21.250      spid54      DoMiniDump () encountered error (0x800703E6) - Invalid access to memory location.
0
 
LVL 5

Author Comment

by:25112
ID: 40314618
this error is happening 100s of times:

A time-out occurred while waiting for buffer latch -- type 3, bp 0359F010, page 1:652536, stat 0xc0010b, database id: 9, allocation unit Id: 72057609715712000, task 0x008D86B8 : 0, waittime 300, flags 0x4003a, owning task 0x008D92E8. Not continuing to wait.
Error: 845, Severity: 17, State: 1.
0
 
LVL 45

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 333 total points
ID: 40314623
Check the error log for the period just before and after the reboot.
0
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

 
LVL 32

Assisted Solution

by:ste5an
ste5an earned 167 total points
ID: 40314642
A time-out occurred while waiting for buffer latch [..]
The request IO process timed out, sounds like an IO problem. Maybe the load on the IO system is too high when rebooting.
0
 
LVL 5

Author Comment

by:25112
ID: 40314649
last error before reboot:
A time-out occurred while waiting for buffer latch -- type 3, bp 0359F010, page 1:652536, stat 0xc0010b, database id: 9, allocation unit Id: 72057609715712000, task 0x008D86B8 : 0, waittime 300, flags 0x4003a, owning task 0x008D92E8. Not continuing to wait.
 Error: 845, Severity: 17, State: 1.

After reboot:
2014-09-09 15:37:58.670      spid17s      Analysis of database 'ROSTER_TEST' (9) is 100% complete (approximately 0 seconds remain). This is an informational message only. No user action is required.
2014-09-09 15:37:59.200      spid17s      Error: 3456, Severity: 21, State: 1.
2014-09-09 15:37:59.200      spid17s      Could not redo log record (259903:17022:4), for transaction ID (0:75894198), on page (1:652523), database 'ROSTER_TEST' (database ID 9). Page: LSN = (259886:6386:12), type = 2. Log: OpCode = 3, context 19, PrevPageLSN: (259827:5755:1). Restore from a backup of the database, or repair the database.
2014-09-09 15:37:59.320      spid17s      Using 'dbghelp.dll' version '4.0.5'
2014-09-09 15:37:59.590      spid17s      **Dump thread - spid = 17, PSS = 0x040947D8, EC = 0x040947E0
2014-09-09 15:37:59.590      spid17s      ***Stack Dump being sent to C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\LOG\SQLDump0123.txt
2014-09-09 15:37:59.590      spid17s      * *******************************************************************************
2014-09-09 15:37:59.590      spid17s      *
2014-09-09 15:37:59.590      spid17s      * BEGIN STACK DUMP:
2014-09-09 15:37:59.590      spid17s      *   09/09/14 15:37:59 spid 17
2014-09-09 15:37:59.590      spid17s      *
2014-09-09 15:37:59.590      spid17s      * HandleAndNoteToErrorlog: Exception raised, major=34, minor=56, severity=21
2014-09-09 15:37:59.590      spid17s      *

Stack Signature for the dump is 0xA30963D5
External dump process return code 0x20002001.  The error information has been submitted to Watson error reporting.
0
 
LVL 5

Author Comment

by:25112
ID: 40314664
>>The request IO process timed out, sounds like an IO problem. Maybe the load on the IO system is too high when rebooting.

this error is seen on the error log all day yesterday.. after the reboot, this error has not happened (perhaps because the database is on EMERGENCY (suspect) mode)
0
 
LVL 45

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 333 total points
ID: 40314681
I think you got transaction log full. Can you check if you find this message in the error logs?
If it's that you should backup more often the transaction log.
0
 
LVL 5

Author Comment

by:25112
ID: 40314699
just before the reboot, I see one error related to log being full.. but that is for the windows events log.. there is 200GB space left on drive, so I don't TL could have got full....

the error is
"
The current event was not reported to the Windows Events log. Operating system error = 1717(The interface is unknown.). You may need to clear the Windows Events log if it is full.
"
0
 
LVL 45

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 333 total points
ID: 40314703
That's not Transaction Log full error.
Maybe the engine lost connection to the transaction log files.
0
 
LVL 5

Author Comment

by:25112
ID: 40314762
right now in EMERGENCY mode, Tlog is 0kb. (according to  sp_helpfile) but the actual file is still there and is over a GB)
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 40314799
I think SQL Server created a new Tlog file.
0
 
LVL 5

Author Comment

by:25112
ID: 40315221
Thanks..

the network team says the disks are OK without errors.. that is what makes this more tricky to resolve.

so back to the original question.. is it possible to pinpoint why the reboot could have kicked off a suspect database, or can we can only theorize possibilities?

~~~
The only direct cause: the drives have not enough time to settle down (write cache).

The server could has some issue that only a reboot will make it "alive" like for example a previous installation of a driver, hotfix or even an application.
~~~

what would be constructive things to do/watch in general to avoid this in future (other than making sure to reboot more often?)
0
 
LVL 45

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 333 total points
ID: 40316331
I recommend you to have a good monitor tool, like for example SCOM, so you can be warned immediately when a problem occurs.
If your company have a support contract with Microsoft then send them the dumps originated by SQL Server and Windows Server so they can analyze it. For sure there's information about the problem.
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
MS SQL Backup 24 72
Server 2008 Cluster Fail-over Errors 5 53
Database Containment - Benefits 6 26
BULK INSERT most recent CSV 19 21
Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
Use this article to create a batch file to backup a Microsoft SQL Server database to a Windows folder.  The folder can be on the local hard drive or on a network share.  This batch file will query the SQL server to get the current date & time and wi…
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Via a live example, show how to setup several different housekeeping processes for a SQL Server.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now