Solved

One of the databases reported SUSPECT after reboot - why at reboot?

Posted on 2014-09-10
19
165 Views
Last Modified: 2014-09-28
What is the connection between reboot and Status of database. How can reboot reset or alert a new status for a database?
0
Comment
Question by:25112
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 9
  • 7
  • 3
19 Comments
 
LVL 5

Author Comment

by:25112
ID: 40314410
This is SQL 2005, if that should make a different...

server reboots take place every 90 days. is that too much distance that can cause issues for sql server?
0
 
LVL 33

Accepted Solution

by:
ste5an earned 167 total points
ID: 40314494
The reboot per se is not a problem. The only direct cause: the drives have not enough time to settle down (write cache).

One the other hand, are you sure that the db wasn't already in suspect state?
0
 
LVL 5

Author Comment

by:25112
ID: 40314501
>>One the other hand, are you sure that the db wasn't already in suspect state?
the database was in use and hence I don't believe it could have been in suspect state. Also when I ran checkdb before the reboot, it did not give errors.. but now after the reboot it mentioned several consistency and allocation errros.

>>is that too much distance that can cause issues for sql server?
do you mean the first few minutes after the reboot?
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 49

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 333 total points
ID: 40314508
The server could has some issue that only a reboot will make it "alive" like for example a previous installation of a driver, hotfix or even an application.
Also can be hardware issue like storage (you can check if you can reach the database files) or even a security issue.
0
 
LVL 33

Assisted Solution

by:ste5an
ste5an earned 167 total points
ID: 40314509
Check your disks. Sounds like an problem with the IO system.
0
 
LVL 49

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 333 total points
ID: 40314526
Even the first thing to do is to check SQL Server log and OS event log.
Usually you can find important information there.
0
 
LVL 5

Author Comment

by:25112
ID: 40314610
when I ran checkdb now, I got this in errorlog.. what does it tell you?
---

2014-09-10 09:39:19.090      spid54      DBCC CHECKDB (ROSTER_TEST) executed by RSBSV\agors found 539 errors and repaired 0 errors. Elapsed time: 0 hours 2 minutes 12 seconds.
2014-09-10 09:39:19.180      spid54      Using 'dbghelp.dll' version '4.0.5'
2014-09-10 09:39:19.180      spid54      **Dump thread - spid = 54, PSS = 0x179F5AC0, EC = 0x179F5AC8
2014-09-10 09:39:19.180      spid54      ***Stack Dump being sent to C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\LOG\SQLDump0127.txt
2014-09-10 09:39:19.180      spid54      * *******************************************************************************
2014-09-10 09:39:19.180      spid54      *
2014-09-10 09:39:19.180      spid54      * BEGIN STACK DUMP:
2014-09-10 09:39:19.180      spid54      *   09/10/14 09:39:19 spid 54
2014-09-10 09:39:19.180      spid54      *
2014-09-10 09:39:19.180      spid54      * DBCC database corruption
2014-09-10 09:39:19.180      spid54      *
2014-09-10 09:39:19.180      spid54      * Input Buffer 88 bytes -
2014-09-10 09:39:19.180      spid54      *               dbcc checkdb (ROSTER_TEST)
2014-09-10 09:39:19.180      spid54      *  
2014-09-10 09:39:19.180      spid54      * *******************************************************************************
2014-09-10 09:39:19.180      spid54      * -------------------------------------------------------------------------------
2014-09-10 09:39:19.180      spid54      * Short Stack Dump
2014-09-10 09:39:19.210      spid54      Stack Signature for the dump is 0x00000092
2014-09-10 09:39:21.250      spid54      External dump process return code 0x20002001.  The error information has been submitted to Watson error reporting.
2014-09-10 09:39:21.250      spid54      DoMiniDump () encountered error (0x800703E6) - Invalid access to memory location.
0
 
LVL 5

Author Comment

by:25112
ID: 40314618
this error is happening 100s of times:

A time-out occurred while waiting for buffer latch -- type 3, bp 0359F010, page 1:652536, stat 0xc0010b, database id: 9, allocation unit Id: 72057609715712000, task 0x008D86B8 : 0, waittime 300, flags 0x4003a, owning task 0x008D92E8. Not continuing to wait.
Error: 845, Severity: 17, State: 1.
0
 
LVL 49

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 333 total points
ID: 40314623
Check the error log for the period just before and after the reboot.
0
 
LVL 33

Assisted Solution

by:ste5an
ste5an earned 167 total points
ID: 40314642
A time-out occurred while waiting for buffer latch [..]
The request IO process timed out, sounds like an IO problem. Maybe the load on the IO system is too high when rebooting.
0
 
LVL 5

Author Comment

by:25112
ID: 40314649
last error before reboot:
A time-out occurred while waiting for buffer latch -- type 3, bp 0359F010, page 1:652536, stat 0xc0010b, database id: 9, allocation unit Id: 72057609715712000, task 0x008D86B8 : 0, waittime 300, flags 0x4003a, owning task 0x008D92E8. Not continuing to wait.
 Error: 845, Severity: 17, State: 1.

After reboot:
2014-09-09 15:37:58.670      spid17s      Analysis of database 'ROSTER_TEST' (9) is 100% complete (approximately 0 seconds remain). This is an informational message only. No user action is required.
2014-09-09 15:37:59.200      spid17s      Error: 3456, Severity: 21, State: 1.
2014-09-09 15:37:59.200      spid17s      Could not redo log record (259903:17022:4), for transaction ID (0:75894198), on page (1:652523), database 'ROSTER_TEST' (database ID 9). Page: LSN = (259886:6386:12), type = 2. Log: OpCode = 3, context 19, PrevPageLSN: (259827:5755:1). Restore from a backup of the database, or repair the database.
2014-09-09 15:37:59.320      spid17s      Using 'dbghelp.dll' version '4.0.5'
2014-09-09 15:37:59.590      spid17s      **Dump thread - spid = 17, PSS = 0x040947D8, EC = 0x040947E0
2014-09-09 15:37:59.590      spid17s      ***Stack Dump being sent to C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\LOG\SQLDump0123.txt
2014-09-09 15:37:59.590      spid17s      * *******************************************************************************
2014-09-09 15:37:59.590      spid17s      *
2014-09-09 15:37:59.590      spid17s      * BEGIN STACK DUMP:
2014-09-09 15:37:59.590      spid17s      *   09/09/14 15:37:59 spid 17
2014-09-09 15:37:59.590      spid17s      *
2014-09-09 15:37:59.590      spid17s      * HandleAndNoteToErrorlog: Exception raised, major=34, minor=56, severity=21
2014-09-09 15:37:59.590      spid17s      *

Stack Signature for the dump is 0xA30963D5
External dump process return code 0x20002001.  The error information has been submitted to Watson error reporting.
0
 
LVL 5

Author Comment

by:25112
ID: 40314664
>>The request IO process timed out, sounds like an IO problem. Maybe the load on the IO system is too high when rebooting.

this error is seen on the error log all day yesterday.. after the reboot, this error has not happened (perhaps because the database is on EMERGENCY (suspect) mode)
0
 
LVL 49

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 333 total points
ID: 40314681
I think you got transaction log full. Can you check if you find this message in the error logs?
If it's that you should backup more often the transaction log.
0
 
LVL 5

Author Comment

by:25112
ID: 40314699
just before the reboot, I see one error related to log being full.. but that is for the windows events log.. there is 200GB space left on drive, so I don't TL could have got full....

the error is
"
The current event was not reported to the Windows Events log. Operating system error = 1717(The interface is unknown.). You may need to clear the Windows Events log if it is full.
"
0
 
LVL 49

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 333 total points
ID: 40314703
That's not Transaction Log full error.
Maybe the engine lost connection to the transaction log files.
0
 
LVL 5

Author Comment

by:25112
ID: 40314762
right now in EMERGENCY mode, Tlog is 0kb. (according to  sp_helpfile) but the actual file is still there and is over a GB)
0
 
LVL 49

Expert Comment

by:Vitor Montalvão
ID: 40314799
I think SQL Server created a new Tlog file.
0
 
LVL 5

Author Comment

by:25112
ID: 40315221
Thanks..

the network team says the disks are OK without errors.. that is what makes this more tricky to resolve.

so back to the original question.. is it possible to pinpoint why the reboot could have kicked off a suspect database, or can we can only theorize possibilities?

~~~
The only direct cause: the drives have not enough time to settle down (write cache).

The server could has some issue that only a reboot will make it "alive" like for example a previous installation of a driver, hotfix or even an application.
~~~

what would be constructive things to do/watch in general to avoid this in future (other than making sure to reboot more often?)
0
 
LVL 49

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 333 total points
ID: 40316331
I recommend you to have a good monitor tool, like for example SCOM, so you can be warned immediately when a problem occurs.
If your company have a support contract with Microsoft then send them the dumps originated by SQL Server and Windows Server so they can analyze it. For sure there's information about the problem.
0

Featured Post

Simplifying Server Workload Migrations

This use case outlines the migration challenges that organizations face and how the Acronis AnyData Engine supports physical-to-physical (P2P), physical-to-virtual (P2V), virtual to physical (V2P), and cross-virtual (V2V) migration scenarios to address these challenges.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question