Tonyfai
asked on
SBS 2003 frozen each morning
SBS 2003 system with Windows 2003 server SP2 and Exchange 2003 SP2.
When someone comes into the office first thing, they find they cannot log on to the domain, and on checking the server they find that it has a light grey or black screen, with a mouse cursor showing, but the server appears to be unresponsive, and the mouse doesn't move. To get things happening again the server is manually rebooted around 8 am.
The last application log before the freeze is informational from WBLOGSVC, at 4:30:10 am and says
"The description for Event ID ( 2004 ) in Source ( WBLOGSVC ) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: ." (and there's no following information)
The last few system log entries shown that
3:59:00 am "The WinHTTP Web Proxy Auto-Discovery Service service entered the stopped state" (after being idle for 15 minutes and being suspended)
4:51:34 am "The WinHTTP Web Proxy Auto-Discovery Service service was successfully sent a start control."
4:51:34 am "The WinHTTP Web Proxy Auto-Discovery Service service entered the running state."
-these seem innocent enough.
The last security log entry is a 5:05 am and there's no errors or warnings, but looking back to 4:30 am there's quite a bit of Account Management activity related to sbsmonacct.
Server status and usage reports are set to run at 6 am and 6:30 am respectively.
The collect usage data task is set to start at 4:30 am.
Exchange server database management is set to run from 1 am to 5:00 am
After the server is rebooted there is a problem with the Exchange E00.log file which prevents the exchange databases mounting.
(Application log error from ESE, event 465)
"Information Store (2828) First Storage Group: Corruption was detected during soft recovery in logfile C:\Program Files\Exchsrvr\mdbdata\E00 .log. The failing checksum record is located at position END. Data not matching the log-file fill pattern first appeared in sector 5128 (0x00001408). This logfile has been damaged and is unusable. "
CA Etrust Threat Management with antivirus signature 31.6.6497.0 dated 11 May 09
APC Back-UPS ES550 running APC Powerchute personal edition 2.0. It shows no blackout, overvoltage, undervoltage or electrical noise events in the last 4 weeks.
Adaptec card with RAID1 showing healthy.
When someone comes into the office first thing, they find they cannot log on to the domain, and on checking the server they find that it has a light grey or black screen, with a mouse cursor showing, but the server appears to be unresponsive, and the mouse doesn't move. To get things happening again the server is manually rebooted around 8 am.
The last application log before the freeze is informational from WBLOGSVC, at 4:30:10 am and says
"The description for Event ID ( 2004 ) in Source ( WBLOGSVC ) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: ." (and there's no following information)
The last few system log entries shown that
3:59:00 am "The WinHTTP Web Proxy Auto-Discovery Service service entered the stopped state" (after being idle for 15 minutes and being suspended)
4:51:34 am "The WinHTTP Web Proxy Auto-Discovery Service service was successfully sent a start control."
4:51:34 am "The WinHTTP Web Proxy Auto-Discovery Service service entered the running state."
-these seem innocent enough.
The last security log entry is a 5:05 am and there's no errors or warnings, but looking back to 4:30 am there's quite a bit of Account Management activity related to sbsmonacct.
Server status and usage reports are set to run at 6 am and 6:30 am respectively.
The collect usage data task is set to start at 4:30 am.
Exchange server database management is set to run from 1 am to 5:00 am
After the server is rebooted there is a problem with the Exchange E00.log file which prevents the exchange databases mounting.
(Application log error from ESE, event 465)
"Information Store (2828) First Storage Group: Corruption was detected during soft recovery in logfile C:\Program Files\Exchsrvr\mdbdata\E00
CA Etrust Threat Management with antivirus signature 31.6.6497.0 dated 11 May 09
APC Back-UPS ES550 running APC Powerchute personal edition 2.0. It shows no blackout, overvoltage, undervoltage or electrical noise events in the last 4 weeks.
Adaptec card with RAID1 showing healthy.
how much space do you have on the drive that houses the information store? How you do clear the Exchange logs, what backup method are you using?
Are you running any tasks at night? like a backup or antivirus scan? If so, try disabling those to see if that is causing the conflict.
Do you have all the latest drivers for the server. When did the freeze begin happening. Was anything added to the server at that time?
Do you have all the latest drivers for the server. When did the freeze begin happening. Was anything added to the server at that time?
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
Paulsolov:
232 Gb freespace on c: where the information store is housed. It's a small business with only 5 clients.
Exchange logs are normally cleared with a weekly normal backup using ntbackup. Except in the case of the error with the E00.log file, where replaying the log files is interrupted by the corruption of the E00.log file, well, as the priv1.edb and pub1.edb files were in clean shutdown, I just deleted all the non-edb files in the c:\program files\mdbdata directory (after backing them up). Seems to have preserved recent emails.
Automationstation:
Tasks at night as mentioned in my original post, plus shadowcopy at 7am and 12 pm every day
backukp at 11pm every Sunday as above.
I think the server is up to date, but I haven't checked every single driver. Freeze began happening shortly after configuration of this machine was complete (its predecessor got hit by lighting), so plenty of things got added at that time... do you have any specific suggestions?
asethi19:
I did a backup of exchange server on Friday and the latest freeze happened on Saturday morning sometime between 4:30 and well, 8am on Monday but pretty sure the problem originated between 4:30 am and 5am because the logs peter out at 5 as I described above.
While I've been waiting for a response I have expanded the exclusions in ETrust Threat Management to include many more directories as per http://www.tech-archive.net/Archive/Windows/microsoft.public.windows.server.general/2007-12/msg01131.html, and sought advice from CA as to what the exemptions should be.
- Tony
232 Gb freespace on c: where the information store is housed. It's a small business with only 5 clients.
Exchange logs are normally cleared with a weekly normal backup using ntbackup. Except in the case of the error with the E00.log file, where replaying the log files is interrupted by the corruption of the E00.log file, well, as the priv1.edb and pub1.edb files were in clean shutdown, I just deleted all the non-edb files in the c:\program files\mdbdata directory (after backing them up). Seems to have preserved recent emails.
Automationstation:
Tasks at night as mentioned in my original post, plus shadowcopy at 7am and 12 pm every day
backukp at 11pm every Sunday as above.
I think the server is up to date, but I haven't checked every single driver. Freeze began happening shortly after configuration of this machine was complete (its predecessor got hit by lighting), so plenty of things got added at that time... do you have any specific suggestions?
asethi19:
I did a backup of exchange server on Friday and the latest freeze happened on Saturday morning sometime between 4:30 and well, 8am on Monday but pretty sure the problem originated between 4:30 am and 5am because the logs peter out at 5 as I described above.
While I've been waiting for a response I have expanded the exclusions in ETrust Threat Management to include many more directories as per http://www.tech-archive.net/Archive/Windows/microsoft.public.windows.server.general/2007-12/msg01131.html, and sought advice from CA as to what the exemptions should be.
- Tony
ASKER
Update 8:15 am my time:
Server is up and healthy. I think the antivirus exemptions did it. But I'll wait till tomorrow morning to be sure.
Server is up and healthy. I think the antivirus exemptions did it. But I'll wait till tomorrow morning to be sure.
ASKER
Server ran fine up until this weekend. It was frozen on Monday morning. Still havent figured out why, but I'm betting it's not the same issue. So I think Aesthi19 was closest, I think the Exchange was being interfered with by E-Trust threat management so I'm giving him the points.
Tony.
Tony.
I am experiencing a similar issue where the server locks up every day at the same time. I too see the same last 3 entries every time before the failure. I am disabling the service today and will let you know how it goes.
Apologies... the service I am referring to is:
WINHTTP WEB PROXY AUTO DISCOVERY SERVICE
WINHTTP WEB PROXY AUTO DISCOVERY SERVICE
ASKER
remedina8 are you using E-Trust threat management, or another antivirus on the server, and have you excluded the directory (I think its c:\program files\exchsrvr\MDBDATA) from all scanning?
ASKER
Also, in the c:\program files\exchsrvr\MDBDATA folder how many .log files do you see?
I am not using E-Trust TM. Also, I should have been a bit clearer, this is not occurring on a Exchange Sever, but a standard F&P WIN2K3 R2 Server. Looks like this thread is mostly for the exchange issue. I share the issue of what seems to be network congestion or something to that effect. I was interested in this thread because the issue described a loss of network connectivity approx the same time every day with the same few event log entries that I have before the failure. I have started a seperate thread to discsuss this issue but wanted to query those on this one to see if a resolution was discovered... I am desperate! ;)
ASKER
Well the short of it was that the antivirus wasn't installed with necessary exclusions, and was interfering with windows to the extent that the server crashed every night. I'm not sure exactly which bit of windows though.