Link to home
Start Free TrialLog in
Avatar of TAB_Systems
TAB_SystemsFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Server 2003 AD seems to fail every morning

Hi all,

My first post on here but here goes.

I have an issue that i've never come across before. My server has just started failing every morning around 4:45am

After this time I am no longer able to log on to the server remotely, access files or FQDNs ect..
Basically all services drop out.

My server is SBS 2003 32bit Only DC AD Pri DNS EXCH

I will attach the event logs once I have prepared them.

My thinking is that a service must be causing this to drop out at the same time each morning, once I restart the system is fine again until the next morning, I have a feeling blackberry might be causing this as its the only thing I have installed onto this server within the last 3 months.

Any ideas would be great. I will be disabling Blackberry Express tonight to see if the problem persists


Avatar of James
James
Flag of Ireland image

This certainly sounds like a program maybe causing this. The other options of course, would be driver issues or virues. Is your system fully up to date and does your server hardware have the latest support pack?
Avatar of TAB_Systems

ASKER

Coudnt add the event logs so will screen dump instead, if need just ask.
I have checked around for viruses/spyware no signs of any. Used AVG 2011 and Malwarebytes to check. The system has SP2 installed with latest updates and no reports of driver issues.
Have you check the programs are fully updated and are you the latest version?
yes the server only has a few programs installed Blackberry Server and Filemaker pro which are on latest versions and auto update.

im thinking of demoting and then premoting the server.
SInce this is SBS it will not be as easy as running dcpromo to demote. You will ned to rebuild.

Take a look at all tasks running around that time, antivirus, backups ect..

One thing you can try is to create memory dump using the CrashOnCtrlScroll key. Then analyze the dump to see if you can determine what is causing your issue.

http://support.microsoft.com/kb/972110
http://blogs.msdn.com/b/johan/archive/2007/01/11/how-to-install-windbg-and-get-your-first-memory-dump.aspx
http://support.microsoft.com/kb/315263 
Im not 100% sure a memory dump will work as the system doesnt blue screen or turn itself off. One by one the services fail although the server remains on and functioning as a workstation. The server dropout time is early in the morning also so would be difficult to time it right although I will give it a try
That what the CrashOnCtrlScroll reg key does. It allows you to force a blue screen so you can see what is in the memory at the time.

Here is link off of the page I sent before, you hold down control key and hit scrollLock twice.
http://support.microsoft.com/kb/244139/
Hello,

Please check for the windows updates. Sometime updates causing problem.

Regards,
Tushar Kaskhedikar
Hi All thanks for the posts they didn't really have much to do with the issue on this occasion but it was much appreciated, here's my findings,

I have been periodically disabling services and found once I disable the BES and SQL services the server now works flawlessly.

So as I firsts suspected its most likely BES that has caused this issue.

Now we how found the issue, here comes the difficult part, getting the thing to work properly.

I will keep this updated for anyone else who is having the same issue
Your next port of call maybe to get in contact with your BES partner for Support on the problem.
TAB,

What version of BBE are you running on your SBS server?  If you disable the BB services but leave the SQL services running, do you still have that problem?  BBE uses the SQL Express rather than full version, normally.  Is this the case with you?  Can you go into SQL manager and see if there are any scheduled SQL tasks for your server at the time you lose functionality?

Justin
Hi Justin,

After all the issues I have left BB disabled for 2 days but now it seems to server has decided to error again, I will list the first few.

Sys log ----------


Event Type:      Error
Event Source:      Kerberos
Event Category:      None
Event ID:      5
Date:            18/11/2010
Time:            15:45:33
User:            N/A
Computer:      CS-SERVER
Description:
The kerberos client received a KRB_AP_ERR_TKT_NYV error from the server MRING$.  This indicates that the ticket used against that server is not yet valid (in relationship to that server time).  Contact your system administrator  to make sure the client and server times are in sync, and that the KDC in realm CLEARSOLUTIONS.LOCAL is  in sync with the KDC in the client realm.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

--------------------------
Event Type:      Error
Event Source:      Srv
Event Category:      None
Event ID:      2019
Date:            18/11/2010
Time:            17:39:48
User:            N/A
Computer:      CS-SERVER
Description:
The server was unable to allocate from the system nonpaged pool because the pool was empty.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 00 00 04 00 01 00 54 00   ......T.
0008: 00 00 00 00 e3 07 00 c0   ....ã..À
0010: 00 00 00 00 9a 00 00 c0   ....¿..À
0018: 00 00 00 00 00 00 00 00   ........
0020: 00 00 00 00 00 00 00 00   ........
0028: 02 00 00 00               ....    

-------------------



app --------------------


Event Type:      Error
Event Source:      Application Error
Event Category:      (100)
Event ID:      1000
Date:            18/11/2010
Time:            08:38:53
User:            N/A
Computer:      CS-SERVER
Description:
Faulting application Start.exe, version 10.0.22.87, faulting module Start.exe, version 10.0.22.87, fault address 0x001587be.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 41 70 70 6c 69 63 61 74   Applicat
0008: 69 6f 6e 20 46 61 69 6c   ion Fail
0010: 75 72 65 20 20 53 74 61   ure  Sta
0018: 72 74 2e 65 78 65 20 31   rt.exe 1
0020: 30 2e 30 2e 32 32 2e 38   0.0.22.8
0028: 37 20 69 6e 20 53 74 61   7 in Sta
0030: 72 74 2e 65 78 65 20 31   rt.exe 1
0038: 30 2e 30 2e 32 32 2e 38   0.0.22.8
0040: 37 20 61 74 20 6f 66 66   7 at off
0048: 73 65 74 20 30 30 31 35   set 0015
0050: 38 37 62 65               87be    

------------------


Event Type:      Error
Event Source:      MSExchangeDSAccess
Event Category:      Topology
Event ID:      2102
Date:            18/11/2010
Time:            17:39:44
User:            N/A
Computer:      CS-SERVER
Description:
Process MAD.EXE (PID=4228). All Domain Controller Servers in use are not responding:
cs-server.ClearSolutions.local
 

For more information, click http://www.microsoft.com/contentredirect.asp.

--------------------------


they seem to be at the forefront of the crash,

I noticed lsass was running at 100% CPU this morning also but not sure weather that was a result of the crash or the cause. Will investigate now.


ASKER CERTIFIED SOLUTION
Avatar of Justin Owens
Justin Owens
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks for your info justin I will be making some changes now, never made any manual changes to the reg although i did notice the lsass process running at 100%

I will keep you posted
None of the above worked we did a frewsh install that seemed to fix the problems
Hey,

If you feel it more appropriate to Delete this question would you might replying and using the Object option?

Thanks!

Chris
This question has been classified as abandoned and is being closed as part of the Cleanup Program.  See my comment at the end of the question for more details.