BitsBytesandMore
asked on
Server 2003 Standard Goes Offline Daily
Hello Experts, I need help.... I am so saturated with this issue I cannot think anymore.
My customer has a small domain: 1 MS-Server 2003 standard - 7 Worstations. Only application running on it is Quickbooks Enterprise 2009. The server roles are domain controller, file server, and DNS. It DOES NOT have an Exchange Server.
The server has started about a week ago to go offline. Users can still connect to the shares and work but I cannot log onto the server. Eventually it locks up and users start getting error they have no written down. No errors in the event log except for the win32time which self corrects and some Quickbooks errors that have always been there (apparently Quickbooks does not provide enough info for the event log to register the reason) and have never caused any issue in the past 2 years.
Once it goes offline I have no choice but to hard boot it. I have tried remotely accessing it unsuccessfully.
I ran a dcdiag and this is the result:
Domain Controller Diagnosis
Performing initial setup:
Done gathering initial info.
Doing initial required tests
Testing server: Default-First-Site-Name\SE RVER
Starting test: Connectivity
......................... SERVER passed test Connectivity
Doing primary tests
Testing server: Default-First-Site-Name\SE RVER
Starting test: Replications
......................... SERVER passed test Replications
Starting test: NCSecDesc
......................... SERVER passed test NCSecDesc
Starting test: NetLogons
......................... SERVER passed test NetLogons
Starting test: Advertising
......................... SERVER passed test Advertising
Starting test: KnowsOfRoleHolders
......................... SERVER passed test KnowsOfRoleHolders
Starting test: RidManager
......................... SERVER passed test RidManager
Starting test: MachineAccount
......................... SERVER passed test MachineAccount
Starting test: Services
......................... SERVER passed test Services
Starting test: ObjectsReplicated
......................... SERVER passed test ObjectsReplicated
Starting test: frssysvol
......................... SERVER passed test frssysvol
Starting test: frsevent
......................... SERVER passed test frsevent
Starting test: kccevent
......................... SERVER passed test kccevent
Starting test: systemlog
An Error Event occured. EventID: 0x80001778
Time Generated: 11/02/2009 16:14:26
Event String: The previous system shutdown at 4:09:21 PM on
An Error Event occured. EventID: 0xC1010020
Time Generated: 11/02/2009 16:16:06
Event String: Dependent Assembly Microsoft.VC80.MFCLOC could
An Error Event occured. EventID: 0xC101003B
Time Generated: 11/02/2009 16:16:06
Event String: Resolve Partial Assembly failed for
An Error Event occured. EventID: 0xC101003B
Time Generated: 11/02/2009 16:16:06
Event String: Generate Activation Context failed for
An Error Event occured. EventID: 0xC1010020
Time Generated: 11/02/2009 16:16:59
Event String: Dependent Assembly Microsoft.VC80.MFCLOC could
An Error Event occured. EventID: 0xC101003B
Time Generated: 11/02/2009 16:16:59
Event String: Resolve Partial Assembly failed for
An Error Event occured. EventID: 0xC101003B
Time Generated: 11/02/2009 16:16:59
Event String: Generate Activation Context failed for
An Error Event occured. EventID: 0xC1010020
Time Generated: 11/02/2009 16:16:59
Event String: Dependent Assembly Microsoft.VC80.MFCLOC could
An Error Event occured. EventID: 0xC101003B
Time Generated: 11/02/2009 16:16:59
Event String: Resolve Partial Assembly failed for
An Error Event occured. EventID: 0xC101003B
Time Generated: 11/02/2009 16:16:59
Event String: Generate Activation Context failed for
......................... SERVER failed test systemlog
Starting test: VerifyReferences
......................... SERVER passed test VerifyReferences
Running partition tests on : ForestDnsZones
Starting test: CrossRefValidation
......................... ForestDnsZones passed test CrossRefValidatio
Starting test: CheckSDRefDom
......................... ForestDnsZones passed test CheckSDRefDom
Running partition tests on : DomainDnsZones
Starting test: CrossRefValidation
......................... DomainDnsZones passed test CrossRefValidatio
Starting test: CheckSDRefDom
......................... DomainDnsZones passed test CheckSDRefDom
Running partition tests on : Schema
Starting test: CrossRefValidation
......................... Schema passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Schema passed test CheckSDRefDom
Running partition tests on : Configuration
Starting test: CrossRefValidation
......................... Configuration passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Configuration passed test CheckSDRefDom
Running partition tests on : mydomain
Starting test: CrossRefValidation
......................... mydomain passed test CrossRefValidatio
Starting test: CheckSDRefDom
......................... mydomain passed test CheckSDRefDom
Running enterprise tests on : mydomain.local
Starting test: Intersite
......................... mydomain.local passed test Intersite
Starting test: FsmoCheck
......................... mydomain.local passed test FsmoCheck
My customer has a small domain: 1 MS-Server 2003 standard - 7 Worstations. Only application running on it is Quickbooks Enterprise 2009. The server roles are domain controller, file server, and DNS. It DOES NOT have an Exchange Server.
The server has started about a week ago to go offline. Users can still connect to the shares and work but I cannot log onto the server. Eventually it locks up and users start getting error they have no written down. No errors in the event log except for the win32time which self corrects and some Quickbooks errors that have always been there (apparently Quickbooks does not provide enough info for the event log to register the reason) and have never caused any issue in the past 2 years.
Once it goes offline I have no choice but to hard boot it. I have tried remotely accessing it unsuccessfully.
I ran a dcdiag and this is the result:
Domain Controller Diagnosis
Performing initial setup:
Done gathering initial info.
Doing initial required tests
Testing server: Default-First-Site-Name\SE
Starting test: Connectivity
......................... SERVER passed test Connectivity
Doing primary tests
Testing server: Default-First-Site-Name\SE
Starting test: Replications
......................... SERVER passed test Replications
Starting test: NCSecDesc
......................... SERVER passed test NCSecDesc
Starting test: NetLogons
......................... SERVER passed test NetLogons
Starting test: Advertising
......................... SERVER passed test Advertising
Starting test: KnowsOfRoleHolders
......................... SERVER passed test KnowsOfRoleHolders
Starting test: RidManager
......................... SERVER passed test RidManager
Starting test: MachineAccount
......................... SERVER passed test MachineAccount
Starting test: Services
......................... SERVER passed test Services
Starting test: ObjectsReplicated
......................... SERVER passed test ObjectsReplicated
Starting test: frssysvol
......................... SERVER passed test frssysvol
Starting test: frsevent
......................... SERVER passed test frsevent
Starting test: kccevent
......................... SERVER passed test kccevent
Starting test: systemlog
An Error Event occured. EventID: 0x80001778
Time Generated: 11/02/2009 16:14:26
Event String: The previous system shutdown at 4:09:21 PM on
An Error Event occured. EventID: 0xC1010020
Time Generated: 11/02/2009 16:16:06
Event String: Dependent Assembly Microsoft.VC80.MFCLOC could
An Error Event occured. EventID: 0xC101003B
Time Generated: 11/02/2009 16:16:06
Event String: Resolve Partial Assembly failed for
An Error Event occured. EventID: 0xC101003B
Time Generated: 11/02/2009 16:16:06
Event String: Generate Activation Context failed for
An Error Event occured. EventID: 0xC1010020
Time Generated: 11/02/2009 16:16:59
Event String: Dependent Assembly Microsoft.VC80.MFCLOC could
An Error Event occured. EventID: 0xC101003B
Time Generated: 11/02/2009 16:16:59
Event String: Resolve Partial Assembly failed for
An Error Event occured. EventID: 0xC101003B
Time Generated: 11/02/2009 16:16:59
Event String: Generate Activation Context failed for
An Error Event occured. EventID: 0xC1010020
Time Generated: 11/02/2009 16:16:59
Event String: Dependent Assembly Microsoft.VC80.MFCLOC could
An Error Event occured. EventID: 0xC101003B
Time Generated: 11/02/2009 16:16:59
Event String: Resolve Partial Assembly failed for
An Error Event occured. EventID: 0xC101003B
Time Generated: 11/02/2009 16:16:59
Event String: Generate Activation Context failed for
......................... SERVER failed test systemlog
Starting test: VerifyReferences
......................... SERVER passed test VerifyReferences
Running partition tests on : ForestDnsZones
Starting test: CrossRefValidation
......................... ForestDnsZones passed test CrossRefValidatio
Starting test: CheckSDRefDom
......................... ForestDnsZones passed test CheckSDRefDom
Running partition tests on : DomainDnsZones
Starting test: CrossRefValidation
......................... DomainDnsZones passed test CrossRefValidatio
Starting test: CheckSDRefDom
......................... DomainDnsZones passed test CheckSDRefDom
Running partition tests on : Schema
Starting test: CrossRefValidation
......................... Schema passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Schema passed test CheckSDRefDom
Running partition tests on : Configuration
Starting test: CrossRefValidation
......................... Configuration passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Configuration passed test CheckSDRefDom
Running partition tests on : mydomain
Starting test: CrossRefValidation
......................... mydomain passed test CrossRefValidatio
Starting test: CheckSDRefDom
......................... mydomain passed test CheckSDRefDom
Running enterprise tests on : mydomain.local
Starting test: Intersite
......................... mydomain.local passed test Intersite
Starting test: FsmoCheck
......................... mydomain.local passed test FsmoCheck
What antivirus are you using? You could have a memory leak, especially if it is McAfee.
ASKER
No McAfee...... this is taboo for me ...... I have just uninstalled 2 days ago the TrendMicro and left it with Clamwin free just while I test..... it keeps on going off line...
are you getting any application errors in the event log?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
nealerocks: no errors ... but notice above in the dcdiag "........ SERVER failed test systemlog...." so it could be part of the reason why I am not getting errors...
Hey Neilsr :-) All updates are current ..... I'll take a look at the article... the strange thing...it has never needed it before....so why all of a sudden .... now it needs it just out of the blue or it hangs?
No errors on the .net framework.... all updates are current with .net..
Hey Neilsr :-) All updates are current ..... I'll take a look at the article... the strange thing...it has never needed it before....so why all of a sudden .... now it needs it just out of the blue or it hangs?
No errors on the .net framework.... all updates are current with .net..
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
nealerocks: yes.... actually it is one of the cleanest system logs I've ever seen (It's a Dell PowerEdge something...can't remember).... they bought it 3 years ago.... it's been a sweatheart to support.... never....never any issues... application logs totally clean and system logs (except the Quickbooks errors: SidebySide error which have been there from day 1 and intuit explains it's normal)..... they have never been a problem before but maybe....this is the error:
Event Type: Error
Event Source: SideBySide
Event Category: None
Event ID: 59
Date: 10/30/2009
Time: 5:08:02 PM
User: N/A
Computer: SERVER
Description:
Generate Activation Context failed for C:\WINDOWS\WinSxS\x86_Micr osoft.VC80 .MFC_1fc8b 3b9a1e18e3 b_8.0.5072 7.42_x-ww_ DEC6DDD2\M FC80.DLL. Reference error message: The referenced assembly is not installed on your system.
.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Event Type: Error
Event Source: SideBySide
Event Category: None
Event ID: 59
Date: 10/30/2009
Time: 5:08:02 PM
User: N/A
Computer: SERVER
Description:
Generate Activation Context failed for C:\WINDOWS\WinSxS\x86_Micr
.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Did you click on the link? Did it give you an idea of what the error is about?
I dont know a lot about that error but if the server is going offline then you may have some hardware issues. I thought a memory leak was a possibility but i think you would get more errors logged.
I dont know a lot about that error but if the server is going offline then you may have some hardware issues. I thought a memory leak was a possibility but i think you would get more errors logged.
ASKER
I'm avoiding to introduce new variables.... I could update the to the latest network card drivers but I don't understand why they would start misbehaving out of the blue.... the've been working flawlessly for 3 years....
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks lrbarrios, I'm connecting remotely right now and I'm planning on testing memory and hard drives tomorrow but they will have a fit because they can't work while I'm testing and they work by shifts 24 hours.......
Yes, it's an inconvenience to the users, but when they understand that it's necessary to isolate the problem (and that you're doing your best to help them) they'll probably be less hostile. The alternative is to continue to have daily problems. :) You might get lucky when you get into the BIOS on the RAID controller (if you've got one) and find that it has reported the faulty drive(s) in its log. I would still test all of the drives anyway. I'm interested to see what you find.
ASKER
Ok guys... I'm back .... I was doing some cleanup of the events log to eliminate variables.... I only have the Quickbooks Enterprise error that everyone seems to be fighting with at the Quickbooks forums (but I'm not really worried about it at this time since Intuit support says to ignore it):
Event Type: Error
Event Source: QuickBooks
Event Category: Error
Event ID: 4
Date: 11/4/2009
Time: 8:31:28 AM
User: N/A
Computer: Server
Description:
An unexpected error has occured in "QuickBooks":
Got unexpected error 5 in call to NetShareGetInfo for path \\server\Quickbooks\MyQuic kbooksComp anyName.QB W
The server keeps going offline. This is the latest dcdiag:
Domain Controller Diagnosis
Performing initial setup:
Done gathering initial info.
Doing initial required tests
Testing server: Default-First-Site-Name\SE RVER
Starting test: Connectivity
......................... SERVER passed test Connectivity
Doing primary tests
Testing server: Default-First-Site-Name\SE RVER
Starting test: Replications
......................... SERVER passed test Replications
Starting test: NCSecDesc
......................... SERVER passed test NCSecDesc
Starting test: NetLogons
......................... SERVER passed test NetLogons
Starting test: Advertising
......................... SERVER passed test Advertising
Starting test: KnowsOfRoleHolders
......................... SERVER passed test KnowsOfRoleHolders
Starting test: RidManager
......................... SERVER passed test RidManager
Starting test: MachineAccount
......................... SERVER passed test MachineAccount
Starting test: Services
......................... SERVER passed test Services
Starting test: ObjectsReplicated
......................... SERVER passed test ObjectsReplicated
Starting test: frssysvol
......................... SERVER passed test frssysvol
Starting test: frsevent
......................... SERVER passed test frsevent
Starting test: kccevent
......................... SERVER passed test kccevent
Starting test: systemlog
......................... SERVER passed test systemlog
Starting test: VerifyReferences
......................... SERVER passed test VerifyReferences
Running partition tests on : ForestDnsZones
Starting test: CrossRefValidation
......................... ForestDnsZones passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... ForestDnsZones passed test CheckSDRefDom
Running partition tests on : DomainDnsZones
Starting test: CrossRefValidation
......................... DomainDnsZones passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... DomainDnsZones passed test CheckSDRefDom
Running partition tests on : Schema
Starting test: CrossRefValidation
......................... Schema passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Schema passed test CheckSDRefDom
Running partition tests on : Configuration
Starting test: CrossRefValidation
......................... Configuration passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Configuration passed test CheckSDRefDom
Running partition tests on : mydomain
Starting test: CrossRefValidation
......................... mydomain passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... mydomainpassed test CheckSDRefDom
Running enterprise tests on : mydomain.local
Starting test: Intersite
......................... mydomain.local passed test Intersite
Starting test: FsmoCheck
......................... mydomain.local passed test FsmoCheck
All tests are passing.... I am not getting any other event error.... It just goes offline and you cannot log into it unless you do a hard boot but it does keep managing the users and providing the shares without any other problems.
Any ideas?
Event Type: Error
Event Source: QuickBooks
Event Category: Error
Event ID: 4
Date: 11/4/2009
Time: 8:31:28 AM
User: N/A
Computer: Server
Description:
An unexpected error has occured in "QuickBooks":
Got unexpected error 5 in call to NetShareGetInfo for path \\server\Quickbooks\MyQuic
The server keeps going offline. This is the latest dcdiag:
Domain Controller Diagnosis
Performing initial setup:
Done gathering initial info.
Doing initial required tests
Testing server: Default-First-Site-Name\SE
Starting test: Connectivity
......................... SERVER passed test Connectivity
Doing primary tests
Testing server: Default-First-Site-Name\SE
Starting test: Replications
......................... SERVER passed test Replications
Starting test: NCSecDesc
......................... SERVER passed test NCSecDesc
Starting test: NetLogons
......................... SERVER passed test NetLogons
Starting test: Advertising
......................... SERVER passed test Advertising
Starting test: KnowsOfRoleHolders
......................... SERVER passed test KnowsOfRoleHolders
Starting test: RidManager
......................... SERVER passed test RidManager
Starting test: MachineAccount
......................... SERVER passed test MachineAccount
Starting test: Services
......................... SERVER passed test Services
Starting test: ObjectsReplicated
......................... SERVER passed test ObjectsReplicated
Starting test: frssysvol
......................... SERVER passed test frssysvol
Starting test: frsevent
......................... SERVER passed test frsevent
Starting test: kccevent
......................... SERVER passed test kccevent
Starting test: systemlog
......................... SERVER passed test systemlog
Starting test: VerifyReferences
......................... SERVER passed test VerifyReferences
Running partition tests on : ForestDnsZones
Starting test: CrossRefValidation
......................... ForestDnsZones passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... ForestDnsZones passed test CheckSDRefDom
Running partition tests on : DomainDnsZones
Starting test: CrossRefValidation
......................... DomainDnsZones passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... DomainDnsZones passed test CheckSDRefDom
Running partition tests on : Schema
Starting test: CrossRefValidation
......................... Schema passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Schema passed test CheckSDRefDom
Running partition tests on : Configuration
Starting test: CrossRefValidation
......................... Configuration passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... Configuration passed test CheckSDRefDom
Running partition tests on : mydomain
Starting test: CrossRefValidation
......................... mydomain passed test CrossRefValidation
Starting test: CheckSDRefDom
......................... mydomainpassed test CheckSDRefDom
Running enterprise tests on : mydomain.local
Starting test: Intersite
......................... mydomain.local passed test Intersite
Starting test: FsmoCheck
......................... mydomain.local passed test FsmoCheck
All tests are passing.... I am not getting any other event error.... It just goes offline and you cannot log into it unless you do a hard boot but it does keep managing the users and providing the shares without any other problems.
Any ideas?
ASKER
The problem turned out to be a bad router..... it would hang the server when it was trying to do run ntbackup to another remote computer (at this point, in retrospect, I realize it was probably not really hanging but just very busy to allow me to login - If I would have waited maybe several hours ....lol....at some point I would have got a login screen).
The trick that gave it away was while working on cleaning up the event logs, I saw the ntbackup pop-up and never progress...... the strange part is that it did not log the ntbackup event as failed (my guess is that we never gave it a chance to finally fail - it was set to retry for 72 hours and then fail)....
I am awarding points because of the moral support. When you are "in the box" .... advice from "out of the box" ... helps you think more clearly and this is what I feel I got from you: by forcing me to address and tackle the issues that Intuit had told me to "ignore", I was able to be working on something else that allowed me to "witness" the issue....
Thanks guys.
The trick that gave it away was while working on cleaning up the event logs, I saw the ntbackup pop-up and never progress...... the strange part is that it did not log the ntbackup event as failed (my guess is that we never gave it a chance to finally fail - it was set to retry for 72 hours and then fail)....
I am awarding points because of the moral support. When you are "in the box" .... advice from "out of the box" ... helps you think more clearly and this is what I feel I got from you: by forcing me to address and tackle the issues that Intuit had told me to "ignore", I was able to be working on something else that allowed me to "witness" the issue....
Thanks guys.