SBS 2008 Server unresponsive and requiring power restart to reboot sometimes hourly sometimes weekly, need help

Hi We have taken over a new site that has a new SBS2008  SP2 server (IBM 3650 Dual QC 2.4Ghz 16 GB memory ) Fully patched and all MS updates installed including rollup 5. We are experiencing a problem of the server becoming unresponsive. This can happen several times through the day or not for several days. We have run the SBS2008 best practices analyzer and affected all fixes including "fix my network" which solved some IP6 issues even though IP6 was enabled. There is no BSOD so there is no MSdump to analyze. MS Logs only show some Dcom 1009 events but nothing to do with the server going down. (infact there is usually a gap in the logs untils the server is restarted via Power off or IBM IMM console power off. We have tried scheduling server restarts but this had no affect. IBM have checked out all the IBM DSA log file and state that there apperas to be no hardware failure.
I have been fight ing problem for several weeks now and hate to say this but I'm stumped. Any and all help appreciated.
simohsAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

LingerLongerCommented:
First thing I would try is uninstalling all non-essential software. Start with the anti-virus, then backup software (if other than built-in), then other monitoring software/utilities like spam software, etc., if any. Line of business applications and core SBS stuff can stay of course.
Do these one at a time, with reboots after the removals to ensure they are completely removed. Double check with the various vendors on their "complete" uninstall procedure, as some anti-virus products have pages of steps to accomplish this.
Just disabling these items is not sufficient, as their hooks are still in various places. And yes, running without virus software isn't a good idea long term, but if you're fully patched, have client computers that are running anti-virus software, behind a firewall, and not using the server as a workstation or P2P host, you should be OK for a day or two or even a week.
My guess is anti-virus.
0
simohsAuthor Commented:
Hi Guru  AV is Nod32 V 4 Exchange Edition, I have just unistalled this but can't restart the server until customer finishes for tyhe day or of course the server goes unreponsive.  Backup is ShadowProtect SBS Edition V4.
0
LingerLongerCommented:
OK, I had a very similar issue, the server didn't go completely off the table like yours is doing, but with ESET V4 Exchange, Exchange for us was basically frozen.
Turns out there are log files created by ESET. They're in my C:\WINDOWS\TEMP directory. The filenames are "NOD****.TMP". The *'s are filled with Hexadecimal values, so up to 65,536 files can be created. If for some reason 65,536 files are created, ESET doesn't have any more room to process anything, and the Exchange Server (for us) went into limbo.
I guess these are work files for ESET to process data. Not every file, email, or attachment goes into a temp file, it's got some voodoo logic for when it needs to create one.
Uninstalling ESET does not purge these files. We ran great with ESET uninstalled, and it fell over immediately after the reinstallation. After getting on the phone with ESET Support, they had me manually delete these files, and all was well.
As of version 4.20.10020.0, they don't have an automated process for this. They are working on a fix, not sure if any newer versions are available or if they automate this cleanup. For now, I just check this every couple of months, and delete whatever NOD****.TMP files are in there.
0
Introduction to Web Design

Develop a strong foundation and understanding of web design by learning HTML, CSS, and additional tools to help you develop your own website.

simohsAuthor Commented:
Just checked the Windows/temp folder and there are no nod*.tmp files present at all. Plenty of Tmp*.tmp files though all 0KB I have deleted these. And will restart server later today with NOD uninstalled.
0
Cliff GaliherCommented:
A system going unresponsive with *no* corresponding events in the logs makes me suspect hardware stability. While you gave the specs (and almost *because* you gave the specs) I have to ask, is this a well-built server (aka OEM or certified server parts) or is it a white-box with unknown or uncertain parts? In particular, if this is a white-box, what'd concern me most immediately is the motherboard and whether it is running with ECC RAM. SEcond tier of suspects would be NIC, drive controller (backplane, SAS, RAID), and finally even look at processor, thermal metrics, etc.

-Cliff
0
simohsAuthor Commented:
Hi cqaliher     All IBM branded and certified hardware including memory and SAS drives. IBM have had me run diags and log gathering software and can find no problem with the hardware, I have checked the IMM logs and all thermals are well withing operating range.
0
LingerLongerCommented:
Did this issue resolve itself with ESET uninstalled?
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
simohsAuthor Commented:
Yrs the system has now been stable for  over three weeks. I will now have to reinstall and find an exclusion to stop this from happening again. Thanks fro all of the help.
0
boat_ankerCommented:
Hi Simohs, did you find an exclusion to stop it from happening again?
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
SBS

From novice to tech pro — start learning today.