Disk Corruption - Event ID 55 and KB 932578?

We have a Windows Server running 2003 with SP2 applied.  Our environment is completely Terminal Server oriented.  (ie. All of our staff connects to the Terminal Server to do their work, nothing saved on locals.)

Periodically we are getting Event 55's on one of the two physical disks.  This has been happening for about 16 months now.  Originally it was on a "real" server.  Then we imaged it and set it up on a VMware server.  Still had the issue.  We purchased a new server and completely reinstalled the OS and applications.  Only thing copied from the old system was the data.  The problem has persisted.

I have been researching this issue and the only thing that comes close to what we have going, seems to be addressed in MS KB 932578.  http://support.microsoft.com/kb/932578

However, that is for systems with allocation units smaller then 4096.  Unless Windows is reporting it wrong, our server has a size of 4096.

Everything else I have found, has been older then what we have already installed.

We don't do anything with double-byte characters either.  

It always seems to happen around specific times/days of the month.  Though nothing is consistant with the programs running at the time.  All that seems consistant is that it is heavier disk io.  

So.... my question is (finally)... Is it possible that the fix in KB 932578 could fix this issue, even though my unit size is 4096?  And is there any downside from trying this?

Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Shreedhar EtteCommented:

First run chkdsk on all the drives and if it reports error then backup the entire data and system state of the server.

After that run chkdsk /f to fix the errors.

Check with your hardware vendor for any updates for the BIOS, RAID Controller If any and NIC. Also verify the  server HDD for any issues.

I hope this helps,

JerryK440Author Commented:
Thanks for the suggestions, Shreedhar.  I have already done that several times in the past 16 months.

In fact I am running yet another chkdsk /r as I type this, since ironically the issue happened again shortly after I posted my question.

Yesterday, we had switched over to our backup server when the issue cropped up.  So the hardware is different, yet the issue stays the same.  Though in this case it was corruption on drive C, rather then drive D yesterday.

CHKDSK /R usually fixes a couple items, then everything is fine for a time.

Like I said, the only thing that seems consistant is that the issue happens when there appears to be high disk io.  Which seems to point me to the NTFS.SYS file.  I am assuming that all disk IO would go through that driver, regardless of the physical drive being written to.
Shreedhar EtteCommented:

Update the RAID Contoller drivers.

Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

JerryK440Author Commented:
Thank you again for the suggestions, Shreedhar.  As I stated before though,  have already done that.
Drivers/Firmware have all been updated for everything in the PC.  (Even CD drive that is never used on the server.)

The problem has persisted, and has manifested itself on four separate sets of hardware.  Always under "heavy" load.  Heavy in quotes because the current server is not even working up a sweat.  The utilization percent is rarely over 10% for most items.  (Aside from one CPU core now and then when one user does "compiles" financial statements for a client.)

Which is why I am looking at MS KB 932578.  http://support.microsoft.com/kb/932578  That patch seems to describe the issue precisely, but is for systems with a cluster size of smaller then 4096.  Our server has a size of exactly 4096.  

Would there be any adverse effects in installing this patch on a server with a cluster size of 4096?

Shreedhar EtteCommented:

There should not be any adverse effect. However just verify the version of the file with installed version.

However request you to take the system state backup of the server before installing hotfix. If possible take the full backup of the server.

I hope this resolves your issue,

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
JerryK440Author Commented:
Thanks Shreedhar.  I will be trying that tonight.  

Hopefully all will go well, and our users will finally have some peace to work in!
Jerry, did this solve your issue?  We are experiencing the same thing on our terminal servers!
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Legacy OS

From novice to tech pro — start learning today.