Please help decipher this chkdsk log, it fixed our Ghost problem.

Hi Everyone,

We were getting an ODD error in our backup software (Norton Ghost), it was completing but with an odd error that took several hours with support to finally figure it out.  They claimed they had never seen this before.  They ended up escalating up to a software engineer who recommended doing a chkdsk /f /r.  Sure enough, it fixed the problem!  

Can anyone help decipher the log and let us know if it looks like anything we should be concerned about?  We do believe the staff has powered it down before without a proper shutdown for what it's worth.  

If anyone is interested, the odd error Norton Ghost was throwing that was finally fixed by this chkdsk was Error "E98F0004: RPAM initialisation failed."  Hopefully if anyone gets it again this post will help.


The code is attached.  Thanks!  
A disk check has been scheduled.
Windows will now check the disk.                         
Cleaning up minor inconsistencies on the drive.
Cleaning up 218 unused index entries from index $SII of file 0x9.
Cleaning up 218 unused index entries from index $SDH of file 0x9.
Cleaning up 218 unused security descriptors.
CHKDSK is verifying Usn Journal...
Usn Journal verification completed.
CHKDSK is verifying file data (stage 4 of 5)...
File data verification completed.
CHKDSK is verifying free space (stage 5 of 5)...
Free space verification is complete.

 488335837 KB total disk space.
  34918776 KB in 74611 files.
     23016 KB in 6601 indexes.
         0 KB in bad sectors.
    210505 KB in use by the system.
     65536 KB occupied by the log file.
 453183540 KB available on disk.

      4096 bytes in each allocation unit.
 122083959 total allocation units on disk.
 113295885 allocation units available on disk.

Internal Info:
90 6b 01 00 46 3d 01 00 76 a6 01 00 00 00 00 00  .k..F=..v.......
7c 01 00 00 06 00 00 00 e7 02 00 00 00 00 00 00  |...............
4a 12 ca 06 00 00 00 00 3a 57 4c 18 00 00 00 00  J.......:WL.....
76 cc 5e 0e 00 00 00 00 f4 94 07 0e 02 00 00 00  v.^.............
d6 ad ca 33 0a 00 00 00 04 e0 e1 7d 0c 00 00 00  ...3.......}....
99 9e 36 00 00 00 00 00 90 38 07 00 73 23 01 00  ..6......8..s#..
00 00 00 00 00 e0 45 53 08 00 00 00 c9 19 00 00  ......ES........

Open in new window

JsmplyAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

noxchoGlobal Support CoordinatorCommented:
The answer is here: http://support.microsoft.com/kb/255008/EN-US/
As for the problem you had I suspect that due to errors in file system Norton could not perform online backup.
Normally checking volumes for errors in file system is a good practice as MS file systems remind shaky towers made of bad concrete. =)
Did support of Symantec tell you what RPAM means?

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
edbedbCommented:
The log shows that errors were found on the drive and they were repaired.
Improper shut downs are a great way to mess up the file system on the hard drive.
Several hours to suggest you check the drive for errors? That is funny.
JsmplyAuthor Commented:
Well I guess we spoke too soon. The error came back on last nights backup and Norton has no idea why it is occuring. Noxcho, no they didn't know exactly what it means.  They said it refers to the "boot drive" is all he could say. At this point we don't know what else to try as Norton is stumped.
IT Pros Agree: AI and Machine Learning Key

We’d all like to think our company’s data is well protected, but when you ask IT professionals they admit the data probably is not as safe as it could be.

noxchoGlobal Support CoordinatorCommented:
Shame on symantec. Did you check the drive with HDD vendor made tools? The problem could be in failing drive.
Mark KiserIT ManagerCommented:
Download ultimate boot CD here.
http://ubcd.sourceforge.net/download.html
It has a variety of tools including hard drive test software from most manufacturers. It has helped me with diagnosing several drive problems.
JsmplyAuthor Commented:
Okay well we heard back from Symantec. They supposedly researched the error and they say its a registry issue or a corrupted dll and running /sfc scannow will fix the problem. We can't test that till afer the weekend. Any thoughts?    Regarding the UBCD, anyone ever tried it with a RAID 1 array?  Will the hard drive tests see both drives?  
ocanada_techguyCommented:
Well at least you got one recent gHost backup to complete error-free, that's important  (or didn't you?)

What version of gHost?

noxcho's kb seems right on point except the  link applies to Win2000 but you've posted  in XP, are you XP?  

What Service Packs hotfixes?  You know what sfc /scannow does; it's Windows "system file checker" it's going to try to determine if any DLLs don't match what it thinks you should have, while also updating the sfc list of what it thinks you have with what DLL versions you actually have.  It might replace one or two from a restore point.

You have RAID, mobo or dedicated card?  Maybe that firmware or drivers have an issue?  Could service pack 3 have caused a new inconsistency?
The log from your chkdsk/f/r says it found 0 bad sectors but then of course the RAID probably handled them at a hardware level so they never show to the OS.
It's also possible the chkdsk bug was there all along and only now that you have this "security descriptor" hiccup has the problem revealed itself.

If there's a golden rule, it's a tie between "first do no harm" and "if it ain't broke, don't fix it"

No the hd diagnostics more or less don't deal very well with a RAID middle-man situation.  It would depend on which manufacturer's tool.  Theoretically you could attach each drive directly, have the manuf diagnostics check the drive as if it had a partition of UNKNOWN type on it.  If it did some bad sectoring you'd have to hope when returned to the RAID card that that would recognize the spared sectors chains, not an entirely safe bet though.  Are you prepared to restore from backup if it goes kablooey?   Realistically your RAID controller has RAID BIOS which you can enter into at boot time and it SHOULD have utilities there for testing the drive, reporting the state and health of the RAIDs (along with re-establishing mirrors or replacing stripe or parity drives)

Both drives?  It's a simple mirror?  If so well you could detach in in theory should have two copies of the same drive, one could now be a backup.  You could test the other,  put it back in,  and then re-establish the mirror with a "new"(used) empltied replacement third drive in place of the second you removed for backup reasons, put in safekeeping in case you need it.
JsmplyAuthor Commented:
So if the chkdsk found zero bad sectors, would that lean towards the hard drive not being bad?  The chkdsk log didn't seem that bad to me. The raid array info on boot says both drives are healthy. We ran the Dell diagnostic utility and it took almost 9 hours to get to 96 percent and we had to cancel the test as the staff needed the machine. It did not find any errors but took forever doing the read tests on the two drives. This was the Dell diagnostic pre boot assesment that then goes to the utility partition. We had never tried it on a raid array though. Any idea if it can accurately see both drives that way?  

Oh and yes, its Windows XP and it came with sp3 integrated. Norton wants us to either do the /sfc scannow or do a system restore point to before that error began (about a week ago). They say its a system issue and it should NOT impact the database backups on this machine. They assured us those are fine. It is Ghost 15.
ocanada_techguyCommented:
Yeah I commented on your other post too.

No kinda the opposite, if the RAID or drive automatic badtracking handling built into the hardware handled the bad sectors automatically, then to the chkdsk in the Windows OS it will always "appear" as though there are 0 bad sectors, because the hardware handles it first.  The mere fact you ran the chkdsk with /r makes it read the drive thoroughly, and THAT "triggers" the RAID / hd logic board to "handle" the badtracks first, so the chkdsk doesn't end up handling them itself.  It's under the blankets.
Drives with S.M.A.R.T. technology record the fact these events ocurred, and IF certain thresholds are exceeded ( like if the drive is almost out of spare area to swap bad sectors with, or if too many bad sectors happened all-at-once so this is not normal wear-and-tear bad sectors are bound to happen but instead an indication of something more serious) then a red flag is raised.  Most computers completely ignore SMART errors except when booting during the POST (power on self test) at which time it'll say woah, smart errors, hard drive may immenently fail.   Ideally Windows would warn right away, but it doesn't.  Ideally the RAID disk controller would warn, but it doesn't necessarily.  Some well-written RAID drivers record events in the Event Logs, some doOTand instead keep their own logs at a hardware level only.  So your problem is when all this is happening "UNDER THE BLANKETS" so you don't know that's what's gone wrong, you only see resulting file is bad or whatever.

If it started happening about a week ago, I'd be tempted to use system restore to put it back the way it was just before that.
When you first bring up System Restore and say you want to restore to a previous point and start looking at the calendar, take note of what events triggered restore points on the dates around that time.
There's a slight chance a hotfix has created a new issue.  If so, the next subsequent automatic updates may recreate the problem after you solve it with a restore point.  If that happens, then you'd need to check if there are new issues with the disk raid controller drivers that everyone else with this platform may be having.  You'd be checking your manufacturer and RAID maker (Dell I guess) support and forums in particular.
Was any software intalled or updated/upgraded around that time?  If so, a) they'll liely have to be reinstalled since you're putting the operating system back the way it was before that OSchanges being undone b) if so, that may be the culprit c) or you can try operating without them for awhile.  In particular try doing the gHost backup before the rest of that stuff has a chance to mess it again.

sfc is all about what if some rogue or badly written program replaced a file it should not have.  Ah, well System Restore can help with that too since it just blindly automatically makes a system checkpoint every day (as long as there's enough free space to do so, System Restore keeps about 90 days worth of restore points, whether automatic or manually made)  SFC might help, is not without a few hiccups of it's own, you may want to read all about that here http://www.updatexp.com/scannow-sfc.html

HOWEVER, having said all that, there is the possibility you'll do all that and the problem does NOT go away. Think about this...the noxcho provided link says that in some RARE cases there could be some minor inconsistencies with the file and security descriptors that Win2000 chkdsk would seem to fix but it would come back again.  MS fixed that bug, and besides, your XP not 2000.  But, suppose for a moment your disk has some inconsistencies that the RAID tried to handle automatically, BUT it did NOT handle so well.  Well then the inconsistencies may not be fixed properly.  You got one gHost backup done ok and then the inconsistencies came back.  Maybe some more bad sectors happened at around the same spot, that the error correcting was again not able to correct correctly, causing the problem again?

What one person suggested is if the system seems innordinately slow, it could very well be it is spending tons of time under the blankets re-reading and re-reading until it finally succeeds.  I've seen plenty of people who's system is slow, we look in the even log, and WOW that's a TON of disk errors warnings and retires, no wonder.  But RAID... could be "hiding" that

Unfortunately you're in no position to know whether the Dell diagnostics were innordinately slow or if that was normal diagnostic speed.  Stupid thing should've given an ETA, but it didn't.

It's so complicated.

Listen, if it were me, I'd be really PUSHING Symantec support HARD for a thorough explanation of what the heck is a Error "E98F0004: RPAM initialisation failed." and wtf do you mean you don't know?!?  Are they just guessing or do they have good reason to believe it's an incompatibility with a system DLL?  Why exactly?

I'd be tempted to BOOT with a gHost bootable disk and see if you can backup with that AOK.  If you cannot, then some corruption with file descriptors or allocation indexes or something is suspected and sfc has NOTHING to do with it.  If you CAN, then ya probably the issue is running gHost under Windows and some incompatibility bug issue or something, and try System Restore point.  Maybe also try booting and trying the good ol' gHost 2001/03

I'd be tempted to scrub disks and restore backup, but I'd sure want to be sure I had a good backup I can restore first eh!!!

Say, maybe try Acronis True Image. If that works better, maybe switch to that. At that point I'd be very indignant and insisstant that Symantec refund your money (not that they'd be inclined to)  Acronis might be very interested to hear about a competitor problem they don't have.  Truth is the perfect defense to libel.

Good luck. Let us know.
JsmplyAuthor Commented:
Thanks. Actually yes, Norton has to reinstall Ghost due to an update to the program. The uninstall froze and needed the Norton removal tool. The RPAM error started after that.  The system works other than that and is peppy and shares a database with several machines.  
noxchoGlobal Support CoordinatorCommented:
Looks like Norton was not removed completely and thus it left some traces in your registry, Try to uninstall Norton and then erase all the traces from registry via regedit - search (symntec or ghost or norton) then install Norton and see if you get these errors again.
JsmplyAuthor Commented:
WIll do.  We are going to take the machine offline this weekend so all of this can be done thoroughly and non-rushed.  Will post again then.  Thanks again!
JsmplyAuthor Commented:
Thanks everyone.  The machine is offline for the weekend and the diagnostic is being run.  After that (assuming it checks out) we are going to take this opportunity to start with a fresh image and install Windows 7 as it's been on the list of things to for a bit.  That should take care of the issue.  Thanks for the help!
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Storage Software

From novice to tech pro — start learning today.