Link to home
Start Free TrialLog in
Avatar of corptech
corptechFlag for United States of America

asked on

Verifying directory file structure

ServeRaid 3L card with RAID 5 with 6 drives. had some bad hard drives which have been replaced.  Now, keep losing SYS volume and have to do NSS /poolrebuild to recover SYS vol.   Hard drives test ok.  appears to be file corruption.  It has been a while since I have worked on Novell.   What is best way to verify directory file integrity?
Avatar of Ghost96
Ghost96
Flag of United States of America image

Can you tell us what it means when you "keep losing SYS volume?"

Are you getting messages about pool data integrity, then it goes through a dismount process, basically rendering your server useless?

Because if I had to guess (and I know I'm waaay pre-mature on this guess) I would say that it's possible you've got a RAID controller problem and not a pool problem.  Especially if you are constantly going through pool rebuilds.

Your pool rebuilds are the sickness - treat the disease.

Any information that shows on the server console or logger screen would be great.  I just had to deal with this on 2 servers with the same type of RAID controller card.  Replaced the card and the server was great - but the pools would dismount and upset everyone because their data would be lost if it wasn't saved.

But yeah, turned out to be 2 bad RAID controllers and not drives or pool/volume issues.  Here's the kind of message I was getting on both of them (I've attached a pic to this link here):

http://www.bndservices.com/ee/badraid1.jpg
Avatar of elf_bin
elf_bin

You've not make it very clear as to what you mean, but a quick and dirty answer is to look in nss /?, there are many verification and rebuild options.
Avatar of corptech

ASKER

Sorry about not being clear on this issue.  When booting server up, server fails to mount SYS and VOL1.   Volumes cmd only shows  _Admin volume.     By doing a NSS /poolrebuild on SYS, I can get the SYS volume back  and mount it.   I can also mount VOL1.  After running the server a short period of time ( 30 to 180 minutes) I will lose both SYS and VOL1 with an error message on screen -
NSS-3.00-5001:  Pool  ServerName/SYS is being deactiveated.  
An I/O error (20204(Zio.c[2279])) at block 3845073 ( file block 28099) Z1D3)
 has compromised pool integrity.

ASKER CERTIFIED SOLUTION
Avatar of Ghost96
Ghost96
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks for the feedback.  Am looking at getting a replacement RAID controller card.  Thanks.
OP, did you see my screenshot?  Sounds exactly like what you are experiencing. Let me also add this:

Pool rebuilds have you roll the dice on whether or not you're going to lose data as part of the process. It's a fact - so for anyone reading if you care to disagree, bring on the supporting arguments please. People do this operation a bit too candidly. I've had people do it and damage files in their volumes. This is most commonly a result of bad information being passed by the controller to the volumes you are fixing.

PLEASE make sure you have current volume backups before doing rebuild operations.
OP, did you see my screenshot?  Sounds exactly like what you are experiencing. Let me also add this:

Pool rebuilds have you roll the dice on whether or not you're going to lose data as part of the process. It's a fact - so for anyone reading if you care to disagree, bring on the supporting arguments please. People do this operation a bit too candidly. I've had people do it and damage files in their volumes. This is most commonly a result of bad information being passed by the controller to the volumes you are fixing.

PLEASE make sure you have current volume backups before doing rebuild operations.
I'd like to echo Ghost96's warning about pool rebuilds.  NetWare people are far to use to VREPAIR (I think it was called).  Pool rebuilds in NSS are not the same thing and can be destructive.  Do a search on Novell's support website & you'll find lots of TIDs warning you about this very thing.

Not to be taken lightly.
thanks for info.  was a combo of bad hard drives, bad controller.