Link to home
Start Free TrialLog in
Avatar of Mattia Minervini
Mattia MinerviniFlag for Italy

asked on

HP PROLIANT with smart array controller p400 and two disk hot swap (raid 1) predicted to fail soon

Hello everybody,
Here a big problem....As u can understand from title i have a server with open enterprise server 2 (sles 10), two disk in hardware raid 1, two partition on logic unit, first partition with OS and second one (NSS, netware partition) with 100 gb of data (with netware user rights on).
After a problem on one disk, hp let me upgrade firmware of server , especially smart array and older disk.
Server began a rebuilding operation.
after both disk starts to blink amber, and hp bios reported 2 errors about disks (1-Something about bad sector/error should be resolved when sectors will be rewritten 2-about two disk predicted to fail soon).
I have two new disks, ready to replace both.
I know backup of data is a problem of mine, but my questions are:

a) If two disks are experienced trouble, but system goes up and works correctly, data on that logic unit (split by mirroring on two disks) should be considered SAFE and CONSISTENT?

b)if now i try to replace first disk, then wait for rebuilding (monitoring from ACU), then replace second disk and wait for a second rebuilding, should be  the RIGHT PROCEDURE?

I cannot reinstall OS, i hope smart array has the exact copy of data on both disk (even though "predicted to fail soon").
Please help and ask me for details!! I'm waiting....

Avatar of jakethecatuk
jakethecatuk
Flag of United Kingdom of Great Britain and Northern Ireland image

a) You can never guarantee that data will be safe and consistent.  If there are bad blocks on one drive, these can be mirrored over to the other drive.  The only way to know would be to run a drive integrity check from the OS to see what it makes of it.  If it comes back clean then you're ok.

b) It shouldn't matter which order you rebuild the raid in.  However, you are better to do this when there is no load on the server (no users accessing the data as changes being made by users will slow down the rebuid).

Do you have a USB drive you can copy the data to before you do anything else - just in case the raid fails during the rebuild?
Avatar of Mattia Minervini

ASKER

Thanks for reply.
i'm just copying all important data on usb disk by OS running, with no load (netcard unplugged).
In an half hour it will complete.

Nothing to do for my OS, to make image of it i should have a program that support imaging of netware partition while running.
Data are more important for us....

Then i'll try to unplug one hot swap disk, and reply with a new one.
Then checking the rebuilding op by ACU, and wait.
When finish, reply the second one and wait for the second rebuilding op.
Is ok in your opinion?

In hp raid 1 hw, one disk is "master" and one "slave" (or "source" and "destination")?
If yes, i would try removing slave first....

Thanks again
I would try rebuilding the disk with the lowest disk ID first - in your case, what you are calling the master.  Make sure you put the defective drive somewhere safe.  If the whole array crashes during the rebuild, you still have what was the master to reboot off again.

Good luck - hope it works out for you.
another question...
there 's a problem of MBR and/or BOOT FLAG between  1st and 2nd disk of RAID1, i think.
MBR should be written on 1st, so if i remove 1 st, i'm not sure what will happen on first reboot.
What do u think about it?

MBR should be the same on both disks or the RAID is useless.  If the 1st drive failed and there was no MBR on the second drive, the system wouldn't boot.
Avatar of Member_2_231077
Member_2_231077

It doesn't matter which order you replace the disks, there's no master/slave relationship.

Since you have two dodgy disks I would split the mirror using the ACU rather than just pop one out, then you have two disks with the OS and data on rather than one that's been hot-plugged and therefore hasn't been shut down cleanly. To do this shut down and boot SmartStart CD and use the ACU from there under maintain server.The split mirror proceedure using the ACU also makes sure the data is good on both disks before it indicates split mirror is complete.

good idea split before replace disks.
but i can't do split cause of ACU of smartstart says it's in REBUILDING, but fixed at 10%.
This is the message also said by ACU on the running OS.

Now, incredible, amber lights on two disks disappeared.
And ACU predicted to fail only 1st disk.

......what's happening?
ASKER CERTIFIED SOLUTION
Avatar of Mattia Minervini
Mattia Minervini
Flag of Italy image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial