Data Corruption after RAID Rebuild on HP ML150 G5
Posted on 2010-11-19
Hey, I have "upgraded" hard drive on one of my client server wich were in RAID1. This is how I done it (Done it this way many time and never had problems):
He had 2x160gigs and I upgrade them to 2x500 RE3 WDC
1- Shutdown server and plug 2x 500 gigs inside
2- Start server and create the array using the utility (pressing F8 in this case)
3- Shut down server, unplug a 160gigs and 500gigs, plug them in another PC
4- Boot the other PC with Acronis True Image Echo and start the cloning of the 160 to the 500
5- Unplug the "empty" 500gigs of the server and plug the freshly cloned 500gigs in it
6- Start the server with only the cloned 500gigs, check to see if everything works = Yeah, everythings is OK
7- Shutdown server, plug the other 500gigs in it
8- Start server and in Windows Server, start a rebuild using the HP Storage Manager
So from now everything was fine, they had access to everything and nothing was lost or whatever. The reason I did the cloning on another PC is because the Estimate Time of the copy on the server was more than 14 hours and on the other PC it only took 30 minutes (I think it's Acronis that wasn't able to properly communicate with the HP raid controller).
SO for the problem;
After 2 days, the rebuild was completed BUT the server cold restarted by itself and after that EVERYTHING was corrupt on the OS partition. So, I checked the log I could see and they had Power Outage all weekend long, so that made me to believe this screwed the rebuild or data (They are on a APC Smart-UPS 1000).
So the monday morning I go there and reclone the hard drive, restore there data from backup, and everything is fine once again. BUT once again, the RAID finished Rebuild and BAM Data corruption again! So that lead me to a defective hard drive... Run test on both of them and they return OK.
So Now, I'm right there at the moment with 2 new RE3 500gigs and trying to do the same thing, but now I'm really afraid of what would happen... What can it be? The server runned perfectly for over 2 years juste before I swap the drive.
Any idea on this ?