RAID 5 Portion on Hot Spare contains half of data

I have an Adaptec 5805Z Controller Card with 5 drives. 4 drives are supposed to be member of  raid 5 array with a system partition and a data partition. The other drive is supposed to be a dedicated hot spare.

At present, the hot spare is showing that it is a member of the system partition, and the drive 1 is showing that it is a member of the data partition. I have tried failing the hot spare in hopes to move the system partition to drive 1. However, when failing the Hot Spare, nothing happened. Reinserting the Hot Spare forced the Hot Spare to Rebuild and then copy the system partition back onto the Hot Spare. Failing the Device 1 and reinserting it only causes it to rebuild the 2 TB data partition.

My goal is to get the system partition off the Hot Spare and onto Device 1. I am thinking that marking Device 1 as a Hot Spare, and then unmark drive 5 (current Hot Spare) as non-hot spare, in hopes that it will copy the data portion of device 1 onto drive 5. Then I should be able to just swap Device 5 with Device 1 Slot and it should all be listed correctly.

Am I correct in the procedure for getting Device 1 to appear similar to the other drives?

Who is Participating?

Improve company productivity with a Business Account.Sign Up

Duncan MeyersConnect With a Mentor Commented:
OK - that makes more sense. Adaptec RAID controllers used to be infamous for a very similar bug about 6 or 7 years ago with the on-board Dell PERC 2 and 3 cards. You'd have the controller report a failure on one drive on only one logical device and on a different drive on the other logical partition, so you couldn't pull out a drive to fix one partition without double-faulting the other. Ugh.

The quickest and simplest fix is bite the bullet and perform a full back up, then blow away the RAID configuration and build a new one. I'd recommend checking for RAID controller firmware upgrades while you're about it.
You are just wasting your time.  This technique won't work, because only the BEGINNING of the partition is on the hot spare.   In a 4+1 R5 set, then any disk is going to have 1/4th the total data.
You just see the first 1/4th of the first stripe.

You can't drill inside of the RAID and make assumptions about the data. Once you went to RAID 5 then no individual disk contains any more than X KB worth of data in one "chunk", where the X a function of how you set stripe size.
Duncan MeyersCommented:
I think you may have a fundamental misunderstanding of how the RAID array works...

Both the system and data partition are striped across all four drives. The image you've posted shows four drives on Connector 0, with two logical drives across teh four drives. The drive that's second from the top is currently rebuilding.

Connector 1 has a single drive that is rebuilding. Once the rebuild completes, you should see the hot spare status change. The blue maker on the left hand side of the hot spare appears to be a progress marker, not a partition marker. Allow the rebuild to fully complete, then check the status again. It would be wise to restart the Adaptec Storage Manager console.
Improved Protection from Phishing Attacks

WatchGuard DNSWatch reduces malware infections by detecting and blocking malicious DNS requests, improving your ability to protect employees from phishing attacks. Learn more about our newest service included in Total Security Suite today!

invotecAuthor Commented:
I started the rebuild effort in an attempt to readjust the locations of the partitions. I had attempted the rebuild effort before. I have highlighted the allocation in red to illustrate my situation. The Logical Device "System" is currently residing on Device 0, 2, 3 and the Hot Spare. The "System" portion currently on the Hot Spare should be on Device 1, but it is not. I cannot seem to figure out how to move the system portion off the hot spare back to device 1.

Look, the hot spare is no longer a hot spare.  It is now part of the array once the rebuild starts.  If the array is still functional you need to let it refinish the rebuild.  If not, you are going down the path of 100% data loss if you try to do what you are doing.

The logic of doing what you want to do escapes me ... the "system" partition is located on all the disks in the RAID set.  Perhaps you are trying to do something differently from what you are asking for.

What are you trying to do, really, big picture?   Do you want to just get the "hot spare" drive out of the system entirely and free it up to use elsewhere?  If so, let the system rebuild, then do full backup just in case, then make that disk #1 be the hot spare (Since the rebuild frees it up), then create a failure by unplugging the original hot spare. This will force rebuild onto that older disk#1.
invotecAuthor Commented:
If you look at the screen shots all of the rebuilding is complete. I am trying to make the Hot Spare a True Hot Spare. At present the one marked hot spare has a part of the system raid on it. The system raid part should reside on the Device 1 drive and not the drive labeled hot spare.

I recently discovered that the Drive Sectors are slightly less on Device 1 which could be why I can get the ServerName Raid on Device 1, but I can't get the remaining 33.333 GB of the System RAID section.
Gerald ConnollyCommented:
invotec, you need to read up on how RAID systems work as the others have said.

But basically what you have is a 5 disk (aka spindle) setup as a 4 Disk RAID-5 container (3D+P), plus a Hot spare.

The fact that the Hot spare is currently in use implies that a some point you have either logically "failed" Drive 1, or it has had a problem and the controller has failed over to the hot spare as intended.

Now your issue with system partitions and data partitions is that you do not understand  how  things are laid out. Firstly your system partition is 100GB and is striped across all 4 spindles with approx 30GB on each spindle. Your data is chopped up into chunks and for every 3 data chunks there is a parity chunk, but as RAID-5 uses a rotating parity each spindle has data and parity data on it.

Your data partition is striped in the same way across the same spindles (but is 2630GB)

Think of 4 bottles with 30 ml of water and 900 ml of oil in each. After tings have settles you will have a layer of water in the bottom of each bottle and thats your system partition, and a layer of oil above it, which is your data partition.
When you make a HDD a hot spare, you tell the controller to make it a hot spare.  It is that simple. The controller only writes the metadata. It doesn't reformat the contents.   Ignore what is on the disk.  The controller will just write over it, in the event a HDD fails and the controller brings the spare into action.

If this bugs you, reformat the HDD on another computer, but only after you tell the controller you are getting rid of the hot spare.  You don't want to confuse it

As for a USABLE hot spare ... you never mentioned that before ;)
First a controller will gladly make ANY sized HDD a hot spare. That doesn't help you.  You want to make a HDD a hot spare in event one of the 4 disks dies.  The rule is simple.  Look at the specs of the disk.  As long as it has >= the SMALLEST disk in the RAID set, it will be suitable.  (The other caveat is that the disk needs to be supported by the controller).

Many controllers build in a "fudge factor" to get around differences in blocks.  The deal with those, however, is that the rules are vendor/product specific.  If all the HDDs are same make/model/firmware revision, it is a non-issue.  If they are different, then you must contact the controller vendor and find out if the disk is suitable, if the disks are different and have a similar published capacity.

invotecAuthor Commented:
I spoke at length with Adaptec and had stumped them pretty good when they reviewed my situation. According to Adaptec to get things back to the state that I desire, there are several options....

Option 1:
I can perform a Skip Init 0, in an attempt to try and reassign the SYSTEM RAID member that is on Connector 1 Device 1 back to Connector 0 Device 1. Since as you noticed the drive on Connector 1 Device 1 is no longer functioning as a hot spare.

Option 2:
I can break RAID and everything and start over. Basically, at this point I think I will go with creating 1 RAID with 2 Partitions, opposed to 2 RAIDS with 2 PARTITIONS on the same four drives. Basically, I was informed that it is not the best practice to store 2 RAID's on the same drive set which can create the issues that I am having, as the Adaptec RAID controller is not smart enough to let me take the remaining space on the drive and make it a hotspare, etc.

Option 3:
I can purchase two smaller drives, and then create a RAID on the Smaller Driver for SYSTEM RAID, and then copy it over SYSTEM Partition to this RAID, and hopefully should be able to boot the box with no problems.

Option 4:
Purchase another server and move everything over, and then use this server for something else.

Option 5:
Do nothing and still be protected by RAID by not Hot Spare. Also, having a waist of resources as the drive is not fulling being utilized.

In most of these options it sounds as though taking a snapshot of the server using Acronis will be good practice. Has anyone had any luck with using the Universal Restore option for a Server, especially if that server is a domain controller? I am also protected with backups using Backup Exec.  
Duncan MeyersCommented:
If it's a DC, save yourself some grief and build another DC (even as a VM on VMware Server) and transfer the roles across, then rebuild at leisure, promote the rebuilt server to DC and transfer the roles back. Saves you having to perform authoritative restores and all that palaver.
Excellent idea, meyersd about building another DC as a virtual machine .. I'll have to remember that.  
Duncan MeyersCommented:
Thanks! Glad I could help.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.