Solved

DELL 2970 RAID 5

Posted on 2014-04-23
18
1,288 Views
Last Modified: 2016-11-23
Hi,

I have a Dell Power Edge 2970 server with 6 x 600GB drives (total approx 3Tb) in a RAID 5 configuration. One of the drive in the RAID is reporting as failure (with an amber light flashing). The base OS of this server is XenServer, with several VM running of it.

I have bought a new drive to replace the failed one, so my question is:

(1) How do I find out if my Dell 2970 server is hot swappable and if I can literally pull out the failed drive whilst the server is on and running and replace it with the new drive?

(2) If it is hot swappable, will it rebuild the new drive automatically from the other 5 good drives or will I need to do anything else?

(3) If is NOT hot swappable and need to be shutdown to replace the drive, what will I need to do to get the new drive rebuilt and how?

(4) Based on the size and number of drives, roughly how long will it take for the new drive to be rebuilt?

(5) I'm assuming that rebuilding of the new drive will not affect the data and base OS already stored on the other 5 drives or will it?


Many thanks in advance.
0
Comment
Question by:markbenham
  • 8
  • 6
  • 3
  • +1
18 Comments
 
LVL 1

Expert Comment

by:Carl Gray
Comment Utility
Hi Mark,

I'd start by looking on Dell's website, and confirming the raid controller and disk you have. If you enter your server tag or express code in the support page it will tell you the system components and give you links to the documentation for the raid controller.

I mention this as there are several different raid controllers that can be in a 2970.

Hope this helps,
Carl
0
 

Author Comment

by:markbenham
Comment Utility
Hi Carl,

Thanks for your response, it will be a PERC 5i if I'm not mistaken.

Many thanks
0
 
LVL 34

Expert Comment

by:Seth Simmons
Comment Utility
1) just to confirm, it's constant flashing amber and not alternating from amber to green; if constant amber then yes, you can pull the drive while running and replace

2) yes after a minute or so you should see a lot of solid green lights on the drive indicating it is rebuilding

3) if it is flashing between amber and green it implies predictive failure which means you would need to manually fail the drive either with OMSA or through the controller BIOS (requires reboot).  here is a link to the OMSA download for XenServer 5.6 (not sure what version you have)

http://ftp.dell.com/FOLDER00982718M/1/OM_Xen56.tar.gz

4) that depends on how much load is on the system.  if i/o is light then perhaps an hour though there are several variables that factor into that

5) no.  as long as only 1 drive has failed, the data is fine.  it rebuilds from the parity

this should explain it a bit better

RAID 5 "Stripe with Parity"
http://eshop.macsales.com/shop/hard-drives/sata/RAID_Guide/Learn_About_RAID_5
0
 

Author Comment

by:markbenham
Comment Utility
Hi Seth

(1) It's not exactly flashing amber constantly but definitely within miliseconds. Will this still be ok to hotswap it?

(2) Thanks for the confirmation

(3) Its not flashing between amber and green so I guess no need to manually fail the drive? I'm running XenServer 6.0, is there an OMSA for this version?

(4) An hour or so? that's really promising.

(5) There is one drive that is flashing just amber (at miliseconds interval) and the rest of the other drives are flashing green.

Will be doing the hot swap in the morning and let you know how it pans out. Fingers cross and really nervous.

Many Thanks
0
 
LVL 34

Expert Comment

by:Seth Simmons
Comment Utility
1) yes as long as it isn't flashing between amber and green

3) if it's only flashing amber that indicates a failure (amber and green is predictive failure) so you can just pull the drive. i did not see an OMSA for version 6.0 on the download page

Drivers for PowerEdge 2970
http://ftp.dell.com/Pages/Drivers/poweredge-2970.html
0
 
LVL 13

Expert Comment

by:Greg Hejl
Comment Utility
Its also probably time to point out why raid five is dangerous.

When you replace a disk in a raid 5 array the rebuild process is actually rewriting ALL your data on the array, which is very high disk activity.  this puts stress on all the drives.

All your drives were most likely made around the same time, all the drives have the same MTBF ratings.

This rebuild time is exactly what another drive needs to come to a screeching halt and it happens more frequently than you might think.

the other downside to RAID5 is the rebuild time - Your available I/O seriously tanks during this time - and it can take a long time to rebuild,  this is not good for a production machine.  the last remaining bottleneck for servers is Disk I/O.

I've had 3 arrays go bad on me over my career - they were all raid 5 arrays and something had gone wrong during a rebuild.

RAID 10 is my go to drive configuration - fastest IO, all the time,  

my two cents

best regards,
0
 

Author Comment

by:markbenham
Comment Utility
Hi Greg,

Thanks for the heads up. Will take into account for future roll out.
In your experience how long will the rebuild take for:

"6 x 600GB drives (total approx 3Tb) in a RAID 5"

Many Thanks
0
 
LVL 13

Expert Comment

by:Greg Hejl
Comment Utility
3 TB?  18-24 hrs. your mileage may vary.  that was may last experience with a 5TB backup array rebuild with 7.2k disks.
0
 

Author Comment

by:markbenham
Comment Utility
Hi,

Thanks for your response. The disk are 15k disk. So hopefully with double spin rate and a smaller capacity, the rebuild will be take less time?

Many Thanks
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 

Author Comment

by:markbenham
Comment Utility
Hi Guys,

I've now hot swapped the drive and so far so good. XenServer 6.0 (the base OS) is still up
and from what I can tell the VM's on it are unaffected too.

Now that the drive is swapped, how would I know if the rebuild is taking place on my
Dell Power Edge 2970 PERC 5/i?

The activity light on the newly replace disk is flasing about 2 times a second whilst the status light is rigorously flashing.

My question for this part is:

(1) By looking at lights, how would I know if the drive is rebuilding or just being identified?

(2) Will it be safe for me to control shutdown the server and view the RAID and the status of the drives WHEN (IF) the rebuild is taking place in step (1)?

(3) Now that the drive has been replaced (although I have no indication apart from the active LED lights flashing 2 time a sec) do you think it is safe to reboot the XenServer and more importantly will it come back up after the reboot?

Many thanks
0
 
LVL 13

Expert Comment

by:Greg Hejl
Comment Utility
http://goo.gl/gi08Xv

Server administrator software.

have you tried accessing the DRAC?  I think I recall seeing disk status in there
0
 

Author Comment

by:markbenham
Comment Utility
Hi Greg,

DRAC is optional and unfortunately we don't have it. I think even if we did, it is not configured and hence will require a reboot to get it configured. (I know - we should have)

Just trying to avoid that at present as I'm not sure if the new disk I had put in is in the process of rebuilding...

By right if I've hot swapped it, should it automatically rebuild??

Many Thanks
0
 
LVL 13

Expert Comment

by:Greg Hejl
Comment Utility
yes the PERC Controller will do the rebuild automagically
0
 
LVL 34

Assisted Solution

by:Seth Simmons
Seth Simmons earned 250 total points
Comment Utility
I think I recall seeing disk status in there

disk status and management through the drac only became available in the 12g servers (620/720, etc.)

By looking at lights, how would I know if the drive is rebuilding or just being identified?

if there is constant disk activity about a min or so after inserting the drive, most likely it has started the rebuild

Will it be safe for me to control shutdown the server and view the RAID and the status of the drives WHEN (IF) the rebuild is taking place in step (1)

i'd say give it 24 hours or so and see if there is still constant disk activity.  if the lights are not too active and are at normal levels i would reboot it then

Now that the drive has been replaced...do you think it is safe to reboot the XenServer and more importantly will it come back up after the reboot?

i would wait until it finishes before rebooting.  i don't see any reason why it wouldn't come up again
0
 
LVL 13

Expert Comment

by:Greg Hejl
Comment Utility
Has it finished yet?
0
 

Author Comment

by:markbenham
Comment Utility
Hi,

Apologies for the delay in reply.

I'have no indication as to wheather it has rebuilt or not but the appropriate activity lights on the disk are solid green. Its now been a 6 days so I can only assum at this stage that it has
rebuilt.

The only problem I have is that there are various VM's on it and they are either not booting
or running very slow. So suspect that the rebuild may not have completed successfully and that it requires a reboot.

I am currently trying to re-create the VM (virtual machines) on a seperate piece of hardware which is taking a while and until I do, I won't be able to reboot to find out the status.

I will however get back to this post as soon as I'm in the position to fill you in.

Many thanks thus far
0
 
LVL 13

Accepted Solution

by:
Greg Hejl earned 250 total points
Comment Utility
This might help.

http://support.citrix.com/article/CTX127065

Since you are moving the VM's consider going to raid 10 for your array
0
 

Author Comment

by:markbenham
Comment Utility
Hi Greg and Seth,

Apologies for the delay in replying to this but its been that busy that I've only just managed to take
this server offline.

In essence, after replacing the 2 drives, and as mentioned by Greg, the stressed that it had placed on the
another drive whilst it was rebuilding has taken a 3rd drive down.

I've now replaced all 3 drives and am now going to a RAID 10 array. I've also installed a DRAC card and am configuring
the DRAC 5 controller (as mentioned by Seth, DRAC on Dell PE2970 does not show drive activity).

Have also found OSMA supplimentary pack for XENSERVER 6.2 and will be installing it for better monitoring:

http://en.community.dell.com/techcenter/systems-management/w/wiki/1760.openmanage-server-administrator-omsa.aspx

Once again guys sorry for the delay and thanks for your help.
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

This article is an update and follow-up of my previous article:   Storage 101: common concepts in the IT enterprise storage This time, I expand on more frequently used storage concepts.
Data center, now-a-days, is referred as the home of all the advanced technologies. In-fact, most of the businesses are now establishing their entire organizational structure around the IT capabilities.
This video Micro Tutorial explains how to clone a hard drive using a commercial software product for Windows systems called Casper from Future Systems Solutions (FSS). Cloning makes an exact, complete copy of one hard disk drive (HDD) onto another d…
Sending a Secure fax is easy with eFax Corporate (http://www.enterprise.efax.com). First, Just open a new email message.  In the To field, type your recipient's fax number @efaxsend.com. You can even send a secure international fax — just include t…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now