Link to home
Start Free TrialLog in
Avatar of Jason
Jason

asked on

HP Proliant DL380e Gen 8 RAID configuration

Hello,
I have an HP Proliant DL380e Gen that I inherited.  SInce I didn't set it up, I'm not quite sure if I'm understanding the configuration.   I am hoping somebody can clarify some things for me.  Currently it has 8 500GB drives in it.  The array controller is showing that bays 5,6,7, and 8 are one logical drive, RAID 5.  So does this mean that 1,2,3, and 4 are not raided or how exactly would you describe the redundancy aspect of this machine?  It's not showing any unassigned drives so why aren't drives 1,2,3, or 4 listed as anything, neither unassigned nor part of any array.  It appears that there is around 3TB of storage all together but I can't picture how this works.  My understanding is that RAID 5 will give me 1/3 of the space available, so out of the 8 drives physically showing 4TB, we'd see 3TB after RAID.  I have Vmware running on it and the vmware logs are showing that drives 1,2,3, and 4 are exceeding the thermal threshold that is defined.  I guess my other question would be, how can I replace drives 1, 2, 3, and 4 if they are, in fact, overheating or if they ever fail?  It doesn't appear (to me) that I can just pop a new drive in and they will just rebuild.  Please help!  I have the VMs intermittently going down and the whole server becomes unresponsive and I can't even reboot the vmware box remotely via CLI.  Have to have the guy walk upstairs and power it off manually.  It's a nightmare!
Avatar of Zephyr ICT
Zephyr ICT
Flag of Belgium image

Strange config apparently, maybe it's configured as two RAID5, since that would amount to the 3TB you're seeing? How many (local) datastores can you see in VMware?
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Have you checked using the Array Configuration Utility ? (which means booting the Server, from the Smart Start DVD), or checking in the Array Controller POST, Press F8 at BOOT.

If the array controller is only showing 5,6,7 and 8 disks in the Array, is the size of the datastore currently, approx 1.5TB ?

8 Disks in RAID 5, would give 3TB.
Avatar of Jason
Jason

ASKER

Thanks for the quick response!  There are two datastores in vmware.  1.5TB each.
Avatar of Jason

ASKER

..and yes, I've looked at the array config util.  Did you mean the F5 at boot?
Avatar of Jason

ASKER

So yea, it's showing only 1 logical drive in the array config using 5,6,7, and 8 but I see two 1.5GB datastores.
Than that would mean there's two logical drives of 1.5TB each ... So in theory you should be able to change the hard disks out (hot or cold swap).
Avatar of Jason

ASKER

Also, I was checking hard drive serial numbers to verify which ones were showing the heat issue, and removing drive 1 crashes the server.  When powered off, if I replace the drive 1 with a brand new drive, the server will not boot.  Can't find the OS.
Avatar of Jason

ASKER

@spravtek:  Why doesn't it show 2 logical drives in the array config?  It specifically says 1 logical drive and lists 5, 6, 7, and 8 underneath it.  No mention of 1,2,3, or 4 anywhere.
That's a good question, the only reason I can think of there's something wrong, either hardware (since you're also seeing overheating issues) or firmware ... So first thing to check is if you have the latest firmware for the controller and maybe for the disks as well.
Not sure whether the issue is that there are two separate controllers one serving a single channel. 1-4  and the second 5-8
The internal/boot controller possibly has 5-8.

Does the util reflecting 5-8 logical volume reflect the 1-4 disks in the listing?

Look at the lspci to see what controllers there are.
Yes, check in VMware if the datastores are using the same controller, would be strange if they weren't though.
Avatar of Jason

ASKER

There shows only one contoller and 1 logical drive consisting of disks 5, 6, 7, and 8.  Do u think somehow the other array was broken yet somehow maintaing a 3TB capacity?  That makes zero sense to me but I'm just throwing anything out there at this point.
Well, something definitely is happening that doesn't make sense, first the temperature issue, second, you only see 1 logical drive. There's not much options to test these things... So hence I'd first try the firmware route. If that doesn't help ... Backup everything and then do some tests with the logical drive(s)?
Avatar of Jason

ASKER

Yeah, I just started this job so I'm trying to figure everything out and how it was setup.  The guy that's been working on the intermittent shutdowns recently was even working with VMware and they ran some kind of diags and were seeing I/O errors on the disks within VMware.  I was thinking either the heat and or the I/O errors were cause and effect of one another.  The guy that was in charge of this server before me, wasn't doing his job correctly so I replaced him, but supposedly he ran all the HP hardware diagnostics and saw no problems.  This has been going on for a month or more.
Hmmm ... The only other thing that could have happened is that disks have been moved around, the so-called disk roaming feature ... That would be a cause for not seeing the second logical drive so it appears (page 63).

But still, disks getting too hot really sounds like a hardware issue no?
Avatar of Jason

ASKER

So the fact that pulling out drive 1 has a catastrophic effect, doesn't that prove that drives 1,2,3, and 4 aren't part of any raid array?  But then how the hell am I still seeing 3TB?  I'm not at the server right now and am working on getting in remotely so I can't check any of the suggestions above just yet but I want to make sure I've got all my shit together before I actually get in there.
If the datastore fails when you pull out 1 drive than that would mean either there's no RAID5 or there's another drive in the array that is not working like it should ... Or, there's 3 disks configured as RAID0?? All kinds of possibilities.

There were HP Advisories for overheating issues with older BIOS ROMs... But your probably up to date if you ran all those tests, or your predecessor did ... Best to doublecheck though.
Avatar of Jason

ASKER

So if drives 1234 are not on the raid, shouldnt they show up as unassigned in the array config util?
If it is a single controller, the drives should be listed there.  They might be presented as individual drives to vmware and are managed there, while the other 4 are combined as a single volume repository,

............
If possible and if not already, transitioning to a newer more stable system/setup should  be considered.
Are there multiple hosts serving the vmware infrastructures? I.e. Other hosts are part of .......
Yes they should, but that will not be the case in your situation because then you wouldn't see the 1.5TB drive in VMware... Or, they're available but also still in a "ghost" config or something??
Avatar of Jason

ASKER

There is just this one server, hosting two vms.  Both vms are there dcs and one is the dns and dhcp.  Retarded because now when this server goes down, the whole company is screwed.
The only solutions I'm seeing is either a second server ... Or, boot VMware from a USB drive if that's not the case yet, which would maybe free up the disks showing the issues and allowing you to take them out.
Setting up a temp/intermediary with one Dc outside that either physical or Vm with DHCP/DNS (split scope DHCP for all scopes) exported from current using netsh dhcp server dump which can then be adjusted and loaded on the new.
Okay, if you have two datastores 1.5TB each, then you have two RAID Arrays, Logical Drives of 1.5TB RAID 5 each.

Possibly....

but of the RAID controller is not showing two logical disks.....

they've not don't something weird here... like span three disks ?
Avatar of Jason

ASKER

I'm not a vmware guru but is it similar to hyper-v then where we can just copy the vmx, vmd, or wmwhatever files and spin them up on some junk server we have lying around temporarily?
Avatar of Jason

ASKER

@andrew that's why I'm at a loss.  Where could this other 1.5TB logical be originating from if the controller (the only one listed in the array config) isn't seeing the drives?
Avatar of Jason

ASKER

...and if 1234 are raid 5, why can't I pop a drive out Without catastrophe?
SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Jason

ASKER

Thanks Andrew!  So the question still remains, where are these hard drives 1234 and why wouldn't the be attached to that raid controller?  I've got an HP tech goin there to give the hardware a once-over to see any hardware errors and update firmware if need be but I'm at a loss.
Check the following:-

select the host > configuration > storage adaptors > devices

check here, or screenshot and we can inspect...
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Jason

ASKER

That's exactly what it is, Arnold.  After review with HP, drives 1-4 are just like you described with no raid enabled onboard.  Upgrading the firmware on the drives to see if the heat issue remains but gonna have to rebuild this sooner than later.  Thank you!
Migrate the VMs off the dataatore which is not RAIDED to the other datastore which is!

BUT, if ESXi is installed on a single disk, or JBODs, I would recommend re-installing ESXi onto a USB flash drive or SD card.

Also you want to ensure, you are using the OEM HP version of ESXi from HP site.
Avatar of Jason

ASKER

Only one vm is on the unraided drives and datastore 2 is almost full.  We're gonna move it off to a NAS before rebuild.  VMware recommended doing a clone to a NAS. Thanks for heads up on the OEM.  Any other issues you forsee?
Avatar of Jason

ASKER

How big of a flash drive would you use for the esxi?  And you'd recommend that over putting it on the HDD??
I think, a 2,4,8,16,32,GB SD or Usb will do
How big is the vm that is on this "data store" and is it "resource heavy I.e. Drive I/O"?


The reason for either gives you the option if drives 1,2,3,4 can be deused in favor of the usb/sd and migrated to another data store, you can repurpose them while booted by usb/sd into a new additional raided data store.
Avatar of Jason

ASKER

The whole vm is currently about 35GB short of its 1.5TB capacity.  It is a dc and houses a handful of sql dbs.  I would like to do the ESXi stuff  on a usb, enable RAID on the controller, raid5 that thing with 4 500GB drives, and call it a day.  Any link to the HP ESXi oem before I gotta google it?
Make sure to locate and copy the esxi's serial/key I think it is part of the config on the boot drice, might be visible in the gui.

The other issue deals with which version is installed?
ESXi - I would recommend a 8GB branded, e.g. HP, SanDisk, Kingston, Lexmar, flash drive or SD Card.

If you want top performance, put all the spindles in the same array.

e.g. all disks, and make one a hot spare.

more spinldles = more disks = more performance = MORE IOPS!
Avatar of Jason

ASKER

Thanks guys!

I'll wait to close this until I finish this week.

I appeappreciate all your insight.
Avatar of Jason

ASKER

Andrew, so in your last statement, were you saying put all 8 drives on one array and just put the two different datastores on the same logical volume?
Well, you would have a single datastore or 3TB, over 8 disks (RAID 5 of 8 discs, and logical volume of 3TB)

OR....

Put all eight disks, in the array, and then carve out logical volumes of 2 x 1.5TB.
Not sure you want to do that due to both performance impact of a failure of a single resource and the duration of the rebuild and the consequences of a second failure during that time.

While 2 500 GB are expanded for parity, a failure of one drive, will only impact the single data store allowing for tolerating a failure of a drive in the other logical volume.

Though have not researched it, maybe another expert who has raid expertise,
Would the rebuild of the same 500GB drive in a 1.5TB space take a shorter amount of time as compared to the same 500GB in a 3T space (4 spindles versus 7)
Taking into account that if each datastore hs equal baseline of Disk IO when combined, not sure they are summed, or there is a weighted adjustment .......
Note the machine is DL380e, not DL380p.

Probably came with HP Dynamic Smart Array B120i Controller which only supports up to 6 drives. Sounds like someone's put an additional Smart Array controller in it (or a 3rd party JBOD controller - can't tell without taking the lid off it) and cabled the single backplane to two different controllers.

While that does work as far as the data paths are concerned it makes a disaster for SES which looks after the backplane LEDs and temperature sensors on the disks because they can't both talk to the little chip on the disk backplane. You need to either get another backplane or more sensibly cable both plugs on this backplane to a single PCIe Smart Array controller (and not a dynamic/fakeraid one).