Solved

RAID5 Question

Posted on 2011-03-17
21
476 Views
Last Modified: 2012-08-13
Hello, I have a LSI Sata300-8x raid controller with a failed drive attached to it.  I am 95% certain I know which drive it is but that's not 100 percent. Unfortunately I am getting no amber drive indicator lights but the RAID utility is telling me port 1 #7.  Unfortunately I am not sure which one that is for certain. Anway, if I pull the drive and it ends up being the wrong one will i lose the entire raid5 or will i be ok when i put the drive back in?
Thanks in advance.
0
Comment
Question by:webclickusa
21 Comments
 
LVL 32

Accepted Solution

by:
PowerEdgeTech earned 500 total points
ID: 35162525
If you pull the wrong one, then your RAID 5 will go offline and you risk at least some data loss, so you want to be sure.  Does your server have the LSI management software installed - or some server management software?  If so, it may give you the ability to "blink" a drive - you can blink drive 7 so you know for sure which one you're pulling.  Are the drive slots labeled on the server itself?
0
 

Author Comment

by:webclickusa
ID: 35162580
No, unfortunately no ability to blink the drive.  The drives are all green (no amber) which is driving me nuts. The only odd thing is one of the drives seems to stay green much longer then the others and this is the one which appears to be noisy but you know how hard it is to pinpoint just which drive is grinding.

I got into the raid utility and did a "spin up" on the failed drive.  I could hear an obviously damaged drive (clicking & grinding). Prior to the spin up i tried to rebuild the drive which failed.  The drives shows 2 media errors according to the raid utility and now the rebuild option is unavailble to me on that drive.

Next question . . . how can i tell if the drives are hotswap?

I would like to pull the drive I am thinking is the problem while the machine is up and running, find a replacement and pop it in all w/out rebooting the machine.

The machine run s24/7 with users connected at all times and is mission critical.

Lastly, the drive is a 500GB Western Digital 7200rmp sata2.  If I put in a Maxtor 500GB 7200rmp sata2 which was sitting on the shelf at this customer (bought by old IT guy) am I going to have a potential problem?  In other words do i need to exact same make/model/firmware, etc?

Thanks so much.
0
 

Author Comment

by:webclickusa
ID: 35162585
and the drive slots are not labeled.  Sorry for not answering that in my last post.
0
 

Author Comment

by:webclickusa
ID: 35162629
This the layout of the card (attached).  The odd thing is I would think in the management software it would say port 1, port 2 (look at the layout) and so forth.  It does not.  All says "port 1" and then physical drives zero through seven under it.

There are 2 raids on the controller.  One is a RAID one which houses the logical volume for the OS (drive C).  It is fine.  The other is a RAID5 (drive E) which houses the data (degraded).  

Thanks again.
0
 

Author Comment

by:webclickusa
ID: 35162635
If my raid 5 goes offline I assume it will get back online once the disk is reinserted?  RAID info is written to the controller and the drives, correct?  Thanks again.
0
 
LVL 32

Assisted Solution

by:PowerEdgeTech
PowerEdgeTech earned 500 total points
ID: 35162746
First, if your RAID 5 goes offline, you can get it back online, but it will not be automatic.  RAID info is written to both the controller and the drives, but you will likely need to "import" or "force online" that drive to put the array back together ... this is where you risk at least a small amount of data loss.

Generally, if the drives are accessible from the outside of the machine and are attached to a backplane, they are hot-swappable ... so you're probably fine there.

As far as the replacement drive goes ... the replacement doesn't have to be the same make/model, but should be an Enterprise-class drive (not Desktop/Consumer-class).
0
 

Author Comment

by:webclickusa
ID: 35162798
gotcha, the drives in it seems to be cheap-o consumer grade drives.  I do not know why people like to save money on drives on all things.  Again, the drive seems to be grinding.  The indicator light on it is behaving differently from the other drives so I am resonably confident i have the right drive.  I would feel a lot more warm and fuzzy if I have a amber indicator light like I should have in a perfect world.

ok, let's say I pull the drive with the machine running and it ends up being the wrong drive.
I then push the drive back in.  Launch the raid utility and force the drive online?

Sorry for all the questions.  I am a bit of a nervous wreck on this one.  It has fought me the whole way.  The old IT guy left unhappy and took lots of password info along with him.   He says he "lost that info".

 When this drive failed I spent a couple of hours just try to crack my way past the raid controller "full access" password.

If the guy was standing in the room today I would have had to fight myself not to punch him out. :)
Serenity now. :)
0
 

Author Comment

by:webclickusa
ID: 35162799
drive do have to be same size, speed, & cache, correct?
0
 
LVL 32

Assisted Solution

by:PowerEdgeTech
PowerEdgeTech earned 500 total points
ID: 35162830
"I do not know why people like to save money on drives on all things."  Indeed, a bad idea, particularly on a mission critical machine.

"Launch the raid utility and force the drive online?"  Or import the foreign config in the RAID utility ... a feature on some LSI-based controllers.

"drive do have to be same size, speed, & cache, correct?"  
Chances are the controller has some onboard battery-backed cache, which will will make on-drive cache pointless.  Size:  Must be at least as large ... bigger is ok, smaller won't work.  Speed:  Doesn't matter ... generally consumer class drives are all 7200RPM, but even if you had a faster or slower drive, chances are it will only affect the speed of reading/writing from/to that drive - with probably an overall effect on the array; avoid mixing SATA 1/2/3 ... not usually a problem, but best not to.
0
 
LVL 10

Expert Comment

by:wmeerza
ID: 35162875
Just a quick question, have you been able to or do you have the ability to take a backup of your problem raid 5 before things get pulled out?
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 32

Assisted Solution

by:PowerEdgeTech
PowerEdgeTech earned 500 total points
ID: 35162886
If it helps make a case for "real" server hardware for a 24x7 mission critical server ... failed drives, failed arrays, data loss are all common when non-Enterprise class drives are used with Enterprise-class controllers.  Consumer class drives are not designed for a 24x7 load, have radically different timings and fault tolerances, and are not programmed to respond to many Enterprise-class controller commands it may receive during the course of normal array operation.
0
 

Author Comment

by:webclickusa
ID: 35162887
i cannot thank you enough.  Again, tomorrow I will be popping out the hopefully correct drive and replacing it with the new one I found on the shelf.  Then just choose "rebuild" and I should be good to go.

This is the info the raid management s/w is giving me for the drive:
WD5000YS-01MPB1

which definately looks like a consumer grade drive to me.
0
 

Author Comment

by:webclickusa
ID: 35162899
Agreed,  I am pushing to upgrade to a new server with enterprise class drive.  I am still trying running into all kinds of crap (bios passwords, management software password, riad controller passwords, local admin password for machines not on the domain, etc) from the disgruntled IT guys.  I would love to revamp so at least I know what is what.  The kicker is he bought 8 500GB drives to store a total of 150GB of data.  He would have been better of getting 3 enterprise class drives + a spare which would have cost more then the 8 crappy drives but . . . .
0
 
LVL 32

Assisted Solution

by:PowerEdgeTech
PowerEdgeTech earned 500 total points
ID: 35162912
I don't see anythign at http://wdc.com about "01MPB1", but the WD5000YS appears to be the RE2 drives, which are Enterprise-class drives:
http://www.wdc.com/en/library/sata/2178-001045.pdf

That should make you feel a little better about the drives in it.
0
 
LVL 32

Expert Comment

by:PowerEdgeTech
ID: 35162917
Been there, done that :)  Ophcrack is my friend for local passwords.
0
 

Author Comment

by:webclickusa
ID: 35162937
Again, I cannot thank you enough for the assistance.  I will feel a lot better tomorrow when i am back in action.  I am a little stir crazy from listening to the beeping for hours as I was unable to turn off the alarm until I got past the raid controller password problem.  ok, it is late here on the E.Coast so I am winding down and hitting this early tomorrow.  Thanks again.  i will update this tomorrow.
0
 

Author Comment

by:webclickusa
ID: 35162951
I generally have had good luck with "winkey" for local passwords.
Thanks for the info.  I will check it out.
0
 
LVL 32

Expert Comment

by:PowerEdgeTech
ID: 35162953
Good luck :)
0
 

Expert Comment

by:prasad2925
ID: 35180362
Hi, why dont you calculate your hard disk ID numbers with the sequence of the detected at the scsi raid configuration , you can calculate the faild hard disk surely and replace that with working one
0
 

Author Comment

by:webclickusa
ID: 35180511
I am good.  I ended up find a schematic of the raid controller online.  I was then able to figure out what port was what on the controller.  I then followed the wires back to the back drive.
Thanks so much for all the input,
Tim
0
 

Author Closing Comment

by:webclickusa
ID: 35180534
cannot thank everyone enough.
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Suggested Solutions

The 6120xp switches seem to have a bug when you create a fiber port channel when you have a UCS fabric interconnects talking to them.  If you follow the Cisco guide for the UCS, the FC Port channel will never come up and it will say that there are n…
this article is a guided solution for most of the common server issues in server hardware tasks we are facing in our routine job works. the topics in the following article covered are, 1) dell hardware raidlevel (Perc) 2) adding HDD 3) how t…
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now