Solved

IBM ServerRaid blocked  logical drive with Linux 3.

Posted on 2008-10-13
7
1,057 Views
Last Modified: 2013-12-15
I have an IBM Eserver running Red Hat Linux ES v3 i686.  This server is primary a web server and was running fine for years until recently when a user called to ask if there was something wrong the the server.  When I checked the console the server appeared to be completely hung, after restarting I received this message from the IBM ServerRaid manager - blocked logical drive hit F4 to continue or F5 to make no changes.  I hit F4 and the system comes up with no problems.  I contacted IBM and downloaded all the latest and greatest drivers for the hard drives, the raid controller, Linux drivers, even the system bios.  This seemed to solve the problem at least for awhile but it has come back again.  I have used fsck to check all the partitions for bad blocks, but his does help either.

A recent example: I used Linux update to install some updated rpms on the system - they downloaded and installed correctly when I went to click finish - all the drives in the raid 5 array pegged on a solid green and the system hung - upon reboot - the blocked logical driver message again.

Any help - this one is driving me insane.
0
Comment
Question by:interstate
7 Comments
 
LVL 29

Expert Comment

by:Michael W
ID: 22712382
It quite possibly be that the latest drivers/firmware is no longer compliant with RHEL 3. I recommend checking with IBM and RedHat to see if there is a fix available for the problem you are having since it seems to be software related.

Also, are there any messages appearing under 'dmesg' in relation to the raid environment?
0
 

Author Comment

by:interstate
ID: 22712623
We were running relatively old drivers when the problem suddenly developed - moving to the latest drivers has not cured the problem completely as of the moment.  I have attached the dmesg file for those who may be able to see something ominous in that file - which could help solve the problem

Thanks

dmesg.txt
0
 

Author Comment

by:interstate
ID: 22731641
Here's some more information which may help.  This morning I noticed that when I went to open a web page on the server (this is a nuke site) I got the session has failed to initialize error, which is where it starts, wait a little while longer and the other nuke site on the server stops working.  If I ssh into the server a command like df -v may work or just return with an input/output error. Wait a little longer and I will no longer be able to connect to the sever at all.

At the console when I attempt to do a restart I get the following:
EXT3-FS Error (device sd(8,2) in start_transaction: Journal has aborted

I have to manually restart the system and at that point after the ServerRaid has initialized I get that blocked drive message.

hope this helps.

thanks
0
Get up to 2TB FREE CLOUD per backup license!

An exclusive Black Friday offer just for Expert Exchange audience! Buy any of our top-rated backup solutions & get up to 2TB free cloud per system! Perform local & cloud backup in the same step, and restore instantly—anytime, anywhere. Grab this deal now before it disappears!

 
LVL 87

Expert Comment

by:rindi
ID: 22737679
Check your HD's, it looks like there is a problem with at least one (sd8,2). Run fsck on the partitions.
0
 
LVL 20

Accepted Solution

by:
Gns earned 250 total points
ID: 22738699
(tagging on to rindis advice:-) And do the fsck from rescue mode, to ensure that it can do a thorough job on a quiscient filesystem.
It could be a marginal block in the journal itself... What you describe just tell us that the HW raid or the IO system is having problems. More or less what you can expect when one start talking about "years of uptime":-). The "blocked HDD" thing is just that... It is "dirty", so you need handle that after pulling the plug on it... More of a consequense than the reason, so to speak:-).
If I were you, I'd start looking at replacing the server altogether... a "total update" to newer distro on new HW.

Cheers
-- Glenn
0
 

Author Closing Comment

by:interstate
ID: 31406256
Got tired of battling IBM with hardware vs software issue - moving the apps to another server and shutting this thing down for now, perhaps on a rainy day will fire it up and take another look.
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

I previously wrote an article addressing the use of UBCD4WIN and SARDU. All are great, but I have always been an advocate of SARDU. Recently it was suggested that I go back and take a look at Easy2Boot in comparison.
How to update Firmware and Bios in Dell Equalogic PS6000 Arrays and Hard Disks firmware update.
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now