Solved

IBM ServerRaid blocked  logical drive with Linux 3.

Posted on 2008-10-13
7
1,058 Views
Last Modified: 2013-12-15
I have an IBM Eserver running Red Hat Linux ES v3 i686.  This server is primary a web server and was running fine for years until recently when a user called to ask if there was something wrong the the server.  When I checked the console the server appeared to be completely hung, after restarting I received this message from the IBM ServerRaid manager - blocked logical drive hit F4 to continue or F5 to make no changes.  I hit F4 and the system comes up with no problems.  I contacted IBM and downloaded all the latest and greatest drivers for the hard drives, the raid controller, Linux drivers, even the system bios.  This seemed to solve the problem at least for awhile but it has come back again.  I have used fsck to check all the partitions for bad blocks, but his does help either.

A recent example: I used Linux update to install some updated rpms on the system - they downloaded and installed correctly when I went to click finish - all the drives in the raid 5 array pegged on a solid green and the system hung - upon reboot - the blocked logical driver message again.

Any help - this one is driving me insane.
0
Comment
Question by:interstate
7 Comments
 
LVL 29

Expert Comment

by:Michael W
ID: 22712382
It quite possibly be that the latest drivers/firmware is no longer compliant with RHEL 3. I recommend checking with IBM and RedHat to see if there is a fix available for the problem you are having since it seems to be software related.

Also, are there any messages appearing under 'dmesg' in relation to the raid environment?
0
 

Author Comment

by:interstate
ID: 22712623
We were running relatively old drivers when the problem suddenly developed - moving to the latest drivers has not cured the problem completely as of the moment.  I have attached the dmesg file for those who may be able to see something ominous in that file - which could help solve the problem

Thanks

dmesg.txt
0
 

Author Comment

by:interstate
ID: 22731641
Here's some more information which may help.  This morning I noticed that when I went to open a web page on the server (this is a nuke site) I got the session has failed to initialize error, which is where it starts, wait a little while longer and the other nuke site on the server stops working.  If I ssh into the server a command like df -v may work or just return with an input/output error. Wait a little longer and I will no longer be able to connect to the sever at all.

At the console when I attempt to do a restart I get the following:
EXT3-FS Error (device sd(8,2) in start_transaction: Journal has aborted

I have to manually restart the system and at that point after the ServerRaid has initialized I get that blocked drive message.

hope this helps.

thanks
0
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

 
LVL 88

Expert Comment

by:rindi
ID: 22737679
Check your HD's, it looks like there is a problem with at least one (sd8,2). Run fsck on the partitions.
0
 
LVL 20

Accepted Solution

by:
Gns earned 250 total points
ID: 22738699
(tagging on to rindis advice:-) And do the fsck from rescue mode, to ensure that it can do a thorough job on a quiscient filesystem.
It could be a marginal block in the journal itself... What you describe just tell us that the HW raid or the IO system is having problems. More or less what you can expect when one start talking about "years of uptime":-). The "blocked HDD" thing is just that... It is "dirty", so you need handle that after pulling the plug on it... More of a consequense than the reason, so to speak:-).
If I were you, I'd start looking at replacing the server altogether... a "total update" to newer distro on new HW.

Cheers
-- Glenn
0
 

Author Closing Comment

by:interstate
ID: 31406256
Got tired of battling IBM with hardware vs software issue - moving the apps to another server and shutting this thing down for now, perhaps on a rainy day will fire it up and take another look.
0

Featured Post

Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The purpose of this article is to demonstrate how we can upgrade Python from version 2.7.6 to Python 2.7.10 on the Linux Mint operating system. I am using an Oracle Virtual Box where I have installed Linux Mint operating system version 17.2. Once yo…
Fine Tune your automatic Updates for Ubuntu / Debian
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

864 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now