Solved

RH EL4 RAID BOOT ISSUE

Posted on 2007-04-01
4
1,230 Views
Last Modified: 2013-12-15
System has two drives (250GB) set as RAID-0 mirror.

Partitions are formed as:
/dev/hda1 Boot Linux
/dev/hda2 Linux swap
/dev/hda3 Linux raid autodetect

/dev/hdb1 Linux swap
/dev/hdb2 Linux raid autodetect

This is the console screen at boot time:
Decompressing Linux...done.
Booting the kernel.
Red Hat nash version 4.2.1.6 starting
EXT3-fs error (device md0): ext3_find_entry: reading directory #2 offset 0
mount: error 2 mounting none
EXT3-fs error (device md0): ext3_find_entry: reading directory #2 offset 0
EXT3-fs error (device md0): ext3_find_entry: reading directory #2 offset 0
EXT3-fs error (device md0): ext3_find_entry: reading directory #2 offset 0
WARNING: can't access (null)
exec of init ((null)) failed!!!: 14
EXT3-fs error (device md0): ext3_find_entry: reading directory #2 offset 0
unmount /initrd/dev failed: 2
Kernel panic - not syncing: Attempted to kill init!

Here the cursor just blinks indefinately and the Caps and Scroll Lock lights on the keyboard flash.

I need serious help... Thanks in advance...
-greg
0
Comment
Question by:Technodweeb
  • 2
  • 2
4 Comments
 
LVL 27

Accepted Solution

by:
Nopius earned 500 total points
ID: 18833855
Was everything working and just recently has been broken or you work with installation of Linux?
It might be helpful of you copy-paste entire boot screen here.
That can be easy done with a serial console connectio to your server. Just connect to COM1 with terminal cable, and add in a GRUB kernel flags 'console=tty0 console=ttyS0,38400n8' (edit before boot). Full screen dump might be more informative.
What about your problem RAID-0 is NOT a mirror, it's a striped device that, once corrupted, cant be restored. If you use 'md0' I guess it's a software RAID, because 'md0' is a driver.
If problem is 'just happened' and everithing was OK before, I can guess that you either have dead mount labels on your devices (EL4 uses labels instead of device names for finding appropriate partitions), or you have corrupted 'md' superbloks at the end of each partition from RAID, or you have corrupted filesystem in a working RAID0, or you have changed 'partition type' on device /dev/hda3 or /dev/hdb2.

You may read about more kernel flags in 'md' manual here: http://www.squarebox.co.uk/cgi-squarebox/manServer/md.4

Then you may try to use kernel flags to manualy define RAID: 'ro raid=noautodetect md=0,/dev/hda3,/dev/hdb2'

Your problem is really serious and it whould be nice to have a backup copy of all your data...

0
 
LVL 11

Author Comment

by:Technodweeb
ID: 18835646
I have gotten it solved. As you mentioned, the problem was VERY serious and I ended up hiring a fellow out of Michigan (not found through EE) to assist over the phone to resolve. Unfortunately, the entire file structure of the RAID was sent to Lost+Found, due to some corruption in the directory, and I now get the job of picking through the scraps to find specific data that is important.

The resolution was to boot to rescue mode and manually assign the RAID and then fsck the RAID.

boot from RH install CD
at prompt: "linux rescue"

This command reconnected the two drives into the RAID. The partitions definition was still intact luckily...
mdadm -Ac partitions -m 0 /dev/md0

This command fixed the issue:
fsck /dev/md0        or it may have been    fsck /dev/hda3   (sorry, it was really late)

About an hour later, voila... All done!
0
 
LVL 11

Author Comment

by:Technodweeb
ID: 18835664
Noplus,

I saw in your post some detail that the fellow I paid was covering as well. Even though the question was answered yesterday, I will give you the points because of the similarities and I think between us, it would have gotten figured out.

thanks,
-greg
0
 
LVL 27

Expert Comment

by:Nopius
ID: 18840333
Technodweeb, thank you.
Really live assistance via phone is much more helpfull (onsite visit is better).
I am glad that problem is partially resolved (you need to find useful data in lost+found). Now you may use 'file' utility to do a fast check of data type of unknown file.
0

Featured Post

Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Network Interface Card (NIC) bonding, also known as link aggregation, NIC teaming and trunking, is an important concept to understand and implement in any environment where high availability is of concern. Using this feature, a server administrator …
Setting up Secure Ubuntu server on VMware 1.      Insert the Ubuntu Server distribution CD or attach the ISO of the CD which is in the “Datastore”. Note that it is important to install the x64 edition on servers, not the X86 editions. 2.      Power on th…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now