Recovering crashed Linux server

Posted on 2008-11-18
Medium Priority
Last Modified: 2013-12-16

I tried to clone one CentOS server using 'Ghost for Linux' but showed some SCSI error. Later then when I start the machine normally, it wasn't coming up. It displays the following error:

hde: Invalid capacity for disk in drive
Volume group "VolGroup00" not found
Unable to access resume device (/dev/VolGroup00/LogVol01)
hde: Invalid capacity for disk in drive
hde: Invalid capacity for disk in drive
mount: could not find filesystem '/dev/root'
setuproot: moving /dev failed: No such file or directory
setuproot: error mounting /proc: No such file or directory
switchroot: error mount failed: No such file or directory
Kernel panic - not syncing: Attempted to kill init !

Looks like the filesystem in the LVM got screwed up. The server is actually configured with Mirrored volume with 2 disks involved (sda & sdb). I tried to recover the system by booting the server with rescue CD; once it come up, I could see the root partition is mounted on /mnt/sysimage. Now how can I fix the filesystem issue?  I tried to ran fsck on the two disks which is showing up in 'fdisk -l' but it says the following error:
"Couldn't find ext2 superblock, trying backup blocks.....
fsck.ext2: Bad magic number in super-block while trying to open /dev/sda2..........
<finally it says>
you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device> "

I tried to ran FSCK with alternate superblock but it gives the same error.

Can you suggest me how can I fix this?

Question by:rdashokraj
LVL 81

Assisted Solution

arnold earned 225 total points
ID: 22991287
run mkfs.ext2 -n /dev/sda2. (make sure that the -n flag does nothing other than displaying what it would have done including the superblocks)
This will display the backup superblocks.
Which you can then use to recover the data
e2fsck -b backup_superblock /dev/sda2

Depending on which alternate block you might need to try a few try the ones in the middle and near the end..
LVL 14

Assisted Solution

cjl7 earned 225 total points
ID: 22991943

Since you are running LVM you can't use the /dev/sda2 device directly. You need to use the logical volume instead.

When you are running a rescue cd, how is /mnt/sysimage mounted? ('mount' will show you)

can you see the filesystem and files as expected? If you can I'd say that your lvm conf is broken. If it works when running a rescue-cd you can restore the setting by doing a vgcfgbackup. (it will write a new /etc/lvm/lvm.conf)

Also make sure your grub.conf is correct.

Is your boot device OK? That is the one your are surpassing when using a rescue-cd.

LVL 19

Accepted Solution

jools earned 900 total points
ID: 22993656
I think the scsi error may not be the whole problem;

> hde: Invalid capacity for disk in drive
This is probably a bad disk drive, I guess that your volume group information points to ide (hd) disks and you're now expecting it to work on a SCSI based system.

You will need to boot rescue mode (do not mount the system) and look to import the vg information you restored so the pointers in the LVM config point to the correct disk.

If you can supply more information from the old system and the new one it would help.

Configuration Guide and Best Practices

Read the guide to learn how to orchestrate Data ONTAP, create application-consistent backups and enable fast recovery from NetApp storage snapshots. Version 9.5 also contains performance and scalability enhancements to meet the needs of the largest enterprise environments.


Assisted Solution

macker- earned 150 total points
ID: 23001015
Are you trying to mirror from a system running two SCSI disks, to a new system running two SCSI disks?

Is 'hde' a CD-ROM drive, a hard drive, ... ?

It sounds like Ghost may not have understood LVM, and remapped the partitions properly.  I think the other Experts are on the right track, but we need more information about the old and new system, at the hardware level; SCSI controller(s) and disks installed (size, model) would be useful, and what device hde is.

Author Comment

ID: 23001457
Thanks for all your inputs. All my attempts to recover the disks (LVM) to its original form got failed. Then finally I went in the rescue with one USB hard disk connected and backed up all the required files and rebuilt the system. Anyways, this system crash has raised many questions about purpose of having LVM when we aren't sure about how to restore the system incase of failure.

Kindly answer this question. Let say I have a LVM configured between two disks for / partition. If incase one disk failed, what is the procedure to bring up the OS using other disk?  
In Solaris, if one of the disk in a mirrored volume is failed, we can boot the system by changing the boot-device parameter to alternative disk and further we can even break the mirror and convert a metadevice (in SVM) to normal disk.  Is that possible in LVM?
LVL 19

Assisted Solution

jools earned 900 total points
ID: 23001897
You can use the software RAID (like SUN's disksuite product you mentioned) underlying LVM if you really needed but it also adds another layer of complication, I have a system that is configured like this.

It is possible to boot the system if a mirror is broken but you have to make sure the system is configured properly first (think grub).
I have never tried to convert the metadevice back to a normal disk myself, it may be possible if LVM is not used but if LVM is used then it may be a little over complicated (if I understand you question).

For instance;
I have an md device for /boot, if this fails then the other disk will still be available and I can still boot, if I want to convert back then it may be possible by just editing /etc/fstab and changing the fdisk type id back to 83.

I also have an md device for my rootvg under LVM, if this fails then I should still have access to the data but converting it back to a normal partitioned disk would be complicated because the LVM partitions are in effect virtual. I guess it depends on your original setup but I would imagine most LVM configurations would use a large partition and then split up their logical volumes within that rather than have a partitioned disk in the volume group as you would if you were using veritas vm and encapsulating the root disk.

If the server was a production server then I would probably shy away from LVM for my O/S (data areas are fine if you like) but I wouldn't mind using software RAID, from a disaster recovery POV anyway.

Author Closing Comment

ID: 31518117
Thanks for all your inputs

Featured Post

Upgrade your Question Security!

Add Premium security features to your question to ensure its privacy or anonymity. Learn more about your ability to control Question Security today.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In part one, we reviewed the prerequisites required for installing SQL Server vNext. In this part we will explore how to install Microsoft's SQL Server on Ubuntu 16.04.
Often times it's very very easy to extend a volume on a Linux instance in AWS, but impossible to shrink it. I wanted to contribute to the experts-exchange community a way of providing a procedure that works on an AWS instance. It can also be used on…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Suggested Courses
Course of the Month13 days, 22 hours left to enroll

807 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question