Recovering crashed Linux server

Posted on 2008-11-18
Last Modified: 2013-12-16

I tried to clone one CentOS server using 'Ghost for Linux' but showed some SCSI error. Later then when I start the machine normally, it wasn't coming up. It displays the following error:

hde: Invalid capacity for disk in drive
Volume group "VolGroup00" not found
Unable to access resume device (/dev/VolGroup00/LogVol01)
hde: Invalid capacity for disk in drive
hde: Invalid capacity for disk in drive
mount: could not find filesystem '/dev/root'
setuproot: moving /dev failed: No such file or directory
setuproot: error mounting /proc: No such file or directory
switchroot: error mount failed: No such file or directory
Kernel panic - not syncing: Attempted to kill init !

Looks like the filesystem in the LVM got screwed up. The server is actually configured with Mirrored volume with 2 disks involved (sda & sdb). I tried to recover the system by booting the server with rescue CD; once it come up, I could see the root partition is mounted on /mnt/sysimage. Now how can I fix the filesystem issue?  I tried to ran fsck on the two disks which is showing up in 'fdisk -l' but it says the following error:
"Couldn't find ext2 superblock, trying backup blocks.....
fsck.ext2: Bad magic number in super-block while trying to open /dev/sda2..........
<finally it says>
you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device> "

I tried to ran FSCK with alternate superblock but it gives the same error.

Can you suggest me how can I fix this?

Question by:rdashokraj
    LVL 76

    Assisted Solution

    run mkfs.ext2 -n /dev/sda2. (make sure that the -n flag does nothing other than displaying what it would have done including the superblocks)
    This will display the backup superblocks.
    Which you can then use to recover the data
    e2fsck -b backup_superblock /dev/sda2

    Depending on which alternate block you might need to try a few try the ones in the middle and near the end..
    LVL 14

    Assisted Solution


    Since you are running LVM you can't use the /dev/sda2 device directly. You need to use the logical volume instead.

    When you are running a rescue cd, how is /mnt/sysimage mounted? ('mount' will show you)

    can you see the filesystem and files as expected? If you can I'd say that your lvm conf is broken. If it works when running a rescue-cd you can restore the setting by doing a vgcfgbackup. (it will write a new /etc/lvm/lvm.conf)

    Also make sure your grub.conf is correct.

    Is your boot device OK? That is the one your are surpassing when using a rescue-cd.

    LVL 19

    Accepted Solution

    I think the scsi error may not be the whole problem;

    > hde: Invalid capacity for disk in drive
    This is probably a bad disk drive, I guess that your volume group information points to ide (hd) disks and you're now expecting it to work on a SCSI based system.

    You will need to boot rescue mode (do not mount the system) and look to import the vg information you restored so the pointers in the LVM config point to the correct disk.

    If you can supply more information from the old system and the new one it would help.

    LVL 7

    Assisted Solution

    Are you trying to mirror from a system running two SCSI disks, to a new system running two SCSI disks?

    Is 'hde' a CD-ROM drive, a hard drive, ... ?

    It sounds like Ghost may not have understood LVM, and remapped the partitions properly.  I think the other Experts are on the right track, but we need more information about the old and new system, at the hardware level; SCSI controller(s) and disks installed (size, model) would be useful, and what device hde is.

    Author Comment

    Thanks for all your inputs. All my attempts to recover the disks (LVM) to its original form got failed. Then finally I went in the rescue with one USB hard disk connected and backed up all the required files and rebuilt the system. Anyways, this system crash has raised many questions about purpose of having LVM when we aren't sure about how to restore the system incase of failure.

    Kindly answer this question. Let say I have a LVM configured between two disks for / partition. If incase one disk failed, what is the procedure to bring up the OS using other disk?  
    In Solaris, if one of the disk in a mirrored volume is failed, we can boot the system by changing the boot-device parameter to alternative disk and further we can even break the mirror and convert a metadevice (in SVM) to normal disk.  Is that possible in LVM?
    LVL 19

    Assisted Solution

    You can use the software RAID (like SUN's disksuite product you mentioned) underlying LVM if you really needed but it also adds another layer of complication, I have a system that is configured like this.

    It is possible to boot the system if a mirror is broken but you have to make sure the system is configured properly first (think grub).
    I have never tried to convert the metadevice back to a normal disk myself, it may be possible if LVM is not used but if LVM is used then it may be a little over complicated (if I understand you question).

    For instance;
    I have an md device for /boot, if this fails then the other disk will still be available and I can still boot, if I want to convert back then it may be possible by just editing /etc/fstab and changing the fdisk type id back to 83.

    I also have an md device for my rootvg under LVM, if this fails then I should still have access to the data but converting it back to a normal partitioned disk would be complicated because the LVM partitions are in effect virtual. I guess it depends on your original setup but I would imagine most LVM configurations would use a large partition and then split up their logical volumes within that rather than have a partitioned disk in the volume group as you would if you were using veritas vm and encapsulating the root disk.

    If the server was a production server then I would probably shy away from LVM for my O/S (data areas are fine if you like) but I wouldn't mind using software RAID, from a disaster recovery POV anyway.

    Author Closing Comment

    Thanks for all your inputs

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    How your wiki can always stay up-to-date

    Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
    - Increase transparency
    - Onboard new hires faster
    - Access from mobile/offline

    In this tutorial I will explain how to make squid prevent malwares in five easy steps: Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently-…
    Network Interface Card (NIC) bonding, also known as link aggregation, NIC teaming and trunking, is an important concept to understand and implement in any environment where high availability is of concern. Using this feature, a server administrator …
    Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
    Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

    761 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    10 Experts available now in Live!

    Get 1:1 Help Now