?
Solved

ESXi - PANIC: Failed to find HD boot partition

Posted on 2009-12-18
14
Medium Priority
?
4,234 Views
Last Modified: 2012-05-08
Hello,

After a failed upgrade attempt from ESXi 3.5 to 4.0 (using vSphere Update Utility), I can no longer boot the ESXi 3.5 server - I get "Failed to find HD boot partition."

The server is an HP Proliant DL380 G5 w/ SAS disks.

When I boot from a Linux Live CD, I can see the partitions:
root@ubuntu:~# fdisk -l

Disk /dev/cciss/c0d0: 440.3 GB, 440346238976 bytes
64 heads, 32 sectors/track, 419946 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Disk identifier: 0x00000000

           Device Boot      Start         End      Blocks   Id  System
/dev/cciss/c0d0p1               5         900      917504    5  Extended
/dev/cciss/c0d0p2             901        4845     4039680    6  FAT16
/dev/cciss/c0d0p3            4846      419947   425064344   fb  VMware VMFS
/dev/cciss/c0d0p4   *           1           4        4080    4  FAT16 <32M
/dev/cciss/c0d0p5             255         504      256000    6  FAT16
/dev/cciss/c0d0p6              53         100       49136    6  FAT16
/dev/cciss/c0d0p7             505         614      112624   fc  VMware VMKCORE
/dev/cciss/c0d0p8             615         900      292848    6  FAT16


I can mount the active partition (/dev/cciss/c0d0p4), and it contains this:
root@ubuntu:/mnt/dsk# ls -ltr
-rwxr-xr-x 1 root root    21 2008-08-12 19:44 syslinux.cfg
-rwxr-xr-x 1 root root 21492 2008-08-12 19:44 safeboot.c32
-rwxr-xr-x 1 root root 99000 2008-08-12 19:44 mboot.c32
-r-xr-xr-x 1 root root 10617 2008-08-12 19:44 ldlinux.sys

But no other partitions can be mounted:

root@ubuntu:~# mount /dev/cciss/c0d0p2 /mnt/dsk
mount: you must specify the filesystem type

root@ubuntu:~# mount -t vfat /dev/cciss/c0d0p2 /mnt/dsk
mount: wrong fs type, bad option, bad superblock on /dev/cciss/c0d0p2,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try dmesg | tail  or so
       
Any help on how to handle this would be great - I am not very familiar with VMWare.

Thanks,
Lars      

Ps. For more details on the failed upgrade attempt: http://www.experts-exchange.com/Software/VMWare/Q_24985452.html
0
Comment
Question by:Lars007
  • 8
  • 6
14 Comments
 
LVL 24

Expert Comment

by:ryder0707
ID: 26086169
Reinstall the esxi with repair option and remember to keep existing vmfs datastore!
0
 

Author Comment

by:Lars007
ID: 26087461
Ok.  Can I install ESXi4 (with repair option?) and keep the vmfs datastore, or do I have to reinstall the old ESXi 3.5 build and then re-attempt the upgrade to 4.0?  

Thanks,
Lars
0
 
LVL 24

Expert Comment

by:ryder0707
ID: 26087501
No i don't think so as the source files are different, repair the same esxi3.5 version then upgrade to esxi4 if needed
Btw, do you have backup of all the VMs? If you make mistake all data will be lost
0
Veeam and MySQL: How to Perform Backup & Recovery

MySQL and the MariaDB variant are among the most used databases in Linux environments, and many critical applications support their data on them. Watch this recorded webinar to find out how Veeam Backup & Replication allows you to get consistent backups of MySQL databases.

 

Author Comment

by:Lars007
ID: 26087643
Ok, I'll try to find the exact build of ESXi 3.5.  It would have been nice if you could do a clean install of ESXi 4 and just keep the data stores.  Or at least do a install / repair using the latest build of 3.5 (to not be back to square one in regards to upgrading to 4)...

No, I have no recent backup of the VMs - the server has not yet been put into production.  But there are a couple of VMs on there that I really would prefer not to lose.  I guess I could use dd to do a raw backup of the entire disk, but with 450GB, that would be a pain.  

Any idea why I cannot mount any of the partitions?  I mean, it appears that it mounts more than just the active partition before it panics (see attached screen shot).  Are there any ESX specific rescue CD or similar that can be used?  If I could just mount the partitions, I assume I could copy off whatever VMs in the datastore that I care about?

Thanks,
Lars


PIC-0040.jpg
0
 
LVL 24

Expert Comment

by:ryder0707
ID: 26087710
No idea what happen but I'm currently testing one of the lab server to see what happen if i try to repair esxi3.5 with esxi4 cd
.
Btw, at the moment I'm not aware of any 3rd party tool that can read vmfs directly, only esx/esxi can read vmfs filesystem

Be back very soon..
0
 
LVL 24

Expert Comment

by:ryder0707
ID: 26087779
As suspected earlier, esxi4 cant repair esxi3.5, it cant find existing installation and prompted me partition was corrupted! And as expected, I can repair it with esxi3.5 cd no problem and my test data in the vmfs is safe
Btw, you said you were trying to upgrade esxi3.5 to esxi4 then it failed after boot rite? So which version do you see when the server boots?
0
 

Author Comment

by:Lars007
ID: 26087886
Thanks for trying that.  

When ESXi starts booting, it is version 3.5.0 build-199239 (see attachment).  The progress bar gets all the way over to the right before it panics.

I looked for my exact build for download, but cannot find it on VMWare's site - it jumps from build 153875 (3.5 U4) to build 207095 (3.5 U5).  If I can't find the exact build, I guess 3.5 U5 would be my best bet?

Thanks,
Lars
ESXiBoot.jpg
0
 
LVL 24

Accepted Solution

by:
ryder0707 earned 2000 total points
ID: 26087922
Yes try it, read carefully after each prompt when you selected Repair, it will tell your if existing vmfs can be kept or not, like my case for esxi4, it prompted me that all data will be lost
Basically, if you dont see anything that says something like "existing vmfs will be preserved" i strongly suggest you stop what you are doing and pay vmware per incident to recover your data
Or perhaps image the entire disk to cover your @$$ by using disk imaging tool or other method that you think suitable
Good luck!
0
 

Author Comment

by:Lars007
ID: 26087986
I'll probably just hook up an external USB HD, boot from a Live CD and dd the whole physical disk over to the USB HD before doing anything.  After that, I'll just try the repair.

By the way, thank you very much for all the help you have given me.  I will probably not be able to do the DD until Monday, so, if you don't mind, I'll keep this posting open until I have tried the repair (in case I have any more questions when doing this - I have never done it before).

Lars
0
 
LVL 24

Expert Comment

by:ryder0707
ID: 26088001
No worries mate, glad to help
Its better to have good backup then nothing
0
 

Author Comment

by:Lars007
ID: 26102480
An update:
After dd:ing the whole disk, I tried the repair.  It gave me a "Disk Geometry Warning", telling me that it had detected an invalid or corrupt partition.  At this point, I opened up a support ticket with VMWare before proceeding.  The tech had me proceed with the repair.  After this, the server was put in "audit mode," and the VMFS partitions were nowhere in sight.  An escalation engineer then used the partition data I had recorded from before the attempted repair to manually recreate the missing partition.  And it worked - the VMs are now accessible again.  

I am currently in the process of backing all VMs up to a local disk (will probably take all night).  Once this is done, I will re-attempt the upgrade to 4 - hopefully w/o a trashed partition table this time (at least I have an open incident to fall back on if it happens again).

I much appreciate your help on this.

Thanks,
Lars

0
 
LVL 24

Expert Comment

by:ryder0707
ID: 26102497
pheewww....lucky you have good backup dude
I wonder how vm tech recreated the missing partition
0
 

Author Comment

by:Lars007
ID: 26102605
Actually, the backup was not needed.  I believe he just used fdisk to create a partition with the same start and end as the VMFS partition that was there before the last repair attempt.

Here is how the partition table looked before the last repair attempt:

root@ubuntu:~# fdisk -l
Disk /dev/cciss/c0d0: 440.3 GB, 440346238976 bytes
64 heads, 32 sectors/track, 419946 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Disk identifier: 0x00000000

           Device Boot      Start         End      Blocks   Id  System
/dev/cciss/c0d0p1               5         900      917504    5  Extended
/dev/cciss/c0d0p2             901        4845     4039680    6  FAT16
/dev/cciss/c0d0p3            4846      419947   425064344   fb  VMware VMFS
/dev/cciss/c0d0p4   *           1           4        4080    4  FAT16 <32M
/dev/cciss/c0d0p5             255         504      256000    6  FAT16
/dev/cciss/c0d0p6              53         100       49136    6  FAT16
/dev/cciss/c0d0p7             505         614      112624   fc  VMware VMKCORE
/dev/cciss/c0d0p8             615         900      292848    6  FAT16
Partition table entries are not in disk order

Here is the partition table from an ssh shell to ESXi after the repair:
/dev/disks/vmhba1:0:0:1             5       750    763904    5  Extended
/dev/disks/vmhba1:0:0:4   *         1         4      4080    4  FAT16 <32M
/dev/disks/vmhba1:0:0:5             5        52     49136    6  FAT16
/dev/disks/vmhba1:0:0:6            53       100     49136    6  FAT16
/dev/disks/vmhba1:0:0:7           101       210    112624   fc  VMKcore
/dev/disks/vmhba1:0:0:8           211       750    552944    6  FAT16

As you can see, the big VMware FS parition is no longer there.  One wrinkle, according to the escalation tech, was that I used "fdisk -l", which reports Start and End in sectors, while EXSi partitions are created using blocks (so he had to do that math to figure out the proper Start and End for the partition).

My lesson here is to never ever run a VMWare upgrade program without first making a fresh backup of all VMs.  Frightening how the upgrade program would just trash my partition tables that way without any warning.

Lars
0
 
LVL 24

Expert Comment

by:ryder0707
ID: 26102628
yes you could also backup the mbr/disk partiton table using dd :)
but i gues you already knew this
thanks for sharing
0

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

August and September have been big months for VMware—from VMworld last month to our new Course of the Month in VMware Professional - Data Center Virtualization. We reached out to Andrew Hancock, resident VMware vExpert, to have a more in-depth discu…
It’s time for spooky stories and consuming way too much sugar, including the many treats we’ve whipped for you in the world of tech. Check it out!
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
Suggested Courses

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question