Lost Partition on Citrix XenServer

We have 2 Servers with Citrix XEN 6.2. Both today suddenly both had troubles.
Both Server are Stand Alone with a Raid1 Disk for the System and a Raid5 Disk for the VM's.

It seems that both machines dropped the connection to the Raid5 Drive.

The Raid5 Disk is unplugged. When wie try pbd-plug [UUID] we get following error

Error parameters: , Logical Volume mount/activate error [opterr=Unable to activa                                                                                                                                                             te LV. Errno is 5]

an fdisk -l tells us that there is no partition on the /dev/sdb device.

[root@hskxen02 ~]# fdisk -l

WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sda: 146.2 GB, 146263769088 bytes
256 heads, 63 sectors/track, 17712 cylinders
Units = cylinders of 16128 * 512 = 8257536 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1       17713   142835711+  ee  EFI GPT

Disk /dev/sdb: 1316.3 GB, 1316373921792 bytes
255 heads, 63 sectors/track, 160040 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdb doesn't contain a valid partition table

Please Help!!
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Did you setup the local storage as ext3 ("optimized for XenDesktop") or lvm?
With lvm there won't be partition entries because xenserver (at least 5.5 did) makes the whole disk a physical volume using lvm2. Some commands that may help (comments after #):
pvscan # rescan all disks for physical volumes, this should show sdb and list a VG_XenStorage-*
lvscan # scan for logical volumes. You can ignore the active/inactive-state. xenserver manages this itself
xe sr-list type=lvm
xe sr-scan uuid=xxxx-xxxx-xxxx-xxxx
xe vdi-list sr-uuid=xxxx-xxxx-xxxx-xxxx
WaibelITAuthor Commented:
Hello acbxyz,

please view ther results:
i tried to destroy and recreate the pbd of the raid5 already. i tried it before on a test server. when i try to attache the disk pbd with pbd-plug i get the error as described prior. no matter if i use /dev/sdb as device or /dev/disk/by-id/scisi..........

xe pbd-list
uuid ( RO)                  : db3590c3-c709-c9d0-ed8b-9d2c0984aeb6
             host-uuid ( RO): 967eadc3-1cce-43b7-a13c-b32bd9f422ea
               sr-uuid ( RO): 6f900dbc-ff24-f1a3-b6b5-4721171bb868
         device-config (MRO): type: nfs_iso; location:
    currently-attached ( RO): true

uuid ( RO)                  : 0eea2a16-554a-6e6f-5f65-f8305a6454d9
             host-uuid ( RO): 967eadc3-1cce-43b7-a13c-b32bd9f422ea
               sr-uuid ( RO): 283656ca-c589-8036-4d71-618891e22cf0
         device-config (MRO): location: /dev/xapi/cd
    currently-attached ( RO): true

uuid ( RO)                  : d9d1faae-cc74-a625-e1e2-0e96097f5c2c
             host-uuid ( RO): 967eadc3-1cce-43b7-a13c-b32bd9f422ea
               sr-uuid ( RO): 3bb63be3-adb8-fc7b-ae58-da2ef9361d9a
         device-config (MRO): device: /dev/sdb
    currently-attached ( RO): false

uuid ( RO)                  : 95021b09-98be-2d62-e75a-19291dae9de7
             host-uuid ( RO): 967eadc3-1cce-43b7-a13c-b32bd9f422ea
               sr-uuid ( RO): ab352e51-0239-c682-3b32-d619a18b8d2c
         device-config (MRO): serverpath: /NFS_VHD; server: hskna01.hsk.local; options:
    currently-attached ( RO): true

uuid ( RO)                  : 134d9204-aa65-0280-c10d-f8d81fb3a33f
             host-uuid ( RO): 967eadc3-1cce-43b7-a13c-b32bd9f422ea
               sr-uuid ( RO): 012c1056-022f-ee72-a90e-017ff94c8fe5
         device-config (MRO): location: /opt/xensource/packages/iso; legacy_mode: true
    currently-attached ( RO): true

uuid ( RO)                  : e561720c-3f44-6a71-1c75-27f6ada7284e
             host-uuid ( RO): 967eadc3-1cce-43b7-a13c-b32bd9f422ea
               sr-uuid ( RO): aef18868-9f83-7746-9002-4436921ea563
         device-config (MRO): location: /dev/xapi/block
    currently-attached ( RO): true

uuid ( RO)                  : f631d083-3e2a-a695-dd0e-6b750a66e76b
             host-uuid ( RO): 967eadc3-1cce-43b7-a13c-b32bd9f422ea
               sr-uuid ( RO): f0f4b7c9-58c0-cf53-2270-e5038a6472f6
         device-config (MRO): device: /dev/disk/by-id/scsi-36003005701020f9017b41cbb0e07864f-part3
    currently-attached ( RO): true

xe sr-list
uuid ( RO)                : 3bb63be3-adb8-fc7b-ae58-da2ef9361d9a
          name-label ( RW): RAID5
    name-description ( RW): High Speed - SAS XenServer Store
                host ( RO): hskxen02.hsk.local
                type ( RO): ext
        content-type ( RO): user

uuid ( RO)                : 012c1056-022f-ee72-a90e-017ff94c8fe5
          name-label ( RW): XenServer Tools
    name-description ( RW): XenServer Tools ISOs
                host ( RO): hskxen02.hsk.local
                type ( RO): iso
        content-type ( RO): iso

uuid ( RO)                : f0f4b7c9-58c0-cf53-2270-e5038a6472f6
          name-label ( RW): RAID1
    name-description ( RW):
                host ( RO): hskxen02.hsk.local
                type ( RO): ext
        content-type ( RO): user

  PV /dev/sda3   VG XSLocalEXT-f0f4b7c9-58c0-cf53-2270-e5038a6472f6   lvm2 [128.21 GB / 0    free]
  Total: 1 [128.21 GB] / in use: 1 [128.21 GB] / in no VG: 0 [0   ]

  ACTIVE            '/dev/XSLocalEXT-f0f4b7c9-58c0-cf53-2270-e5038a6472f6/f0f4b7c9-58c0-cf53-2270-e5038a6472f6' [128.21 GB] inherit
If even pvscan doesn't recognize the disk xenserver can't either.
The entries below /dev/disk/by-id are just symbolic links to /dev/sdx so using these always gives the same result.

To check the content of disks and partitions you can use this command "blkid /dev/sd*" (without quotes, of course).
Another way is to boot the machine from a live linux system (cd or usb dongle) and try to repair it from there. I prefer grml nowadays and check.
A live system has more tools and a newer kernel than xenserver itself.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
WaibelITAuthor Commented:
The Disk could not be recovered. I had a Citrix engeneer working on both Servers vor about 6 hours. On one of the serves he was able to make the Storage accessible again. But sadly the Disk was  empty. We still can not explain how it was possible that two servers at the same time crushed the filesystem. Looking on the protocols of the Hardware nothing special happened. No power loss and no Raid-Error.

Just if some one runs in a similiar problem i post the summery from the citrix engeneer:

Issue Description: cannot re-attach the  SR
Environmental Details: 6.2  

Troubleshooting Steps followed:
¿      The error message  found in the logs  :

EXT3-fs error (device dm-1): ext3_valid_block_bitmap: Invalid block bitmap - block_group = 27, block = 885761
Apr 28 12:00:35 hskxen01 kernel: [139021.831759] journal_bmap: journal block not found at offset 31756 on dm-1
Apr 28 12:00:35 hskxen01 kernel: [139021.831776] Aborting journal on device dm-1.
Apr 28 12:00:35 hskxen01 kernel: [139021.838097] ext3_abort called.
Apr 28 12:00:35 hskxen01 kernel: [139021.838112] EXT3-fs error (device dm-1): ext3_journal_start_sb: Detected aborted journal

¿      The PBD plug failed with the error message” Volume Group  XS-Local-**** is not available” .
¿      Recreated the PV and restored  VG on the top.
¿      The PBD plug failed with the fsck error. Below steps fixed the issue:

dumpe2fs /dev/XSLocal-**** | grep "block size" –I  -¿ note down the block size.

Block size:               4096

mke2fs -S -b 4096 -v /dev/XSLocal--***

e2fsck -y -f -v -C 0 /dev/XSLocal--***

tune2fs –j /dev/XSLOCAL--***  ¿ to rebuild the journal.

Since, we have been able to connect the Storage repository, I would now proceed with the case closure.
However if you any questions, please feel free to let me know.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.