Link to home
Start Free TrialLog in
Avatar of SeeDk
SeeDk

asked on

RHEL 5.5 - Creating an ext4 filesystem: how to confirm it is stable/correctly configured?

This is on VM running RHEL 5.5.
I added a disk to this VM with 4TB of allocated space. Rebooted the OS and it shows as sdc.
Installed ext4 tools with: yum install e4fsprogs

Then configured partitions with parted:
parted /dev/sdc
mklabel gpt
mkpart primary 0GB 4398GB

'Print' shows the partition configured as GPT with the desired size.

Then configure the filesystem:
mke4s -t ext4 /dev/sdc1

Finished with no problem but then if I check with fdisk -l:
"Disk /dev/sdc doesn't contain a valid partition table"

Check with parted /dev/sdc:
"Unable to open /dev/sdc - unrecognized disk label"
so i do the first steps again to set up the partition but it shows the filesystem as ext3.

Wanting to test, I move forward and mount the device.
mkdir /data
mount /dev/sdc1 /data

df -h shows it mounted with a total size of 4T - looks good.
df -T shows the filesystem as ext4 and also the total size of 4T - good.

BUT I am still seeing the same errors with fdisk -l and parted:
fdisk -l :                   "Disk /dev/sdc doesn't contain a valid partition table"
parted /dev/sdc:   "Unable to open /dev/sdc -unrecognized disk label"

I am able to read/write data on the disk but am worried something is not correct about this configuration because these error messages keep appearing.
Avatar of SeeDk
SeeDk

ASKER

I was testing creating dummy data on the drive and the server froze up. Rebooted and drive was not mounted...thought it might be only because it wasn't added on etc/fstab.
Tried mounting and got this error:
special device /dev/sdc1 does not exist
fdisk and parted showing same errors as before.
In parted I did again:
mklabel gpt
mkpart primary 0GB 4398GB

After that:
Print shows FileSystem as ext3
fdisk-l shows System as EFI GPT

Then tried mounting:
"wrong fs type, bad option, bad superblock on /dev/sdc1"

Tried mounting again and the server froze again...needed another reboot.

Will probably need to reformat sdc but not sure what is the right way to do it.
SOLUTION
Avatar of Scott Silva
Scott Silva
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of SeeDk

ASKER

Made a new VM with 5.5 to test - same issue. Upgraded it to 5.11 - seems to work.
Still odd that filesystem is still shown as ext3 when looking under
parted
print

Tested writing a large file to the disk by running

yes this is a test > test.txt

seemed to be running fine until it got over 150GB at which point I noticed the partition had been lost in the OS. parted -> print returned the error of "Disk /dev/sdc doesn't contain a valid partition table"

A reboot and then I saw the same issue as before - "wrong fs type, bad option, bad superblock on /dev/sdc1"

At least the server didn't freeze up - not that it matters since this is a test server.

Is there some sort of limitation on how big the file size can be on an ext4 extension and/or how many i/o are permitted?

Either way...it seems an update is needed to it seems this is a good time to update to the latest RHEL. Hoping it is more stable there.
I'm going to assume they have an older version of parted?
RedHat shows 5 at being certified to 16TB on EXT4.
Does your VM system have any filesytem limits? Your storage?
Avatar of SeeDk

ASKER

I saw that certification as well which is why I thought this would work.
The Storage is 8TB. VMs and Storage are fine going higher than that as I have done it on Windows machines.
Parted version installed is 1.8.1.
Doing
yum -y install parted or yum update parted says this is the latest version...maybe this is the latest support on RHEL5.11? This is subscribed on RHN classic though.
On an RHEL7 server I have updated it to 3.1.
prob know this, just repeating the steps :)

Make sure its supported:
https://access.redhat.com/articles/rhel-limits

Make sure the disk was healthy (during boot):
dmesg | less
fsck /dev/sd*

Make sure the disk is mapped/handled correctly
ls -lart /dev/disk/by-path
cat /proc/diskstats
cat cat /sys/block/sd*/*

Make sure there are no other failures causing the freeze
view /var/log/messages




Make sure the disk is mounted correctly / handled correctly

1. Find the desired partition: ls -lart /dev/disk/by-path/
[root@sux12 home]# ls -lart /dev/disk/by-path/
total 0
drwxr-xr-x. 5 root root 100 Jul  4  2016 ..
drwxr-xr-x. 2 root root 280 Jul  4  2016 .
lrwxrwxrwx. 1 root root  10 Jul  4  2016 xen-vbd-51760 -> ../../xvdd
lrwxrwxrwx. 1 root root  10 Jul  4  2016 xen-vbd-51776 -> ../../xvde
lrwxrwxrwx. 1 root root  10 Jul  4  2016 xen-vbd-51744 -> ../../xvdc
lrwxrwxrwx. 1 root root  10 Jul  4  2016 xen-vbd-51728 -> ../../xvdb
lrwxrwxrwx. 1 root root  11 Jul  4  2016 xen-vbd-51776-part1 -> ../../xvde1
lrwxrwxrwx. 1 root root  11 Jul  4  2016 xen-vbd-51760-part1 -> ../../xvdd1
lrwxrwxrwx. 1 root root  10 Jul  4  2016 xen-vbd-51712 -> ../../xvda
lrwxrwxrwx. 1 root root  11 Jul  4  2016 xen-vbd-51712-part3 -> ../../xvda3
lrwxrwxrwx. 1 root root  11 Jul  4  2016 xen-vbd-51712-part2 -> ../../xvda2
lrwxrwxrwx. 1 root root  11 Jul  4  2016 xen-vbd-51728-part1 -> ../../xvdb1
lrwxrwxrwx. 1 root root  11 Jul  4  2016 xen-vbd-51744-part1 -> ../../xvdc1
lrwxrwxrwx. 1 root root  11 Jul  4  2016 xen-vbd-51712-part1 -> ../../xvda1

Open in new window


If you created a partition correctly udev should have created a disk called sdc1 wich is the 1st partition on the scsi device c (third populated device during boot).

You can then use tune2fs -l /dev/sdc1 to (re)view the filesystem configuration.

[root@sux12 home]# tune2fs -l /dev/xvdb1
tune2fs 1.41.12 (17-May-2010)
Filesystem volume name:   <none>
Last mounted on:          <not available>
Filesystem UUID:          1ac7d608-ceda-4a32-90b7-0a411f3e8599
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Filesystem flags:         signed_directory_hash
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              3276800
Block count:              13107024
Reserved block count:     655351
Free blocks:              7616815
Free inodes:              3276760
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      1020
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Filesystem created:       Tue Aug 13 13:19:59 2013
Last mount time:          Mon Jul  4 11:38:07 2016
Last write time:          Mon Jul  4 11:38:07 2016
Mount count:              2
Maximum mount count:      36
Last checked:             Fri Mar  4 09:37:50 2016
Check interval:           15552000 (6 months)
Next check after:         Wed Aug 31 10:37:50 2016
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      4eaa0419-f453-4560-bac7-3f4e018fcb63
Journal backup:           inode blocks

Open in new window


Next you need to make sure the disk is mounted during boot.
To do this you need to configure the file /etc/fstab

In this configuration its wise to use either a volume label or the filesystem UUID to mount the disired disk. Wise because udev will create device mappers in the order disks are presented to the OS. If the order is changed for any reason (best tested with two USB disks) the layout will change having all kinds of undesired effects.

To assign a volume label (more readable in relation to the UUID) use the command
[root@sux12 home]# tune2fs -L /mnt/example /dev/xvdb1
tune2fs 1.41.12 (17-May-2010)

Open in new window


You can review the configuration by repeating the tune2fs -l /dev/sdc1 command.
tune2fs 1.41.12 (17-May-2010)
Filesystem volume name:   /mnt/example
Last mounted on:          <not available>
Filesystem UUID:          1ac7d608-ceda-4a32-90b7-0a411f3e8599

Open in new window


Then add the following line to /etc/fstab. This will mount the device with the lable /mnt/example to the actual folder /mnt/example expecting an ext4 filesystem using the default options, do not allow the filesystem to be dumped, do not perform a filesystem check (or replace with a number 2 to instruct fsck to perform check on the disks during boot where this disk doesnt hold the root partition).
LABEL=/mnt/example              /mnt/example                    ext4    defaults        0 0

Open in new window


Reapply the fstab using:
mount -a

Open in new window


Review the drive mouted accordingly using:
[root@sux12 home]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/xvda2            9.8G  3.7G  5.7G  39% /
tmpfs                 5.9G     0  5.9G   0% /dev/shm
/dev/xvda1             99M   50M   45M  53% /boot
/dev/xvde1            148G   53G   88G  38% /u01
/dev/xvdb1             50G   22G   25G  47% /mnt/example
/dev/xvdc1             74G   56G   15G  80% /u02

Open in new window

Avatar of SeeDk

ASKER

I made a VM with the newest version of RHEL (7.3) and am still having this problem.
The only difference is that now parted does report the filesystem as ext4.
I have to format it as ext4 twice to be able to mount it - the first time I get that superblock error.
But when I write any data at all - the disk gets corrupted and parted shows the partition is lost..

This led me to suspect maybe there was a physical problem with the disks or VM configuration but I tested the same disks on a Windows host and it worked perfectly. Mounted with no problems and wrote data to almost full capacity.

Everything I check online says this should be straightforward and I don't see any mention of this specific issue. Maybe it is something with the VM config? The VM machine version is 8 and ESXI is 5.5..which should support this. The total disk size I am trying to mount is 4TB.
Did you try creating a partition using fdisk instead of parted?

fdisk /dev/sdc
p (print partition tables)
d (if empty delete partitions)
n (create new partition)
p (primary partition)
.. consume whole disk
w (write partition table)
exit (force resync)

mkfs.ext4 /dev/sdc1 -L [your label]

tune2fs /dev/sdc1

?
Avatar of SeeDk

ASKER

fdisk doesnt support disks above 2TB. This is a 4TB disk and I saw online all the guides say use parted
ow right GPT forgot about that.

can you:

1. just create a partition with parted;
2. then : dd if=/dev/sdc of=/home/[user]/ptable bs=512 count=1
3. then : file /home/[user]/ptable
4. then share the result?

[root@support ~]# dd if=/dev/xvdb of=./ptable bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000243239 s, 2.1 MB/s
[root@support ~]# file ./ptable
./ptable: x86 boot sector; partition 1: ID=0x83, starthead 1, startsector 63, 104856192 sectors, extended partition table (last)\011, code offset 0x0

Open in new window

Avatar of SeeDk

ASKER

/home/support/ptable: x86 boot sector; partition 1: ID=0xee, starthead 0, startsector 1, 4294967295 sectors, extended partition table (last)\011, code offset 0x0

Open in new window


FYI, device is sdb now because I am testing on another server where it is the second disk.
Also, I tested using gdisk to format. It worked, I was able to write data - in fact was able to fill disk with data to almost full capacity with fallocate.
Rebooted...and partition+data gone again.
Can you repeat the dd step after a reboot on the same device and share the result?
Avatar of SeeDk

ASKER

home/support/ptable: x86 boot sector; partition 1: ID=0xee, starthead 0, startsector 1, 4294967295 sectors, extended partition table (last)\011, code offset 0x0

Open in new window


Looks the same
Well then the partition isnt lost (phew)

can you share the result of the command: ls -lart /dev/disk/by-path en tell me what the disk is we are troubleshooting?
Avatar of SeeDk

ASKER

Yeah, it seems the partition is not lost unless I try writing data to it first.
I just formatted it as ext4 again and wrote data to it. Then rebooted and partition is lost:

/home/support/ptable: data

Open in new window


This is the result of ls -lart now:
brw-rw----. 1 root disk 8, 16 May 10 14:12 /dev/sdb

Open in new window

Can you perform a hexdump on the data?

hexdump -C /home/support/ptable
Avatar of SeeDk

ASKER

Do you mean the data I wrote to the disk? It was just fallocate created files and some random test.txt files (this is a test, hello, etc)

and i can't access it now anyway since the partition is gone - I don't know how I would go about recovering it. Not that it matters - just test files.
Lol no :)

From what I understand.

1.  You create a GPT partition then format the disk using ext4.
2. Then you write some data to the disk (assuming its mounted)
3. Then you reboot the machine
4. Then the partition (assuming you are using the dd instruction) returns : data

please perform a hexdump on the file containing : data ;-)

(btw, how are you using fallocate and if codewize what offset are you using?)
Avatar of SeeDk

ASKER

Oh ok, this is the result:

00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200

Open in new window


fallocate use i think is nothing unusual:

fallocate -l 1000G test.txt
fallocate -l 600G test2.txt
fallocate -l 500G test2.txt

and so on
Avatar of SeeDk

ASKER

I did cat /var/log/messages | grep sdb*  and these errors - am I overlooking something when partitioning?:

May 10 14:00:11 localhost udisksd[2638]: Error probing device: Error sending ATA command IDENTIFY PACKET DEVICE to /dev/sr0: ATA command failed: error=0x01 count=0x02 status=0x50 (g-io-error-quark, 0)
May 10 14:00:11 localhost udisksd[2638]: Acquired the name org.freedesktop.UDisks2 on the system message bus
May 10 14:01:40 localhost kernel: sd 0:0:1:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
May 10 14:01:40 localhost kernel: sdb: sdb1
May 10 14:02:42 localhost kernel: sd 0:0:1:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
May 10 14:02:42 localhost kernel: sdb: sdb1
May 10 14:02:49 localhost kernel: EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null)
May 10 14:02:51 localhost kernel: sdb1: WRITE SAME failed. Manually zeroing.
May 10 14:06:54 localhost kernel: EXT4-fs error (device sdb1): ext4_mb_generate_buddy:757: group 16, block bitmap and bg descriptor inconsistent: 32768 vs 24544 free clusters
May 10 14:06:54 localhost kernel: EXT4-fs error (device sdb1): ext4_mb_generate_buddy:757: group 25, block bitmap and bg descriptor inconsistent: 32768 vs 31359 free clusters
May 10 14:06:54 localhost kernel: EXT4-fs error (device sdb1): ext4_mb_generate_buddy:757: group 27, block bitmap and bg descriptor inconsistent: 32768 vs 31359 free clusters
May 10 14:06:54 localhost kernel: EXT4-fs error (device sdb1): ext4_mb_generate_buddy:757: group 32, block bitmap and bg descriptor inconsistent: 32768 vs 24544 free clusters
May 10 14:06:54 localhost kernel: EXT4-fs error (device sdb1): ext4_mb_generate_buddy:757: group 48, block bitmap and bg descriptor inconsistent: 32768 vs 24544 free clusters
May 10 14:06:54 localhost kernel: EXT4-fs error (device sdb1): ext4_mb_generate_buddy:757: group 49, block bitmap and bg descriptor inconsistent: 32768 vs 31359 free clusters
May 10 14:06:54 localhost kernel: EXT4-fs error (device sdb1): ext4_mb_generate_buddy:757: group 64, block bitmap and bg descriptor inconsistent: 32768 vs 24544 free clusters
May 10 14:06:54 localhost kernel: EXT4-fs error (device sdb1): ext4_mb_generate_buddy:757: group 80, block bitmap and bg descriptor inconsistent: 32768 vs 24544 free clusters
May 10 14:06:54 localhost kernel: EXT4-fs error (device sdb1): ext4_mb_generate_buddy:757: group 81, block bitmap and bg descriptor inconsistent: 32768 vs 31359 free clusters
May 10 14:06:54 localhost kernel: EXT4-fs error (device sdb1): ext4_mb_generate_buddy:757: group 96, block bitmap and bg descriptor inconsistent: 32768 vs 24544 free clusters
May 10 14:09:04 localhost kernel: Command line: BOOT_IMAGE=/vmlinuz-3.10.0-514.16.1.el7.x86_64 root=/dev/mapper/rhel-root ro crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap rhgb quiet net.ifnames=0 biosdevname=0 LANG=en_US.UTF-8
May 10 14:09:04 localhost kernel: Kernel command line: BOOT_IMAGE=/vmlinuz-3.10.0-514.16.1.el7.x86_64 root=/dev/mapper/rhel-root ro crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap rhgb quiet net.ifnames=0 biosdevname=0 LANG=en_US.UTF-8
May 10 14:09:04 localhost kernel: sd 0:0:0:0: [sda] 31457280 512-byte logical blocks: (16.1 GB/15.0 GiB)
May 10 14:09:04 localhost kernel: sd 0:0:0:0: [sda] Write Protect is off
May 10 14:09:04 localhost kernel: sd 0:0:1:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
May 10 14:09:04 localhost kernel: sd 0:0:0:0: [sda] Cache data unavailable
May 10 14:09:04 localhost kernel: sd 0:0:0:0: [sda] Assuming drive cache: write through
May 10 14:09:04 localhost kernel: sd 0:0:1:0: [sdb] 6442450944 512-byte logical blocks: (3.29 TB/3.00 TiB)
May 10 14:09:04 localhost kernel: sd 0:0:1:0: [sdb] Write Protect is off
May 10 14:09:04 localhost kernel: sd 0:0:1:0: [sdb] Cache data unavailable
May 10 14:09:04 localhost kernel: sd 0:0:1:0: [sdb] Assuming drive cache: write through
May 10 14:09:04 localhost kernel: sd 0:0:1:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
May 10 14:09:04 localhost kernel: sda: sda1 sda2
May 10 14:09:04 localhost kernel: sd 0:0:0:0: [sda] Attached SCSI disk
May 10 14:09:04 localhost kernel: sd 0:0:1:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
May 10 14:09:04 localhost kernel: sd 0:0:1:0: [sdb] Attached SCSI disk
May 10 14:09:06 localhost kernel: Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
May 10 14:09:06 localhost kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0
May 10 14:09:06 localhost kernel: sd 0:0:1:0: Attached scsi generic sg1 type 0
May 10 14:09:06 localhost kernel: XFS (sda1): Mounting V4 Filesystem
May 10 14:09:07 localhost kernel: XFS (sda1): Ending clean mount
May 10 14:09:07 localhost smartd[751]: Device: /dev/sda, opened
May 10 14:09:07 localhost smartd[751]: Device: /dev/sda, [VMware   Virtual disk     1.0 ], 16.1 GB
May 10 14:09:07 localhost smartd[751]: Device: /dev/sda, IE (SMART) not enabled, skip device
May 10 14:09:07 localhost smartd[751]: Try 'smartctl -s on /dev/sda' to turn on SMART features
May 10 14:09:07 localhost smartd[751]: Device: /dev/sdb, opened
May 10 14:09:07 localhost smartd[751]: Device: /dev/sdb, [VMware   Virtual disk     1.0 ], 3.29 TB
May 10 14:09:07 localhost smartd[751]: Device: /dev/sdb, IE (SMART) not enabled, skip device
May 10 14:09:07 localhost smartd[751]: Try 'smartctl -s on /dev/sdb' to turn on SMART features
May 10 14:09:17 localhost rhnsd[1111]: Spacewalk Services Daemon starting up, check in interval 240 minutes.
May 10 14:09:17 localhost rhnsd: Starting Spacewalk Daemon: [  OK  ]
May 10 14:09:26 localhost udisksd[2634]: udisks daemon version 2.1.2 starting
May 10 14:09:26 localhost udisksd[2634]: Error probing device: Error sending ATA command IDENTIFY PACKET DEVICE to /dev/sr0: ATA command failed: error=0x01 count=0x02 status=0x50 (g-io-error-quark, 0)
May 10 14:09:26 localhost udisksd[2634]: Acquired the name org.freedesktop.UDisks2 on the system message bus
May 10 14:12:30 localhost kernel: sd 0:0:1:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
May 10 14:32:17 localhost kernel: sd 0:0:1:0: [sdb] Very big device. Trying to use READ CAPACITY(16).

Open in new window

Based on Redhat support concerning the write same failed:

These messages are essentially informational.
        If a device supports WRITE SAME, then the kernel will use that scsi command to optimize zeroing of blocks.
        If the device does not support WRITE SAME or if the operation fails, the kernel falls back to writing zeroes using WRITE commands.
        It is this later case, WRITE SAME command failing, that causes the event to be output -- the kernel then repeats the request using normal write commands.


On the buddy messages:

Whilst the meaning of the ext4_mb_generate_buddy is understood, the root cause for the block count mismatch is not understood. Often times this was caused by an issue in storage hardware.

There is a known issue on RHEL 6:

On Red Hat Enterprise Linux 6, performing an offline resize after consuming all reserved GDT blocks. Check the following solution for more information and steps to correct the issue: Consumption of reserved GDT blocks during an online resize results in corruption following the offline resize to an ext4 filesystem. In which case you need to contact your RHEL support for a patch.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of SeeDk

ASKER

Great, so much for an easy fix. These are new disks but the vendor who sent one that was DOA originally. Maybe there is something wrong with these disks..
I'll try to test on another vmhost and see if that works better.
sry I cant help you out further :(
How about trying a newer vm hardware version? 5.5 should do version 10... Also, try a different scsi emulator...

It looks like VM version 8 only supports 2TB drives reliably.
Avatar of SeeDk

ASKER

Only VM version 8 is visible when creating new VMs on these hosts. Maybe they are missing an update or licensing for version10.
How do I try a different scsi emulator?
This is the first time I've tried adding 1TB+ drives to Linux but have done so on Windows VMs here with no issues.

I was able to add 3TB from this same drive to a Windows VM with no problem so it is probably Linux ext4 having an issue with either the VM software or physical disks.
I was able to reproduce the same error with only 450GB allocated space.
Are you sure you are at ESX5.5?
You might have to use the web client to get all the newer features, but that might be only above 5.5

I can't check since I am on 6.0.

I usually pick "custom "  and not "typical" for initial configuration.
There should be at least
Buslogic and LSI logic controller choices.
Avatar of SeeDk

ASKER

Yeah it is 5.5 - I know it does support machine version 10 because I converted a a physical to VM as version 10 once. But it is not in the options for new VM's.
I see where you saying - haven't been paying attention to this screen honestly and the default is "VMware Paravirtual" so I think that's what I have been using. The other three options are:

BusLogic Parallel (not recommended for this guest OS)
LSI Logic Parallel
LSI Logic SAS

Which do you recommend using?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of SeeDk

ASKER

I can live with 2TB - the problem is this one is losing partitioning with even 450GB disks!

Anyway, I created another RHEL 7.3 VM (still machine v8) on a seperate host. Using same paravirtual driver. Unfortunately I only have 2tb worth of spare space to test with there.
I tested creating two 1TB drives, one at a time and rebooting. No issues.
Filled them with 100% data, added to fstab rebooted and still looks good. Rebooted a few more times for good measure - no data loss.

So it looks like there is something outside the VMspace causing this issue. I'll have to check the firmware on the RAID controllers as well as the disks...but my suspicion is on the disks. Since they are not Dell-branded (we use all Dell) and this vendor has been giving me lots of grief with disks lately.
Avatar of SeeDk

ASKER

Turns out I had a RAID0 2TB datastore configured on the the same vmhost using the same PERC controller. Unused since of course RAID0 is no good for anything permanent.
So I used this as a test on the same RHEL7.3 VM having these partition loss issues.
Set it up with 2 disks allocated 1TB each from that RAID 0 datastore. Filled them 100% with data using fallocate.
Rebooted, shutdown...multiple times and didn't see this strange partition loss!
Think it's safe to say the problem is the hard disks...I am getting replacements ASAP.
Thank you everyone for your help.
Youre welcome, gl on replacing the DOAs