emjay180
asked on
RHEL 5.6 VMs on vSphere 4 going read-only and unable to login
Hello, we have several vSphere VMs running RHEL 5.6 that will run fine, then all of a sudden, the file system goes read-only and/or won't allow logins of any type until the VM is rebooted. We have upgraded BIOs so far with no effect and are wondering if anyone has heard of this happening and any suggestion of root cause. Could it be a RHEL/VMware combo thing or something related to either side? Thank you for your help!
Usually when RHEL fall to read only mode is that it lost access to his storage for certain amount of times. If you use shared storage (iSCSI or NFS) try using two NIC to access your storage (multipathting in iSCSI) and if NFS try using 2 NIC active/active with IP HASH load balancing. If you have local storage try moving that vm to local to rule out network timeout.
ASKER
Yes, it's EMC VNX storage, and they may be vSphere 4.1 u2 boxes, I just put 4 because I know it's not 5.
Does the degraded storage connection cause the login lockout for all users too? It always comes back when server is rebooted, but it happens everyday.
Is there anyway to adjust the threshold so it doesn't trigger "read-only" mode so easily?
Thank you for your help!
Does the degraded storage connection cause the login lockout for all users too? It always comes back when server is rebooted, but it happens everyday.
Is there anyway to adjust the threshold so it doesn't trigger "read-only" mode so easily?
Thank you for your help!
Please see KB from vmware.
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=51306
If your RHEL lost his storage the KB won't help you. CentOS and RHEL became read-only as a mean to preserve integrity of the OS. As per KB you can try the following command :
mount -o remount /
Look at /var/log/vmkernel on CLI of the ESX Host for issues.
Do you have a failover path ?
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=51306
If your RHEL lost his storage the KB won't help you. CentOS and RHEL became read-only as a mean to preserve integrity of the OS. As per KB you can try the following command :
mount -o remount /
Look at /var/log/vmkernel on CLI of the ESX Host for issues.
Do you have a failover path ?
Are you using NFS or iSCSI.
If using iSCSI check the error on your vmkernel log and follow the suggestion here
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1030381
If using iSCSI check the error on your vmkernel log and follow the suggestion here
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1030381
You may be suffering this because of poor storage performance, too. If you have a long wait for storage, it will seem like a loss or disconnect from storage to the linux machine and then it will mark the drive(s) as read-only to prevent corruption. Look at your storage performance monitors.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you for all the comprehensive ideas! I'll get to work trying to shake something out from the processes you've suggested.
Hi
Besides of adding a second NIC in the iSCSI network(and use port bidding), if is iSCSI, or a second FC card to the host to prevent this issues, most of the times a simple restart of the VM fix this problems in the Linux.
But you need to detail more your VMware vs Storage environment to understand and try to help the best way to prevent this.
In VMware vs Storage is always a good option to have everything with hight availably. Double cards, double Switchs(FC or LAN) and connections should always be balanced between cards and Swichs(card 1 port 1 to Switch 1, card 1 port 2 to Switch 2)
Try also to check the logs from the EMC, you should have some disconnections and connections on it.
Some print screens from the vSwitch, Networking and Storage adapters configurations can help.
Hope this can help
Jail
Besides of adding a second NIC in the iSCSI network(and use port bidding), if is iSCSI, or a second FC card to the host to prevent this issues, most of the times a simple restart of the VM fix this problems in the Linux.
But you need to detail more your VMware vs Storage environment to understand and try to help the best way to prevent this.
In VMware vs Storage is always a good option to have everything with hight availably. Double cards, double Switchs(FC or LAN) and connections should always be balanced between cards and Swichs(card 1 port 1 to Switch 1, card 1 port 2 to Switch 2)
Try also to check the logs from the EMC, you should have some disconnections and connections on it.
Some print screens from the vSwitch, Networking and Storage adapters configurations can help.
Hope this can help
Jail
ASKER
Hello, answer seems to have been upgrading the VMtools to latest version (vmwaretools 8.6.5 build-731933). It's has been operating well without dropping into read-only mode for several days now.
ASKER
Final actual solution was to upgrade VMtools. But, this was best troubleshooting overview.
why arent you running 4.1 u2 ?