Link to home
Start Free TrialLog in
Avatar of emjay180
emjay180Flag for United States of America

asked on

RHEL 5.6 VMs on vSphere 4 going read-only and unable to login

Hello, we have several vSphere VMs running RHEL 5.6 that will run fine, then all of a sudden, the file system goes read-only and/or won't allow logins of any type until the VM is rebooted.  We have upgraded BIOs so far with no effect and are wondering if anyone has heard of this happening and any suggestion of root cause.  Could it be a RHEL/VMware combo thing or something related to either side?  Thank you for your help!
Avatar of IanTh
IanTh
Flag of United Kingdom of Great Britain and Northern Ireland image

is the file system local or iscsi / nfs
why arent you running 4.1 u2 ?
Usually when RHEL fall to read only mode is that it lost access to his storage for certain amount of times. If you use shared storage (iSCSI or NFS) try using two NIC to access your storage (multipathting in iSCSI) and if NFS try using 2 NIC active/active with IP HASH load balancing. If you have local storage try moving that vm to local to rule out network timeout.
Avatar of emjay180

ASKER

Yes, it's EMC VNX storage, and they may be vSphere 4.1 u2 boxes, I just put 4 because I know it's not 5.  

Does the degraded storage connection cause the login lockout for all users too?  It always comes back when server is rebooted, but it happens everyday.  

Is there anyway to adjust the threshold so it doesn't trigger "read-only" mode so easily?  

Thank you for your help!
Please see KB from vmware.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=51306

If your RHEL lost his storage the KB won't help you. CentOS and RHEL became read-only as a mean to preserve integrity of the OS. As per KB you can try the following command :
mount -o remount /


Look at /var/log/vmkernel on CLI of the ESX Host for issues.

Do you have a failover path ?
Are you using NFS or iSCSI.

If using iSCSI check the error on your vmkernel log and follow the suggestion here
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1030381
You may be suffering this because of poor storage performance, too.  If you have a long wait for storage, it will seem like a loss or disconnect from storage to the linux machine and then it will mark the drive(s) as read-only to prevent corruption.  Look at your storage performance monitors.
ASKER CERTIFIED SOLUTION
Avatar of Mysidia
Mysidia
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank you for all the comprehensive ideas!  I'll get to work trying to shake something out from the processes you've suggested.
Avatar of Luciano Patrão
Hi

Besides of adding a second NIC in the iSCSI network(and use port bidding), if is iSCSI, or a second FC card to the host to prevent this issues, most of the times a simple restart of the VM fix this problems in the Linux.

But you need to detail more your VMware vs Storage environment to understand and try to help the best way to prevent this.

In VMware vs Storage is always a good option to have everything with hight availably. Double cards, double Switchs(FC or LAN) and connections should always be balanced between cards and Swichs(card 1 port 1 to Switch 1, card 1 port 2 to Switch 2)

Try also to check the logs from the EMC, you should have some disconnections and connections on it.

Some print screens from the vSwitch, Networking and Storage adapters configurations can help.

Hope this can help

Jail
Hello, answer seems to have been upgrading the VMtools to latest version (vmwaretools 8.6.5 build-731933).  It's has been operating well without dropping into read-only mode for several days now.
Final actual solution was to upgrade VMtools.  But, this was best troubleshooting overview.