Finding the cause of error of a virtual machine that has stopped responding

Dear Experts,

Recently I had an issue with the Exchange Server VM which had suddenly stopped responding
(I could not access OWA and blackberry mobiles could not get mails via pop3).

Given that, I was connected to vCenter 6.0 via vsphere client and I tried to open the console for this VM.
(Please note that all the other VMs that were running from the same ESXi were working flawlessly.)
There was a message at top of the console regarding pipe connection and the console’s screen was black and unresponding.
Luckily after restarting the VM, the operation of Exchange Server restored successfully.

Q1) where should I look in order to find what actually lead to this crash? (was a windows error or a vsphere’s one?)
Q2) Should I check the ESXi Host or the VM only, since it was the only one that was frozen?
Q3)is there any 3rd party tool for troubleshooting compared to the built-in options of vSphere?

Thanks in advance,
mamelasAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Q1. check events in the OS Event log. Check the Task and Events in the vSphere Client.

Q2. Check the VM logs, in the VM Folder.

Q3. You could use VMware vSphere Health advisors from https://www.opvizor.com/ or Sonar https://sonar-raas.com/.

Do you have VMware Tools installed ?

You should be using the VMXNET3 interface not the E1000 interface.

You have not allocated too many vCPUs ? (sockets)

You have not allocated too much memory?

or for both above to little.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Zacharia KurianAdministrator- Data Center & NetworkCommented:
Check lists;

1. Update VMware tools.
2. Make sure to upgrade Default VM Compatibility Level (VMX) to the latest
3. Make sure to use VMXNET3.
4. Check the event logs in your exchange for a closer clue.
5. Make sure you have enough space in your partitions.
6. Check your storage.
7. Check your exchange health.
8. Make sure to update to the latest RUs/CUs
9. Check for compatibly issue with 3rd party  software such AV, Backup etc..used in your exchange
 (if any)
10. Make sure to patch your OS too.

NB: before executing any of the above make sure to take a complete healthy backup of your exchange.
Zac.
0
mamelasAuthor Commented:
Dear Experts,

Thank you both for your replies!

Re your above questions/recommendations:

Exchange VM has VM tools Installed, it has 4vCPUs, 16GB RAM , latest Windows Updates and latest Exchange CU. The Virtual Machine Version of the VM is 11 and the only extra application that resides on this Server is the Kaspersky Antivirus Light Agent. Lastly there is more than enough space on the VMDKs’ for this Server.

-      I have downloaded the VM log file but it was impossible to read it since the text inside it was not formatted (all text is in a row with no spaces/lines). Is there any other way or software that would help me to open and read the logs?

-      Opvizor and Sonar-raas seem to be cloud based. Is there any application that I could install locally and health report my VM infrastructure?

-      The current adapter interface is E1000E. Should this cause any problem? Why should I use the VMXNET3 and what’s the difference between them?

Sorry for my long post.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Zacharia KurianAdministrator- Data Center & NetworkCommented:
VMXNET3 is the third generation paravirtualized NIC by VMware. VMXNET3 includes new features  which are not in enhanced ones. The main features are;

VLAN off-loading
Large TX/RX ring sizes (configured from within the virtual machine)
MSI/MSI-X support (subject to guest operating system kernel support)
Receive Side Scaling (supported in Windows 2008 when explicitly enabled through the device's Advanced configuration tab)
IPv6 check-sum and TCP Segmentation Offloading (TSO) over IPv6

So you better change it to VXNET3. By the way have checked the windows event logs for a clue?

Zac.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
The E1000 network adaptor, is a legacy adaptor, which should only be used to install the OS, and then switch to the virtualised adaptor, which runs at 10GBe.

Yes, they are cloud based, but they are available as free trials, and will such all the data and logs, out of your environment, upload securely to the cloud, and Report!

These are the based tools to perform a health check, unless you want to pay for a VMware Partner to attend site.

As for viewing log files, Notepad++, Vi for Windows will deal with the linefeeds.
0
mamelasAuthor Commented:
Thank you both! You helped me a lot
ps. the only strange error with the event log was an error coming from the Kaspersky AV Light Agent, which I will have to investigate.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VMware

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.