We help IT Professionals succeed at work.

Linux on VM errors: BUG: soft lockup cpu#[y] stuck for [x]s!

Medium Priority
3,877 Views
Last Modified: 2016-09-05
I have a Linux machine that is giving a lot of "BUG: soft lockup cpu#[y] stuck for [x]s!" errors.
It is Oracle Linux 7.1 Running on vSphere 5.5.

Here is an example output:

helpme1.png
Comment
Watch Question

Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
CERTIFIED EXPERT
Fellow
Expert of the Year 2017
Commented:
This could be a resource issue, linked to your other question, you have posted, and is to be expected.

You can read the VMware KB here

Soft lockup messages from Linux kernel running in an SMP-enabled virtual machine (1009996)

Author

Commented:
Okay. This is actually a different Virtual Machine than the the one I posted on the other thread. Although it resides on the same vSphere umbrella.

Author

Commented:
So even though this is a different VM (on the same host machine),  it's still to be expected?
CERTIFIED EXPERT
Commented:
Is this a new installation? Did you make any changes since it happen? What troubleshoot steps you have done?

Try update to Oracle Linux 7.2.

Author

Commented:
No, it is not a new installation. I did not make any changes. I haven't done any troubleshooting steps.

I will not be able to update to Oracle 7.2 as this is a production system.

Author

Commented:
I'm seeing more similar errors:

More errors
CERTIFIED EXPERT

Commented:
I haven't done any troubleshooting steps.

Because you have not provide any detail information or any troubleshooting you have done, it is very difficult to tell.

This soft lock up message is common to be seen in vm if large resources is being committed or it can be a hardware or kernel bug. Recommended steps:

1) Update to the latest kernel.
2) Migrate the vm to ANOTHER host and obverse if this happen on heavy load.
3) Migrate a different vm to THIS host and obverse if this happen on another vm.
Top Expert 2015
Commented:
Reduce number of virtual CPUs to be less or equal to physical cores in the host machine.
Make sure you turn (v)NUMA off for that virtual machine with oracle products.
Do not overcommit host memory or vm memory (kswapd in one screenshot tells guest runs out of memory)
Use virtual hardware v9 or better (it is mentioned in vmware compatibikity lists), older versions have timing issues with very fresh Linux kernels.

At least provide following information:
If you run UEK or basic kernel.
Build number of ESXi
Hardware configuration (snip of place where CPU type is next to RAM size)
VM configuration (1st screen of 'edit config' should suffice)
Sure dont post hostnames and IPs.
Top Expert 2015

Commented:
Some feedback is nice, but having none lets split between typists.