asked on

Centos 6 (fedora 14) kvm sporadic failures

The described behavior is found on both fedora 14 installs and on centos 6 installs.
I've migrated to centos 6 in a hope that this issue will be gone but it didn't.
The situation:
On one server ~90 VMs are running. From time to time VM's go down without any understandable reason.
Part of log at that time:

Oct 10 06:27:52 vd002 kernel: Pid 7999(qemu-kvm) over core_pipe_limit
Oct 10 06:27:52 vd002 kernel: Skipping core dump
.
Oct 10 06:27:52 vd002 collectd[5879]: Not sleeping because the next interval is 103.306590 seconds in the past!
Oct 10 06:27:52 vd002 collectd[5879]: uc_update: Value too old: name = vd002.local/load/load; value time = 1318228072; last cache update = 1318228072;
Oct 10 06:27:52 vd002 collectd[5879]: uc_update: Value too old: name = vd002.local/memory/memory-used; value time = 1318228072; last cache update = 1318228072;
.
Oct 10 06:27:54 vd002 kernel: br0: port 70(VM165) entering disabled state
Oct 10 06:27:54 vd002 kernel: device VM165 left promiscuous mode
Oct 10 06:27:54 vd002 kernel: br0: port 70(VM165) entering disabled state
Oct 10 06:27:56 vd002 ntpd[5769]: Deleting interface #76 VM, fe80::fc54:ff:feda:5763#123, interface stats: received=0, sent=0, dropped=0, active_time=36236 secs
Oct 10 06:28:04 vd002 kernel: Pid 23541(qemu-kvm) over core_pipe_limit
Oct 10 06:28:04 vd002 kernel: Skipping core dump

and so on
20 of 90 gone down.

collectd is gathering statistics from the node server and from the VMs via collectd-libvirt.

VMs can be started normaly after the failure.

Swappiness is 0
# sysctl -a|grep swappi
vm.swappiness = 0

There is enough free ram on the node server. Swap is present, but unused by the system.

Any ideas / help / comments are very appreciated.

ASKER CERTIFIED SOLUTION

noci

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

coolvds

ASKER

Centos 6 is fully updated.

The version of the qemu-kvm (centos's up-to-date package is qemu-kvm-0.12.1.2-2.113.el6_0.8.x86_64) is lower than the appionted by you. Thank you, it may apply.

About dumps - I just can get an idea, how to trace the reason of segfaults.

But anyway, I'll proceed with this issue after updating to the latest available qemu-ckv

coolvds

ASKER

Thank you