?
Solved

Centos 6 (fedora 14) kvm sporadic failures

Posted on 2011-10-10
3
Medium Priority
?
1,242 Views
Last Modified: 2013-11-08
The described behavior is found on both fedora 14 installs and on centos 6 installs.
I've migrated to centos 6 in a hope that this issue will be gone but it didn't.
The situation:
On one server ~90 VMs are running. From time to time VM's go down without any understandable reason.
Part of log at that time:


Oct 10 06:27:52 vd002 kernel: Pid 7999(qemu-kvm) over core_pipe_limit
Oct 10 06:27:52 vd002 kernel: Skipping core dump
.
Oct 10 06:27:52 vd002 collectd[5879]: Not sleeping because the next interval is 103.306590 seconds in the past!
Oct 10 06:27:52 vd002 collectd[5879]: uc_update: Value too old: name = vd002.local/load/load; value time = 1318228072; last cache update = 1318228072;
Oct 10 06:27:52 vd002 collectd[5879]: uc_update: Value too old: name = vd002.local/memory/memory-used; value time = 1318228072; last cache update = 1318228072;
.
Oct 10 06:27:54 vd002 kernel: br0: port 70(VM165) entering disabled state
Oct 10 06:27:54 vd002 kernel: device VM165 left promiscuous mode
Oct 10 06:27:54 vd002 kernel: br0: port 70(VM165) entering disabled state
Oct 10 06:27:56 vd002 ntpd[5769]: Deleting interface #76 VM, fe80::fc54:ff:feda:5763#123, interface stats: received=0, sent=0, dropped=0, active_time=36236 secs
Oct 10 06:28:04 vd002 kernel: Pid 23541(qemu-kvm) over core_pipe_limit
Oct 10 06:28:04 vd002 kernel: Skipping core dump

and so on
20 of 90 gone down.

collectd is gathering statistics from the node server and from the VMs via collectd-libvirt.

VMs can be started normaly after the failure.

Swappiness is 0
# sysctl -a|grep swappi
vm.swappiness = 0

There is enough free ram on the node server. Swap is present, but unused by the system.

Any ideas / help / comments are very appreciated.

0
Comment
Question by:coolvds
  • 2
3 Comments
 
LVL 41

Accepted Solution

by:
noci earned 2000 total points
ID: 36945910
Might this apply:
https://www.redhat.com/archives/rhsa-announce/2011-May/msg00013.html

Or did you already have upgraded to the lastest updates.
My quess is that you need to trace why the coredumps are triggered.
(A coredump is an attempt to save the memory contents of a process after a fatal error is noticed by the kernel).
0
 

Author Comment

by:coolvds
ID: 36945951
Centos 6 is fully updated.

The version of the qemu-kvm (centos's up-to-date package is qemu-kvm-0.12.1.2-2.113.el6_0.8.x86_64) is lower than the appionted by you. Thank you, it may apply.

About dumps - I just can get an idea, how to trace the reason of segfaults.

But anyway, I'll proceed with this issue after updating to the latest available qemu-ckv

0
 

Author Closing Comment

by:coolvds
ID: 36945952
Thank you
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

What if you have to shut down the entire Citrix infrastructure for hardware maintenance, software upgrades or "the unknown"? I developed this plan for "the unknown" and hope that it helps you as well. This article explains how to properly shut down …
Often times it's very very easy to extend a volume on a Linux instance in AWS, but impossible to shrink it. I wanted to contribute to the experts-exchange community a way of providing a procedure that works on an AWS instance. It can also be used on…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
Suggested Courses
Course of the Month8 days, 23 hours left to enroll

621 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question