I have an openstack ocata release deployed via kolla. Libvirtd running inside docker container nova_ libvirt and volumes /sys/fs/cgroup, /run privileged mode enabled. Some guest vms cannot provide cpu-stats Symptoms are:
$ docker exec -ti nova_libvirt virsh cpu-stats instance-000004cb error: Failed to retrieve CPU statistics for domain 'instance-000004cb' error: Requested operation is not valid: cgroup CPUACCT controller is not mounted
To check cgroups looking for all related pid
$ ps fax | grep instance-000004cb 8275 ? Sl 4073:40 /usr/libexec/qemu-kvm -name guest=instance-000004cb $ ps fax | grep 8275 8346 ? S 76:04 \_ [vhost-8275] 8367 ? S 0:00 \_ [kvm-pit/8275] 8275 ? Sl 4073:42 /usr/libexec/qemu-kvm
See cgroup for qemu-kvm
$ cat /proc/8275/cgroup 11:blkio:/user.slice 10:devices:/user.slice
9:hugetlb:/docker/e5bef89178c1c3ae34fd2b4a9b86b299a6145c0b9f608a06e83f6f4ca4d897bd 8:cpuacct,cpu:/user.slice
7:perf_event:/machine.slice/machine-qemu\x2d25\x2dinstance\x2d000004cb.scope
6:net_prio,net_cls:/machine.slice/machine-qemu\x2d25\x2dinstance\x2d000004cb.scope 5:freezer:/machine.slice/machine-qemu\x2d25\x2dinstance\x2d000004cb.scope 4:memory:/user.slice 3:pids:/user.slice
2:cpuset:/machine.slice/machine-qemu\x2d25\x2dinstance\x2d000004cb.scope/emulator 1:name=systemd:/user.slice/user-0.slice/session-c1068.scope
for vhost-8275
$ cat /proc/8346/cgroup 11:blkio:/user.slice 10:devices:/user.slice
9:hugetlb:/docker/e5bef89178c1c3ae34fd2b4a9b86b299a6145c0b9f608a06e83f6f4ca4d897bd 8:cpuacct,cpu:/user.slice
7:perf_event:/machine.slice/machine-qemu\x2d25\x2dinstance\x2d000004cb.scope
6:net_prio,net_cls:/machine.slice/machine-qemu\x2d25\x2dinstance\x2d000004cb.scope 5:freezer:/machine.slice/machine-qemu\x2d25\x2dinstance\x2d000004cb.scope 4:memory:/user.slice 3:pids:/user.slice
2:cpuset:/machine.slice/machine-qemu\x2d25\x2dinstance\x2d000004cb.scope/emulator 1:name=systemd:/user.slice/user-0.slice/session-c1068.scope
for kvm-pit
$ cat /proc/8275/cgroup 11:blkio:/user.slice 10:devices:/user.slice 9:hugetlb:/ 8:cpuacct,cpu:/user.slice 7:perf_event:/ 6:net_prio,net_cls:/ 5:freezer:/ 4:memory:/user.slice 3:pids:/user.slice 2:cpuset:/ 1:name=systemd:/user.slice/user-0.slice/session-c4807.scope
I tried to fix the groups with a this script
get_broken_vms() { docker exec nova_libvirt bash -c 'for vm in $(virsh list --name); do virsh cpu-stats $vm > /dev/null 2>&1 || echo $vm; done' }
attach_vm_to_cgroup() { # Attach processes and their threads pid to correct cgroup local vm_pid=$1; shift local vm_cgname=$1; shift
echo Fix cgroup for pid $vm_pid in cgroup $vm_cgname
for tpid in $(find /proc/$vm_pid/task/ -maxdepth 1 -mindepth 1 -type d -printf '%f\n'); do echo $tpid | tee /sys/fs/cgroup/{blkio,devices,perf_event,net_prio,net_cls,freezer,memory,pids,systemd}/machine.slice/$vm_cgname/tasks 1>/dev/null & echo $tpid | tee /sys/fs/cgroup/{cpu,\cpuacct,cpuset}/machine.slice/$vm_cgname/emulator/tasks 1>/dev/null & done }
for vm in $(get_broken_vms); do vm_pid=$(pgrep -f $vm) vm_vhost_pids=$(pgrep -x vhost-$vm_pid) vm_cgname=$(find /sys/fs/cgroup/systemd/machine.slice -maxdepth 1 -mindepth 1 -type d -name "machine-qemu\\\x2d*\\\x2d${vm/-/\\\\x2d}.scope" -printf '%f\n')
echo Working on vm: $vm pid: $vm_pid vhost_pid: $vm_vhost_pids cgroup_name: $vm_cgname [ -z "$vm_pid" -a -z "$vm_cgname" ] || attach_vm_to_cgroup $vm_pid $vm_cgname
# Fix vhost-NNNN kernel threads for vpid in $vm_vhost_pids; do [ -z "$vm_cgname" ] || attach_vm_to_cgroup $vpid $vm_cgname done done
After fixing all vms successfully provided cpu-stats and other metrics, but after some hours cgroups broke again. Problems and symptoms: - cgoup broken not at all VMs - to find out what leads to this effect failed - if restart a problem VM then as expected cgroups has been fixed but after some hours cgroup broken again - if cgroups has been fixed by hand cpu-stats is works, but after some hours cgroup broken again Now i check: - logrotate - nothing - cron - nothing Add audit logs for cgrups
auditctl -w '/sys/fs/cgroup/cpu,cpuacct/machine.slice' -p rwxa
And found only libvirtd processes write cgroups. Any suggestions?