Issue
Experiencing crashes on Qemu instances based on CloudLinux during a snapshot.
Environment
- CentOS 6/7/8 VM (virtual machine)
- CloudLinux OS 6/7/8 VM
- ProxMox with qemu-guest-agent
Solution
The issue is not related to CloudLinux directly, but to the Qemu agent which does not freeze the file system(s) correctly. There aren't any solutions or recommendations from our side.
Cause
What is actually happening:
When VM backup is invoked, the Qemu agent freezes the file systems, so no single change will be made during the backup. But Qemu agent does not respect the loop* devices in freezing order (we have checked its sources), which leads to the next situation:
- freeze loopback fs ---> send async reqs to loopback thread
- freeze the main fs
- loopback thread wakes up and tries to write data to the main fs, which is still frozen, and this finally leads to the hung task and kernel crash.
Workaround
As a workaround, you may disable the qemu-guest-agent. That would allow to run proxmox backups again without crashing the kernel.
Useful links
- https://gitlab.com/qemu-project/qemu/-/issues/520
- https://cloudlinux.zendesk.com/agent/tickets/49567
Comments
0 comments
Please sign in to leave a comment.