qemu-img convert blocks other tasks

Bug #712392 reported by Alvin
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
qemu-kvm (Ubuntu)
Expired
Low
Unassigned

Bug Description

Binary package hint: qemu-kvm

Steps to reproduce:
- Use qemu-img convert to convert an image. e.g. a large (40GB) raw file to a compressed qcow2 image
- Watch the load rise
- kernel messages will be like "INFO: task blocked for more than 120 seconds". (Also see below)

In some cases this will bring down the server. When running libvirt too, all virtual servers will timeout/crash.
Using ionice to renice qemu-img convert does not really prevent the issue.

kvm D 0000000000000000 0 9632 1 0x00000000
 ffff8801a4269ca8 0000000000000086 0000000000015bc0 0000000000015bc0
 ffff8802004fdf38 ffff8801a4269fd8 0000000000015bc0 ffff8802004fdb80
 0000000000015bc0 ffff8801a4269fd8 0000000000015bc0 ffff8802004fdf38
Call Trace:
 [<ffffffff815596b7>] __mutex_lock_slowpath+0x107/0x190
 [<ffffffff815590b3>] mutex_lock+0x23/0x50
 [<ffffffff810f5899>] generic_file_aio_write+0x59/0xe0
 [<ffffffff811d7879>] ext4_file_write+0x39/0xb0
 [<ffffffff81143a8a>] do_sync_write+0xfa/0x140
 [<ffffffff81084380>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81252316>] ? security_file_permission+0x16/0x20
 [<ffffffff81143d88>] vfs_write+0xb8/0x1a0
 [<ffffffff81144722>] sys_pwrite64+0x82/0xa0
 [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
kdmflush D 0000000000000002 0 396 2 0x00000000
 ffff88022eeb3d10 0000000000000046 0000000000015bc0 0000000000015bc0
 ffff88022f489a98 ffff88022eeb3fd8 0000000000015bc0 ffff88022f4896e0
 0000000000015bc0 ffff88022eeb3fd8 0000000000015bc0 ffff88022f489a98

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: qemu-kvm 0.12.3+noroms-0ubuntu9.3
ProcVersionSignature: Ubuntu 2.6.32-28.55-server 2.6.32.27+drm33.12
Uname: Linux 2.6.32-28-server x86_64
Architecture: amd64
Date: Thu Feb 3 12:34:13 2011
KvmCmdLine:
 UID PID PPID C SZ RSS PSR STIME TTY TIME CMD
 root 1809 1 0 188145 517408 3 Jan31 ? 00:21:36 /usr/bin/kvm -S -M pc-0.11 -enable-kvm -m 512 -smp 1 -name jessica -uuid 76a39821-a89d-1ac7-65c2-40464dc21043 -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/jessica.monitor,server,nowait -monitor chardev:monitor -boot c -drive if=ide,media=cdrom,index=2,format=raw -drive file=/var/lib/libvirt/images/jessica.img,if=virtio,index=0,boot=on,format=qcow2 -net nic,macaddr=54:52:00:5f:d8:2c,vlan=0,model=virtio,name=virtio.0 -net tap,fd=39,vlan=0,name=tap.0 -chardev pty,id=serial0 -serial chardev:serial0 -parallel none -usb -vnc 127.0.0.1:0 -vga cirrus
 root 1909 1 0 248235 625280 3 Jan31 ? 00:21:14 /usr/bin/kvm -S -M pc-0.11 -enable-kvm -m 768 -smp 1 -name gurney -uuid 7a21a182-8349-b17d-7eff-64f9bb3e7e30 -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/gurney.monitor,server,nowait -monitor chardev:monitor -boot c -drive if=ide,media=cdrom,index=2,format=raw -drive file=/var/lib/libvirt/images/gurney.img,if=virtio,index=0,boot=on,format=qcow2 -net nic,macaddr=54:52:00:52:dc:a0,vlan=0,model=virtio,name=virtio.0 -net tap,fd=41,vlan=0,name=tap.0 -chardev pty,id=serial0 -serial chardev:serial0 -parallel none -usb -vnc 127.0.0.1:2 -vga cirrus
 root 27120 1 0 193095 552120 0 Jan31 ? 00:27:52 /usr/bin/kvm -S -M pc-0.11 -cpu qemu32 -enable-kvm -m 512 -smp 1 -name kolab -uuid 79b2a347-7841-39df-8399-c072b05e7f6f -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/kolab.monitor,server,nowait -monitor chardev:monitor -boot c -drive if=ide,media=cdrom,index=2,format=raw -drive file=/srv/libvirt/leto/kolab.img,if=virtio,index=0,boot=on,format=qcow2 -net nic,macaddr=54:52:00:63:ee:4f,vlan=0,model=virtio,name=virtio.0 -net tap,fd=40,vlan=0,name=tap.0 -chardev pty,id=serial0 -serial chardev:serial0 -parallel none -usb -vnc 127.0.0.1:1 -vga cirrus
ProcCmdLine: BOOT_IMAGE=/vmlinuz-2.6.32-28-server root=/dev/mapper/vg0-root ro quiet splash delayacct
ProcEnviron:
 PATH=(custom, user)
 LANG=C
 SHELL=/bin/bash
SourcePackage: qemu-kvm
dmi.bios.date: 02/23/2010
dmi.bios.vendor: Intel Corp.
dmi.bios.version: CBQ4510H.86A.0119.2010.0223.1522
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: DQ45CB
dmi.board.vendor: Intel Corporation
dmi.board.version: AAE30148-301
dmi.chassis.type: 3
dmi.modalias: dmi:bvnIntelCorp.:bvrCBQ4510H.86A.0119.2010.0223.1522:bd02/23/2010:svn:pn:pvr:rvnIntelCorporation:rnDQ45CB:rvrAAE30148-301:cvn:ct3:cvr:

Revision history for this message
Alvin (alvind) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks for taking the time to report this bug and helping to make Ubuntu better.

For my own test on a natty server, I started with a 30G qcow2 disk with a lucid install on it. It had about 1G allocated. Conversion from qcow2 to raw took probably less than 10 seconds. Conversion back to qcow took a lot longer (perhaps a minute). So for a full 40G allocated drive I certainly would expect it to be slow.

However you certainly do seem to have a real problem there. In your CurrentDmesg, I see

[257894.409748] Buffer I/O error on device dm-9, logical block 0
[257894.409786] Buffer I/O error on device dm-9, logical block 0

This could indicate a real problem, or just a device which you've since removed (i.e. usb thumb drive). Could you look under /sys/block/dm-9 for more information?

Can you look for relevant info in /var/log/syslog from this event and paste them here?

Finally, could you re-test with the qemu-kvm package from the server-edgers's archive (see https://launchpad.net/~ubuntu-server-edgers/+archive/server-edgers-qemu-kvm)?

Changed in qemu-kvm (Ubuntu):
status: New → Incomplete
Revision history for this message
Alvin (alvind) wrote :

You're right about the errors. Apparently I had a full snapshot, but it doesn't make a difference.

Meanwhile, I've had the time to test this. It is easy to reproduce:

- Create a snapshot (1 is enough)
- Cause some I/O, like qemu-img convert (this does /not/ have to be the snapshotted volume)
- Watch your virtual machines crash and/or your system burn

Doing the same qemu-img without any snapshots present is no problem.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for qemu-kvm (Ubuntu) because there has been no activity for 60 days.]

Changed in qemu-kvm (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Alvin (alvind) wrote :

It's not good to let this bug expire. LVM snapshots are a severe risk and an important feature on file servers.

Changed in qemu-kvm (Ubuntu):
status: Expired → New
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Does this:

kvm D 0000000000000000 0 9632 1 0x00000000
 ffff8801a4269ca8 0000000000000086 0000000000015bc0 0000000000015bc0
 ffff8802004fdf38 ffff8801a4269fd8 0000000000015bc0 ffff8802004fdb80
 0000000000015bc0 ffff8801a4269fd8 0000000000015bc0 ffff8802004fdf38
Call Trace:
 [<ffffffff815596b7>] __mutex_lock_slowpath+0x107/0x190
 [<ffffffff815590b3>] mutex_lock+0x23/0x50
 [<ffffffff810f5899>] generic_file_aio_write+0x59/0xe0
 [<ffffffff811d7879>] ext4_file_write+0x39/0xb0
 [<ffffffff81143a8a>] do_sync_write+0xfa/0x140
 [<ffffffff81084380>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81252316>] ? security_file_permission+0x16/0x20
 [<ffffffff81143d88>] vfs_write+0xb8/0x1a0
 [<ffffffff81144722>] sys_pwrite64+0x82/0xa0
 [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
kdmflush D 0000000000000002 0 396 2 0x00000000
 ffff88022eeb3d10 0000000000000046 0000000000015bc0 0000000000015bc0
 ffff88022f489a98 ffff88022eeb3fd8 0000000000015bc0 ffff88022f4896e0
 0000000000015bc0 ffff88022eeb3fd8 0000000000015bc0 ffff88022f489a98

show up in the guest, or the host? (I'd assume and hope the guest).

Does this happen only with LVM volumes?

For LVM, does it help if you create a temporary readonly snapshot of the volume you want to copy, use qemu-img on the snapshot, and then delete the snapshot? Does that allow the VM to continue running with the original volume with no errors?

Revision history for this message
Alvin (alvind) wrote :

Unfortunately, this shows up in both host and guests.

I haven't tried to use qemu-img on a snapshot, but I will. I fear it will not make much difference. Just having snapshots (without accessing them) is enough to bring the system down on moments with heavy I/O.

Sometime this week, I will warn the users of their impending doom and try out whether it makes a difference.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 712392] Re: qemu-img convert blocks other tasks

Quoting Alvin (<email address hidden>):
> Unfortunately, this shows up in both host and guests.
>
> I haven't tried to use qemu-img on a snapshot, but I will. I fear it
> will not make much difference. Just having snapshots (without accessing
> them) is enough to bring the system down on moments with heavy I/O.
>
> Sometime this week, I will warn the users of their impending doom and
> try out whether it makes a difference.

I assume you have very fast underlying storage and that's the trigger.
Rather than running qemu-img under 'nice', I wonder whether you would
be better off running qemu-img in a cgroup with tighter controls over
its blkio. I've not experimented with this myself, but
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/cgroups/blkio-controller.txt;h=465351d4cf853e8a308c9c84abef789b3dcfa42c;hb=HEAD
should give some good info.

I also must point out that the only storage failures I've seen myself
(roughly a year ago) were due to ext4 itself. I since switched to
xfs for all root and data partitions and have not seen the like since.
Unfortunately that's likely not a viable option for you as you have
users already depending on the system (plus 2.6.38 is rumored to bring
its own xfs instabilities, though I've not seen those here). However
if it were possible, I'd highly recommending testing with xfs.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I've tried again to reproduce this to no avail.

Do you get different behavior if you do

   qemu-img convert -f raw -O raw /dev/yourvg/yourlv lv1.raw

and if you do

   dd if=/dev/yourvg/yourlv of=lv2.raw

?

If not, then the compression if you're using it, or simply memory pressure appears to be the cause, so beside ionice you may want to try to simply use 'nice'

Also, please let us know if using a cgroup works for you (and, if not, please provide all the settings you used).

Changed in qemu-kvm (Ubuntu):
status: New → Incomplete
importance: Undecided → Low
Revision history for this message
Alvin (alvind) wrote :

I still haven't had the chance to test this properly. (server is in production), but the problem manifested itself by accident. So, here's a little bit of information.

I started downloading (rsync) a qcow image from the file server and noticed it was bit slow. 14MB/sec max - 790kB/s after a while. This is writing to a simple single sata disk. Then the virtual machines started to become unresponsive and I had to interrupt the download, or the server would have gone down. It's not memory pressure. Memory was at +/- 900MB (of 8GB). Only 2 small virtual machines were running, BUT there was a snapshot present.

It's not qemu-img that causes the panics. Any I/O will do it. Then the famous messages start appearing in kern.log:
[149164.740056] INFO: task kvm:21354 blocked for more than 120 seconds.
[149164.740294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[...]
[149164.740595] [<ffffffff8155d557>] __mutex_lock_slowpath+0x107/0x190

Restarting the download without lvm snapshot present yields:
- 11MB/s, but more or less constant.
- Running virtual machines are responsive as ever
- No errors in kern.log

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks for the info, Alvin. So I wonder if we could reproduce this without
qemu, simply creating two LVM partitions, one with a snapshot, with both
originals mounted, rsyncing to the one while trying to get something done
(maybe a slower rate-limited rsync) in the other.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for qemu-kvm (Ubuntu) because there has been no activity for 60 days.]

Changed in qemu-kvm (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.