Comment 26 for bug 432154

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Here's a status update...

I can reproduce this problem as follows, without libvirt or eucalyptus in the picture...

Create some small auxiliary storage to dynamically add to a running VM.
  (host) $ dd if=/dev/zero of=/tmp/foo bs=1M count=64

Boot a vm in Karmic under qemu-kvm-0.11.0 like this:
  (host) $ kvm -drive file=karmic-server.img,if=scsi,boot=on -boot c

To reproduce the problem as Eucalyptus is experiencing it, you *must* use if=scsi. This problem manifests itself when *subsequent* scsi storage devices are added. If the device is the *first* scsi device on the system, then it would succeed. It's the 2nd scsi device that causes the problem.

Once the system is booted, login and in the vm, load this module:
  (vm) $ sudo modprobe acpiphp

Check that the acpiphp slots are loaded in dmesg. And note that there is only one /dev/sd[a-z] device.

Now, drop to the qemu console with ctrl-alt-2, and add the storage:
  (qemu) pci_add auto storage file=/tmp/foo,if=scsi
  OK domain 0, bus 0, slot 6, function 0

Switch back to the vm linux shell with ctrl-alt-1, and look at the dmesg output.
  (vm) $ dmesg | tail -n 12
[ 44.033397] pci 0000:00:06.0: reg 10 io port: [0x00-0xff]
[ 44.033443] pci 0000:00:06.0: reg 14 32bit mmio: [0x000000-0x0003ff]
[ 44.033486] pci 0000:00:06.0: reg 18 32bit mmio: [0x000000-0x001fff]
[ 44.033899] pci 0000:00:02.0: BAR 6: bogus alignment [0x0-0x0] flags 0x2
[ 44.033975] decode_hpp: Could not get hotplug parameters. Use defaults
[ 44.042277] sym53c8xx 0000:00:06.0: enabling device (0000 -> 0003)
[ 44.043230] ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 11
[ 44.043247] sym53c8xx 0000:00:06.0: PCI INT A -> Link[LNKB] -> GSI 11 (level, high) -> IRQ 11
[ 44.045237] sym1: <895a> rev 0x0 at pci 0000:00:06.0 irq 11
[ 44.047586] sym1: No NVRAM, ID 7, Fast-40, LVD, parity checking
[ 44.055399] sym1: SCSI BUS has been reset.
[ 44.063329] scsi3 : sym-2.2.3

More importantly, note that no /dev/sd[b-z] device shows up.

This is Eucalyptus' use case (though they use libvirt to do the above).

The official, supported way of doing this according to upstream qemu and kvm is to use virtio instead of scsi. They have acknowledged that scsi is buggy, and can in fact lose data when the buffer is saturated.

Truly, if you now drop to the qemu console with ctrl-alt-2, and do this:
  (qemu) pci_add auto storage file=/tmp/foo,if=virtio
  OK domain 0, bus 0, slot 7, function 0

Going back to the vm with ctrl-alt-1, you can now see a new /dev/vda device registered. Eucalyptus notes that this is not ideal because the device is called /dev/vda instead of /dev/sda or /dev/sdb. They are concerned that this breaks compatibility with EC2 images which expect disks in the /dev/sd[a-z] namespace, particularly because some of these images hardcode such device names in /etc/fstab. Fortunately, in Ubuntu we use UUIDs, so we don't suffer from this, really, in our guests.

Now, all of that said, it is actually possible to hot-add a second scsi device (though upstream recommends doing so through virtio instead). However, this method is not yet supported by libvirt. The key is that with modern qemu, you have to add a drive to a bus, rather than adding a whole pci bus. Here's how:

Drop to a qemu console with ctrl-alt-2. Get the address of the current scsi bus:
  (qemu) info pci
  Look for "SCSI Controller". In my case, it's on Bus 0, device 4, function 0

Now instead of pci_add, use drive_add
  (qemu) drive_add 0:4 file=/tmp/foo,if=scsi
  OK bus 0, unit 1

This is not ideal, however, as I tried re-scan the scsi bus with rescan-scsi-bus.sh (from scsitools) without luck. It did not pick up the new sdb device. However, I did reboot the vm, and voila, I now have /dev/sdb.

I'm not sure how to proceed with this bug...

 1) Upstream qemu/kvm recommends using virtio instead of scsi; Eucalyptus does not want to as it introduces inconsistency with EC2.
 2) Libvirt could (perhaps) be taught to use drive_add; but I have not yet figured out how to trigger a re-scan of the scsi bus in the guest automatically and without performing a reboot.
 3) Arguably, this is a regression in qemu-kvm (since kvm-84). I've been grokking the git commit history for several hours now, and have attempted to build some random snapshots at various points in time in the interest of bisecting the regression. This has not been productive so far.

I could probably use some assistance at this point.

:-Dustin