Comment 5 for bug 731340

Revision history for this message
Grondr (grondr) wrote :

It's not just AMD64, or this is two different bugs---I think it's
actually 10.10 in both AMD64 and i386.

I just installed 10.10 on a WD 500GB IDE disk inside a -non- encrypted
LVM using the alternate installer last week. I spent some time last
night benchmarking I/O using (a) a 2TB SATA 4K-sector Samsung with
an ext4 on it, and (b) the same model disk with aes-xts-plain64 (with
512-bit keys) under an ext4 (no LVM on either of those two disks).
ext4 created with defaults except "-i 1048576" so it would initialize
much faster---I wasn't planning on filling 2TB to run a test.

Sequential I/O to both 2TB disks was very fast---dd'ing /dev/zero
either directly to the partition or to LUKS ran around 130-140MB/s,
which is probably the limit for the disk. (This was before I created
filesystems on those partitions.) I then did "cp -a /usr /mnt/a1"
and similar for the LUKS-based FS and timed things.

"time find /usr | wc -l" (-before- the copy) took about 16 seconds,
and of course repeating this immediately took about 200ms. Since
this was /usr for the running OS, I'll bet that lots of it was already
in the caches. (Also, that IDE disk may seek faster than a 2TB green
drive.)

After the copy, I unmounted & remounted each of the SATA filesystems
to drop their caches, and timed. The non-LUKS one took 47 seconds.
The LUKS one took 3 minutes and 38 seconds! This was repeatable; I
could dismount & remount and always get the same figures. Rerunning
w/o a dismount/remount of course gave me subsecond times. This is
on a six-core AMD64 2.8Ghz CPU and none of the cores had appreciable
runtime; the CPU stayed at 800MHz the whole time, and user+sys for
the finds were about 3 seconds for LUKS and half a second without.

Note that this is only a 4.6 to 1 ratio and not the 30;1 in the OP's
report, so either this is a different bug or something else is going
on.

I wondered if ext4 atime vs 4K-sectors was somehow screwing me,
so I tried remounting both SATA filesystems noatime. No change
in results. (I didn't try mounting them ro; I could, but...)

I then booted 10.10 i386 from a LiveCD---this is the same release,
but a few kernels back, since the installed system is up-to-date
(2.6.35-28-generic) and the LiveCD is I think at 2.6.35-22.

THINGS DID NOT IMPROVE MUCH, THOUGH THEY DID IMPROVE---i386 took
2 minutes and 15 seconds, so it's 1m15s faster, or only 2.9 times
as bad instead of 4.6.

I then booted a Natty LiveCD from the March 30th build.

BIG IMPROVEMENT---time was down to 57 seconds, which is only 10
seconds slower than the non-LUKS case. I can live with that. (Though
I'm still curious why it's slower at all---LUKS in any crypto mode
I've tried [aes-cbc-essiv:sha256, or aes-xts-plain64 with either 256-
or 512-bit keys] runs at least 140 MB/s to my disks when dd'ing a few
gig, and I think the disks are the limiting factor, because they don't
run any faster just dd'ing plaintext to a raw partition).

I am -quite- reluctant to run Natty on an otherwise 10.10 system.
I'm -very very- relunctant to run Natty overall on this machine (I
discovered that high network load copying to normal plaintext disk
seemed to burn 10x the CPU it should (different story, and one I
need to replicate). So I'd really like to see this fixed in 10.10
because I suspect a lot of people are going to hold off on Natty,
and it's clearly a regression. (I have not had a chance to test
10.04 and may not; I'd probably have to switch back to aes-cbc
to do so, which I can do now because I have no data in the FS
yet, but will be impossible later.)