Comment 56 for bug 204133

Revision history for this message
Colin Ian King (colin-king) wrote :

Ago, is Wubi using the normal loopback or dm-loop in this configuration? And what are we using the default vm sysctl's here, or are we using the Wubi vm sysctl's hacks too?

I'm surprised we are seeing a factor of 10 difference is data speeds. Here are some benchmarks I got when I first investigated this on Hardy back in July, which do indicate performance penalty, but not to such a high degree:

Test 1. Raw data I/O tests
---------------------------

Comparison of loopback and dm-loop block I/O. Block write/read tests of
a 6 GB data file on ext3 file system on the loopback.

Results are averages of 3 tests.

               sync? write read
loop no 46.2 MB/s 58.1 MB/s
loop yes 40.7 MB/s 40.0 MB/s

dm-loop no 44.1 MB/s 57.0 MB/s
dm-loop yes 49.8 MB/s 55.5 MB/s

Asynchronous writes of large files seem to be roughly the same with
loopback and dm-loop, probably because we are bottle necked on the disk
I/O performance underneath. Curiously dm-loop synchronous writes are
best performing which is totally unexpected.

Now for some real-world file transfer tests.

Test 2. Simple loopback tests
------------------------------

Loopback copy test. Test involved timing the copying a pre-cached
intrepid kernel source git tree to a loop mounted ext3 file system. The
test copied 26,438 files and 647MB of data.

              loop
              sync? time1 time2 time3 average Datarate

loop no 47.472 44.478 38.372 43.441 14.9 MB/s
loop yes 41.995 39.841 53.996 45.227 13.3 MB/s

dm-loop no 49.957 42.594 34.525 42.359 15.3MB/s
dm-loop yes 35.715 37.292 47.320 40.109 16.1MB/s

In real-world file copying tests, it appears that dm-loop performs
better than the normal loopback. Again, with dm-loop in sync mode
performs better than in async mode, which is most unexpected.

Test 3. Wubi style configuration
--------------------------------

The final tests try to simulate a Wubi configuration, with an ntfs-3g
file mounted using ntfs-3g via fuse, and inside this is an ext3
filesystem that is mounted using the loopback. In these tests, we test
with sync mode turn on and off on ntfs-3g and also on and off on the
loopback and dm-loop loop devices. Like test 2, we copy a pre-cached
intrepid kernel source git tree of 26,438 files and 647MB of data.

           loop ntfs-3g
           sync? sync? time1 time2 time3 average Datarate

loop no no 68.492 87.544 95.719 83.918 7.71 MB/s
loop yes no 183.243 225.088 n/a 204.166 3.17 MB/s

dm-loop no no 66.602 65.576 70.238 67.472 9.57 MB/s
dm-loop yes no 84.470 91.677 89.652 88.597 7.30 MB/s

loop no yes 87.237 94.472 91.491 91.06 7.10 MB/s
loop yes yes 193.071 192.259 192.041 192.456 3.36 MB/s

dm-loop no yes 78.685 80.738 89.807 83.076 7.79 MB/s
dm-loop yes yes 194.754 190.230 194.610 193.198 3.36 MB/s

With ntfs-3g mounted asynchronously, we see that dm-loop outperforms the
normal loopback by quite a margin. In fact, a dm-loop mounted
synchronously almost is as fast as the loopback mounted asynchronously.

With ntfs-3g mounted synchronously, we we see that dm-loop outperforms
the loopback when mounted asynchronously, and is literally the same as
the loopback when mounted synchronously.

The performance of asynchronous vs synchronous ntfs-3g from the above
data is shown below:

                        ntfs-3g MB/s
                        async sync
loop 7.71 7.10
loop sync 3.17 3.36
dm-loop 9.57 7.79
dm-loop sync 7.30 3.36

So.... some questions:

1. Is dm-loop or the normal loopback being used?
2. Is this problem with just copying the ISO - in which case can it be disabled during ISO copying, and enabled for normal user usage?
3. In normal syncio + dm-loop configuration from my tests one does see some obvious performance penalty, is this reasonable a penalty (with the benefit of guaranteed data consistency) for normal use?

The fundamental question to face is that one can either have syncio and slow but guaranteed data writeback and hence little change of problems on power-outage, or one can turn off syncio and add in the vm sysctl hacks and risk some data inconsistencies with the nested dm-loop/loopback fuse ntfs-3g layering.

Colin