Comment 17 for bug 204133

Revision history for this message
Colin Ian King (colin-king) wrote :

Hi,

I've dug into this and come across several unexpected issues with the tweaks to the vm settings which could be causing the corruption.

1. Setting the dirty_expire_centisecs and dirty_writeback_centisecs to 1 may not actually be setting the values you expect due to conversions between USER_HZ and HZ in the kernel. For example

setting: sysctl -w vm.dirty_expire_centisecs=1

and reading it back: sysctl vm.dirty_expire_centisecs returns 0 and not 1. This is due to conversion between USER_HZ and HZ to store the value in terms of the jiffy clock and then back from HZ to USER_HZ to read the value back out. I therefore suggest anything below 2 may be problematic in not being set correctly. See the next item.

2. Setting dirty_writeback_centisecs to zero actually turns off the periodic write back (pdflush is stopped) - definitely not what we want to do.

3. There is a currently outstanding Linux kernel bug that is rare and difficult to trigger even intentionally on most kernel versions. However, it is easier to encounter when reducing dirty_ratio setting below its default. An introduction to the issue starts at http://lkml.org/lkml/2006/12/28/171 and comments about it not being specific to the current kernel release are at http://lkml.org/lkml/2006/12/28/131 So don't set dirty_ratio too low. I suggest keeping it at the default value and see if this helps

4. It may be worth modifying the filebase looped back filesystem mount options to force data and directory sync'ing and force , for example,

    mount -o loop,sync,dirsync,commit=1

    (see man 8 mount for how these work)

5. I suggest trying:

sysctl $quiet -w vm.dirty_background_ratio=0
sysctl $quiet -w vm.dirty_ratio=40
sysctl $quiet -w vm.dirty_expire_centisecs=2 (or 4)
sysctl $quiet -w vm.dirty_writeback_centisecs=2 (or 4)

(for a system where USER_HZ=100, HZ=250). After setting the centisecs times, it is worth checking the dirty_writeback_centisecs is not zero otherwise the pdflush daemon may be turned off!)

6. When the PC shuts down or reboots, is it possible to force a sync before umounting the loopback device?

References:

http://www.westnet.com/~gsmith/content/linux-pdflush.htm
http://lkml.org/lkml/2006/12/28/171
http://lkml.org/lkml/2006/12/28/131

Let me know if this helps.

Colin