Comment 136 for bug 317781

Revision history for this message
Theodore Ts'o (tytso) wrote :

@Kai,

>While that may be true (and I suppose it is ;-)) what happens
>to all those users sticking to ext3 or similar fs' when "suddenly"
>all apps to fsync() on every occassion?
>
>It may not hurt ext4 performance that much but probably other
>fs' performance.

Actually the problem with fsync() being expensive was pretty much exclusive to ext3's data=ordered mode. No other filesystem had anything like that, and all modern filesystems are using delayed allocation. So in some sense, this is a "get used to it, this is the wave of the future". (And in many ways, this is also "back to the future", since historically Unix systems sync'ed metadata every 5 seconds, and data every 30 seconds.) Basically, modern application writers (at least under Linux), have gotten lazy. Older programs (like emacs, vi), and programs that need to work on other Legacy Unix systems will tend to use fsync(), because that is the only safe thing to do.

>Correct me if I am wrong but I read, currently Ext4
>does (for yet unknown reasons) out-of-order flushing
>between data and meta data which hopefully can be
>fixed without affecting performance too much while
>improving integrity on crashes.

Well, that's not how I would describe it, although I admit in practice it has that effect. What's happening is that the journal is still being committed every 5 seconds, but dirty pages in the page cache do not get flushed out if they don't have a block allocation assigned to them. I can implement a "allocate on commit" mode, but make no mistake --- it ***will*** have a negative performance impact, because fundamentally it's the equivalent of calling fsync() on dirty files every five seconds. If you are copying a large file, such as an DVD image file, which takes longer than five seconds to write, forcing a allocation in the middle of the write could very well result in a more fragmented file. On the other hand, it won't be any worse than ext3, since that's what is happening under ext3.

>I'm using XFS so Ext4 isn't my preference but this
>is still interesting for me as it looks like XFS and Ext4
>share the same oddities that lead to truncated configs

Yes, and as I've said multiple times already, "get used to it"; all modern filesystems are going to be doing this, because delayed allocation is a very powerful technique that provides better performance, prevents file fragmentation, and so on. It's not just "oddities" in XFS and ext4; it's also in btrfs, tux3, reiser4, and ZFS.

>in e.g. KDE4. I lost my kmailrc due to this several
>times, including all my filters, account settings, folder
>settings etc... (btw: a situation which could be improved
> if KMail wouldn't work with a single monolithic config)

Yeah, that's a good example for what I mean by a broken application design --- why is it that KMail is constantly updating its config? Is it doing something stupid such as storing the last location and size of the window in a monolithic config, which it is then constantly rewriting out each time you drag the window around on the screen? If you are going to be using a single monolithic config, then you really want to fsync() it each time you write it out. If some of the changes are bullsh*t ones where it really doesn't matter of you lose the last location and size of the window, then write that to a separate dot file and don't bother to fsync() it.

In some of the cases where people report losing "hundreds" of dot files, I suspect one of the thigns that is going on is that in addition to using separate files for each configuration registry variable, the application writer isn't keep track of which registry variables have changed, and so it is rewriting *all* of the small files each time, resulting a large number of needless writes. This is bad for battery usage, and bad for SSD endurance, and in general it's just bad design.

The real answer here is we probably need some better libraries to help application writers keep track of a large number of state variables; something like sqllite, done right, would be a big win --- which is why I keep suggesting it. In the short-term, yes, we'll put in some kludges that will allow people who are using crappy binary drivers degrade the performance of ext4 so that it behaves like ext3 date=ordered mode. But it's really not the right long-term answer.

>Also an open-source driver could have made the
>system freeze - that's not a single fault of closed-source
>drivers. So some of the arguments here are just irrelevant.

In practice, though, that's not most people's experience. If for no other reason than with an open source driver, you can have more people debugging the problem; with a closed source driver from Nvidia, only Nvidia can debug the problem, and generally the size of the manufacturer's Linux support team (or more accurately, Linux support person) is a fraction of the size of the team tasked to support the Windows driver. So most people tend to find that the open source drivers are **far** more stable.