NFS performance issue while clearing the file access cache upon login

Bug #2015827 reported by Chengen Du
62
This bug affects 8 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
In Progress
Undecided
Chengen Du

Bug Description

The performance issue that has been observed may be attributed to an increase in NFS ACCESS operations, possibly due to a new mechanism introduced in the Linux 6.2-rc3 NFS client side.
This mechanism clears the access cache as soon as the cache timestamp becomes older than the user's login time,
with the primary objective of preventing the NFS client's access cache from becoming stale due to any changes made to the user's group membership on the server after the user has already logged in on the client.

It's worth noting that POSIX only refreshes the user's supplementary group information upon login.
Upstream has taken into consideration that users may reasonably expect the access cache to be cleared when they log out and log back in again, with all behavior returning to normal after the replacement.

The performance overhead can be particularly noticeable when applications or users switch to other privileged users via commands such as "su" to operate on NFS-mounted folders.
In such cases, the privileged user's login time will be renewed, and NFS ACCESS operations will need to be re-sent, potentially leading to performance degradation.

Tags: patch
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 2015827

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Chengen Du (chengendu)
Changed in linux (Ubuntu):
assignee: nobody → ChengEn, Du (chengendu)
status: Incomplete → In Progress
Revision history for this message
Jan Ingvoldstad (jan-launchpad-xud) wrote :

Please note that this is a bug that for unknown reasons have been backported from 6.2-rc3 to LTS released kernels in Ubuntu Server LTS.

Upstream is not responsible for making the decision of whether this backported change should be part of older kernels in Ubuntu Server LTS.

Please revert the changes to LTS released kernels, so that server hosting environments can use Ubuntu Server as a server platform.

Revision history for this message
Allan G Soeby (soeby) wrote :

Du ChengEn, I would second Jan's opinion.

This whole chain of fixes that has gone in to fix LP: #2003053, should be rolled back. There where no heavy arguments to cherry-pick those changes in the first place. (It is not in upstream LTS either).

Once it was discovered what kind of impact it had, it should have been rolled back.

As it is also now clear that LP: #2003053 cannot be solved without impacting other use-cases, and by only introducing an extra mount-option, this is just another argument for reverting this set of patches.

Revision history for this message
Jan Ingvoldstad (jan-launchpad-xud) wrote :

Judging from the utter lack of response to the core issue of backporting untested patches from an - at the time - release candidate unstable upstream Linux version, back to what is supposed to be *three* long term support "enterprise grade" Ubuntu editions, it seems that Ubuntu's policy for Linux kernels has gone to "move fast and break things".

Basically, for stability with Ubuntu, there is now only one option:

Roll your own Linux LTS kernels.

This experience has completely undermined my trust in Ubuntu as a stable platform for servers.

Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

Hello @jan-launchpad-xud and @soeby.

The patches we introduced to fix bug 2003053 unfortunately introduced a regression that was not caught by our tests and the reviews done internally and by upstream. The regression was fixed as soon as we could and we apologize for the inconvenience. We do have extensive quality control processes but unfortunately sometimes issues are discovered after a kernel is released.

The Ubuntu LTS kernels are not necessarily a 1-to-1 match with the upstream LTS releases, we do pick up every patch applied to the upstream stable kernels but we apply other patches to provide extra fixes for our users. The reason we backported a patchset from an upstream -rc release was not unknown or randomly, it was based on a real issue that was affecting our users.

If you could kindly provide more information about the issues that you are currently having with the Ubuntu kernels that were caused by the changes to fix bug 2003053 and bug 2009325 we would be happy to investigate and provide a solution if possible.

Revision history for this message
Jan Ingvoldstad (jan-launchpad-xud) wrote :

Hello @kleber-souza.

The regression was not fixed. There have only been mitigations.

Please see our comments in the other bug report.

All information required is available in the previous bug report, but I have attached a patchset that actually fixes the regression.

Revision history for this message
Chengen Du (chengendu) wrote (last edit ):

The NFS patchset did resolve the issue our user encountered, but unfortunately introduced some performance overhead that may have significant impacts in certain scenarios.
We wanted to let you know that we have submitted a patch (https://patchwork.kernel<email address hidden>/) that we propose to address the issue.
We are currently awaiting a response from the upstream.

tags: added: patch
Revision history for this message
Allan G Soeby (soeby) wrote :

Hi @kleber-souza, @chengendu

Thanks for your attention.

Allow me give my perception of the impact of fixing "bug" LP: #2003053.

The original patchset introduced *two* regressions. One, (NFS deathlock) that hit everybody - fixed by #2009325, but the remaining one, are now hitting those of use spawning new user processes frequently, causing new "login times" to be created and access cache zapped. As a result we are looking at 300-400% increase in *overall* NFS operations, making the current kernels unusable for production. We do not have that kind of head-room on our NFS servers.

The result is, we are simply stuck with kernels prior to #2003053 fixes. With recent CVE fixes in current kernel, we have now also resorted to the option of building our own kernels. This is very counter-productive.

I understand the use case for the changes that went into "bug" #20003053. The reason why I call this a "bug" (in quotes) is due to the fact, that the behaviour has been around for more than 15 years. While age alone is not a qualifier, I am just saying that this has been an accepted behaviour for that long. Furthermore #2003053 will only apply in environments where the NFS-server has a knowledge of users and their secondary groups and validates them for ACCESS calls. (ours don't)

From the original upstream commit message 0eb43812c0270ee3d005ff32f91f7d0a6c4943af : "While it is reasonable to expect that such group membership changes are rare, and that we do not want to optimise the cache to accommodate them, it is also not unreasonable for the user to expect that if they log out and log back in again, that the staleness would clear up".

It is clear that a trade-off was considered, however the use case being a "user" (a physical interactive person), and not any service of any kind. I am quite certain that with a use case with a regression of 3-4x increase in NFS ops, this would not have gone in the way it was.

I understand why sometimes there is are strong reasons to cherry-pick changes from upstream - or making your own changes. IMHO, I do not think the use case for #20003053 was strong enough to justify that.

The main regression assessment for #20003053 was considered low, as it was upstream changes. We now know, this was not the case.

And with that knowledge, and comparing it to the weak use case the changes was trying to address, it should have been the right decision to revert the changes.

The suggested upstream changes to introduce a mount option to address this, should should be turned around. The option should be added for those wanting to zap/re-validate their access caches on re-login, but leave the default behaviour as is.

Revision history for this message
IHME SA (ihmesa) wrote :

According to https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2022098, a patched gke kernel was released to revert the default behavior and add the nfs_fasc module parameter. linux-5.4.0-154.171 was proposed, but not released. Can that be moved forward to fix this for impacted users?

Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

Hello @ihmesa,

We cannot release linux-5.4.0-154.171 anymore, it has been replaced in -proposed by linux-5.4.0-156.173 which also contains the same fix. We would appreciate if you could help us test the fix from this version in -proposed.

Thank you.

Revision history for this message
IHME SA (ihmesa) wrote :

Sorry for the confusion. Thank you for clarifying. Here is our test:

We installed linux-5.4.0-156.173 (focal) from -proposed and tested it out with good results. Ran test script on a mounted nfs share. Silly little test (test.sh seen below):

#!/bin/bash

nfsstat -l | grep access
touch myfiles.{1..1000}
md5sum myfiles.{1..1000} > /dev/null
sudo -u $(whoami) md5sum myfiles.{1..1000} > /dev/null
nfsstat -l | grep access

Kernel info:

# dpkg -l linux-image-generic | tail -1
ii linux-image-generic 5.4.0.156.152 amd64 Generic Linux kernel image

$ uname -rv
5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023

Test with the change in place:

$ cat /sys/module/nfs/parameters/nfs_fasc
N
$ ./test.sh
nfs v3 client access: 45
nfs v3 client access: 1045
$ ./test.sh
nfs v3 client access: 1045
nfs v3 client access: 1045
$ ./test.sh
nfs v3 client access: 1045
nfs v3 client access: 1045
$ ./test.sh
nfs v3 client access: 1045
nfs v3 client access: 1045

Unmount, reload the nfs module with nfs_fasc=Y, and re-test:

# umount /mnt/nfs-share
# rmmod nfsv3
# rmmod nfs
# modprobe nfs nfs_fasc=Y
$ cat /sys/module/nfs/parameters/nfs_fasc
Y
$ ./test.sh
nfs v3 client access: 28
nfs v3 client access: 2029
$ ./test.sh
nfs v3 client access: 2029
nfs v3 client access: 3030
$ ./test.sh
nfs v3 client access: 3030
nfs v3 client access: 4031
$ ./test.sh
nfs v3 client access: 4031
nfs v3 client access: 5032

Conclusion: with the nfs_fasc parameter change defaulting to off, linux-5.4.0-156.173 returns to the old behavior and NFS ACCESS calls are not made on every new session access. After loading the nfs module with the nfs_fasc parameter on, the new session NFS ACCESS call behavior resumes.

The nfs_fasc parameter functions as expected. This patch appears to revert to the old level of NFS ACCESS calls that we expected.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.