NFS4 acl ops do not work with standard kernel

Bug #562913 reported by Robert Sander
52
This bug affects 9 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Nominated for Lucid by cotillion
Nominated for Maverick by cotillion

Bug Description

Binary package hint: nfs4-acl-tools

There is a minor bug in the recent kernel that cause nfs4_getfacl and nfs4_setfacl to fail:

http://linux-nfs.org/pipermail/nfsv4/2009-November/011643.html

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: nfs4-acl-tools 0.3.3-0ubuntu1
ProcVersionSignature: Ubuntu 2.6.32-21.31-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-21-generic i686
Architecture: i386
Date: Wed Apr 14 12:01:43 2010
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: nfs4-acl-tools

Revision history for this message
Robert Sander (gurubert) wrote :
Revision history for this message
Robert Sander (gurubert) wrote :

With kernel 2.6.34 from http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.34-rc3-lucid/linux-image-2.6.34-020634rc3-generic_2.6.34-020634rc3_i386.deb the issue is resolved:

root@bloch:/mnt/nfs4# mount -t nfs4 -o rw,acl netapp-sim02:/vol/test test
root@bloch:/mnt/nfs4# cd test/
root@bloch:/mnt/nfs4/test# l
total 16
drwxrwxrwx 4 root nogroup 4096 Apr 7 14:46 ./
drwxr-xr-x 5 root root 4096 Apr 8 14:48 ../
drwxrwxrwx 10 root nogroup 4096 Apr 15 10:00 .snapshot/
drwxr-xr-x 2 root nogroup 4096 Apr 7 14:46 test/
root@bloch:/mnt/nfs4/test# nfs4_getfacl test/
A::OWNER@:rwaDxtTnNcCy
D::OWNER@:
A:g:GROUP@:rxtncy
D:g:GROUP@:waDTC
A::EVERYONE@:rxtncy
D::EVERYONE@:waDTC

Revision history for this message
cotillion (tobias-schwan) wrote :

I can confirm this bug.

Is there a chance the fix in the kernel package will become a StableReleaseUpdate? In my opinion it should. Otherwise this whole package is useless.

Revision history for this message
Alain St-Denis (alain-st-denis) wrote :

I'd like to second that. Deploying a custom kernel is doable, but it's a pain.

William Grant (wgrant)
affects: nfs4-acl-tools (Ubuntu) → linux (Ubuntu)
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi Robert,

If you could also please test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
cotillion (tobias-schwan) wrote :

I can also confirm after installing the latest upstream kernel (linux-image-2.6.35-999-generic_2.6.35-999.201006281005_amd64) the problem is solved.

Revision history for this message
loonatic (albert-friendly) wrote :

I've been struggling with this issue for a couple of months as well.
I used to solve it by patching and recompiling the kernel.
See here for more details:
  http://linux-nfs.org/pipermail/nfsv4/2009-November/011643.html

The issue has been fixed in vanilla kernel from 2.6.33.2. Also see:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=patch;\
h=462d60577a997aa87c935ae4521bd303733a9f2b

All it takes is to apply the patch below. Strangely enough, Debian Lenny has fixed as.
As have a lot of other distributions, especially server oriented. After all, it was diagnosed and fixed in Nov 2009.

diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 83ad47c..32b11c0 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -2096,7 +2096,7 @@ nfs4_xdr_enc_getacl(struct rpc_rqst *req, __be32 *p,
  encode_compound_hdr(&xdr, req, &hdr);
  encode_sequence(&xdr, &args->seq_args, &hdr);
  encode_putfh(&xdr, args->fh, &hdr);
- replen = hdr.replen + nfs4_fattr_bitmap_maxsz + 1;
+ replen = hdr.replen + op_decode_hdr_maxsz + nfs4_fattr_bitmap_maxsz + 1;
  encode_getattr_two(&xdr, FATTR4_WORD0_ACL, 0, &hdr);

  xdr_inline_pages(&req->rq_rcv_buf, replen << 2,

I can confirm that the ppa kernels 2.6.33 and up fix the problem, but they should as the vanilla kernel has this fix.
However, running a stock Ubuntu Lucid kernel with the fix instead of a ppa kernel is much preferred.

Revision history for this message
Toni Harbaugh-Blackford (harbaugh) wrote :

Since lucid is an LTS version, couldn't this be fixed in the standard kernel?

tags: removed: needs-upstream-testing
Revision history for this message
Jeremy Kerr (jk-ozlabs) wrote :

Looks like 462d60577a997aa87c935ae4521bd303733a9f2 hasn't gone into the linux-2.6.32.y tree; not sure why. Potential for an SRU here..

Revision history for this message
Jeremy Kerr (jk-ozlabs) wrote :

Sorry, that should be d327cf7449e6fd5cbac784c641770e9366faa386.

Revision history for this message
Jeremy Kerr (jk-ozlabs) wrote : [stable] nfs: fix acl decoding

Hi Greg,

Looks like this fixes an issue on 2.6.32.17 -
https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/562913 . The
patch has hit mainline (as of 2.6.33, in commit
d327cf7449e6fd5cbac784c641770e9366faa386), but has missed stable.

Bruce - let me know if there's any reason this shouldn't go in.

Please consider for inclusion in the 2.6.32.y stable series.

Regards,

Jeremy

>From 394cc62815fdac2b3effe952588630c8c3e0629f Mon Sep 17 00:00:00 2001
From: J. Bruce Fields <email address hidden>
Date: Thu, 3 Dec 2009 08:10:17 -0500
Subject: [PATCH] Re: acl trouble after upgrading ubuntu

Subject: [PATCH] nfs: fix acl decoding

Commit 28f566942c6b1d929f5e240e69e7081b77b238d3 "NFS: use dynamically
computed compound_hdr.replen for xdr_inline_pages offset" accidentally
changed the amount of space to allow for the acl reply, resulting in an
IO error on attempts to get an acl.

Reported-by: Paul Rudin <email address hidden>
Cc: Benny Halevy <email address hidden>
Signed-off-by: J. Bruce Fields <email address hidden>
Signed-off-by: Trond Myklebust <email address hidden>
---
 fs/nfs/nfs4xdr.c | 2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 5ec74cd..e81b2bf 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -2096,7 +2096,7 @@ nfs4_xdr_enc_getacl(struct rpc_rqst *req, __be32 *p,
  encode_compound_hdr(&xdr, req, &hdr);
  encode_sequence(&xdr, &args->seq_args, &hdr);
  encode_putfh(&xdr, args->fh, &hdr);
- replen = hdr.replen + nfs4_fattr_bitmap_maxsz + 1;
+ replen = hdr.replen + op_decode_hdr_maxsz + nfs4_fattr_bitmap_maxsz + 1;
  encode_getattr_two(&xdr, FATTR4_WORD0_ACL, 0, &hdr);

  xdr_inline_pages(&req->rq_rcv_buf, replen << 2,
--
1.7.0.4

Revision history for this message
Greg KH (greg-kroah) wrote : Re: [stable] nfs: fix acl decoding

On Tue, Aug 03, 2010 at 02:37:18PM +0800, Jeremy Kerr wrote:
> Hi Greg,
>
> Looks like this fixes an issue on 2.6.32.17 -
> https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/562913 . The
> patch has hit mainline (as of 2.6.33, in commit
> d327cf7449e6fd5cbac784c641770e9366faa386), but has missed stable.

Now queued up.

thanks,

greg k-h

Revision history for this message
J. Bruce Fields (bfields-fieldses) wrote : Re: [stable] nfs: fix acl decoding

On Tue, Aug 03, 2010 at 02:37:18PM +0800, Jeremy Kerr wrote:
> Hi Greg,
>
> Looks like this fixes an issue on 2.6.32.17 -
> https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/562913 . The
> patch has hit mainline (as of 2.6.33, in commit
> d327cf7449e6fd5cbac784c641770e9366faa386), but has missed stable.
>
> Bruce - let me know if there's any reason this shouldn't go in.

Apologies, yes, I think it should have.

--b.

>
> Please consider for inclusion in the 2.6.32.y stable series.
>
> Regards,
>
>
> Jeremy
>
>
> >From 394cc62815fdac2b3effe952588630c8c3e0629f Mon Sep 17 00:00:00 2001
> From: J. Bruce Fields <email address hidden>
> Date: Thu, 3 Dec 2009 08:10:17 -0500
> Subject: [PATCH] Re: acl trouble after upgrading ubuntu
>
> Subject: [PATCH] nfs: fix acl decoding
>
> Commit 28f566942c6b1d929f5e240e69e7081b77b238d3 "NFS: use dynamically
> computed compound_hdr.replen for xdr_inline_pages offset" accidentally
> changed the amount of space to allow for the acl reply, resulting in an
> IO error on attempts to get an acl.
>
> Reported-by: Paul Rudin <email address hidden>
> Cc: Benny Halevy <email address hidden>
> Signed-off-by: J. Bruce Fields <email address hidden>
> Signed-off-by: Trond Myklebust <email address hidden>
> ---
> fs/nfs/nfs4xdr.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index 5ec74cd..e81b2bf 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -2096,7 +2096,7 @@ nfs4_xdr_enc_getacl(struct rpc_rqst *req, __be32 *p,
> encode_compound_hdr(&xdr, req, &hdr);
> encode_sequence(&xdr, &args->seq_args, &hdr);
> encode_putfh(&xdr, args->fh, &hdr);
> - replen = hdr.replen + nfs4_fattr_bitmap_maxsz + 1;
> + replen = hdr.replen + op_decode_hdr_maxsz + nfs4_fattr_bitmap_maxsz + 1;
> encode_getattr_two(&xdr, FATTR4_WORD0_ACL, 0, &hdr);
>
> xdr_inline_pages(&req->rq_rcv_buf, replen << 2,
> --
> 1.7.0.4
>
>
>

Revision history for this message
Johan Ramm-Ericson (johanre) wrote :

Since this bug is a showstopper for us (we are in the process of setting up an Ubuntu 10.04 NFS v4 server with ~1500 Ubuntu 10.04 client machines) I would very much like to know if the mentioned patch will be making it into the kernel for lucid (1) at all and (2) if there has been a decision to include it is there a time estimate on when it will make it into the mainstream?

Thanks for your time and great work!

Revision history for this message
Stephane Miller (stephaneeee) wrote :

This is now fixed in lucid's linux-image-2.6.32-25-server package.

Revision history for this message
Alexander Brinkman (abrinkman) wrote :

We have been trying to get NFS4 ACLs working and this issue greatly affects us (showstopper for workstation deployment). I can confirm the bug is fixed in the server kernel image (tried with 2.6.32-27.49), but is seems to be still unpatched in the generic image (tried version 2.6.32-27.29). When will this be patched in the generic kernel?

Revision history for this message
Stefan Bader (smb) wrote :

The patch mention in comment #10 has been in Lucid since 2.6.32-25.43 (all kernels with that number are built from the same source so server and generic are the same). Did a quick test on the current Lucid kernel and getting ACLs seems to work. Marking as fixed. If this issue does not seem to be fixed, feel free to re-open or create a new bug.

Changed in linux (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.