run-init crashes when root is unionfs on nfs

Bug #85145 reported by Jonas Bonn
2
Affects Status Importance Assigned to Milestone
initramfs-tools (Ubuntu)
Invalid
Undecided
Unassigned
linux-source-2.6.20 (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

Binary package hint: initramfs-tools

In initramfs-tools/scripts/nfs-bottom we use a custom script to set up a unionfs over an NFS mount in order to get a writeable, shared base system. When this script is included in our initramfs, then run-init (at the end of the initramfs init script) causes a kernel oops...

1) We used this method with Dapper and it worked perfectly.

2) With Feisty, we get the oops.

The script to mount the unionfs over NFS root looks like this:

---------------------------
#!/bin/sh

PREREQ=""
DESCRIPTION="Setting up volatile unionfs over read-only root..."

. /scripts/functions

prereqs()
{
       echo "$PREREQ"
}

case $1 in
# get pre-requisites
prereqs)
       prereqs
       exit 0
       ;;
esac

log_begin_msg "$DESCRIPTION"

mkdir -p /cow
mount -n -t tmpfs tmpfs /cow
mount -n -t unionfs -o dirs=/cow=rw:${rootmnt}=nfsro unionfs "${rootmnt}"

. /tmp/net-eth0.conf
cat > ${rootmnt}/etc/resolv.conf << EOF
search $DNSDOMAIN
nameserver $IPV4DNS0
EOF

log_end_msg
-----------------------

If I comment out the line with the unionfs mount, then everything works fine.

Steps to reproduce:

1) Add the above script to /usr/share/initramfs/scripts/nfs-bottom
2) Update your initramfs (update-initramfs -u)
3) Boot with the initramfs over NFS.

Oops when run-init is called...

Revision history for this message
Phillip Lougher (phillip-lougher) wrote :

Confirmed. The unionfs mount doesn't actually need to be performed as part of the initramfs to cause a kernel hang. Issuing the unionfs mount on an NFS filesystem in the shell caused my system to hang...

A couple of things to clarify:

1. Are both your Dapper and Feisty kernels using the vanilla Unionfs code merged by Ubuntu?

2. Did your system kernel oops or hang? If it kernel oopsed, can you attach the oops message (or take a picture of the screen if that's not possible).

Dapper has Unionfs version 1.1.2. Feisty has a snapshot from 20060916. It is quite possible there is a regression there. Will investigate further.

Changed in linux-source-2.6.20:
status: Unconfirmed → In Progress
Revision history for this message
Jonas Bonn (jonas.bonn) wrote :

1) Yes, I am using the vanilla unionfs code from Ubuntu in both cases... I have done nothing special.

2) The kernel panics... unfortunately I cannot get a picture of the screen for you as I have no camera available here... is the oops message saved somewhere so that I can get at it otherwise???

Changed in initramfs-tools:
status: Unconfirmed → Rejected
Revision history for this message
Phillip Lougher (phillip-lougher) wrote :

No, there isn't anyway to obtain an oops from the initramfs. However, I have obtained an oops myself.

The bug is caused by Unionfs calling lookup_one_len() on an NFS mounted filesystem, see

http://www.fsl.cs.sunysb.edu/pipermail/unionfs/2006-August/004773.html

The fix is to ensure that Unionfs correctly passes intents to the underlying NFS filesystem. This is currently done with a new lookup_one_len_nd() call which passes the necessary nameidata argument.

The upshot of this is that the bug is fixed, but only in Unionfs 2.0. We will have to evaluate whether to incorporate Unionfs 2.0 into the current Feisty kernel.

Changed in linux-source-2.6.20:
assignee: nobody → phillip-lougher
Changed in linux-source-2.6.20:
importance: Undecided → Medium
Revision history for this message
Phillip Lougher (phillip-lougher) wrote :

Unionfs 2.0 fix backported to the Unionfs 1.2 version in our Feisty kernel.

Changed in linux-source-2.6.20:
status: In Progress → Fix Committed
Changed in linux-source-2.6.20:
status: Fix Committed → Fix Released
Revision history for this message
cybaix (cybaix) wrote :

I just tried this out under Gutsy and get the same problem, did the backported fix not get carried over to the Gutsy kernel? I am running 2.6.22-14-generic. Does anyone have a workaround?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.