lucid: Failure to bring up cryptsetup devices by key files (when not using "splash")

Bug #532898 reported by Daniel Hahler
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cryptsetup (Ubuntu)
Fix Released
High
Unassigned

Bug Description

Binary package hint: mountall

The boot process waits for a cryptsetup device forever and I have to workaround as follows:
1. Wait until harddisk activity gets (more) silent (about half a minute)
2. Press Alt-SysRq-i to kill the process
3. Enter root password (root account has to be activated!)
4. Start mountall (again)

Drawbacks: I'm not sure if it's related to this workaround, but when the system is up, I have to "sudo chmod 777 /dev/shm" - otherwise various things (like Chromium) fail in a strange way.
---
Architecture: i386
DistroRelease: Ubuntu 10.04
NonfreeKernelModules: nvidia
Package: mountall 2.7
PackageArchitecture: i386
ProcEnviron:
 SHELL=/bin/bash
 PATH=(custom, user)
 LANG=de_DE.UTF-8
 LANGUAGE=
ProcVersionSignature: Ubuntu 2.6.32-15.22-generic
Tags: lucid
Uname: Linux 2.6.32-15-generic i686
UserGroups: adm admin audio cdrom dialout dip floppy fuse libvirtd lpadmin plugdev pulse sambashare sbuild scanner src video

Revision history for this message
Daniel Hahler (blueyed) wrote :
Revision history for this message
Daniel Hahler (blueyed) wrote :
Revision history for this message
Daniel Hahler (blueyed) wrote :
Changed in mountall (Ubuntu):
importance: Undecided → High
summary: - Failure to bring up cryptsetup devices by key files
+ lucid: Failure to bring up cryptsetup devices by key files
Revision history for this message
Daniel Hahler (blueyed) wrote : Dependencies.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Daniel Hahler (blueyed) wrote : Re: lucid: Failure to bring up cryptsetup devices by key files

I've just tried using "start mountall" in the (mountall?) shell and it worked, too - but the output looked different (was cleared to fast to be sure though).

Revision history for this message
Daniel Hahler (blueyed) wrote :

I have to manually do "sudo initctl emit filesystems" btw so that tasks like munin-node get started at all.

Revision history for this message
Daniel Hahler (blueyed) wrote :

Assigning to plymouth: removing "splash" from the kernel cmdline makes the system boot without hassle.

affects: mountall (Ubuntu) → plymouth (Ubuntu)
Revision history for this message
Steve Langasek (vorlon) wrote :

What are the contents of your /etc/crypttab?

Note that when the message about 'waiting for device' shows up, you can just hit 'S' to skip that filesystem and continue booting.

affects: plymouth (Ubuntu) → cryptsetup (Ubuntu)
Changed in cryptsetup (Ubuntu):
status: New → Incomplete
Revision history for this message
Daniel Hahler (blueyed) wrote :

$ cat /etc/crypttab
# <target name> <source device> <key file> <options>
# /dev/md1:
cfast UUID=61fc7fe4-10b1-4fa3-a648-772f658fa5d4 none luks,retry=1,cipher=twofish-cbc-plain,loud
# /dev/md2:
cdata UUID=07a90580-44a4-44cf-b7b6-10502257b7df /etc/path/to/keyfile.luks luks,retry=1,cipher=twofish-cbc-essiv:sha256,loud
# /dev/sdc8:
cdata2 UUID=0f80239b-8243-4a4e-a271-7a1e96293d81 /etc/path/to/keyfile2.luks luks,retry=1,cipher=twofish-cbc-essiv:sha256,loud

Changed in cryptsetup (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
Steve Langasek (vorlon) wrote :

Thanks, will try to reproduce this here.

Which device is mounted at /mnt/datalv?

Revision history for this message
Daniel Hahler (blueyed) wrote :

/dev/mapper/datarcvg-datalv on /mnt/datalv type ext3 (rw,relatime,user_xattr)
(and two bind-mounts of /mnt/datalv itself)

datarcvg-datalv is a LVM LV, built from /dev/md2 (cdata).

(Please note: I'm unable to skip anything (as suggested from you above), the keys are not recognized, bug 540569)

Revision history for this message
Daniel Hahler (blueyed) wrote :

It now works for me, when using "splash" and it gets used properly (via "nouveau" driver rather than nvidia-current).

So this appears to be failing if plymouth gets not used fully.

summary: - lucid: Failure to bring up cryptsetup devices by key files
+ lucid: Failure to bring up cryptsetup devices by key files (when not
+ using "splash")
Revision history for this message
Steve Langasek (vorlon) wrote :

I haven't been able to reproduce this, with or without splash.

/etc/fstab:
/dev/mapper/cdata /mnt/datalv ext4 defaults 0 2

/etc/crypttab:
cdata /dev/sda6 /etc/keyfile-test.luks luks

/proc/cmdline:
root=/dev/mapper/dario-root ro quiet

And plymouth is certainly used even when not passing 'splash', so I'm not sure why that would have any effect.

Your original screenshot shows that there was a mismatch between the versions of plymouth and mountall you had installed (older plymouth, newer mountall) - what plymouth version do you have installed now? Is it possible this has gone away with a plymouth upgrade?

Revision history for this message
iblue (iblue) wrote :

I can confirm this behaviour. When I add entries to /etc/crypttab and /etc/fstab, I get "Waiting for <name of the mount> [SM]". If I comment them out, everything works fine.
Seems like the cryptdisks are just not started.

/etc/crypttab:
bytemachine_wurzelzwerg /dev/mappper/bytemachine-wurzelzwerg none luks
bytemachine_backup /dev/mapper/bytemachine-backup /etc/keys/bytemachine/backup luks
[...]

/etc/fstab:
[...]
/dev/mapper/bytemachine_wurzelzwerg / xfs errors=remount-ro 0 1
/dev/mapper/bytemachine_backup /srv/export/bytemachine/backup auto defaults 0 0
[...]

Revision history for this message
Steve Langasek (vorlon) wrote :

iblue,

Same question - what version of plymouth do you have installed? Also, what kernel version are you booting? (If somehow you're booting an older kernel, that would also mean you were using the older plymouth from a previously-built initramfs.)

Revision history for this message
iblue (iblue) wrote :

apt-cache says I'm using plymouth version 0.8.0~-17.
Kernel is 2.6.32-17-generic x86_64.

I did some further debugging. In /etc/init/cryptdisks-enable.conf when I replace "start on stopped udevtrigger" with "start on started mountall", you can see that the /dev/mapper devices are created after some time, but even though the devices exist, mountall does not mount them. May be a bug in mountall?

When I got to the console (press M when the message is displayed), wait for the devices to show up and then press Ctrl+D, everything boots fine.

Revision history for this message
Daniel Hahler (blueyed) wrote :

For me, it works since a few days: first when plymouth was used fully, but currently also with plymouth using the text plugin (although I have the framebuffer activated after booting?!)
It happened already that I could not type in the password at the text and/or graphical prompt, but a reboot "fixed" it then.
This looks like there might be timing issues involved.
plymouth 0.8.0~-17 and linux 2.6.32-17-generic.

Revision history for this message
segler (segler-alex) wrote :

i have the same problem, but without using keyfiles. i am using password input, and after i typed and pressed enter it waits indefinitelly. i have a desktop computer and a notebook, and both show the same behaviour. they are perfectly updated (latest kernel, mountall, plymouth, cryptsetup,...), since the question occured before.
if i press "s" after some waiting, it works. that means all gets mounted and i can use the machine. so it comes to one keypress more than it should be, but this has to be a bug, you should only press "s" if you want to skip. and it does not skip at all.

Revision history for this message
Steve Langasek (vorlon) wrote :

> they are perfectly updated (latest kernel, mountall, plymouth,
> cryptsetup,...)

Please list exact package versions; packages are not instantaneously at the same version everywhere in the mirror network, so we need you to confirm what versions you're running rather than guessing.

> if i press "s" after some waiting, it works.

Are you not being shown the prompt with the '[SM]' options?

Revision history for this message
Carlos Hernando (carlos-hernando) wrote :

I have the same problem as segler.

2.6.32-18-generic #27-Ubuntu SMP Fri Mar 26 19:51:10 UTC 2010 i686 GNU/Linux
cryptsetup 2:1.1.0~rc2-1ubuntu13
plymouth: 0.8.1-1ubuntu3
mountall: 2.8

fstab:
/dev/mapper/home /home ext4 defaults 0 2

crypttab:
home /dev/sda4 none luks

I type my passphrase, the system unlocks the partition and waits indefinitely.

As a quick fix:
  * Press M (to get the shell).
  * mount /home
  * Control + D

Revision history for this message
TJ (tj) wrote :

I'm experiencing this with ubuntu-desktop i386 with nvidia-current 195.x.y drivers installed.

In this case I cannot find any combination of kernel command-line options (inc. "nomodeset" and "nosplash") or interrupts to get past the issue, whether multi-user or single (recovery), and am currently experiencing an unusable system so I cannot at this moment report the exact package versions installed.

I had managed to boot it via 'recovery' and entering the shell then manually starting GDM about 24-hours ago. I had avoided restarting until around 23:00 UTC because of the difficulty in getting things started. At that time I'd just ensured all packages were upgraded from the GB mirror.

The configuration is:

sda1, sda2 legacy Windows partitions
sda3 ext3 /boot
sda4 LVM VG=Ubuntu

Within VG "Ubuntu" are approximately 15 Logical Volumes (LV). Some are encrypted, others are not.

The encrypted VGs are secured only using a LUKS key-file (no passwords). The key-file is on a USB key. I have a custom script (/usr/local/sbin/crypt-usb-key.sh) that has worked fine since Hardy and still works with Lucid. /etc/crypttab specifies the key-file and custom script.

Ubuntu/Lucid_enc is LUKS containing the root ext4.
Ubuntu/Lucid_var_enc is LUKS containing the /var ext4.
Ubuntu/swap is swapfs
Ubuntu/usr_local is /usr/local ext3.
Ubuntu/home is LUKS containing the /home ext4.

Obviously Ubuntu/Lucid_enc has to be unlocked from initrd using my custom script and it is. Here's a snippet of the output (after Escaping the splash screen):

Unlocking encrypted volume using a key-file
Waiting up to 30 seconds for removable devices...
Success loading key-file
Key slot 0 unlocked.

However, /var never gets unlocked.

If splash is enabled I see "Waiting for /var [SM]" but regardless of what key I press nothing appears to happen.
If I press Escape I see that pressing "M" causes:

init: plymouth main process (438) killed by SEGV signal

I also see several fsck reports for several unecrypted LVs. I don't see anything to suggest cryptsetup has tried to unlock any encrypted partitions.

If I switch away from tty7 and return to it before pressing a key the consoles freeze - no reaction to regular keys or tty switching, although SysReq keys are still responsive.

I'm wondering if this is another symptom of tty7 being 'grabbed' by plymouth?

Revision history for this message
TJ (tj) wrote :

Through more experimentation I found a manual workaround that at least gets me 'in'.

After the splash shows "Waiting for /var [SM]" I try pressing "M" and nothing happens. I then press Escape to dismiss the splash screen. I press "M" again and get

init: plymouth main process ($PID) killed by SEGV signal

I then press Alt+SysReq+i to get to a root shell.

I now manually mount the USB key and run the cryptsetup commands:

mkdir /tmp/USB
mount /dev/sdb1 /tmp/USB
cryptsetup luksOpen /dev/mapper/Ubuntu-Lucid_var_enc var --key-file /tmp/USB/path/to/key/file
cryptsetup luksOpen /dev/mapper/Ubuntu-home home --key-file /tmp/USB/path/to/key/file

Finally I restart the mountall script:

start mountall

And the system continues on to start GDM.

Now I have a way in I'll try to discover some clues on this.

Revision history for this message
Steve Langasek (vorlon) wrote :

Ok - someone who is able to get into their system following this hang, please edit /etc/init/mountall.conf and replace the line:
    exec mountall --daemon $force_fsck $fsck_fix
with:
    exec mountall --daemon $force_fsck $fsck_fix --debug > /dev/mountall.log 2>&1

then break into the shell and save this file away before restarting mountall, and send the resulting log to this bug report?

Revision history for this message
TJ (tj) wrote :

Here's the mountall.log from the affected system here, as described in comment #21 with the slight correction that the encrypted LV Lucid names end in "_encrypted" not "_enc" as previously stated. Also, don't be confused by the "Ubuntu-Lucid" and "Ubunt-Lucid_var" LVs - they were used to test-install Lucid to unencrypted file-systems and are due to be removed.

Revision history for this message
Steve Langasek (vorlon) wrote :

TJ,

It's an important distinction that the original bug submitter's keyfiles are listed as being located on /etc, whereas yours are on an external USB device. If your keys aren't on the root device, then yes, decrypting the devices is not going to work reliably at boot because there's no way for the cryptdisks-udev job to know it's supposed to wait for this other device to be mounted.

You can customize /etc/init/cryptdisks-enable.conf for your environment by editing its start line to:

  start on stopped udevtrigger and mounted MOUNTPOINT=/mountpoint/for/my/keys

to ensure the cryptsetup upstart job doesn't run before the USB disk is mounted. However, I know of no general fix for this that's going to work with an event-driven boot because the start condition will be different for each user.

Daniel, unless your key files are actually symlinked to some other partition, this also doesn't address the problem you were seeing, so it would be helpful to have a mountall log from your system as well.

Revision history for this message
TJ (tj) wrote : Re: [Bug 532898] Re: lucid: Failure to bring up cryptsetup devices by key files (when not using "splash")

On Wed, 2010-03-31 at 02:02 +0000, Steve Langasek wrote:
> TJ,
>
> It's an important distinction that the original bug submitter's keyfiles
> are listed as being located on /etc, whereas yours are on an external
> USB device. If your keys aren't on the root device, then yes,
> decrypting the devices is not going to work reliably at boot because
> there's no way for the cryptdisks-udev job to know it's supposed to wait
> for this other device to be mounted.

Thanks for giving me a clue on where to investigate, Steve.

That doesn't sound quite right to me. cryptsetup 'knows' (or should
know) that for each device it must run the keyscript. That keyscript
then deals with ensuring the kernel drivers are loaded, the device has
settled, and then mounts the device in order to access the key-file - in
other words, it is the keyscript's responsibility to 'wait for this
other device to be mounted', not cryptdiskd-udev.

Also, it has to be remembered that the encrypted root file-system was
successfully unlocked from the same key-file on the external device
using the same keyscript, so we know that the device is accessible and
therefore the keyscript will note that and immediately do the mount.

Is there a way to have the cryptsetup jobs generate a log similar to
mountall to help figure out what is actually happening?
I'm at the wrong end of a 20 hour session right now so not thinking as
incisively as I might be. I'll come at it fresh tomorrow and see what I
can discover and report back.

One thing I'll try is having the key-file on the root file-system just
to see if that is the differentiator.

Revision history for this message
TJ (tj) wrote :

On Wed, 2010-03-31 at 02:37 +0000, TJ wrote:
> One thing I'll try is having the key-file on the root file-system just
> to see if that is the differentiator.

Confirming the easy bit: having the key-file in the encrypted root and
modifying /etc/crypttab allows an uninterrupted boot:

var /dev/mapper/Ubuntu-Lucid_var_encrypted /etc/keyfile luks
home /dev/mapper/Ubuntu-home /etc/keyfile luks

Thanks for that insight Steve - the new event-driven boot is certainly
throwing out some 'interesting' issues!

I'll follow through on the cryptsetup init jobs, especially -udev.

A quick test using:

$ sudo udevadm monitor --property > udevadm-monitor--property.log &
$ sudo udevadm trigger
$ fg
^C

and comparison to /etc/init/cryptdisks-udev.conf reveals that *none* of
the LUKS LVMs has the property ID_FS_USAGE=crypto:

$ grep -i crypt udevadm-monitor--property.log
KERNEL[1270004082.078366] add /devices/virtual/misc/ecryptfs (misc)
DEVPATH=/devices/virtual/misc/ecryptfs
DEVNAME=ecryptfs
UDEV [1270004082.368070] add /devices/virtual/misc/ecryptfs (misc)
DEVPATH=/devices/virtual/misc/ecryptfs
DEVNAME=/dev/ecryptfs
DM_UUID=CRYPT-LUKS1-5a1b276fe73543ff8d09d2c96f4bee42-var_unformatted
DM_NAME=Ubuntu-Lucid_encrypted
DM_UUID=CRYPT-LUKS1-702270f27c3948fb83020b119e29cb24-root
DM_NAME=Ubuntu-Lucid_var_encrypted
DM_UUID=CRYPT-LUKS1-16cb06fad6aa419cadeaac9e1c348fce-home_unformatted

and the actual device-mapper names do not contain the "_unformatted"
suffix (which I have seen remain in /dev/mapper/ in the past):

$ ls -1 /dev/mapper
control
home
root
Ubuntu-all
Ubuntu-home
Ubuntu-Karmic
Ubuntu-Karmic_var
Ubuntu-Lucid_encrypted
Ubuntu-Lucid_var_encrypted
Ubuntu-Media
Ubuntu-SourceCode
Ubuntu-swap
Ubuntu-usr_local
Ubuntu-VideoCapture
Ubuntu-VirtualMachines
var

The DM_UUID= properties match:

$ ls -l /dev/disk/by-uuid/ | egrep 'Ubuntu-.*(encrypted|home)'
16cb06fa-d6aa-419c-adea-ac9e1c348fce -> ../../mapper/Ubuntu-home
5a1b276f-e735-43ff-8d09-d2c96f4bee42 -> ../../mapper/Ubuntu-Lucid_var_encrypted
702270f2-7c39-48fb-8302-0b119e29cb24 -> ../../mapper/Ubuntu-Lucid_encrypted

The udev log only shows the USB key, the sda3 /boot partition, the two
Windows installations (sda1, sda2) and the LVM PV (sda4) having an
ID_FS_USAGE key:

$ grep -i -B 4 ID_FS_USAGE udevadm-monitor--property.log
ID_FS_UUID=D313-1D74
ID_FS_UUID_ENC=D313-1D74
ID_FS_VERSION=FAT32
ID_FS_TYPE=vfat
ID_FS_USAGE=filesystem
--
ID_FS_UUID=af296c2f-a6f5-4cdb-b74c-66310f169677
ID_FS_UUID_ENC=af296c2f-a6f5-4cdb-b74c-66310f169677
ID_FS_VERSION=1.0
ID_FS_TYPE=ext3
ID_FS_USAGE=filesystem
--
ID_FS_LABEL_ENC=Recovery
ID_FS_UUID=CCE61747E61730E6
ID_FS_UUID_ENC=CCE61747E61730E6
ID_FS_TYPE=ntfs
ID_FS_USAGE=filesystem
--
ID_FS_LABEL_ENC=Vista
ID_FS_UUID=E6E08581E0855927
ID_FS_UUID_ENC=E6E08581E0855927
ID_FS_TYPE=ntfs
ID_FS_USAGE=filesystem
--
ID_FS_UUID=ciBh6h-0yEr-7c3u-y3el-QUAc-i7tj-YyHD46
ID_FS_UUID_ENC=ciBh6h-0yEr-7c3u-y3el-QUAc-i7tj-YyHD46
ID_FS_VERSION=LVM2\x20001
ID_FS_TYPE=LVM2_member
ID_FS_USAGE=raid

Steve, do you know under what circumstances "ID_FS_USAGE=crypto" should
exist? Maybe it is only in raw disk partitions, not LVM volumes?

Revision history for this message
TJ (tj) wrote :

I've found an alternative method for successfully using the USB key that
doesn't require a custom keyscript, and it is probably the recommended
solution too!

1. Ensure there is an entry for the USB key device in /etc/fstab:

# USB key
LABEL=USB /media/USB auto defaults 0 2

In my case the vfat file-system on the key has the label "USB" (defined
using mkfs.vfat ... -n USB ... when the file-system was created). The
fstab entry could use UUID= instead to match the file-system containing
the key-file.

2. Modify entries in /etc/crypttab so the key-file path includes the
mount-point set in fstab (in this example, "/media/USB"):

var /dev/mapper/Ubuntu-Lucid_var_encrypted /media/USB/home/tj/keyfile luks
home /dev/mapper/Ubuntu-home /media/USB/home/tj/keyfile luks

3. The IMPORTANT bit. Modify /etc/default/cryptdisks and set the
variable CRYPTDISKS_MOUNT to the mount-point defined in fstab:

CRYPTDISKS_MOUNT="/media/USB"

Now restart. There is no 'hiccup' "Waiting for /var [SM]", the splash
screen remains in place, and X starts quickly.

The only aspect of using CRYPTDISKS_MOUNT I haven't yet tested is for
the root file-system, since that will require the initrd image to be
updated, and may or may-not still need the custom keyscript. I'll test
that tomorrow and report back.

Revision history for this message
Daniel Hahler (blueyed) wrote :

> Daniel, unless your key files are actually symlinked to some other partition,
> this also doesn't address the problem you were seeing, so it would be helpful
> to have a mountall log from your system as well.

As said in comment 17 (https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/532898/comments/17) it's actually fixed for me. The key files are not symlinked to another partition.
If a logfile is still useful, I could provide one though, of course.

Revision history for this message
Carlos Hernando (carlos-hernando) wrote :

Steve,

This is the log using no key file. The encrypted partition is /dev/sda4 (home) and it's correctly unlocked.

Revision history for this message
TJ (tj) wrote :

I found that cryptsetup still needs the custom keyscript to access the key-file on the external USB device for the encrypted root.

Without the keyscript specified in "/etc/crypttab" cryptsetup doesn't create an initrd /conf/conf.d/cryptroot file - which maybe should be the subject of another bug.

For the /var and /home encrypted volumes I also found that this bug will strike if the USB device isn't inserted very early and given chance to settle - I find inserting it immediately after GRUB menu, which I have displayed for up to 10 seconds. If it is inserted only when the kernel begins running the device isn't ready in time to be mounted according to the "/etc/fstab" entry and this bug will strike.

Revision history for this message
Steve Langasek (vorlon) wrote :

Carlos, your bug doesn't fit the description either of TJ's issue or of the original issue reported by Daniel; please file a separate bug report.

Revision history for this message
Steve Langasek (vorlon) wrote :

Daniel,

Sorry, I had forgotten that you said earlier this was fixed for you. It was your mention of "timing issues" that led me to leave this bug report open, since it seemed possible that you might have the same problem later. If it's working reliably for you, then we should probably close this bug.

Revision history for this message
Daniel Hahler (blueyed) wrote :

Closing: working reliably for a while for myself.
Thanks.

Changed in cryptsetup (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.