Comment 55 for bug 525154

Revision history for this message
David Mathog (mathog) wrote : Re: mountall for /var races with rpc.statd

The changes in #8 did not work for me. /var is not mounted, it is just part of /, so the upstart's "mounted" test cannot be applied.

The strangest thing about this whole mess is that, at least in my hands, rpc.statd can be in a state where to all outward appearances it is running normally ("status statd" shows "start/running" and "rpcinfo -p" shows both ports), yet until it is restarted (server statd stop; server statd start) mountall will never respond to a SIGUSR1 from within an init script, or (sometimes) an "at" job; yet mountall will always respond to that signal from root in a terminal! Moreover, in this strange state if root in a terminal enters "mount /mnt/safserver/u1" it will result in all NFS mounts being made, not just that one.

Rarely I also see in /var/log/boot.log

  mount.nfs: DNS resolution failed for safserver: Name or service not known

This is just wrong because nsswitch.conf has "files" first for hosts, and safserver is in /etc/hosts. Probably yet another race condition. This one is definitely not persistent since (even without my fix) logging in after a failed NFS mount DNS is always working.

Just discovered that the order of the lines in /etc/fstab also seems to make a difference:

This one fails the most (every one of 4 boots):

proc /proc proc nodev,noexec,nosuid 0 0
LABEL=root / ext3 errors=remount-ro 0 1
LABEL=boot /boot ext3 defaults 0 2
LABEL=swap none swap sw 0 0
/dev/fd0 /media/floppy0 auto rw,user,noauto,exec,utf8 0 0
safserver:/u4/pdb /mnt/safserver/pdb nfs ro,bg,hard,intr 0 0
safserver:/u1 /mnt/safserver/u1 nfs rw,bg,hard,intr 0 0
/dev/sda1 /mnt/windows/C ntfs-3g ro 0 0
/dev/sda6 /mnt/windows/D ntfs-3g defaults 0 0

This one fails the least (none of 4 boots):

proc /proc proc nodev,noexec,nosuid 0 0
LABEL=root / ext3 errors=remount-ro 0 1
LABEL=boot /boot ext3 defaults 0 2
LABEL=swap none swap sw 0 0
/dev/fd0 /media/floppy0 auto rw,user,noauto,exec,utf8 0 0
/dev/sda1 /mnt/windows/C ntfs-3g ro 0 0
/dev/sda6 /mnt/windows/D ntfs-3g defaults 0 0
safserver:/u4/pdb /mnt/safserver/pdb nfs ro,bg,hard,intr 0 0
safserver:/u1 /mnt/safserver/u1 nfs rw,bg,hard,intr 0 0

This one is in between (fails about 75% of the time):

proc /proc proc nodev,noexec,nosuid 0 0
LABEL=root / ext3 errors=remount-ro 0 1
LABEL=boot /boot ext3 defaults 0 2
LABEL=swap none swap sw 0 0
/dev/fd0 /media/floppy0 auto rw,user,noauto,exec,utf8 0 0
/dev/sda1 /mnt/windows/C ntfs-3g ro 0 0
safserver:/u4/pdb /mnt/safserver/pdb nfs ro,bg,hard,intr 0 0
safserver:/u1 /mnt/safserver/u1 nfs rw,bg,hard,intr 0 0
/dev/sda6 /mnt/windows/D ntfs-3g defaults 0 0

Suggests that the ntfs-3g mounts provide enough of a delay so that the race condition between statd starting and nfs mounting is overcome. (Mostly, surely in enough boots it would fail even in the best order.)

Upstart has far too many mysteries and hidden variables for my taste!