Comment 2 for bug 406397

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote : Re: init: job stuck after wrong use of expect fork

Unfortunately what we're running into here turns out to be good old-fashioned UNIX semantics.

Becoming a daemon involves calling fork() twice to completely detach from your calling session and terminal before carrying on in the grandchild process while the parent and child both exit.

When the parent exits, init receives SIGCHLD because it is the process that spawned it and is its own parent. The child process is then reparented to the init daemon.

When the child exits, if the parent has *not yet* exited (remember that after a fork() things do not happen in a deterministic order) the parent receives the SIGCHLD for it because it spawned it, otherwise init receives the SIGCHLD because it's the new parent.

This isn't normally a problem because daemons don't worry about handling SIGCHLD before daemonising, so the SIGCHLD is still pending when the parent exits and the child is reparented. The kernel notices this and sends SIGCHLD to the init daemon after reparenting the zombie.

But for some reason known only to himself, Lennart has chosen to explicitly and deliberately install a SIGCHLD handler during the damonisation (inside the actual call to daemon_fork() in the libdaemon library avahi uses).

We simply can't handle this through ptrace and signals. Fixing this properly is going to neet the netlink-based code. Fortunately it probably only affects Lennart's code and only when you get the "expect" line wrong.

It's somewhat unfortunate that this results in a stuck job, because init is basically still waiting for the SIGCHLD that never comes. This isn't a situation that's really possible to deal with, if you like it's an assertion error.

The only alternative would be to have some kind of timeout after sending SIGKILL for the process to die, but then we'd hit other problems when the process is truly still running (e.g. NFS timeout in kernel deadlock).

I strongly dislike the idea of a "just forget about it" flag, but I guess we'll need one of those too.