Comment 1 for bug 434192

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

Tim has found the problem. The first thing that happens to a job when it's beginning to be processed is that it's marked as running; the transaction is committed to make this visible.

When a job fails, the transaction is aborted, a new one is implicitly started, and the new job is marked as failed. Then the exception is re-raised. It's caught one call level up, where the error is appended to a list of failures. Then the next job is processed, which as a side effect commits the previous job's failure mark.

But what happens at the end, when there is no next job? I looked at this and stupidly discarded it as something that would have been noticed. At the end, the script registers oopses for any failures. This was failing because of missing configuration values. And at that point there's nothing to catch and handle the exception, so the script borks out. The transaction is never committed, and so that final failure is not recorded.