codebrowse deadlocks on logging lock

Bug #382050 reported by Michael Hudson-Doyle
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Michael Hudson-Doyle

Bug Description

We kill threads that take longer than a minute or so by sending them a SystemExit exception. Unfortunately, threading.RLock isn't safe against asynchronous exceptions, and it's possible for a thread to be killed while holding the lock that the logging module uses to ensure that logging output doesn't get jumbled. After that happens, any thread that tries to log (i.e., all of them) will block forever.

Although the unsafe window where an exception will cause this problem is pretty small, this problem seems to be happening surprisingly often.

I don't really know what the fix is. I guess figuring out why we have requests taking so long that we have to kill the threads processing them would be ideal, but possibly a little ambitious. We need to do something, though.

Tags: lp-code
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

r51 of ~launchpad-pqm/launchpad-loggerhead/devel

Changed in launchpad-code:
assignee: nobody → Michael Hudson (mwhudson)
status: Triaged → Fix Committed
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

This was actually released to production ages ago.

Changed in launchpad-code:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.