Some timeout oops reports seem incomplete

Bug #288064 reported by Diogo Matsubara
4
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Gary Poster

Bug Description

Some timeout oops reports are missing the SQL statements. Those timeouts with no sql statements recorded started on 2008-10-18. Some examples: OOPS-1022C1556, OOPS-1023A1492, OOPS-1024A1719, OOPS-1025B1681, OOPS-1026A4137, OOPS-1027EC19

Conversation related to this bug on #launchpad-code 2008-10-22 around 10:45:

<stub> That first one (OOPS-1025A1069) certainly looks like a db lock could not be obtained.
<BjornT> stub: no it doesn't. the timeout happens before the query is sent to the db. there seems to b
sql queries not being recorded in the oops report. there's no way you can get to that code point without issu
ies.
<Ursinha> BjornT, but I think this is not the only problem
<Ursinha> BjornT, recalls me bug 252674
<ubuntoid> Ursinha: Bug 252674 on http://launchpad.net/bugs/252674 is private
* mwhudson stabs the hotel network
<mwhudson> with rusty barbed pointy things
<BjornT> Ursinha: right. it's timing out for some reason, but hard to tell why without seeing the sql queries.
<stub> BjornT: I don't see how - timeout exceptions are database exeptions - you can't get one without issuing a query. In this case, it looks like the first query attempted blocked until the timeout kicked in, aborting it and raising the exception.
<BjornT> stub: timeouts can happen in two places. either in connection_raw_execute or connection_raw_execute_error. if the timeout is in the latter, it's the db that is raising the error. if it's in the former, we raise it ourselves, before sending the query to the db
<stub> That would be a bug (or feature or limitation) I think - the statement that times out isn't counted in the stats.
<stub> BjornT: Ahh...
<stub> So if this exception happened while the HTTP/1.1 problem was happening, it would make sense
<BjornT> stub: not really. some sql queries have certainly been issued, but they are not showing up in the oops report for some reason
<stub> BjornT: I wonder if we are not logging statements made on the non (LPMAIN_STORE, DEFAULT_FLAVOR) store?
<kiko> stub, are we sending statements elsewhere?
<stub> kiko: The oops reporter might only be looking at one database adapter instead of all four. Just guessing here - not looking at code.
<BjornT> stub: hmm, that could be possible. but i haven't looked at the code to confirm.
<stub> So statements might be sitting somewhere that nobody looks at.
<kiko> stub, ah, we are using multiple adapters but a single database then?

Revision history for this message
Björn Tillenius (bjornt) wrote :

The issue is that we call clear_request_started() and set_request_started() before generating the oops.

Changed in launchpad-foundations:
importance: Undecided → High
status: New → Triaged
assignee: nobody → garyposter
Revision history for this message
Gary Poster (gary) wrote :

rev 7246. Preparing cherry-pick.

Changed in launchpad-foundations:
status: Triaged → Fix Committed
Changed in launchpad-foundations:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.