code.lp.net/launchpad timing out frequently

Bug #328302 reported by Martin Albisetti
22
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Tim Penhey

Bug Description

I'm getting timeouts on code.lp.net/launchpad two or three times a day.
Random oops of it: https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1138EA512
Please help!

Similar timeouts on ubunet and bzr: OOPS-1180B888, OOPS-1180E705

Revision history for this message
Jonathan Lange (jml) wrote :

This should really be addressed this cycle, since it's a very important page.

Assigning to thumper for possible re-assignment.

Changed in launchpad-bazaar:
importance: Undecided → High
status: New → Triaged
assignee: nobody → thumper
milestone: none → 2.2.2
Revision history for this message
Jonathan Lange (jml) wrote :
Revision history for this message
Jonathan Lange (jml) wrote :
Revision history for this message
Jonathan Lange (jml) wrote :

Stuart, can you look at this query and tell us how to make the pain go away?

Changed in launchpad-bazaar:
assignee: thumper → stub
Revision history for this message
Tim Penhey (thumper) wrote :

jml and I spent some time talking about this, and much of what we do in code can be done with the following single sql call. It is however a little slow (https://pastebin.canonical.com/14346/). Can this be made faster? Or should we look into denormalisation?

select count(distinct(Revision.id)) as revision_count,
 count(distinct(
    coalesce(RevisionAuthor.person, -RevisionAuthor.id))) as committer_count
from Revision
join RevisionAuthor on Revision.revision_author = RevisionAuthor.id
join BranchRevision on BranchRevision.revision = Revision.id
join Branch on BranchRevision.branch = Branch.id
where
    Revision.revision_date > '2009-02-01'
AND Revision.revision_date <= '2009-02-27'
and Branch.product = 10294
and BranchRevision.revision >= (
  select min(id) from revision where revision_date > '2009-02-01')

Tim Penhey (thumper)
Changed in launchpad-bazaar:
milestone: 2.2.2 → 2.2.3
Revision history for this message
Jonathan Lange (jml) wrote :

stub and I talked a bit about this on IRC. The server where my logs are is unavailable at the moment.

stub, can you tell us what to do? :)

Revision history for this message
Stuart Bishop (stub) wrote : Re: [Bug 328302] Re: code.lp.net/launchpad timing out frequently

On Tue, Mar 3, 2009 at 12:37 AM, Jonathan Lange <email address hidden> wrote:
> stub and I talked a bit about this on IRC. The server where my logs are
> is unavailable at the moment.
>
> stub, can you tell us what to do? :)

There is an optimization on the existing query, restricting the set of
BranchRevisions that need to be joined through. This optimization no
longer works as we have Revisions from the future.

We can get the existing optimization working and back to where we
where by doing:

UPDATE Revision SET revision_date=date_created WHERE revision_date >
date_created;

(and fixing the code to make revisions with future dates saner).

If we don't do this, we have to add revision_date to BranchRevision,
mirroring the Revision.revision_date.

We also discussed not doing these queries in real time to generate the
statistics, but have the calculations cached. The cache would be
updated by the branchscanner.

--
Stuart Bishop <email address hidden>
http://www.stuartbishop.net/

Revision history for this message
Francis J. Lacoste (flacoste) wrote :

While reviewing the tables to priune regularly, don't forget about OAuthNonce.

Revision history for this message
Francis J. Lacoste (flacoste) wrote :

Of course that last comment should have gone on bug 314621.

Revision history for this message
Tim Penhey (thumper) wrote :

I spent some time with spm today looking at this. It seems more likely that our optimisation is no longer an optimisation due to the shear size of the BranchRevision table. I also spent some time thinking about how we could do without the branch revision table. The primary use cases for it are "feeds", "unmerged revisions", "summary info".

The use-case that seems to be hurting us the most is the summary information for projects. We'd like to have it for project groups, and users, but the query is too cumbersome. As of today we have 84 million rows in the branch revision table. Too big to be doing ad-hoc queries across.

I propose that we create a table for the purpose of providing quick access to the summary information.

If we had something like:

create table RevisionSummaryCache
(
revision_date datetime
product int references Product
distroseries int references DistroSeries -- can't forget source package branches
sourcepackage int references SourcePackageName -- or what ever it is
author int -- no references, use +ve numbers for person id fields, and -ve for revision_author links where there isn't a person
revision int references Revision
)

We make some arbitrary time cutoff, like 30 days. We remove any entries in this table older than that time.

When we scan a branch we make sure there exists a row for the product or distroseries/sourcepackage for each revision that is within the last 30 days.

With good indices on this table, we should be able to have very quick counts of revisions across projects, as well as source packages, distroseries and distributions. Additionally we should be able to have quick counts of all commits within the last 30 days across all of Launchpad (which admittedly isn't hard now, but the others are). We should also be able to get quick counts of commits by an individual across different subsections.

Comments?

Revision history for this message
Robert Collins (lifeless) wrote :

Is this in addition to, or a replacement for branchrevision?

stub seems to be saying that branchrevision is ok, if we fix some data and prevent bad data in future.

Having multiple different caches will impose more of a processing (and even design) cost.

Revision history for this message
Tim Penhey (thumper) wrote :

On Wed, 11 Mar 2009 20:32:33 Robert Collins wrote:
> Is this in addition to, or a replacement for branchrevision?
>
> stub seems to be saying that branchrevision is ok, if we fix some data
> and prevent bad data in future.
>
> Having multiple different caches will impose more of a processing (and
> even design) cost.

This is in addition to branchrevision.

I'm saying that branchrevision isn't ok, even if we fix the data. That is what
I tested today.

There there is a cost, but I think it is worth it to get quick queries.

Ursula Junque (ursinha)
description: updated
Revision history for this message
Tim Penhey (thumper) wrote :

I have a quick fix for this that makes the call take 3s instead of 17s but a de-normalised table would still be more optimal IMHO.

Changed in launchpad-bazaar:
assignee: stub → thumper
status: Triaged → In Progress
Tim Penhey (thumper)
Changed in launchpad-bazaar:
status: In Progress → Fix Committed
Tim Penhey (thumper)
Changed in launchpad-bazaar:
status: Fix Committed → Fix Released
Curtis Hovey (sinzui)
visibility: private → public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.