ScopedCollection:CollectionResource:#message-page-resource timeouts for bugs with many messages

Bug #497386 reported by Gavin Panella
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Robert Collins

Bug Description

Now the top timeout in production.
sample url
https://api.launchpad.net/1.0/bugs/1/messages?ws.start=500&ws.size=75

Recent oops:

https://lp-oops.canonical.com/oops.py/?oopsid=1718C1010

Branch: launchpad-rev-9760.2
Revno: 9760
SQL time: 4344 ms
Non-sql time: 10882 ms
Total time: 15226 ms
Statement Count: 288

Repeating query pattern (due to inefficient lookups that the storm cache can't handle / lack of eager loading:

163. 7644 30ms SQL-launchpad-main-master SELECT COUNT(*) FROM Bug, BugMessage WHERE BugMessage.message = %s AND Bug.id = BugMessage.bug
164
. 7676 4ms SQL-launchpad-main-master SELECT Bug.date_last_message, Bug.date_last_updated, Bug.date_made_private, Bug.datecreated, Bug.description, Bug.duplicateof, Bug.heat, Bug.heat_last_updated, Bug.id, Bug.latest_patch_uploaded, Bug.message_count, Bug.name, Bug.number_of_duplicates, Bug.owner, Bug.private, Bug.security_related, Bug.title, Bug.users_affected_count, Bug.users_unaffected_count, Bug.who_made_private FROM Bug, BugMessage WHERE BugMessage.message = %s AND Bug.id = BugMessage.bug ORDER BY Bug.id DESC LIMIT 1
165. 7703 17ms SQL-launchpad-main-master SELECT BugTask.assignee, BugTask.bug, BugTask.bugwatch, BugTask.date_assigned, BugTask.date_closed, BugTask.date_confirmed, BugTask.date_fix_committed, BugTask.date_fix_released, BugTask.date_incomplete, BugTask.date_inprogress, BugTask.date_left_closed, BugTask.date_left_new, BugTask.date_triaged, BugTask.datecreated, BugTask.distribution, BugTask.distroseries, BugTask.id, BugTask.importance, BugTask.milestone, BugTask.owner, BugTask.product, BugTask.productseries, BugTask.sourcepackagename, BugTask.status, BugTask.statusexplanation, BugTask.targetnamecache, "_prejoin1".account, "_prejoin1".creation_comment, "_prejoin1".creation_rationale, "_prejoin1".datecreated, "_prejoin1".defaultmembershipperiod, "_prejoin1".defaultrenewalperiod, "_prejoin1".displayname, "_prejoin1".hide_email_addresses, "_prejoin1".homepage_content, "_prejoin1".icon, "_prejoin1".id, "_prejoin1".logo, "_prejoin1".mailing_list_auto_subscribe_policy, "_prejoin1".merged, "_prejoin1".mugshot, "_prejoin1".name, "_prejoin1".personal_standing, "_prejoin1".personal_standing_reason, "_prejoin1".registrant, "_prejoin1".renewal_policy, "_prejoin1".subscriptionpolicy, "_prejoin1".teamdescription, "_prejoin1".teamowner, "_prejoin1".verbose_bugnotifications, "_prejoin1".visibility, "_prejoin2".answers_usage, "_prejoin2".blueprints_usage, "_prejoin2".owner, "_prejoin2".translations_usage, "_prejoin2".active, "_prejoin2".autoupdate, "_prejoin2".bug_reported_acknowledgement, "_prejoin2".bug_reporting_guidelines, "_prejoin2".bug_supervisor, "_prejoin2".bugtracker, "_prejoin2".date_next_suggest_packaging, "_prejoin2".datecreated, "_prejoin2".description, "_prejoin2".development_focus, "_prejoin2".displayname, "_prejoin2".downloadurl, "_prejoin2".driver, "_prejoin2".enable_bug_expiration, "_prejoin2".freshmeatproject, "_prejoin2".homepage_content, "_prejoin2".homepageurl, "_prejoin2".icon, "_prejoin2".id, "_prejoin2".lastdoap, "_prejoin2".license_approved, "_prejoin2".license_info, "_prejoin2".reviewed, "_prejoin2".logo, "_prejoin2".max_bug_heat, "_prejoin2".mugshot, "_prejoin2".name, "_prejoin2".official_answers, "_prejoin2".official_blueprints, "_prejoin2".official_malone, "_prejoin2".official_rosetta, "_prejoin2".private_bugs, "_prejoin2".programminglang, "_prejoin2".project, "_prejoin2".registrant, "_prejoin2".remote_product, "_prejoin2".reviewer_whiteboard, "_prejoin2".screenshotsurl, "_prejoin2".security_contact, "_prejoin2".sourceforgeproject, "_prejoin2".summary, "_prejoin2".title, "_prejoin2".translation_focus, "_prejoin2".translationgroup, "_prejoin2".translationpermission, "_prejoin2".wikiurl, "_prejoin3".id, "_prejoin3".name, "_prejoin4".account, "_prejoin4".creation_comment, "_prejoin4".creation_rationale, "_prejoin4".datecreated, "_prejoin4".defaultmembershipperiod, "_prejoin4".defaultrenewalperiod, "_prejoin4".displayname, "_prejoin4".hide_email_addresses, "_prejoin4".homepage_content, "_prejoin4".icon, "_prejoin4".id, "_prejoin4".logo, "_prejoin4".mailing_list_auto_subscribe_policy, "_prejoin4".merged, "_prejoin4".mugshot, "_prejoin4".name, "_prejoin4".personal_standing, "_prejoin4".personal_standing_reason, "_prejoin4".registrant, "_prejoin4".renewal_policy, "_prejoin4".subscriptionpolicy, "_prejoin4".teamdescription, "_prejoin4".teamowner, "_prejoin4".verbose_bugnotifications, "_prejoin4".visibility, "_prejoin5".bug, "_prejoin5".bugtracker, "_prejoin5".datecreated, "_prejoin5".id, "_prejoin5".last_error_type, "_prejoin5".lastchanged, "_prejoin5".lastchecked, "_prejoin5".next_check, "_prejoin5".owner, "_prejoin5".remote_importance, "_prejoin5".remotebug, "_prejoin5".remotestatus FROM BugTask LEFT JOIN Person AS "_prejoin1" ON BugTask.assignee = "_prejoin1".id LEFT JOIN Product AS "_prejoin2" ON BugTask.product = "_prejoin2".id LEFT JOIN SourcePackageName AS "_prejoin3" ON BugTask.sourcepackagename = "_prejoin3".id LEFT JOIN Person AS "_prejoin4" ON BugTask.owner = "_prejoin4".id LEFT JOIN BugWatch AS "_prejoin5" ON BugTask.bugwatch = "_prejoin5".id WHERE BugTask.bug = 136469 AND (1=1) ORDER BY BugTask.id
166. 7724 20ms SQL-launchpad-main-master SELECT Message.datecreated, Message.id, Message.owner, Message.parent, Message.raw, Message.rfc822msgid, Message.subject, "_prejoin1".account, "_prejoin1".creation_comment, "_prejoin1".creation_rationale, "_prejoin1".datecreated, "_prejoin1".defaultmembershipperiod, "_prejoin1".defaultrenewalperiod, "_prejoin1".displayname, "_prejoin1".hide_email_addresses, "_prejoin1".homepage_content, "_prejoin1".icon, "_prejoin1".id, "_prejoin1".logo, "_prejoin1".mailing_list_auto_subscribe_policy, "_prejoin1".merged, "_prejoin1".mugshot, "_prejoin1".name, "_prejoin1".personal_standing, "_prejoin1".personal_standing_reason, "_prejoin1".registrant, "_prejoin1".renewal_policy, "_prejoin1".subscriptionpolicy, "_prejoin1".teamdescription, "_prejoin1".teamowner, "_prejoin1".verbose_bugnotifications, "_prejoin1".visibility FROM BugMessage, Message LEFT JOIN Person AS "_prejoin1" ON Message.owner = "_prejoin1".id WHERE BugMessage.bug = %s AND Message.id = BugMessage.message AND (1=1) ORDER BY Message.datecreated, Message.id
167. 7814 22ms SQL-launchpad-main-master SELECT MessageChunk.blob, MessageChunk.content, MessageChunk.id, MessageChunk.message, MessageChunk.sequence FROM MessageChunk WHERE MessageChunk.message = %s ORDER BY MessageChunk.sequence

----

Currently it seems almost impossible to get *any* messages from bug 1 via the web API. In other words, with launchpadlib, root.bugs[1].messages[0] almost always times out.

For a bug with many (>50) messages, or when requesting a narrow slice of messages, getting messages via the web API is very inefficient because message list is fulfilled by Bug.indexed_messages, which pulls in every message. It can't return a result set - which would be dealt with efficiently - for two reasons: (1) it needs to wrap each message in a delegate to provide the comment index, and (2) it must check if the parent is within the set of messages.

We can address (2) by using a message_id_set:

    message_id_set = set(
        IResultSet(self.messages).values(Message.id))

Instead of the message_set as is currently used.

(1) is trickier. There may be some way to get Storm to materialize a query using a different object factory class, or it may also be possible to provide a custom marshaller for the lazr.restful.

Curtis Hovey (sinzui)
Changed in malone:
status: New → Triaged
importance: Undecided → Low
tags: added: api
Revision history for this message
Robert Collins (lifeless) wrote :

High as per zaroo-oops-polishy

tags: added: timeout
Changed in malone:
importance: Low → High
summary: - Getting bug messages via the web API is inefficient for bugs with many
- messages
+ ScopedCollection:CollectionResource:#message-page-resource timeouts for
+ bugs with many messages
description: updated
Revision history for this message
Robert Collins (lifeless) wrote :

From the bug in the OOPS:
{"parent_link": null,
"bug_attachments_collection_link": "https://api.launchpad.net/1.0/ubuntu/+source/linux-source-2.6.22/+bug/136469/comments/64/bug_attachments",
"content": "I will upgrade to 3.80 later when i get home, as far as fiesty goes, i couldnt ever get it to install on my machine but gutsy did so i am running a dual between vista and gutsy. i hope someone gets this highly annoying bug fixed :-|",
"http_etag": "\"7236e3623a9071986277b7cf78c30d25cd8ff583-00d8661f4ad54914c4b8f3ac9ffa35123b7656b0\"",
"owner_link": "https://api.launchpad.net/1.0/~numbers1thru9",
"self_link": "https://api.launchpad.net/1.0/ubuntu/+source/linux-source-2.6.22/+bug/136469/comments/64", "date_created": "2007-10-24T19:15:35.265678+00:00",
"resource_type_link": "https://api.launchpad.net/1.0/#message",
 "subject": "Re: toshiba p100 series dsdt acpi error no sound, works with acpi turned off."}

I bet one of these is triggering the odd sql pattern, because indexed_messages is whats exported via the API today.

Revision history for this message
Robert Collins (lifeless) wrote :

so content requires looking up chunks, and its done.. drum roll... per message shown.

Revision history for this message
Robert Collins (lifeless) wrote :

message.owner will need eager loading too.

Revision history for this message
Gavin Panella (allenap) wrote :

The following query might be somewhere to start. It uses window functions which are a new feature in PostgreSQL 8.4. They'll allow us to calculate the message index and slice on it straight in the database.

SELECT * FROM (
   SELECT message.*,
          row_number() OVER (
             ORDER BY message.datecreated, message.id) AS index
     FROM message, bugmessage
    WHERE bugmessage.message = message.id
      AND bugmessage.bug = 15
      AND (message.parent IS NULL OR message.parent IN (
             SELECT message FROM bugmessage WHERE bug = 15))) AS indexed_messages
 WHERE index > 2
   AND index <= 4;

Revision history for this message
Gavin Panella (allenap) wrote :

Heh, the query is pointless; please ignore.

Revision history for this message
Gavin Panella (allenap) wrote :

Actually, it's not entirely pointless; it made me think of something
else. Have a look at the following (needs PostgreSQL 8.4 again):

A view of indexed messages:

{{{
CREATE VIEW indexedmessage AS
  SELECT message.*,
         -1 + row_number() OVER (
            PARTITION BY bugtask.id
            ORDER BY message.datecreated, message.id) AS index,
         bugtask.id AS inside
    FROM message, bugmessage, bugtask
   WHERE bugmessage.message = message.id
     AND bugmessage.bug = bugtask.bug;

GRANT SELECT ON indexedmessage TO PUBLIC;
}}}

The Storm model class (well, with some SQLObject shim stuff still
around because Message hasn't changed yet):

{{{
from canonical.launchpad.database.message import Message
from canonical.launchpad.interfaces.message import IIndexedMessage

class IndexedMessage(Message):

    implements(IIndexedMessage)

    _table = "IndexedMessage"
    _defaultOrder = 'index'

    index = Int()

    insideID = Int("inside", allow_none=False)
    inside = Reference(insideID, "BugTask.id")
}}}

In use:

{{{
bug = BugSet().get(15)
bugtask = bug.default_bugtask

from storm import tracer
tracer.debug(True)

# All messages.
print [
    (im.id, im.index, im.insideID)
    for im in store.find(
        IndexedMessage, IndexedMessage.inside == bugtask)
    ]

# Slicing.
print [
    (im.id, im.index, im.insideID)
    for im in store.find(
        IndexedMessage, IndexedMessage.inside == bugtask)[3:5]
    ]

tracer.debug(False)
}}}

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 497386] Re: ScopedCollection:CollectionResource:#message-page-resource timeouts for bugs with many messages

We broadly want to get rid of views, and calculating rank every time
will require careful index analysis to ensure that the query can be
answered from indices.

If it can't be it will scale poorly; I think long term we really
shouldn't be indexing on the fly.

Revision history for this message
Launchpad QA Bot (lpqabot) wrote : Bug fixed by a commit
Changed in malone:
assignee: nobody → Robert Collins (lifeless)
milestone: none → 10.10
tags: added: qa-needstesting
Changed in malone:
status: Triaged → Fix Committed
tags: added: qa-ok
removed: qa-needstesting
Revision history for this message
Robert Collins (lifeless) wrote :

This is working reliably on edge; may still be slow - but there is a db patch landed in db-stable that will make this really fast once we do some data migration.

Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
tags: added: qa-needstesting
removed: qa-ok
tags: added: qa-ok
removed: qa-needstesting
Curtis Hovey (sinzui)
Changed in malone:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.