Distribution:+search slow 50% of requests

Bug #618372 reported by Robert Collins
14
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Critical
Unassigned

Bug Description

Distribution:+search is showing up on https://devpad.canonical.com/~stub/ppr/lpnet/latest-daily-timeout-candidates.html as taking a very long time 50% of requests. This is timing out and and showing up on our top-oopsers already.

https://lp-oops.canonical.com/oops.py/?oopsid=1695C1150

https://launchpad.net/ubuntu/+search
QUERY_STRING: text=mplayer

So I'm guessing they were on
https://launchpad.net/ubuntu/+search?text=smplayer
deleted the s, hit search again.

SQL time: 17983 ms
Non-sql time: 172 ms
Total time: 18155 ms
Statement Count: 12

11. 1347 8952ms launchpad-main-slave SELECT COUNT(*) FROM (SELECT DISTINCT DistributionSourcePackageCache.archive, DistributionSourcePackageCache.binpkgdescriptions, DistributionSourcePackageCache.binpkgnames, DistributionSourcePackageCache.binpkgsummaries, DistributionSourcePackageCache.changelog, DistributionSourcePackageCache.distribution, DistributionSourcePackageCache.id, DistributionSourcePackageCache.name, DistributionSourcePackageCache.sourcepackagename FROM BinaryPackageBuild, BinaryPackageName, BinaryPackageRelease, DistributionSourcePackageCache, DistroArchSeries, DistroSeries, SourcePackageName, SourcePackageRelease WHERE DistroSeries.distribution = %s AND DistroSeries.releasestatus != %s AND BinaryPackageRelease.binarypackagename = BinaryPackageName.id AND DistroArchSeries.distroseries = DistroSeries.id AND BinaryPackageBuild.distro_arch_series = DistroArchSeries.id AND BinaryPackageRelease.build = BinaryPackageBuild.id AND BinaryPackageBuild.source_package_release = SourcePackageRelease.id AND SourcePackageRelease.sourcepackagename = SourcePackageName.id AND DistributionSourcePackageCache.sourcepackagename = SourcePackageName.id AND DistributionSourcePackageCache.archive IN (%s, %s) AND BinaryPackageName.name = %s) AS "_tmp"

12. 10300 7795ms launchpad-main-slave SELECT DISTINCT DistributionSourcePackageCache.archive, DistributionSourcePackageCache.binpkgdescriptions, DistributionSourcePackageCache.binpkgnames, DistributionSourcePackageCache.binpkgsummaries, DistributionSourcePackageCache.changelog, DistributionSourcePackageCache.distribution, DistributionSourcePackageCache.id, DistributionSourcePackageCache.name, DistributionSourcePackageCache.sourcepackagename FROM BinaryPackageBuild, BinaryPackageName, BinaryPackageRelease, DistributionSourcePackageCache, DistroArchSeries, DistroSeries, SourcePackageName, SourcePackageRelease WHERE DistroSeries.distribution = %s AND DistroSeries.releasestatus != %s AND BinaryPackageRelease.binarypackagename = BinaryPackageName.id AND DistroArchSeries.distroseries = DistroSeries.id AND BinaryPackageBuild.distro_arch_series = DistroArchSeries.id AND BinaryPackageRelease.build = BinaryPackageBuild.id AND BinaryPackageBuild.source_package_release = SourcePackageRelease.id AND SourcePackageRelease.sourcepackagename = SourcePackageName.id AND DistributionSourcePackageCache.sourcepackagename = SourcePackageName.id AND DistributionSourcePackageCache.archive IN (%s, %s) AND BinaryPackageName.name = %s ORDER BY name

Revision history for this message
Curtis Hovey (sinzui) wrote :

The search mechanism is soyuz. Only soyuz-based distros have search.

affects: launchpad-registry → soyuz
description: updated
tags: added: dba
description: updated
description: updated
Revision history for this message
Robert Collins (lifeless) wrote :

Timing is on.
lpmain_staging=> SELECT COUNT(*) FROM (SELECT DISTINCT DistributionSourcePackageCache.archive, DistributionSourcePackageCache.binpkgdescriptions, DistributionSourcePackageCache.binpkgnames, DistributionSourcePackageCache.binpkgsummaries, DistributionSourcePackageCache.changelog, DistributionSourcePackageCache.distribution, DistributionSourcePackageCache.id, DistributionSourcePackageCache.name, DistributionSourcePackageCache.sourcepackagename FROM BinaryPackageBuild, BinaryPackageName, BinaryPackageRelease, DistributionSourcePackageCache, DistroArchSeries, DistroSeries, SourcePackageName, SourcePackageRelease WHERE DistroSeries.distribution = 1 AND DistroSeries.releasestatus != 6 AND BinaryPackageRelease.binarypackagename = BinaryPackageName.id AND DistroArchSeries.distroseries = DistroSeries.id AND BinaryPackageBuild.distro_arch_series = DistroArchSeries.id AND BinaryPackageRelease.build = BinaryPackageBuild.id AND BinaryPackageBuild.source_package_release = SourcePackageRelease.id AND SourcePackageRelease.sourcepackagename = SourcePackageName.id AND DistributionSourcePackageCache.sourcepackagename = SourcePackageName.id AND DistributionSourcePackageCache.archive IN (1, 534) AND BinaryPackageName.name = 'mplayer') AS "_tmp";
 count
-------
     1
(1 row)

Time: 23046.859 ms

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (5.2 KiB)

(thats with disk IO, its 8-8.5seconds when repeated.

plan for it:

                                                                                                                                                                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate (cost=1261.93..1261.94 rows=1 width=0) (actual time=8106.454..8106.455 rows=1 loops=1)
   -> Unique (cost=1261.85..1261.90 rows=2 width=830) (actual time=7895.168..8106.442 rows=1 loops=1)
         -> Sort (cost=1261.85..1261.85 rows=2 width=830) (actual time=7895.165..7903.721 rows=1382 loops=1)
               Sort Key: distributionsourcepackagecache.archive, distributionsourcepackagecache.binpkgdescriptions, distributionsourcepackagecache.binpkgnames, distributionsourcepackagecache.binpkgsummaries, distributionsourcepackagecache.changelog, distributionsourcepackagecache.distribution, distributionsourcepackagecache.id, distributionsourcepackagecache.name, distributionsourcepackagecache.sourcepackagename
               Sort Method: quicksort Memory: 761kB
               -> Nested Loop (cost=0.00..1261.84 rows=2 width=830) (actual time=1.103..7686.693 rows=1382 loops=1)
                     -> Nested Loop (cost=0.00..649.04 rows=1 width=8) (actual time=0.195..364.537 rows=1917 loops=1)
                           -> Nested Loop (cost=0.00..648.76 rows=1 width=4) (actual time=0.180..335.799 rows=1917 loops=1)
                                 -> Nested Loop (cost=0.00..648.21 rows=1 width=4) (actual time=0.161..299.414 rows=1917 loops=1)
                                       Join Filter: (distroarchseries.distroseries = distroseries.id)
                                       -> Seq Scan on distroseries (cost=0.00..2.45 rows=1 width=4) (actual time=0.031..0.077 rows=6 loops=1)
                                             Filter: ((releasestatus <> 6) AND (distribution = 1))
                                       -> Nested Loop (cost=0.00..645.32 rows=35 width=8) (actual time=0.058..47.401 rows=2327 loops=6)
                                             -> Nested Loop (cost=0.00..635.52 rows=35 width=8) (actual time=0.052..33.313 rows=2327 loops=6)
                                                   -> Nested Loop (cost=0.00..620.13 rows=35 width=4) (actual time=0.041..12.791 rows=2327 loops=6)
                                                         -> Index Scan using binarypackagename_name_key on binarypackagename (cost=0.00..6.29 rows=1 width=4) (actual time=0.016..0.017 rows=1 loops=6)
                                       ...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :

The inner query is 7.6 seconds, stable.

Revision history for this message
Stuart Bishop (stub) wrote :
Download full text (5.2 KiB)

There is an unnecessary join with SourcePackageName. However the big win is surprising - if I move the BinaryPackageName check to a subquery, things go much, much faster. PG 8.3 doesn't seem smart enough to know the BinaryPackageName check can only return a single row - maybe PG 8.4 will.

launchpad_prod_3=# explain analyze
launchpad_prod_3-# SELECT COUNT(DISTINCT DistributionSourcePackageCache.id)
launchpad_prod_3-# FROM
launchpad_prod_3-# BinaryPackageRelease
launchpad_prod_3-# JOIN BinaryPackageBuild
launchpad_prod_3-# ON BinaryPackageBuild.id = BinaryPackageRelease.build
launchpad_prod_3-# JOIN DistroArchSeries
launchpad_prod_3-# ON DistroArchSeries.id = BinaryPackageBuild.distro_arch_series
launchpad_prod_3-# JOIN DistroSeries ON DistroSeries.id = DistroArchSeries.distroseries
launchpad_prod_3-# JOIN SourcePackageRelease
launchpad_prod_3-# ON SourcePackageRelease.id = BinaryPackageBuild.source_package_release
launchpad_prod_3-# JOIN DistributionSourcePackageCache
launchpad_prod_3-# ON DistributionSourcePackageCache.sourcepackagename
launchpad_prod_3-# = SourcePackageRelease.sourcepackagename
launchpad_prod_3-# WHERE
launchpad_prod_3-# DistroSeries.distribution = 1
launchpad_prod_3-# AND DistroSeries.releasestatus != 6
launchpad_prod_3-# AND DistributionSourcePackageCache.archive IN (1, 534)
launchpad_prod_3-# AND BinaryPackageRelease.binarypackagename = (
launchpad_prod_3(# SELECT id FROM BinaryPackageName WHERE name='mplayer');
                                                                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate (cost=14663.84..14663.85 rows=1 width=4) (actual time=628.353..628.354 rows=1 loops=1)
   InitPlan
     -> Index Scan using binarypackagename_name_key on binarypackagename (cost=0.00..3.89 rows=1 width=4) (actual time=0.082..0.084 rows=1 loops=1)
           Index Cond: (name = 'mplayer'::text)
   -> Hash Join (cost=1437.85..14659.79 rows=65 width=4) (actual time=412.262..627.161 rows=1388 loops=1)
         Hash Cond: (distributionsourcepackagecache.sourcepackagename = sourcepackagerelease.sourcepackagename)
         -> Bitmap Heap Scan on distributionsourcepackagecache (cost=539.83..13678.72 rows=21977 width=8) (actual time=56.406..361.512 rows=21978 loops=1)
               Recheck Cond: (archive = ANY ('{1,534}'::integer[]))
               -> Bitmap Index Scan on distributionsourcepackagecache__archive__idx (cost=0.00..534.34 rows=21977 width=0) (actual time=50.053..50.053 rows=22577 loops=1)
                     Index Cond: (archive = ANY ('{1,534}'::integer[]))
         -> Hash (cost=897.45..897.45 rows=46 width=4) (actual time=242.941..242.941 rows=1936 loops=1)
               -> Nested Loop (cost=7.20..897.45 rows=46 width=4) (actual time=0.997..239.023 rows=1936 loops=1)
                  ...

Read more...

Stuart Bishop (stub)
tags: removed: dba
Revision history for this message
Robert Collins (lifeless) wrote :

On staging thats still bad.
Without the copy-paste noise:

SELECT COUNT(
DISTINCT DistributionSourcePackageCache.id)
 FROM
 BinaryPackageRelease
 JOIN BinaryPackageBuild
 ON BinaryPackageBuild.id = BinaryPackageRelease.build
 JOIN DistroArchSeries
 ON DistroArchSeries.id = BinaryPackageBuild.distro_arch_series
 JOIN DistroSeries ON DistroSeries.id = DistroArchSeries.distroseries
 JOIN SourcePackageRelease
 ON SourcePackageRelease.id = BinaryPackageBuild.source_package_release
 JOIN DistributionSourcePackageCache
 ON DistributionSourcePackageCache.sourcepackagename
 = SourcePackageRelease.sourcepackagename
 WHERE
 DistroSeries.distribution = 1
 AND DistroSeries.releasestatus != 6
 AND DistributionSourcePackageCache.archive IN (1, 534)
 AND BinaryPackageRelease.binarypackagename = (
 SELECT id FROM BinaryPackageName WHERE name='mplayer');
Time: 6831.607 ms

Revision history for this message
Robert Collins (lifeless) wrote :

Stuart - So I can rewrite this to do your query, but its really slow on staging still; I worry about this from two angles - if its a query plan glitch, it may reoccur shortly anyhow; and it will definitely interfere with QA on staging.

tags: added: dba
Revision history for this message
Robert Collins (lifeless) wrote :

Julian, the search objects this returns are different to DistroSeries and DistroArchSeries, and either of those searches appears simpler and fast; how would you feel about us making this return the same thing as DAS does - we could consolidate the code and simplify things.

Revision history for this message
Robert Collins (lifeless) wrote :

The answer was roughly 'not using DSPC would be fine if it performs ok'

Revision history for this message
Launchpad QA Bot (lpqabot) wrote : Bug fixed by a commit

Fixed in stable r11487 (http://bazaar.launchpad.net/~launchpad-pqm/launchpad/stable/revision/11487) by a commit, but not testable.

Changed in soyuz:
assignee: nobody → Robert Collins (lifeless)
milestone: none → 10.09
tags: added: qa-untestable
Changed in soyuz:
status: Triaged → Fix Committed
Changed in soyuz:
status: Fix Committed → Triaged
assignee: Robert Collins (lifeless) → nobody
milestone: 10.09 → none
tags: added: pg83
Changed in launchpad:
importance: High → Critical
Revision history for this message
Stuart Bishop (stub) wrote :

This no longer appears to be an issue with PostgreSQL 8.4

Changed in launchpad:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.