Comment 1 for bug 626680

Revision history for this message
Stuart Bishop (stub) wrote : Re: [Bug 626680] [NEW] iteration in LP API's is O(N^2) due to batching

On Mon, Aug 30, 2010 at 3:52 PM, Robert Collins
<email address hidden> wrote:
> Public bug reported:
>
> So, iteration on collections in the Launchpad API's is O(N^2) because
> they batch.
>
> We should, I think, stop batching. If folk want only a slice, give them
> a slice, otherwise stream the entire thing down and let the client
> consume as it goes.
>
> Iteration is O(N^2) because the database has to calculate the first N
> rows to return N-onwards, and thus the second batch costs the work for
> the first batch + the work for the first-batch and second-batch, and so-
> on.
>
> I'm sure this will require work in a bunch of places, so I thought I'd
> kick the issue of here and let additional tasks etc be raised as needed.

Without batching, we would need to limit queries to returning a sane
number of rows. Currently, people can retrieve all Ubuntu bugs. That
would be insane to do in a single request.

The alternative is to make batching saner.

We can get the first batch:

SELECT * FROM Foo ORDER BY id LIMIT 100;

If we remember the key from the last item in the previous batch, we
can get the next batch:

SELECT * FROM Foo WHERE id > 342 ORDER BY id LIMIT 100;

This technique is much faster and preferred. What you lose is the
ability to know how many batches there are, how to jump to an
arbitrary batch, and you must order by a unique column. This is
probably fine for the webservice. Using this technique in the
appserver we lose some batch navigation features.

--
Stuart Bishop <email address hidden>
http://www.stuartbishop.net/