Bazaar

Bug #198646
Comment #35

Comment 35 for bug 198646

Revision history for this message

Vincent Ladeuil (vila) wrote on 2011-05-08: Re: [Bug 198646] Re: Invalid http response ... Expected a boundary

#35

>>>>> Miriup writes:

<snip/>

    > I've just found these code bits and just wanted to put a quick update
    > here. As far as I can see, this mechanism kicks in in the following
    > cirumstances (_readv in http/__init.py):

    > - errors.ShortReadvError: short-read; the the range is actually incomplete
    > - errors.InvalidRange
    > - errors.InvalidHttpRange: when the range headers are malformed.

> However, none of these are applying for reading the boundary marker
> (read_boundary in http/response.py) when it is missing.

Correct.

> I will first attempt fitting the necessary adaption there.

Great.

    > I'll see if I can use the existing mechanism for issues in Content-
    > Range's (_degrade_range_hint in http/__init__.py). According to their
    > documentation the degration flag propagates whenever the connection
    > object is cloned - not sure when this is done, though.

Hmm, yeah, it's a bit unclear and hairy (and there are edge cases where
it's bogus):

- it starts at None,
- it's set once the first multi-range request succeeds (in whatever form
including retries),
- it propagates by cloning (see __init__ in HttpTransportBase),

So if cloning occurs before the first successful request, each clone
have to rediscover it.

A better way would be make this attribute part of the connection
parameters (which contains only the authentication data so far). Don't
worry about that too much in a first step if you feel it's too complex.

Anyway, the place you should detect the bug and raise a specific
exception is probably in http/response.py, catching it should occur
where the hint is handled.

    > Even if it doesn't propagate and we're not doing requests in
    > parallel, we'll be using the same HTTP request object for the
    > remaining requests --- and that's the one who's keeping the HTTP
    > connection open.

You mean connection not request here right ? The connection attribute in
the request object is the one that is shared between the transports and
this achieved via the request objects but we create a request object
for... each request.

    >> This is also bad for the client and in the end for the user as it
    >> means the connection will have to be re-established incurring
    >> latency penalty.

    > Regarding latency I think flag on fail is a good solution. A user
    > not affected by evil proxies will not take a performance hit. For
    > the rest of us this re-establishment will happen exactly once per
    > HTTP client object and then the degraded mode gets activated and
    > successive requests over the same connection will request
    > individual ranges and the connection will be kept alive. This is
    > definitely preferable over a broken checkout. :)

Yes, that's the idea.

> Regarding a config option I have a few reasons to refrain from
> doing so:

    > 1. For an ordinary user it will be very difficult to determine
    > whether he's affected by this issue or not. He'd have to be aware
    > of a transparent proxy being in place and he'd have to be aware
    > that those bazaar requests are quite uncommon. Most of the web
    > works very well without HTTP multipart responses. ;)

But almost always used by bzr when supported by the server. But once
it's established that a server is affected, the option can be set in
such a

    > 2. Affected are only users who fetch with bazaar through
    > HTTP. That means amongst the people behind an evil proxies only
    > people are hit that are *not* fetching sources using typical
    > developer transport connections (like https or ssh).

Oh, you're right, https is immune !

    > Most projects offer .tar.gz's and you'd ordinarily download
    > these. I stumbled over the issue only myself, because I was
    > writing a live ebuild for Gentoo that checks out another project
    > using bazaar.

    > 3. /me would have to fiddle with two proably quite distant (in
    > terms of cross references) parts of bazaar. I leave that to you
    > pros. :-P

No problem with that. Correctly detecting the behavior is the most
important part, we can help you for the rest or even handle it
ourselves.

>>>>> Miriup  writes: