Large POST fails with createComment on merge proposal

Bug #512552 reported by dobey
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Unassigned
Tarmac
Invalid
Undecided
Unassigned

Bug Description

We're using tarmac to land branches for the various ubuntuone client projects. Over the past few days, we've noticed a lot of weirdness with getting 502 or 503 errors, with ubuntuone-client. After more debugging today, it appears that when tests fail, the tarmac hook creates a comment with the test output, so that the developer can fix it, and know what failed. The output is now an estimated approximate 5000 lines of text. The createComment() call on the branch_merge_proposal is giving us the 50x errors, and it appears to be due to the size of the POST data. I've created a test branch on another project which has a very small test suite, and behavior is as expected. I then created a test branch of ubuntuone-client to force a test failure, and every time, the createComment() calls with a 50x error. I suspect this is probably an issue in either the squid proxy config, or a bug in squid itself perhaps.

Curtis Hovey (sinzui)
affects: launchpad → launchpad-code
Revision history for this message
Tim Penhey (thumper) wrote :

Leonard, any ideas?

Changed in launchpad-code:
assignee: nobody → Leonard Richardson (leonardr)
status: New → Incomplete
Revision history for this message
Leonard Richardson (leonardr) wrote :

I think you should look at the full body of the HTTP response, and then look into a squid bug or (more likely) misconfiguration, for three reasons.

1. When lazr.restful raises an exception the HTTP response will contain a stack trace or at least an OOPS-ID. If this is not present, the lazr.restful code is not running. If it is, you'll have some specific information about where lazr.restful choked.

2. There's no reason why a large POST body should choke lazr.restful (unless it's so big we run out of memory), but it's probably not an unusual setting for a squid proxy config.

3. lazr.restful never sends a 502 status code, and only sends a 503 in one obscure case (RequestExpired) which I don't think is what's happening here. In general, when lazr.restful raises an exception, the status code is 500.

I can run some local tests if it turns out lazr.restful is to blame, or if you want more information about how lazr.restful handles POSTs. How big a POST body are we talking about? 5000 lines of text can't be more than 40k, so memory usage shouldn't be an issue.

Revision history for this message
dobey (dobey) wrote :

Here is the traceback. I don't see any OOPS IDs, but you can see restfulclient is doing some stuff. I'm pretty certain that the blame belongs with squid here, at this point.

Traceback (most recent call last):
  File "./tarmac-lander", line 25, in <module>
    script.main()
  File "/home/dobey/Projects/canonical/tarmac/run-tree/tarmac/bin.py", line 244, in main
    trunk)
  File "/home/dobey/Projects/canonical/tarmac/run-tree/tarmac/hooks.py", line 33, in fire
    hook(*args, **kwargs)
  File "/home/dobey/Projects/canonical/tarmac/run-tree/tarmac/plugins/runtest.py", line 64, in __call__
    self.do_failed(stdout_value, stderr_value)
  File "/home/dobey/Projects/canonical/tarmac/run-tree/tarmac/plugins/runtest.py", line 89, in do_failed
    self.candidate.createComment(subject=subject, content=comment)
  File "/usr/lib/python2.6/dist-packages/lazr/restfulclient/resource.py", line 460, in __call__
    url, in_representation, http_method, extra_headers=extra_headers)
  File "/usr/lib/python2.6/dist-packages/lazr/restfulclient/_browser.py", line 204, in _request
    raise HTTPError(response, content)
lazr.restfulclient.errors.HTTPError: HTTP Error 503: Service Unavailable

Revision history for this message
Leonard Richardson (leonardr) wrote :

Can you catch the HTTPError and print out the .content? If you're using a recent launchpadlib it's likely that the content is already being printed out and it's empty, but just to make sure.

Revision history for this message
dobey (dobey) wrote :

It looks like it's a timeout, and the HTTPError.content has a bunch of HTML (the standard LP timeout error HTML page), and an OOPS id inside it. The OOPS id is: OOPS-1487EA832

Revision history for this message
Leonard Richardson (leonardr) wrote :

It looks like the POST data is causing a database timeout, or slowing down some other code so much that the request as a whole times out. Do you also get a timeout if you use the website to do the equivalent operation with a huge chunk of data?

Revision history for this message
dobey (dobey) wrote :

I haven't tried pasting the data in the web page. I have noticed some timeout errors when trying to view my list of bugs. I'll try and let you know.

Revision history for this message
dobey (dobey) wrote :

The in-line editor pops up an error dialog with the following content:

The following errors were encountered:

    * Server error, please contact an administrator.

Looks like it is indeed timing out on the web form as well with the approximately the same size data.

Revision history for this message
Tim Penhey (thumper) wrote : Re: [Bug 512552] Re: Large POST fails with createComment on merge proposal

On Wed, 27 Jan 2010 10:10:11 Rodney Dawes wrote:
> The in-line editor pops up an error dialog with the following content:
>
> The following errors were encountered:
>
> * Server error, please contact an administrator.
>
> Looks like it is indeed timing out on the web form as well with the
> approximately the same size data.
>

How big is the comment you are adding?

Revision history for this message
dobey (dobey) wrote :

5195 lines, 388K, plus a few bytes (maybe 1k) for the subject text, and the short "The attempt to merge foo into bar has failed..." message at the top of the body.

Revision history for this message
Leonard Richardson (leonardr) wrote :

The original bug says "The output is now an estimated approximate 5000 lines of text. ", which would put it at maybe 400 kilobytes.

Is it possible to make the timeout proportional to the size of the entity-body?

Revision history for this message
dobey (dobey) wrote :

Should this be changed to Confirmed or something, also? I don't think it's quite "Incomplete" at this point, is it? :) Or is there any more info you need from me?

Paul Hummer (rockstar)
Changed in launchpad-code:
status: Incomplete → Triaged
importance: Undecided → High
Revision history for this message
Leonard Richardson (leonardr) wrote :

This might not be a helpful thing to say, but should you really be putting 400k of data into a bug comment? Maybe you should be putting the test output into a bug attachment. I don't know if this would avoid the timeout, but it seems like a better design.

Revision history for this message
Elliot Murphy (statik) wrote : Re: [Bug 512552] Re: Large POST fails with createComment on merge proposal

On Thu, Jan 28, 2010 at 12:15 PM, Leonard Richardson
<email address hidden> wrote:
> This might not be a helpful thing to say, but should you really be
> putting 400k of data into a bug comment? Maybe you should be putting the
> test output into a bug attachment. I don't know if this would avoid the
> timeout, but it seems like a better design.

Do merge proposals have attachments? I thought they only had comments.
--
Elliot Murphy | https://launchpad.net/~statik/

Revision history for this message
dobey (dobey) wrote : Re: [Bug 512552] Re: Large POST fails with createComment on merge proposal

On Thu, 2010-01-28 at 18:28 +0000, Elliot Murphy wrote:
> On Thu, Jan 28, 2010 at 12:15 PM, Leonard Richardson
> <email address hidden> wrote:
> > This might not be a helpful thing to say, but should you really be
> > putting 400k of data into a bug comment? Maybe you should be putting the
> > test output into a bug attachment. I don't know if this would avoid the
> > timeout, but it seems like a better design.
>
> Do merge proposals have attachments? I thought they only had comments.

No, they only have comments. But attachments don't seem to have this
problem, presumably because the data doesn't go into the db. If bug
attachments were failing, I'm sure we'd have a LOT more people
complaining about apport not working.

As for not doing so, I would love to only put the failures in the
comment, but trial doesn't work that way, and there isn't a particularly
easy way to trim the output from stdout/stderr to only give what we
really need to know. And if the build fails, we want to know where and
why and see the build logs. Maybe a longer term solution would be to
allow attachments on merge proposals, so we can just attach the build
logs, etc...

Revision history for this message
Tim Penhey (thumper) wrote :

On Fri, 29 Jan 2010 07:28:21 Elliot Murphy wrote:
> On Thu, Jan 28, 2010 at 12:15 PM, Leonard Richardson
>
> <email address hidden> wrote:
> > This might not be a helpful thing to say, but should you really be
> > putting 400k of data into a bug comment? Maybe you should be putting the
> > test output into a bug attachment. I don't know if this would avoid the
> > timeout, but it seems like a better design.
>
> Do merge proposals have attachments? I thought they only had comments.

They don't yet, but we are considering it and would like them.

Paul Hummer (rockstar)
affects: launchpad-code → launchpad-foundations
Changed in tarmac:
status: New → Invalid
Revision history for this message
Robert Collins (lifeless) wrote :

So, attachments still have to be processed. Processing 400K of data shouldn't be an issue; and I know other tarmac jobs have added much larger errors successfully( one did 16MB recently !)

So I think the underlying performance issue is fixed, and am closing this accordingly.

Changed in launchpad:
assignee: Leonard Richardson (leonardr) → nobody
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.