Comment 10 for bug 505078

Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 505078] Re: crashes with "invalid literal for int() with base 16: ''" if linefeeds are missing from the file

2010/1/11 Robert Collins <email address hidden>:
> I have the following constraints here:
>  - I want something that doesn't require arbitrary buffering per
> attachment.
>  - It should be dead simple to output. E.g. the shell bindings should be
> able to do it. (This is a preference not a hard requirement.)
>  - Parsing can be a bit harder. I feel strongly that it should be
> something that already exists, so that new parsers don't need to be
> written (or if they do need to be written they can be reused by other
> things).
>  - It shouldn't add a lot of overhead in the common case: most channels
> on the internet today (e.g. launchpad attachments) are 8 bit clean, and
> test runs with 10's of thousands of entries could suffer significant
> inflation if a large overhead scheme is chosen.
>
> I appreciate that you want to be able to copy and paste segments of a
> stream and like the flexability that that will offer: Do you think the
> constraints above are reasonable?

I think those are an excellent set of constraints.

I do want to be able to copy-and-paste a stream, or chop it up using
text-based tools. But even more than that, I don't want it to _look_
like I can and actually not. It would in some ways be better if it
was binary garbage than text with invisible but critical markers.

base64 of attachments seems to meet all of these: it streams; shell
programs can use base64 from gnu coreutils, and it is only about 30%
bigger than the input.

The main drawback, which is perhaps not negligible, is that it would
mean you could no longer directly read the attachments. Since the
attachments include the traceback, this would be a fairly severe
problem for eg "bzr selftest --subunit |subunit-filter|less". So the
question is whether there is another constraint that text attachments
should be directly readable in the stream, or whether the raw stream
is only for debugging.

--
Martin <http://launchpad.net/~mbp/>