Comment 9 for bug 505078

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 505078] Re: crashes with "invalid literal for int() with base 16: ''" if linefeeds are missing from the file

On Mon, 2010-01-11 at 07:01 +0000, Martin Pool wrote:

> > Do you mean transfer encodings perhaps? They already do specify a
> > content encoding?
>
> really? where?

sorry yes, c-t only.

> they seem to only specify a content-type at the moment.
>
> istm that what is wanted here is a content-transfer-encoding which
> <http://en.wikipedia.org/wiki/MIME#Content-Transfer-Encoding>

Yes, something like that might work - I was using a conceptually similar
thing in using chunked encoding.

> > 1. It indicates whether or not a binary-to-text encoding scheme has been used on top of the original encoding as specified within the Content-Type header, and
> > 2. If such a binary-to-text encoding method has been used it states which one.
>
> however, this doesn't mark boundaries between attachments.

Currently they are self delimiting so thats not an issue.

I have the following constraints here:
 - I want something that doesn't require arbitrary buffering per
attachment.
 - It should be dead simple to output. E.g. the shell bindings should be
able to do it. (This is a preference not a hard requirement.)
 - Parsing can be a bit harder. I feel strongly that it should be
something that already exists, so that new parsers don't need to be
written (or if they do need to be written they can be reused by other
things).
 - It shouldn't add a lot of overhead in the common case: most channels
on the internet today (e.g. launchpad attachments) are 8 bit clean, and
test runs with 10's of thousands of entries could suffer significant
inflation if a large overhead scheme is chosen.

I appreciate that you want to be able to copy and paste segments of a
stream and like the flexability that that will offer: Do you think the
constraints above are reasonable?

-Rob