Comment 5 for bug 368717

Revision history for this message
John A Meinel (jameinel) wrote :

I should also note that the lightweight checkout seems to have to download most, if not *all* of the remote history anyway:

$ (grep "body bytes read" light_checkout.log | sed -e 's/.*[[:space:]]\+\([[:digit:]]\+\).*/\1/' | tr '\n' '+'; echo 0 ) | bc
56879036

$ (grep "byte part read" heavy_checkout.log | sed -e 's/.*[[:space:]]\+\([[:digit:]]\+\).*/\1/' | tr '\n' '+'; echo 0 ) | bc
53278569

So the 'lightweight' checkout is downloading 3MB *more* than the heavy checkout. (My guess is it is reading the remote indexes, but I'm not positive on that.)

I would guess that all the files in ~ubuntu-core-code don't have enough history to actually have a fulltext in their delta-chain (can take up to 200 revisions).

One possible optimization would be a specific RPC for 'iter_files_bytes()' which is the api which gets the actual file content during checkout. A checkout is 93MB, and a tar.gz of just the checkout is 21MB. So I guess a theoretically optimal fetch could download ~21MB instead of 50MB.

However, if the best we could possibly do is only 2x faster than getting everything, I'm not sure it is worth spending a lot of time trying to optimize for the lightweight version.

Note that other codebases would experience different results, I think for 'python' it was 20MB versus 200MB for lightweight checkout versus whole-history, which is a more interesting case.