Bundle (bzr send) broken with --2a format

Bug #393349 reported by Roman Erzhukov
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Critical
John A Meinel

Bug Description

It turns out that the code that generates bundles is broken wrt --2a format repositories.

When it generates data to put into the bundle it reads the raw inventory texts from the repository, and inserts them into the bundle. But it does *not* include any chk pages. This means that the real "inventory" data is completely unavailable in the generated bundle. We don't discover this until we try to insert the bundle into another repository and then find that we can't read any of the chk pages.

Options:

1) Instead of using "inventories.get_record_stream()/get_mpdiffs()" use logical Inventory objects and generate an XML data stream for them. (all we need is some sort of 'whole tree' information, enough that we could regenerate the chk pages.)
2) Teach the bundle code that after generating an mp_diff of the inventory, it needs to go find the chk pages and include them as well.

I originally thought this was about stacking, but it isn't. The data about tree shape just isn't present in the bundles generated from a --2a repository.

I'm leaving the traceback since that may help people who run into this find the bug.

> bzr pull ../my.patch

bzr: ERROR: exceptions.AttributeError: 'AbsentContentFactory' object has no attribute 'get_bytes_as'

Traceback (most recent call last):
  File "bzrlib\commands.pyo", line 729, in exception_to_return_code
  File "bzrlib\commands.pyo", line 924, in run_bzr
  File "bzrlib\commands.pyo", line 560, in run_argv_aliases
  File "bzrlib\builtins.pyo", line 940, in run
  File "bzrlib\merge_directive.pyo", line 607, in get_merge_request
  File "bzrlib\merge_directive.pyo", line 612, in _maybe_verify
  File "bzrlib\merge_directive.pyo", line 593, in _verify_patch
  File "bzrlib\merge_directive.pyo", line 190, in _generate_diff
  File "bzrlib\diff.pyo", line 427, in show_diff_trees
  File "bzrlib\diff.pyo", line 834, in show_diff
  File "bzrlib\diff.pyo", line 857, in _show_diff
  File "bzrlib\revisiontree.pyo", line 248, in iter_changes
  File "bzrlib\inventory.pyo", line 1815, in iter_changes
  File "bzrlib\chk_map.pyo", line 245, in iter_changes
  File "bzrlib\chk_map.pyo", line 126, in _ensure_root
  File "bzrlib\chk_map.pyo", line 139, in _get_node
  File "bzrlib\chk_map.pyo", line 150, in _read_bytes
AttributeError: 'AbsentContentFactory' object has no attribute 'get_bytes_as'

bzr 1.16.1 on python 2.5.2 (win32)
arguments: ['C:\\Program Files\\Bazaar\\bzr.EXE', 'pull', '../my.patch']
encoding: 'cp1251', fsenc: 'mbcs', lang: None
plugins:
  bzrtools C:\Program Files\Bazaar\plugins\bzrtools [1.16]
  docdiff C:\Program Files\Bazaar\plugins\docdiff [1.0dev]
  launchpad C:\Program Files\Bazaar\plugins\launchpad [1.16.1]
  netrc_credential_store C:\Program Files\Bazaar\plugins\netrc_credential_store [1.16.1]
  qbzr C:\Program Files\Bazaar\plugins\qbzr [0.11]
  rebase C:\Program Files\Bazaar\plugins\rebase [0.5.1]
  svn C:\Program Files\Bazaar\plugins\svn [0.6.2]
*** Bazaar has encountered an internal error.
    Please report a bug at https://bugs.launchpad.net/bzr/+filebug
    including this traceback, and a description of what you
    were doing when the error occurred.

Related branches

Revision history for this message
Jonathan Lange (jml) wrote :

We're also getting this error on Launchpad, see OOPS-1283SMPU324 for example.

Changed in bzr:
importance: Undecided → High
status: New → Triaged
Revision history for this message
Robert Collins (lifeless) wrote :

Showstoppers like this are critical bugs, they come before improving things that work. Bumping to critical.
Jml - is the backtrace identical? If not, can you extract the tail of it and include it in the bug please.

Changed in bzr:
importance: High → Critical
Changed in bzr:
milestone: none → 2.0
Revision history for this message
Jonathan Lange (jml) wrote :
Download full text (3.4 KiB)

  File ELIDED, line NNN, in someFunction
    dest_branch.pull(source_branch, overwrite=True)
  File "<string>", line 4, in pull_write_locked
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/branch.py", line 855, in pull
    possible_transports=possible_transports, *args, **kwargs)
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/branch.py", line 3144, in pull
    _override_hook_target=_override_hook_target)
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/branch.py", line 3020, in pull
    overwrite=overwrite, graph=graph)
  File "<string>", line 4, in update_revisions_write_locked
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/branch.py", line 808, in update_revisions
    overwrite, graph)
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/branch.py", line 2963, in update_revisions
    self.target.fetch(self.source, stop_revision)
  File "<string>", line 4, in fetch_write_locked
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/branch.py", line 532, in fetch
    pb=pb)
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/repository.py", line 1553, in fetch
    find_ghosts=find_ghosts, fetch_spec=fetch_spec)
  File "<string>", line 4, in fetch_write_locked
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/repository.py", line 3139, in fetch
    pb=pb, find_ghosts=find_ghosts)
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/fetch.py", line 82, in __init__
    self.__fetch()
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/fetch.py", line 108, in __fetch
    self._fetch_everything_for_search(search)
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/fetch.py", line 135, in _fetch_everything_for_search
    resume_tokens, missing_keys = self.sink.insert_stream(
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/repository.py", line 4047, in insert_stream
    return self._locked_insert_stream(stream, src_format, is_resume)
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/repository.py", line 4089, in _locked_insert_stream
    self.target_repo.chk_bytes.insert_record_stream(substream)
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/groupcompress.py", line 1369, in insert_record_stream
    for _ in self._insert_record_stream(stream, random_id=False):
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/groupcompress.py", line 1423, in _insert_record_stream
    for record in stream:
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/repofmt/groupcompress_repo.py", line 932, in _filter_id_to_entry
    self._chk_id_roots, uninteresting_root_keys):
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/chk_map.py", line 1414, in iter_interesting_nodes
    interesting_root_keys, uninteresting_root_keys, pb)
  File "/srv/bazaar.launchpad.net/production/launchpad-rev-8204/lib/bzrlib/chk_map.py", line 1351, in _find_all_uninteresting
    pb=pb...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 393349] Re: exceptions.AttributeError: 'AbsentContentFactory' object has no attribute 'get_bytes_as'

the stack traces look different;, however that may be misleading; the
could be the same bug. Lets treat that as the case for now.

I think to move this forward we should make sure there is a test for the
particular set of interactions making this happen; and if its a
behavioural skew between implementations we should make it an interface
test.

-Rob

Revision history for this message
John A Meinel (jameinel) wrote : Re: exceptions.AttributeError: 'AbsentContentFactory' object has no attribute 'get_bytes_as'

I'll mention that rev-8204 looks a lot more like bug #390563 while the original posting here is different.

The root cause may be the same/similar (needs investigation) but the code triggering it is quite different. (chk_map.CHKMap.iter_changes() versus chk_map.iter_interesting_nodes())

The original posting also is going through bundle codepaths, which has its own logic about what needs to be put into the bundle, and how that gets pulled out and back into the repository. And certainly that *could* be incorrect and not inserting enough. (It is unlikely, but since 'bzr merge bundle' is triggering the traceback and 'bzr merge branch' isn't....)

Anyway, I'm guessing they are separate issues, but will be investigating.

Changed in bzr:
assignee: nobody → John A Meinel (jameinel)
Revision history for this message
John A Meinel (jameinel) wrote :

So it does, indeed, seem possible to trigger this via bundles. I'll repeat my comment from bug #390563 describing the problem.

1) A stacked branch is valid if:
   a) For the list of Revisions that are present, has the corresponding
      inventory for all of those revisions (at least a delta)
   b) for the delta of all those inventories, has the corresponding file texts
   c) for any base which an inventory is delta'd against, has the corresponding
      inventory.
      If we don't have a base, then we require the full inventory for the
      revision, and all referenced file texts.

2) With bundles, it violates (c). The insert-data-from bundle inserts all
   of (a) and (b), but it doesn't bring in the inventories from parent
   revisions, nor does it include the referenced texts.

3) Note also that *commit* does the wrong thing as well. For heavyweight
checkouts, I think it does fine, because it generates the data locally, and
then uses fetch to push it to the remote. But for lightweight checkouts, it
doesn't 'fill in' any parent inventories or referenced texts.

Here is a test case that reproduces it. Note that we may not need all 500
files, but we need something large enough to cause the CHK pages to split.

# create a 500 file tree
bzr init --2a source
cd source
for i in `seq 10`; do mkdir $i; for j in `seq 50; do touch $i/$j; done; done

bzr add -q
bzr commit -q -m "add 500 files"

echo foo > 5/5
bzr commit -m "5/5"

echo bar > 6/6
bzr commit -m "6/6"

cd ..
bzr branch source target
echo baz >target/7/7
bzr commit -m "7/7" target

(cd target; bzr send -o ../my.patch ../source)

bzr branch --stacked source stacked

cd stacked
bzr pull ../my.patch #boom

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 393349] Re: exceptions.AttributeError: 'AbsentContentFactory' object has no attribute 'get_bytes_as'

If you're looking at changing bundle representation, it may be worth
noting that we do have an inventory delta serialiser in-tree now.

-Rob

John A Meinel (jameinel)
summary: - exceptions.AttributeError: 'AbsentContentFactory' object has no
- attribute 'get_bytes_as'
+ Bundles broken with --2a format
description: updated
Samuel Bronson (naesten)
summary: - Bundles broken with --2a format
+ Bundle (bzr send) broken with --2a format
Martin Pool (mbp)
Changed in bzr:
status: Triaged → In Progress
Revision history for this message
Wouter van Heyst (larstiq) wrote :

With current bzr.dev including r4588 "(jam) Get bundles working with --2a (bug #393349)", the recipe from comment #6 doesn't go boom. Does this mean the bug is fixed?

Revision history for this message
John A Meinel (jameinel) wrote :

bundles and 'bzr send' should now work with --2a formats. We probably want to revisit a new bundle format for efficiency, but at least correctness is now handled.

Changed in bzr:
milestone: 2.0 → 1.18
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.