Comment 11 for bug 256409

Revision history for this message
Martin Pool (mbp) wrote : Re: dirstate internals question

(Again kind of stating the obvious in the hope of working up to the
actual bug.) So looking at the logfile, the user has removed the
directory but not run bzr rm on it, so presumably on entering commit
they are all still marked as present in the dirstate's current tree.

One thing I notice is that after 1.3 Robert added a check to
_apply_removals() to do with checking children are not left behind if
a parent is removed. This doesn't change the success condition but
might have caught a problem at a different place.

There are some 'noise' changes in that file with changes of assertions
to if/raise but I've checked them again and they all seem safe, and
anyhow are not closely related to this code path.

I think the files removed from the tree should be removed from
commit's in-progress inventory, and accumulated into a delta to be
later removed, by _report_and_accumulate_deletes from commit.py. In
trunk this does a set difference to find the removed files. Doing set
difference is generally something we want to avoid for performance but
not necessarily wrong. This code makes me suspicious of bugs in cases
like removing a file and then re-adding it with a different id. And
as the XXX in here says, these files are not properly sorted by
directory order, and we have had bugs related to that with dirstate in
the past.

I'm not sure if the code that later uses this list relies in it being
in directory order but we should certainly check. If this is the bug
it would account for it not being trivially reproducible unless you
have carefully constructed filenames.

    def _report_and_accumulate_deletes(self):
        # XXX: Could the list of deleted paths and ids be instead taken from
        # _populate_from_inventory?
        deleted_ids = set(self.basis_inv._byid.keys()) - \
            set(self.builder.new_inventory._byid.keys())
        if deleted_ids:
            self.any_entries_deleted = True
            deleted = [(self.basis_tree.id2path(file_id), file_id)
                for file_id in deleted_ids]
            deleted.sort()
            # XXX: this is not quite directory-order sorting
            for path, file_id in deleted:
                self._basis_delta.append((path, None, file_id, None))
                self.reporter.deleted(path)

--
Martin