Merge lp:~jameinel/bzr/2.0.1-faster-log-dir-bug374730 into lp:bzr/2.0

Proposed by John A Meinel
Status: Merged
Merged at revision: not available
Proposed branch: lp:~jameinel/bzr/2.0.1-faster-log-dir-bug374730
Merge into: lp:bzr/2.0
Diff against target: 843 lines
5 files modified
NEWS (+5/-0)
bzrlib/inventory.py (+146/-0)
bzrlib/tests/per_inventory/__init__.py (+41/-4)
bzrlib/tests/per_inventory/basics.py (+99/-147)
bzrlib/tests/test_inv.py (+218/-0)
To merge this branch: bzr merge lp:~jameinel/bzr/2.0.1-faster-log-dir-bug374730
Reviewer Review Type Date Requested Status
Robert Collins (community) Approve
Review via email: mp+12382@code.launchpad.net
To post a comment you must log in.
Revision history for this message
John A Meinel (jameinel) wrote :

This is my 'phase 1' for making 'bzr log DIR' faster.

It implements a custom 'CHKInventory.filter()' which only has to deal in the actual files that are requested, rather than always doing 'iter_entries()' and walking the whole tree.

This makes 'bzr log DIR' scale based on the number of entries in DIR. In my testing,

5m30s bzr.dev log tools
5m30s newbzr log bzrlib
0m36s newbzr log tools

So when the number of files is low, the time drops from 5m30s => 36s, which is great.

The next step is probably going to bypass 'filter()' for 'log DIR' and solely layer on top of 'iter_changes'. My initial (broken) testing showed that it stayed consistently around 36s for both bzrlib and tools, which is certainly a lot better.

I'd like to land this in the 2.0.x series, because it seems like a simple bug fix.

The size of the delta is a bit large, because it turned out that CHKInventory.filter() wasn't tested at all. (See my mailing list thread about what 'per_inventory' is actually supposed to be testing.) I went ahead and made 'per_inventory' actually care about both inventory implementations, and moved the tests around accordingly.

Revision history for this message
Robert Collins (lifeless) wrote :

On Thu, 2009-09-24 at 20:45 +0000, John A Meinel wrote:
> John A Meinel has proposed merging
> lp:~jameinel/bzr/2.0.1-faster-log-dir-bug374730 into lp:bzr/2.0.
>
> Requested reviews:
> bzr-core (bzr-core)
> Related bugs:
> #374730 log dir is slow in development-rich-root
> https://bugs.launchpad.net/bugs/374730
>
>
> This is my 'phase 1' for making 'bzr log DIR' faster.
>
> It implements a custom 'CHKInventory.filter()' which only has to deal
> in the actual files that are requested, rather than always doing
> 'iter_entries()' and walking the whole tree.
>
> This makes 'bzr log DIR' scale based on the number of entries in DIR.
> In my testing,
>
> 5m30s bzr.dev log tools
> 5m30s newbzr log bzrlib
> 0m36s newbzr log tools
>
> So when the number of files is low, the time drops from 5m30s => 36s,
> which is great.
>
> The next step is probably going to bypass 'filter()' for 'log DIR' and
> solely layer on top of 'iter_changes'. My initial (broken) testing
> showed that it stayed consistently around 36s for both bzrlib and
> tools, which is certainly a lot better.
>
> I'd like to land this in the 2.0.x series, because it seems like a
> simple bug fix.
>
> The size of the delta is a bit large, because it turned out that
> CHKInventory.filter() wasn't tested at all. (See my mailing list
> thread about what 'per_inventory' is actually supposed to be testing.)
> I went ahead and made 'per_inventory' actually care about both
> inventory implementations, and moved the tests around accordingly.

So this looks broadly fine; we didn't do much more than tune log for the
O(tree) datastore we had previously.

See my list response for thoughts on the test layout [short story -
test_inv.py is where I would have added tests].

 review +1

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2009-09-17 17:39:34 +0000
3+++ NEWS 2009-09-25 19:03:36 +0000
4@@ -8,6 +8,11 @@
5 Bug Fixes
6 *********
7
8+* Improve the time for ``bzr log DIR`` for 2a format repositories.
9+ We had been using the same code path as for <2a formats, which required
10+ iterating over all objects in all revisions.
11+ (John Arbash Meinel, #374730)
12+
13 * Make sure that we unlock the tree if we fail to create a TreeTransform
14 object when doing a merge, and there is limbo, or pending-deletions
15 directory. (Gary van der Merwe, #427773)
16
17=== modified file 'bzrlib/inventory.py'
18--- bzrlib/inventory.py 2009-08-30 22:02:45 +0000
19+++ bzrlib/inventory.py 2009-09-25 19:03:36 +0000
20@@ -1197,6 +1197,14 @@
21 raise errors.InconsistentDelta("<deleted>", parent_id,
22 "The file id was deleted but its children were not deleted.")
23
24+ def create_by_apply_delta(self, inventory_delta, new_revision_id,
25+ propagate_caches=False):
26+ """See CHKInventory.create_by_apply_delta()"""
27+ new_inv = self.copy()
28+ new_inv.apply_delta(inventory_delta)
29+ new_inv.revision_id = new_revision_id
30+ return new_inv
31+
32 def _set_root(self, ie):
33 self.root = ie
34 self._byid = {self.root.file_id: self.root}
35@@ -1546,6 +1554,123 @@
36 else:
37 raise ValueError("unknown kind %r" % entry.kind)
38
39+ def _expand_fileids_to_parents_and_children(self, file_ids):
40+ """Give a more wholistic view starting with the given file_ids.
41+
42+ For any file_id which maps to a directory, we will include all children
43+ of that directory. We will also include all directories which are
44+ parents of the given file_ids, but we will not include their children.
45+
46+ eg:
47+ / # TREE_ROOT
48+ foo/ # foo-id
49+ baz # baz-id
50+ frob/ # frob-id
51+ fringle # fringle-id
52+ bar/ # bar-id
53+ bing # bing-id
54+
55+ if given [foo-id] we will include
56+ TREE_ROOT as interesting parents
57+ and
58+ foo-id, baz-id, frob-id, fringle-id
59+ As interesting ids.
60+ """
61+ interesting = set()
62+ # TODO: Pre-pass over the list of fileids to see if anything is already
63+ # deserialized in self._fileid_to_entry_cache
64+
65+ directories_to_expand = set()
66+ children_of_parent_id = {}
67+ # It is okay if some of the fileids are missing
68+ for entry in self._getitems(file_ids):
69+ if entry.kind == 'directory':
70+ directories_to_expand.add(entry.file_id)
71+ interesting.add(entry.parent_id)
72+ children_of_parent_id.setdefault(entry.parent_id, []
73+ ).append(entry.file_id)
74+
75+ # Now, interesting has all of the direct parents, but not the
76+ # parents of those parents. It also may have some duplicates with
77+ # specific_fileids
78+ remaining_parents = interesting.difference(file_ids)
79+ # When we hit the TREE_ROOT, we'll get an interesting parent of None,
80+ # but we don't actually want to recurse into that
81+ interesting.add(None) # this will auto-filter it in the loop
82+ remaining_parents.discard(None)
83+ while remaining_parents:
84+ if None in remaining_parents:
85+ import pdb; pdb.set_trace()
86+ next_parents = set()
87+ for entry in self._getitems(remaining_parents):
88+ next_parents.add(entry.parent_id)
89+ children_of_parent_id.setdefault(entry.parent_id, []
90+ ).append(entry.file_id)
91+ # Remove any search tips we've already processed
92+ remaining_parents = next_parents.difference(interesting)
93+ interesting.update(remaining_parents)
94+ # We should probably also .difference(directories_to_expand)
95+ interesting.update(file_ids)
96+ interesting.discard(None)
97+ while directories_to_expand:
98+ # Expand directories by looking in the
99+ # parent_id_basename_to_file_id map
100+ keys = [(f,) for f in directories_to_expand]
101+ directories_to_expand = set()
102+ items = self.parent_id_basename_to_file_id.iteritems(keys)
103+ next_file_ids = set([item[1] for item in items])
104+ next_file_ids = next_file_ids.difference(interesting)
105+ interesting.update(next_file_ids)
106+ for entry in self._getitems(next_file_ids):
107+ if entry.kind == 'directory':
108+ directories_to_expand.add(entry.file_id)
109+ children_of_parent_id.setdefault(entry.parent_id, []
110+ ).append(entry.file_id)
111+ return interesting, children_of_parent_id
112+
113+ def filter(self, specific_fileids):
114+ """Get an inventory view filtered against a set of file-ids.
115+
116+ Children of directories and parents are included.
117+
118+ The result may or may not reference the underlying inventory
119+ so it should be treated as immutable.
120+ """
121+ (interesting,
122+ parent_to_children) = self._expand_fileids_to_parents_and_children(
123+ specific_fileids)
124+ # There is some overlap here, but we assume that all interesting items
125+ # are in the _fileid_to_entry_cache because we had to read them to
126+ # determine if they were a dir we wanted to recurse, or just a file
127+ # This should give us all the entries we'll want to add, so start
128+ # adding
129+ other = Inventory(self.root_id)
130+ other.root.revision = self.root.revision
131+ other.revision_id = self.revision_id
132+ if not interesting or not parent_to_children:
133+ # empty filter, or filtering entrys that don't exist
134+ # (if even 1 existed, then we would have populated
135+ # parent_to_children with at least the tree root.)
136+ return other
137+ cache = self._fileid_to_entry_cache
138+ try:
139+ remaining_children = collections.deque(parent_to_children[self.root_id])
140+ except:
141+ import pdb; pdb.set_trace()
142+ raise
143+ while remaining_children:
144+ file_id = remaining_children.popleft()
145+ ie = cache[file_id]
146+ if ie.kind == 'directory':
147+ ie = ie.copy() # We create a copy to depopulate the .children attribute
148+ # TODO: depending on the uses of 'other' we should probably alwyas
149+ # '.copy()' to prevent someone from mutating other and
150+ # invaliding our internal cache
151+ other.add(ie)
152+ if file_id in parent_to_children:
153+ remaining_children.extend(parent_to_children[file_id])
154+ return other
155+
156 @staticmethod
157 def _bytes_to_utf8name_key(bytes):
158 """Get the file_id, revision_id key out of bytes."""
159@@ -1885,6 +2010,27 @@
160 # really we're passing an inventory, not a tree...
161 raise errors.NoSuchId(self, file_id)
162
163+ def _getitems(self, file_ids):
164+ """Similar to __getitem__, but lets you query for multiple.
165+
166+ The returned order is undefined. And currently if an item doesn't
167+ exist, it isn't included in the output.
168+ """
169+ result = []
170+ remaining = []
171+ for file_id in file_ids:
172+ entry = self._fileid_to_entry_cache.get(file_id, None)
173+ if entry is None:
174+ remaining.append(file_id)
175+ else:
176+ result.append(entry)
177+ file_keys = [(f,) for f in remaining]
178+ for file_key, value in self.id_to_entry.iteritems(file_keys):
179+ entry = self._bytes_to_entry(value)
180+ result.append(entry)
181+ self._fileid_to_entry_cache[entry.file_id] = entry
182+ return result
183+
184 def has_id(self, file_id):
185 # Perhaps have an explicit 'contains' method on CHKMap ?
186 if self._fileid_to_entry_cache.get(file_id, None) is not None:
187
188=== modified file 'bzrlib/tests/per_inventory/__init__.py'
189--- bzrlib/tests/per_inventory/__init__.py 2009-07-10 07:14:02 +0000
190+++ bzrlib/tests/per_inventory/__init__.py 2009-09-25 19:03:36 +0000
191@@ -16,15 +16,52 @@
192
193 """Tests for different inventory implementations"""
194
195-from bzrlib.tests import multiply_tests
196+from bzrlib import (
197+ groupcompress,
198+ tests,
199+ )
200
201 def load_tests(basic_tests, module, loader):
202 """Generate suite containing all parameterized tests"""
203 modules_to_test = [
204 'bzrlib.tests.per_inventory.basics',
205 ]
206- from bzrlib.inventory import Inventory
207- scenarios = [('Inventory', {'inventory_class':Inventory})]
208+ from bzrlib.inventory import Inventory, CHKInventory
209+
210+ def inv_to_chk_inv(test, inv):
211+ """CHKInventory needs a backing VF, so we create one."""
212+ factory = groupcompress.make_pack_factory(True, True, 1)
213+ trans = test.get_transport('chk-inv')
214+ trans.ensure_base()
215+ vf = factory(trans)
216+ # We intentionally use a non-standard maximum_size, so that we are more
217+ # likely to trigger splits, and get increased test coverage.
218+ chk_inv = CHKInventory.from_inventory(vf, inv,
219+ maximum_size=100,
220+ search_key_name='hash-255-way')
221+ return chk_inv
222+ scenarios = [('Inventory', {'_inventory_class': Inventory,
223+ '_inv_to_test_inv': lambda test, inv: inv
224+ }),
225+ ('CHKInventory', {'_inventory_class': CHKInventory,
226+ '_inv_to_test_inv': inv_to_chk_inv,
227+ })]
228 # add the tests for the sub modules
229- return multiply_tests(loader.loadTestsFromModuleNames(modules_to_test),
230+ return tests.multiply_tests(
231+ loader.loadTestsFromModuleNames(modules_to_test),
232 scenarios, basic_tests)
233+
234+
235+class TestCaseWithInventory(tests.TestCaseWithMemoryTransport):
236+
237+ _inventory_class = None # set by load_tests
238+ _inv_to_test_inv = None # set by load_tests
239+
240+ def make_test_inventory(self):
241+ """Return an instance of the Inventory class under test."""
242+ return self._inventory_class()
243+
244+ def inv_to_test_inv(self, inv):
245+ """Convert a regular Inventory object into an inventory under test."""
246+ return self._inv_to_test_inv(self, inv)
247+
248
249=== modified file 'bzrlib/tests/per_inventory/basics.py'
250--- bzrlib/tests/per_inventory/basics.py 2009-07-17 00:49:02 +0000
251+++ bzrlib/tests/per_inventory/basics.py 2009-09-25 19:03:36 +0000
252@@ -21,6 +21,8 @@
253
254 from bzrlib import (
255 errors,
256+ inventory,
257+ osutils,
258 )
259
260 from bzrlib.inventory import (
261@@ -28,22 +30,36 @@
262 InventoryEntry,
263 InventoryFile,
264 InventoryLink,
265- ROOT_ID,
266 TreeReference,
267 )
268
269-from bzrlib.tests import (
270- TestCase,
271- )
272-
273-
274-class TestInventory(TestCase):
275-
276- def make_inventory(self, root_id):
277- return self.inventory_class(root_id=root_id)
278+from bzrlib.tests.per_inventory import TestCaseWithInventory
279+
280+
281+
282+class TestInventory(TestCaseWithInventory):
283+
284+ def make_init_inventory(self):
285+ inv = inventory.Inventory('tree-root')
286+ inv.revision = 'initial-rev'
287+ inv.root.revision = 'initial-rev'
288+ return self.inv_to_test_inv(inv)
289+
290+ def make_file(self, file_id, name, parent_id, content='content\n',
291+ revision='new-test-rev'):
292+ ie = InventoryFile(file_id, name, parent_id)
293+ ie.text_sha1 = osutils.sha_string(content)
294+ ie.text_size = len(content)
295+ ie.revision = revision
296+ return ie
297+
298+ def make_link(self, file_id, name, parent_id, target='link-target\n'):
299+ ie = InventoryLink(file_id, name, parent_id)
300+ ie.symlink_target = target
301+ return ie
302
303 def prepare_inv_with_nested_dirs(self):
304- inv = self.make_inventory('tree-root')
305+ inv = inventory.Inventory('tree-root')
306 for args in [('src', 'directory', 'src-id'),
307 ('doc', 'directory', 'doc-id'),
308 ('src/hello.c', 'file', 'hello-id'),
309@@ -53,172 +69,98 @@
310 ('src/zz.c', 'file', 'zzc-id'),
311 ('src/sub/a', 'file', 'a-id'),
312 ('Makefile', 'file', 'makefile-id')]:
313- inv.add_path(*args)
314- return inv
315-
316-
317-class TestInventoryUpdates(TestInventory):
318-
319- def test_creation_from_root_id(self):
320- # iff a root id is passed to the constructor, a root directory is made
321- inv = self.make_inventory(root_id='tree-root')
322- self.assertNotEqual(None, inv.root)
323- self.assertEqual('tree-root', inv.root.file_id)
324-
325- def test_add_path_of_root(self):
326- # if no root id is given at creation time, there is no root directory
327- inv = self.make_inventory(root_id=None)
328- self.assertIs(None, inv.root)
329- # add a root entry by adding its path
330- ie = inv.add_path("", "directory", "my-root")
331- self.assertEqual("my-root", ie.file_id)
332- self.assertIs(ie, inv.root)
333-
334- def test_add_path(self):
335- inv = self.make_inventory(root_id='tree_root')
336- ie = inv.add_path('hello', 'file', 'hello-id')
337- self.assertEqual('hello-id', ie.file_id)
338- self.assertEqual('file', ie.kind)
339-
340- def test_copy(self):
341- """Make sure copy() works and creates a deep copy."""
342- inv = self.make_inventory(root_id='some-tree-root')
343- ie = inv.add_path('hello', 'file', 'hello-id')
344- inv2 = inv.copy()
345- inv.root.file_id = 'some-new-root'
346- ie.name = 'file2'
347- self.assertEqual('some-tree-root', inv2.root.file_id)
348- self.assertEqual('hello', inv2['hello-id'].name)
349-
350- def test_copy_empty(self):
351- """Make sure an empty inventory can be copied."""
352- inv = self.make_inventory(root_id=None)
353- inv2 = inv.copy()
354- self.assertIs(None, inv2.root)
355-
356- def test_copy_copies_root_revision(self):
357- """Make sure the revision of the root gets copied."""
358- inv = self.make_inventory(root_id='someroot')
359- inv.root.revision = 'therev'
360- inv2 = inv.copy()
361- self.assertEquals('someroot', inv2.root.file_id)
362- self.assertEquals('therev', inv2.root.revision)
363-
364- def test_create_tree_reference(self):
365- inv = self.make_inventory('tree-root-123')
366- inv.add(TreeReference('nested-id', 'nested', parent_id='tree-root-123',
367- revision='rev', reference_revision='rev2'))
368-
369- def test_error_encoding(self):
370- inv = self.make_inventory('tree-root')
371- inv.add(InventoryFile('a-id', u'\u1234', 'tree-root'))
372- e = self.assertRaises(errors.InconsistentDelta, inv.add,
373- InventoryFile('b-id', u'\u1234', 'tree-root'))
374- self.assertContainsRe(str(e), r'\\u1234')
375-
376- def test_add_recursive(self):
377- parent = InventoryDirectory('src-id', 'src', 'tree-root')
378- child = InventoryFile('hello-id', 'hello.c', 'src-id')
379- parent.children[child.file_id] = child
380- inv = self.make_inventory('tree-root')
381- inv.add(parent)
382- self.assertEqual('src/hello.c', inv.id2path('hello-id'))
383-
384-
385-class TestInventoryApplyDelta(TestInventory):
386+ ie = inv.add_path(*args)
387+ if args[1] == 'file':
388+ ie.text_sha1 = osutils.sha_string('content\n')
389+ ie.text_size = len('content\n')
390+ return self.inv_to_test_inv(inv)
391+
392+
393+class TestInventoryCreateByApplyDelta(TestInventory):
394 """A subset of the inventory delta application tests.
395
396 See test_inv which has comprehensive delta application tests for
397- inventories, dirstate, and repository based inventories, unlike the tests
398- here which only test in-memory implementations that can support a plain
399- 'apply_delta'.
400+ inventories, dirstate, and repository based inventories.
401 """
402-
403- def test_apply_delta_add(self):
404- inv = self.make_inventory('tree-root')
405- inv.apply_delta([
406- (None, "a", "a-id", InventoryFile('a-id', 'a', 'tree-root')),
407- ])
408- self.assertEqual('a', inv.id2path('a-id'))
409-
410- def test_apply_delta_delete(self):
411- inv = self.make_inventory('tree-root')
412- inv.apply_delta([
413- (None, "a", "a-id", InventoryFile('a-id', 'a', 'tree-root')),
414- ])
415- self.assertEqual('a', inv.id2path('a-id'))
416- a_ie = inv['a-id']
417- inv.apply_delta([("a", None, "a-id", None)])
418+ def test_add(self):
419+ inv = self.make_init_inventory()
420+ inv = inv.create_by_apply_delta([
421+ (None, "a", "a-id", self.make_file('a-id', 'a', 'tree-root')),
422+ ], 'new-test-rev')
423+ self.assertEqual('a', inv.id2path('a-id'))
424+
425+ def test_delete(self):
426+ inv = self.make_init_inventory()
427+ inv = inv.create_by_apply_delta([
428+ (None, "a", "a-id", self.make_file('a-id', 'a', 'tree-root')),
429+ ], 'new-rev-1')
430+ self.assertEqual('a', inv.id2path('a-id'))
431+ inv = inv.create_by_apply_delta([
432+ ("a", None, "a-id", None),
433+ ], 'new-rev-2')
434 self.assertRaises(errors.NoSuchId, inv.id2path, 'a-id')
435
436- def test_apply_delta_rename(self):
437- inv = self.make_inventory('tree-root')
438- inv.apply_delta([
439- (None, "a", "a-id", InventoryFile('a-id', 'a', 'tree-root')),
440- ])
441+ def test_rename(self):
442+ inv = self.make_init_inventory()
443+ inv = inv.create_by_apply_delta([
444+ (None, "a", "a-id", self.make_file('a-id', 'a', 'tree-root')),
445+ ], 'new-rev-1')
446 self.assertEqual('a', inv.id2path('a-id'))
447 a_ie = inv['a-id']
448- b_ie = InventoryFile(a_ie.file_id, "b", a_ie.parent_id)
449- inv.apply_delta([("a", "b", "a-id", b_ie)])
450+ b_ie = self.make_file(a_ie.file_id, "b", a_ie.parent_id)
451+ inv = inv.create_by_apply_delta([("a", "b", "a-id", b_ie)], 'new-rev-2')
452 self.assertEqual("b", inv.id2path('a-id'))
453
454- def test_apply_delta_illegal(self):
455+ def test_illegal(self):
456 # A file-id cannot appear in a delta more than once
457- inv = self.make_inventory('tree-root')
458- self.assertRaises(errors.InconsistentDelta, inv.apply_delta, [
459- ("a", "a", "id-1", InventoryFile('id-1', 'a', 'tree-root')),
460- ("a", "b", "id-1", InventoryFile('id-1', 'b', 'tree-root')),
461- ])
462+ inv = self.make_init_inventory()
463+ self.assertRaises(errors.InconsistentDelta, inv.create_by_apply_delta, [
464+ (None, "a", "id-1", self.make_file('id-1', 'a', 'tree-root')),
465+ (None, "b", "id-1", self.make_file('id-1', 'b', 'tree-root')),
466+ ], 'new-rev-1')
467
468
469 class TestInventoryReads(TestInventory):
470
471 def test_is_root(self):
472 """Ensure our root-checking code is accurate."""
473- inv = self.make_inventory('TREE_ROOT')
474- self.assertTrue(inv.is_root('TREE_ROOT'))
475+ inv = self.make_init_inventory()
476+ self.assertTrue(inv.is_root('tree-root'))
477 self.assertFalse(inv.is_root('booga'))
478- inv.root.file_id = 'booga'
479+ ie = inv['tree-root'].copy()
480+ ie.file_id = 'booga'
481+ inv = inv.create_by_apply_delta([("", None, "tree-root", None),
482+ (None, "", "booga", ie)], 'new-rev-2')
483 self.assertFalse(inv.is_root('TREE_ROOT'))
484 self.assertTrue(inv.is_root('booga'))
485- # works properly even if no root is set
486- inv.root = None
487- self.assertFalse(inv.is_root('TREE_ROOT'))
488- self.assertFalse(inv.is_root('booga'))
489
490 def test_ids(self):
491 """Test detection of files within selected directories."""
492- inv = self.make_inventory(ROOT_ID)
493+ inv = inventory.Inventory('TREE_ROOT')
494 for args in [('src', 'directory', 'src-id'),
495 ('doc', 'directory', 'doc-id'),
496 ('src/hello.c', 'file'),
497 ('src/bye.c', 'file', 'bye-id'),
498 ('Makefile', 'file')]:
499- inv.add_path(*args)
500+ ie = inv.add_path(*args)
501+ if args[1] == 'file':
502+ ie.text_sha1 = osutils.sha_string('content\n')
503+ ie.text_size = len('content\n')
504+ inv = self.inv_to_test_inv(inv)
505 self.assertEqual(inv.path2id('src'), 'src-id')
506 self.assertEqual(inv.path2id('src/bye.c'), 'bye-id')
507- self.assert_('src-id' in inv)
508+ self.assertTrue('src-id' in inv)
509
510 def test_non_directory_children(self):
511 """Test path2id when a parent directory has no children"""
512- inv = self.make_inventory('tree_root')
513- inv.add(InventoryFile('file-id','file',
514- parent_id='tree_root'))
515- inv.add(InventoryLink('link-id','link',
516- parent_id='tree_root'))
517+ inv = inventory.Inventory('tree-root')
518+ inv.add(self.make_file('file-id','file', 'tree-root'))
519+ inv.add(self.make_link('link-id','link', 'tree-root'))
520 self.assertIs(None, inv.path2id('file/subfile'))
521 self.assertIs(None, inv.path2id('link/subfile'))
522
523 def test_iter_entries(self):
524- inv = self.make_inventory('tree-root')
525- for args in [('src', 'directory', 'src-id'),
526- ('doc', 'directory', 'doc-id'),
527- ('src/hello.c', 'file', 'hello-id'),
528- ('src/bye.c', 'file', 'bye-id'),
529- ('src/sub', 'directory', 'sub-id'),
530- ('src/sub/a', 'file', 'a-id'),
531- ('Makefile', 'file', 'makefile-id')]:
532- inv.add_path(*args)
533+ inv = self.prepare_inv_with_nested_dirs()
534
535 # Test all entries
536 self.assertEqual([
537@@ -230,6 +172,8 @@
538 ('src/hello.c', 'hello-id'),
539 ('src/sub', 'sub-id'),
540 ('src/sub/a', 'a-id'),
541+ ('src/zz.c', 'zzc-id'),
542+ ('zz', 'zz-id'),
543 ], [(path, ie.file_id) for path, ie in inv.iter_entries()])
544
545 # Test a subdirectory
546@@ -238,6 +182,7 @@
547 ('hello.c', 'hello-id'),
548 ('sub', 'sub-id'),
549 ('sub/a', 'a-id'),
550+ ('zz.c', 'zzc-id'),
551 ], [(path, ie.file_id) for path, ie in inv.iter_entries(
552 from_dir='src-id')])
553
554@@ -247,6 +192,7 @@
555 ('Makefile', 'makefile-id'),
556 ('doc', 'doc-id'),
557 ('src', 'src-id'),
558+ ('zz', 'zz-id'),
559 ], [(path, ie.file_id) for path, ie in inv.iter_entries(
560 recursive=False)])
561
562@@ -255,24 +201,23 @@
563 ('bye.c', 'bye-id'),
564 ('hello.c', 'hello-id'),
565 ('sub', 'sub-id'),
566+ ('zz.c', 'zzc-id'),
567 ], [(path, ie.file_id) for path, ie in inv.iter_entries(
568 from_dir='src-id', recursive=False)])
569
570 def test_iter_just_entries(self):
571- inv = self.make_inventory('tree-root')
572- for args in [('src', 'directory', 'src-id'),
573- ('doc', 'directory', 'doc-id'),
574- ('src/hello.c', 'file', 'hello-id'),
575- ('src/bye.c', 'file', 'bye-id'),
576- ('Makefile', 'file', 'makefile-id')]:
577- inv.add_path(*args)
578+ inv = self.prepare_inv_with_nested_dirs()
579 self.assertEqual([
580+ 'a-id',
581 'bye-id',
582 'doc-id',
583 'hello-id',
584 'makefile-id',
585 'src-id',
586+ 'sub-id',
587 'tree-root',
588+ 'zz-id',
589+ 'zzc-id',
590 ], sorted([ie.file_id for ie in inv.iter_just_entries()]))
591
592 def test_iter_entries_by_dir(self):
593@@ -387,3 +332,10 @@
594 ('src/sub/a', 'a-id'),
595 ('src/zz.c', 'zzc-id'),
596 ], [(path, ie.file_id) for path, ie in new_inv.iter_entries()])
597+
598+ def test_inv_filter_entry_not_present(self):
599+ inv = self.prepare_inv_with_nested_dirs()
600+ new_inv = inv.filter(['not-present-id'])
601+ self.assertEqual([
602+ ('', 'tree-root'),
603+ ], [(path, ie.file_id) for path, ie in new_inv.iter_entries()])
604
605=== modified file 'bzrlib/tests/test_inv.py'
606--- bzrlib/tests/test_inv.py 2009-09-23 20:48:52 +0000
607+++ bzrlib/tests/test_inv.py 2009-09-25 19:03:36 +0000
608@@ -259,6 +259,76 @@
609 return repo.get_inventory('result')
610
611
612+class TestInventoryUpdates(TestCase):
613+
614+ def test_creation_from_root_id(self):
615+ # iff a root id is passed to the constructor, a root directory is made
616+ inv = inventory.Inventory(root_id='tree-root')
617+ self.assertNotEqual(None, inv.root)
618+ self.assertEqual('tree-root', inv.root.file_id)
619+
620+ def test_add_path_of_root(self):
621+ # if no root id is given at creation time, there is no root directory
622+ inv = inventory.Inventory(root_id=None)
623+ self.assertIs(None, inv.root)
624+ # add a root entry by adding its path
625+ ie = inv.add_path("", "directory", "my-root")
626+ ie.revision = 'test-rev'
627+ self.assertEqual("my-root", ie.file_id)
628+ self.assertIs(ie, inv.root)
629+
630+ def test_add_path(self):
631+ inv = inventory.Inventory(root_id='tree_root')
632+ ie = inv.add_path('hello', 'file', 'hello-id')
633+ self.assertEqual('hello-id', ie.file_id)
634+ self.assertEqual('file', ie.kind)
635+
636+ def test_copy(self):
637+ """Make sure copy() works and creates a deep copy."""
638+ inv = inventory.Inventory(root_id='some-tree-root')
639+ ie = inv.add_path('hello', 'file', 'hello-id')
640+ inv2 = inv.copy()
641+ inv.root.file_id = 'some-new-root'
642+ ie.name = 'file2'
643+ self.assertEqual('some-tree-root', inv2.root.file_id)
644+ self.assertEqual('hello', inv2['hello-id'].name)
645+
646+ def test_copy_empty(self):
647+ """Make sure an empty inventory can be copied."""
648+ inv = inventory.Inventory(root_id=None)
649+ inv2 = inv.copy()
650+ self.assertIs(None, inv2.root)
651+
652+ def test_copy_copies_root_revision(self):
653+ """Make sure the revision of the root gets copied."""
654+ inv = inventory.Inventory(root_id='someroot')
655+ inv.root.revision = 'therev'
656+ inv2 = inv.copy()
657+ self.assertEquals('someroot', inv2.root.file_id)
658+ self.assertEquals('therev', inv2.root.revision)
659+
660+ def test_create_tree_reference(self):
661+ inv = inventory.Inventory('tree-root-123')
662+ inv.add(TreeReference('nested-id', 'nested', parent_id='tree-root-123',
663+ revision='rev', reference_revision='rev2'))
664+
665+ def test_error_encoding(self):
666+ inv = inventory.Inventory('tree-root')
667+ inv.add(InventoryFile('a-id', u'\u1234', 'tree-root'))
668+ e = self.assertRaises(errors.InconsistentDelta, inv.add,
669+ InventoryFile('b-id', u'\u1234', 'tree-root'))
670+ self.assertContainsRe(str(e), r'\\u1234')
671+
672+ def test_add_recursive(self):
673+ parent = InventoryDirectory('src-id', 'src', 'tree-root')
674+ child = InventoryFile('hello-id', 'hello.c', 'src-id')
675+ parent.children[child.file_id] = child
676+ inv = inventory.Inventory('tree-root')
677+ inv.add(parent)
678+ self.assertEqual('src/hello.c', inv.id2path('hello-id'))
679+
680+
681+
682 class TestDeltaApplication(TestCaseWithTransport):
683
684 def get_empty_inventory(self, reference_inv=None):
685@@ -505,6 +575,22 @@
686 inv, delta)
687
688
689+class TestInventory(TestCase):
690+
691+ def test_is_root(self):
692+ """Ensure our root-checking code is accurate."""
693+ inv = inventory.Inventory('TREE_ROOT')
694+ self.assertTrue(inv.is_root('TREE_ROOT'))
695+ self.assertFalse(inv.is_root('booga'))
696+ inv.root.file_id = 'booga'
697+ self.assertFalse(inv.is_root('TREE_ROOT'))
698+ self.assertTrue(inv.is_root('booga'))
699+ # works properly even if no root is set
700+ inv.root = None
701+ self.assertFalse(inv.is_root('TREE_ROOT'))
702+ self.assertFalse(inv.is_root('booga'))
703+
704+
705 class TestInventoryEntry(TestCase):
706
707 def test_file_kind_character(self):
708@@ -1128,3 +1214,135 @@
709 self.assertIsInstance(ie2.name, unicode)
710 self.assertEqual(('tree\xce\xa9name', 'tree-root-id', 'tree-rev-id'),
711 inv._bytes_to_utf8name_key(bytes))
712+
713+
714+class TestCHKInventoryExpand(tests.TestCaseWithMemoryTransport):
715+
716+ def get_chk_bytes(self):
717+ factory = groupcompress.make_pack_factory(True, True, 1)
718+ trans = self.get_transport('')
719+ return factory(trans)
720+
721+ def make_dir(self, inv, name, parent_id):
722+ inv.add(inv.make_entry('directory', name, parent_id, name + '-id'))
723+
724+ def make_file(self, inv, name, parent_id, content='content\n'):
725+ ie = inv.make_entry('file', name, parent_id, name + '-id')
726+ ie.text_sha1 = osutils.sha_string(content)
727+ ie.text_size = len(content)
728+ inv.add(ie)
729+
730+ def make_simple_inventory(self):
731+ inv = Inventory('TREE_ROOT')
732+ inv.revision_id = "revid"
733+ inv.root.revision = "rootrev"
734+ # / TREE_ROOT
735+ # dir1/ dir1-id
736+ # sub-file1 sub-file1-id
737+ # sub-file2 sub-file2-id
738+ # sub-dir1/ sub-dir1-id
739+ # subsub-file1 subsub-file1-id
740+ # dir2/ dir2-id
741+ # sub2-file1 sub2-file1-id
742+ # top top-id
743+ self.make_dir(inv, 'dir1', 'TREE_ROOT')
744+ self.make_dir(inv, 'dir2', 'TREE_ROOT')
745+ self.make_dir(inv, 'sub-dir1', 'dir1-id')
746+ self.make_file(inv, 'top', 'TREE_ROOT')
747+ self.make_file(inv, 'sub-file1', 'dir1-id')
748+ self.make_file(inv, 'sub-file2', 'dir1-id')
749+ self.make_file(inv, 'subsub-file1', 'sub-dir1-id')
750+ self.make_file(inv, 'sub2-file1', 'dir2-id')
751+ chk_bytes = self.get_chk_bytes()
752+ # use a small maximum_size to force internal paging structures
753+ chk_inv = CHKInventory.from_inventory(chk_bytes, inv,
754+ maximum_size=100,
755+ search_key_name='hash-255-way')
756+ bytes = ''.join(chk_inv.to_lines())
757+ return CHKInventory.deserialise(chk_bytes, bytes, ("revid",))
758+
759+ def assert_Getitems(self, expected_fileids, inv, file_ids):
760+ self.assertEqual(sorted(expected_fileids),
761+ sorted([ie.file_id for ie in inv._getitems(file_ids)]))
762+
763+ def assertExpand(self, all_ids, inv, file_ids):
764+ (val_all_ids,
765+ val_children) = inv._expand_fileids_to_parents_and_children(file_ids)
766+ self.assertEqual(set(all_ids), val_all_ids)
767+ entries = inv._getitems(val_all_ids)
768+ expected_children = {}
769+ for entry in entries:
770+ s = expected_children.setdefault(entry.parent_id, [])
771+ s.append(entry.file_id)
772+ val_children = dict((k, sorted(v)) for k, v
773+ in val_children.iteritems())
774+ expected_children = dict((k, sorted(v)) for k, v
775+ in expected_children.iteritems())
776+ self.assertEqual(expected_children, val_children)
777+
778+ def test_make_simple_inventory(self):
779+ inv = self.make_simple_inventory()
780+ layout = []
781+ for path, entry in inv.iter_entries_by_dir():
782+ layout.append((path, entry.file_id))
783+ self.assertEqual([
784+ ('', 'TREE_ROOT'),
785+ ('dir1', 'dir1-id'),
786+ ('dir2', 'dir2-id'),
787+ ('top', 'top-id'),
788+ ('dir1/sub-dir1', 'sub-dir1-id'),
789+ ('dir1/sub-file1', 'sub-file1-id'),
790+ ('dir1/sub-file2', 'sub-file2-id'),
791+ ('dir1/sub-dir1/subsub-file1', 'subsub-file1-id'),
792+ ('dir2/sub2-file1', 'sub2-file1-id'),
793+ ], layout)
794+
795+ def test__getitems(self):
796+ inv = self.make_simple_inventory()
797+ # Reading from disk
798+ self.assert_Getitems(['dir1-id'], inv, ['dir1-id'])
799+ self.assertTrue('dir1-id' in inv._fileid_to_entry_cache)
800+ self.assertFalse('sub-file2-id' in inv._fileid_to_entry_cache)
801+ # From cache
802+ self.assert_Getitems(['dir1-id'], inv, ['dir1-id'])
803+ # Mixed
804+ self.assert_Getitems(['dir1-id', 'sub-file2-id'], inv,
805+ ['dir1-id', 'sub-file2-id'])
806+ self.assertTrue('dir1-id' in inv._fileid_to_entry_cache)
807+ self.assertTrue('sub-file2-id' in inv._fileid_to_entry_cache)
808+
809+ def test_single_file(self):
810+ inv = self.make_simple_inventory()
811+ self.assertExpand(['TREE_ROOT', 'top-id'], inv, ['top-id'])
812+
813+ def test_get_all_parents(self):
814+ inv = self.make_simple_inventory()
815+ self.assertExpand(['TREE_ROOT', 'dir1-id', 'sub-dir1-id',
816+ 'subsub-file1-id',
817+ ], inv, ['subsub-file1-id'])
818+
819+ def test_get_children(self):
820+ inv = self.make_simple_inventory()
821+ self.assertExpand(['TREE_ROOT', 'dir1-id', 'sub-dir1-id',
822+ 'sub-file1-id', 'sub-file2-id', 'subsub-file1-id',
823+ ], inv, ['dir1-id'])
824+
825+ def test_from_root(self):
826+ inv = self.make_simple_inventory()
827+ self.assertExpand(['TREE_ROOT', 'dir1-id', 'dir2-id', 'sub-dir1-id',
828+ 'sub-file1-id', 'sub-file2-id', 'sub2-file1-id',
829+ 'subsub-file1-id', 'top-id'], inv, ['TREE_ROOT'])
830+
831+ def test_top_level_file(self):
832+ inv = self.make_simple_inventory()
833+ self.assertExpand(['TREE_ROOT', 'top-id'], inv, ['top-id'])
834+
835+ def test_subsub_file(self):
836+ inv = self.make_simple_inventory()
837+ self.assertExpand(['TREE_ROOT', 'dir1-id', 'sub-dir1-id',
838+ 'subsub-file1-id'], inv, ['subsub-file1-id'])
839+
840+ def test_sub_and_root(self):
841+ inv = self.make_simple_inventory()
842+ self.assertExpand(['TREE_ROOT', 'dir1-id', 'sub-dir1-id', 'top-id',
843+ 'subsub-file1-id'], inv, ['top-id', 'subsub-file1-id'])

Subscribers

People subscribed via source and target branches