Comment 1 for bug 551391

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 551391] [NEW] on MemoryError, log/report memory usage by type

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
> Public bug reported:
>
> If bzr aborts with a MemoryError, it might help with debugging if we log
> to bzr.log and/or to an apport report a summary of allocated objects,
> available from gc.get_objects. We could just ship a very simple version
> that prints the count of objects of each type in gc.get_objects(), but
> that may just tell us there's a lot of strings allocated, and perhaps
> we'd have to use meliae.
>
> ** Affects: bzr
> Importance: Low
> Status: Confirmed
>
>
> ** Tags: apport memory
>

If we are running on python2.6 we could also do some basic stats
gathering using 'sys.getsizeof()', which then lets you do stuff like:

info = {}
getsizeof = getattr(sys, 'getsizeof', lambda x: 0)
for o in gc.get_objects():
  x = info.setdefault(type(o), [0, 0])
  x[0] += 1
  x[1] += getsizeof(x)

Though I guess you are starting to grow your memory consumption.
Hopefully the type dict won't get too big (large dicts consume a lot of
memory).

Note, however, that *strings* are not in gc.get_objects() because they
don't have references and thus don't participate in cycles or the
garbage collector. As such Meliae had some tricks to walk some refs to
see if it could find more data. In the end, it was more efficient to use
memory building a set that could track what objects had been found, and
more accurate.

I think with the inclusion of StaticTuple, we broke the trick Meliae was
originally using. Namely:

 obj.foo = ST(ST('str1', 'str2'))

At this point, obj is in gc, and references an outer ST, which
references an inner one, but you don't get as far as the actual strings.

If we really wanted to be memory efficient, a bloom filter would
probably get us decent accuracy, costing say 1MB of memory.

So I guess I have to say... far easier to have a line:

try:
 from meliae import scanner
except:
 return
scanner.dump_all_objects('bzr_memory_reference_dump.json')

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkuyMjQACgkQJdeBCYSNAAPvGACgo38ijnlnlOzfE43HQlBiF/eM
siUAn1zgK5YTs1PMclveafp+UmDkVz3D
=0GDh
-----END PGP SIGNATURE-----