on MemoryError, log/report memory usage by type

Bug #551391 reported by Martin Pool
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Low
Karl Bielefeldt

Bug Description

If bzr aborts with a MemoryError, it might help with debugging if we log to bzr.log and/or to an apport report a summary of allocated objects, available from gc.get_objects. We could just ship a very simple version that prints the count of objects of each type in gc.get_objects(), but that may just tell us there's a lot of strings allocated, and perhaps we'd have to use meliae.

Tags: apport memory

Related branches

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 551391] [NEW] on MemoryError, log/report memory usage by type

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
> Public bug reported:
>
> If bzr aborts with a MemoryError, it might help with debugging if we log
> to bzr.log and/or to an apport report a summary of allocated objects,
> available from gc.get_objects. We could just ship a very simple version
> that prints the count of objects of each type in gc.get_objects(), but
> that may just tell us there's a lot of strings allocated, and perhaps
> we'd have to use meliae.
>
> ** Affects: bzr
> Importance: Low
> Status: Confirmed
>
>
> ** Tags: apport memory
>

If we are running on python2.6 we could also do some basic stats
gathering using 'sys.getsizeof()', which then lets you do stuff like:

info = {}
getsizeof = getattr(sys, 'getsizeof', lambda x: 0)
for o in gc.get_objects():
  x = info.setdefault(type(o), [0, 0])
  x[0] += 1
  x[1] += getsizeof(x)

Though I guess you are starting to grow your memory consumption.
Hopefully the type dict won't get too big (large dicts consume a lot of
memory).

Note, however, that *strings* are not in gc.get_objects() because they
don't have references and thus don't participate in cycles or the
garbage collector. As such Meliae had some tricks to walk some refs to
see if it could find more data. In the end, it was more efficient to use
memory building a set that could track what objects had been found, and
more accurate.

I think with the inclusion of StaticTuple, we broke the trick Meliae was
originally using. Namely:

 obj.foo = ST(ST('str1', 'str2'))

At this point, obj is in gc, and references an outer ST, which
references an inner one, but you don't get as far as the actual strings.

If we really wanted to be memory efficient, a bloom filter would
probably get us decent accuracy, costing say 1MB of memory.

So I guess I have to say... far easier to have a line:

try:
 from meliae import scanner
except:
 return
scanner.dump_all_objects('bzr_memory_reference_dump.json')

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkuyMjQACgkQJdeBCYSNAAPvGACgo38ijnlnlOzfE43HQlBiF/eM
siUAn1zgK5YTs1PMclveafp+UmDkVz3D
=0GDh
-----END PGP SIGNATURE-----

Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 551391] [NEW] on MemoryError, log/report memory usage by type

On 31 March 2010 04:17, John A Meinel <email address hidden> wrote:
> try:
>  from meliae import scanner
> except:
>  return
> scanner.dump_all_objects('bzr_memory_reference_dump.json')

I wonder if loading and running meliae will be possible if we're
already out of memory, but it's probably worth a try.

--
Martin <http://launchpad.net/~mbp/>

Changed in bzr:
assignee: nobody → Karl Bielefeldt (kbielefe)
status: Confirmed → In Progress
Changed in bzr:
status: In Progress → Fix Committed
Vincent Ladeuil (vila)
Changed in bzr:
status: Fix Committed → Fix Released
milestone: none → 2.3b3
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.