malloc_init deadlock (mozilla jemalloc)

Bug #333624 reported by km
4
Affects Status Importance Assigned to Milestone
Mozilla Firefox
Invalid
Medium
firefox-3.0 (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

I submitted this as a bug against mozilla (https://bugzilla.mozilla.org/show_bug.cgi?id=474155) but there has been no feedback there, This occurs in the firefox 3.1 builds

malloc_init calls malloc_ncpus which in turn calls malloc_init trapping itself
in malloc_mutex_lock. I see this on a Ubuntu 8.04 server but not on a Ubuntu
8.10 laptop, but there may be other differences. The traceback that follows shows that on 8.04 that calloc calls malloc_init which acquires a futex, and tries to "open" /proc/cpuinfo to see how many cpus there are. However, open calls malloc and gets caught in the futex.

This is the mozilla jemalloc malloc replacement, but I don't know why this occurs on the 8.04 server box and not the 8.10 laptop.

 Here is a traceback:

#0 malloc_init_hard ()
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:5383
#1 0x08056c22 in malloc_init ()
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:5362
#2 0x08057411 in calloc (num=1, size=20)
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:6098
#3 0xb7c3332c in ?? () from /lib/tls/i686/cmov/libdl.so.2
#4 0xb7c32d73 in dlsym () from /lib/tls/i686/cmov/libdl.so.2
#5 0xb7efe97a in open (path=0x805800f "/proc/cpuinfo", oflag=0, mode=0)
    at libc_ut.c:97
#6 0x08054388 in malloc_ncpus ()
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:5111
#7 0x08055edf in malloc_init_hard ()
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:5415
#8 0x08056c22 in malloc_init ()
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:5362
#9 0x08057411 in calloc (num=1, size=20)
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:6098
#10 0xb7c3332c in ?? () from /lib/tls/i686/cmov/libdl.so.2
#11 0xb7c32b51 in dlopen () from /lib/tls/i686/cmov/libdl.so.2
#12 0xb7c4b443 in pr_FindSymbolInProg (name=0x0)
    at /var/dept/scratch/km/src/mozilla/nsprpub/pr/src/malloc/prmem.c:130
#13 0xb7c4b52f in _PR_InitZones ()
---Type <return> to continue, or q <return> to quit---
    at /var/dept/scratch/km/src/mozilla/nsprpub/pr/src/malloc/prmem.c:186
#14 0xb7c510c1 in _PR_InitStuff ()
    at /var/dept/scratch/km/src/mozilla/nsprpub/pr/src/misc/prinit.c:172
#15 0xb7c62568 in PR_GetCurrentThread ()
    at /var/dept/scratch/km/src/mozilla/nsprpub/pr/src/pthreads/ptthread.c:646
#16 0xb7ca134b in nsAutoOwningThread (this=0xb7d44974)
    at ../../../../dist/include/xpcom/nsISupportsImpl.h:70
#17 0xb7d0d70f in __static_initialization_and_destruction_0 (
    __initialize_p=<value optimized out>, __priority=0)
    at /var/dept/scratch/km/src/mozilla/xpcom/base/nsTraceRefcntImpl.cpp:1287
#18 0xb7d2b685 in __do_global_ctors_aux ()
   from ./objdir-ff/dist/bin/libxpcom_core.so
#19 0xb7c9c7b0 in _init () from ./objdir-ff/dist/bin/libxpcom_core.so
#20 0xb7f11990 in ?? () from /lib/ld-linux.so.2
#21 0xb7f11ac3 in ?? () from /lib/ld-linux.so.2
#22 0xb7f0484f in ?? () from /lib/ld-linux.so.2

Revision history for this message
km (km-mathcs) wrote :

I submitted this as a bug against mozilla (https://bugzilla.mozilla.org/show_bug.cgi?id=474155) but there has been no feedback there, This occurs in the firefox 3.1 builds

malloc_init calls malloc_ncpus which in turn calls malloc_init trapping itself
in malloc_mutex_lock. I see this on a Ubuntu 8.04 server but not on a Ubuntu
8.10 laptop, but there may be other differences. The traceback that follows shows that on 8.04 that calloc calls malloc_init which acquires a futex, and tries to "open" /proc/cpuinfo to see how many cpus there are. However, open calls malloc and gets caught in the futex.

This is the mozilla jemalloc malloc replacement, but I don't know why this occurs on the 8.04 server box and not the 8.10 laptop.

 Here is a traceback:

#0 malloc_init_hard ()
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:5383
#1 0x08056c22 in malloc_init ()
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:5362
#2 0x08057411 in calloc (num=1, size=20)
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:6098
#3 0xb7c3332c in ?? () from /lib/tls/i686/cmov/libdl.so.2
#4 0xb7c32d73 in dlsym () from /lib/tls/i686/cmov/libdl.so.2
#5 0xb7efe97a in open (path=0x805800f "/proc/cpuinfo", oflag=0, mode=0)
    at libc_ut.c:97
#6 0x08054388 in malloc_ncpus ()
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:5111
#7 0x08055edf in malloc_init_hard ()
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:5415
#8 0x08056c22 in malloc_init ()
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:5362
#9 0x08057411 in calloc (num=1, size=20)
    at /var/dept/scratch/km/src/mozilla/memory/jemalloc/jemalloc.c:6098
#10 0xb7c3332c in ?? () from /lib/tls/i686/cmov/libdl.so.2
#11 0xb7c32b51 in dlopen () from /lib/tls/i686/cmov/libdl.so.2
#12 0xb7c4b443 in pr_FindSymbolInProg (name=0x0)
    at /var/dept/scratch/km/src/mozilla/nsprpub/pr/src/malloc/prmem.c:130
#13 0xb7c4b52f in _PR_InitZones ()
---Type <return> to continue, or q <return> to quit---
    at /var/dept/scratch/km/src/mozilla/nsprpub/pr/src/malloc/prmem.c:186
#14 0xb7c510c1 in _PR_InitStuff ()
    at /var/dept/scratch/km/src/mozilla/nsprpub/pr/src/misc/prinit.c:172
#15 0xb7c62568 in PR_GetCurrentThread ()
    at /var/dept/scratch/km/src/mozilla/nsprpub/pr/src/pthreads/ptthread.c:646
#16 0xb7ca134b in nsAutoOwningThread (this=0xb7d44974)
    at ../../../../dist/include/xpcom/nsISupportsImpl.h:70
#17 0xb7d0d70f in __static_initialization_and_destruction_0 (
    __initialize_p=<value optimized out>, __priority=0)
    at /var/dept/scratch/km/src/mozilla/xpcom/base/nsTraceRefcntImpl.cpp:1287
#18 0xb7d2b685 in __do_global_ctors_aux ()
   from ./objdir-ff/dist/bin/libxpcom_core.so
#19 0xb7c9c7b0 in _init () from ./objdir-ff/dist/bin/libxpcom_core.so
#20 0xb7f11990 in ?? () from /lib/ld-linux.so.2
#21 0xb7f11ac3 in ?? () from /lib/ld-linux.so.2
#22 0xb7f0484f in ?? () from /lib/ld-linux.so.2

Revision history for this message
John Vivirito (gnomefreak) wrote :

What version of Ubuntu and can you please paste output of
apt-cache policy firefox-3.0
apt-cache policy firefox-3.1

Revision history for this message
John Vivirito (gnomefreak) wrote :

Do you see this only on firefox-3.1?

Changed in firefox:
status: New → Incomplete
Changed in firefox:
status: Unknown → New
Revision history for this message
km (km-mathcs) wrote :

I see this on the versions of firefox 3.0.x and 3.1.x downloaded from ftp.mozilla.org. I also see it on the version of 3.1.x I built from source on the target machine with the debug option, so that I could get the traceback I posted above.

This doesn't appear to happen with the apt-get version from the ubuntu repository. It also doesn't happen with the same mozilla.org binaries on my single core Intrepid laptop. I only see the futex problem on the multicore Hardy server.

I'm guessing that there must be some compile option or firefox patch in the ubuntu binaries that are not in the mozilla.org binaries, that gets around this issue.

Revision history for this message
John Vivirito (gnomefreak) wrote :

You are not seeing this in our builds per comment above. This is something you need to track upstream since LP is only for our official builds.

Changed in firefox-3.0:
status: Incomplete → Invalid
Revision history for this message
In , km (km-mathcs) wrote :

Haven't seen it in newer thunderbird/Linux.

Changed in firefox:
status: New → Invalid
Changed in firefox:
importance: Unknown → Medium
Revision history for this message
Mikko Ohtamaa (mikko-red-innovation) wrote :

I see this behavior on 8.04 server and with Varnish.

Varnish uses jemalloc. After recent updates, Varnish would not start any longer. I turns out to be /proc/cpuinfo deadlock in jemalloc.

I am not sure what has changed on the server, but I assume it is related to recent stable updates.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.