Kernel freezes running application code (google-chrome)

Bug #616745 reported by Sergio Callegari
34
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Chromium Browser
Unknown
Unknown
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Nominated for Lucid by Sergio Callegari
module-init-tools (Ubuntu)
Fix Released
Undecided
Unassigned
Nominated for Lucid by Sergio Callegari

Bug Description

Keyboard leds start to flash. Keyboard becomes unresponsive. Mouse pointer blocks at current position.

Please revert latest changes, until regression cause is identified and fixes applied.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image 2.6.32.24.25
Regression: Yes
Reproducible: No
ProcVersionSignature: Ubuntu 2.6.32-23.37-generic 2.6.32.15+drm33.5
Uname: Linux 2.6.32-23-generic x86_64
NonfreeKernelModules: nvidia
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
AplayDevices:
 **** List of PLAYBACK Hardware Devices ****
 card 0: NVidia [HDA NVidia], device 0: ALC662 rev1 Analog [ALC662 rev1 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: NVidia [HDA NVidia], device 0: ALC662 rev1 Analog [ALC662 rev1 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: callegar 1946 F.... pulseaudio
                      callegar 1986 F.... kmix
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'NVidia'/'HDA NVidia at 0xfbff8000 irq 21'
   Mixer name : 'Realtek ALC662 rev1'
   Components : 'HDA:10ec0662,18493662,00100101'
   Controls : 23
   Simple ctrls : 14
Date: Thu Aug 12 13:50:00 2010
Frequency: I don't know.
HibernationDevice: RESUME=UUID=c50b4249-43ed-49ef-a3d8-ec7aa9718265
MachineType: To Be Filled By O.E.M. To Be Filled By O.E.M.
ProcCmdLine: root=/dev/mapper/DISK00-root ro quiet splash
RelatedPackageVersions: linux-firmware 1.34.1
RfKill:
 0: hci0: Bluetooth
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
WpaSupplicantLog:

dmi.bios.date: 05/13/2009
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: P1.40
dmi.board.name: N68-S
dmi.board.vendor: ASRock
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: To Be Filled By O.E.M.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrP1.40:bd05/13/2009:svnToBeFilledByO.E.M.:pnToBeFilledByO.E.M.:pvrToBeFilledByO.E.M.:rvnASRock:rnN68-S:rvr:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.:
dmi.product.name: To Be Filled By O.E.M.
dmi.product.version: To Be Filled By O.E.M.
dmi.sys.vendor: To Be Filled By O.E.M.

Revision history for this message
Sergio Callegari (callegar) wrote :
Revision history for this message
Sergio Callegari (callegar) wrote :

Maybe not a regression after all, but an already existing issue.

These freezes that I had never experienced before appear to be triggered by the use of the newer
google-chrome 6.0.472.33-r55501 (perhaps in conjunction with a very slow network - dialup on mobile GPRS).

I have now experienced the same issue with 2.6.32-23.

Of course chrome may do something weird here, but some kernel fault seems to exist, since no user level application should freeze the machine, should it?

As soon as I have a proper environment, I'll try to see if all the machine freezes or just keyboard/mouse/user input by ping from elsewhere.

Revision history for this message
Sergio Callegari (callegar) wrote :

Changed title, as I verified that this is not a regression and happens even with previous 2.6.32 kernels.

I experience the freeze every time I do the following:

1) Go online (I have only tried with ppp dialup on GPRS mobile)
2) Start google-chrome
3) Browse some pages
4) Exit google-chrome

on exit I get the hard freeze.

I have the same behavior both on a desktop with AMD Phenom II cpu and nvidia graphis (with proprietary drivers) and a Dell Laptop with Intel Core II processor and Intel graphics. Both are 64 bit lucid.

In all cases, the system becomes unresponsive (even to SysReq magic) and unreachable from the net.

This thing is well documented from other web sites. See:

http://www.google.com/support/forum/p/Chrome/thread?tid=7493a9fc196ea8be&hl=en
http://ubuntuforums.org/showthread.php?s=871a4db71d2354a920c8694c472e6a56&t=1470623

It has never been reported as an ubuntu bug, since who noticed it regarded it as a google-chrome bug.

Personally I think that this is a kernel bug that google-chrome happens to trigger. The notice (in some of the previous websites) that downgrading to 2.6.32-rc7 fixes the issue appears as an additional hint that this may be the case.

Of course there can be bugs in google-chrome and even serious ones, but by no means these should bring a system to a complete halt in this way.

summary: - latest kernel update (2.6.32-24) causes system to freeze during normal
- use
+ Kernel freezes running application code (google-chrome)
Revision history for this message
Sergio Callegari (callegar) wrote :

By looking at other reports, some suggest that this may be caused by a wifi driver (atheros).
This is not the case. None of my machines has such hardware.

Revision history for this message
Sergio Callegari (callegar) wrote :

Some news about this bug:

1) It is confirmed by other users, even on other distros. See http://code.google.com/p/chromium/issues/detail?id=54617

2) It is a kernel bug where google-chrome (as well as chromium) plays the mere role of a trigger (and only if the chrome sandbox is used)

3) It is related to the use of

chrome + internet access via mobile phone on usb (i.e. mobile phone attached via usb cable to the computer and ppp).

does not happen when access to the internet is given by wifi or ethernet

Seems to be reproducible by

a) go online with a mobile access

b) start google chrome

c) navigate some pages

d) exit chrome

on d) system locks up.

in the above link at the chromium bug tracker, there is also a screenshot of a kernel log. The bug so far has been discussed there, but the discussion should be taken to the kernel developers, since chrome seems to be just a trigger.

Revision history for this message
Sergio Callegari (callegar) wrote :

Solved.

The lockup is caused by the phonet kernel module and as such affects only those who have the module loaded, namely those who connected a Nokia mobile phone to the PC before starting google-chrome.

An extremely efficient Nokia person indicated that

"Network namespace in the Phonet socket stack causes an OOPS when the
a namespace is destroyed. This occurs as the loopback exit_net handler
is called after the Phonet exit_net handler, and re-enters the Phonet
stack."

Chromium guys confirmed that the chrome sandbox uses namespaces, and hence the triggering of the bug.

The Nokia person has also provided a patch. It can be found as an attachment at the end of the thread http://code.google.com/p/chromium/issues/detail?id=54617, together with the appropriate credits.

Note that the author reports:

"There is no easy way to fix this in kernel <= 2.6.32. As there
is no use for Phonet namespaces yet, disable them."

Please patch the Ubuntu Lucid kernel accordingly and release a fixed version. This should by no means affect the stability of the LTS kernel since phonet is not a commonly used feature and since Phonet namespaces are not yet used.

Alternatively, please blacklist phonet by default in Lucid.

Also, please propagate the fix upstream, noticing that this fix only applies to 2.6.32 and that a different fix will be provided for newer kernels.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Rodrigo Linfati (rlinfati) wrote :

Repro:

[ 6295.349006] BUG: soft lockup - CPU#1 stuck for 61s! [netns:13]
[ 6295.349006] Modules linked in: ppp_deflate bsd_comp ppp_async crc_ccitt cdc_ether cdc_phonet phonet
 usbnet mii cdc_acm usb_storage binfmt_misc rfcomm ppdev sco bridge stp bnep l2cap vboxdrv deflate zlib_deflate ctr twofish twofish_
common camellia serpent blowfish cast5 des_generic aes_i586 aes_generic xcbc rmd160 sha256_generic sha1_generic crypto_null af_key d
m_crypt joydev snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss s
nd_seq_midi arc4 snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device iwl3945 iwlcore snd dell_wmi uvcvideo videodev v4l1
_compat btusb bluetooth sdhci_pci sdhci ricoh_mmc mac80211 led_class dell_laptop dcdbas psmouse serio_raw soundcore snd_page_alloc c
fg80211 uinput lp parport fbcon tileblit font bitblit softcursor vga16fb vgastate usbhid hid i915 drm_kms_helper ohci1394 drm ieee13
94 intel_agp ahci i2c_algo_bit video output tg3 agpgart
[ 6295.349006]
[ 6295.349006] Pid: 13, comm: netns Not tainted (2.6.32-25-generic #44-Ubuntu) Inspiron 1420

[ 6295.349006] EIP: 0060:[<c012a4f5>] EFLAGS: 00000217 CPU: 1
[ 6295.349006] EIP is at __ticket_spin_lock+0x15/0x20
[ 6295.349006] EAX: f0d6f878 EBX: f0d6f878 ECX: efdc4c00 EDX: 00009400
[ 6295.349006] ESI: 00000000 EDI: efdc4c00 EBP: f70b1e88 ESP: f70b1e88
[ 6295.349006] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 6295.349006] CR0: 8005003b CR2: b361b000 CR3: 0084e000 CR4: 000006d0
[ 6295.349006] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 6295.349006] DR6: ffff0ff0 DR7: 00000400
[ 6295.349006] Call Trace:
[ 6295.349006] [<c058d437>] _spin_lock_bh+0x17/0x20
[ 6295.349006] [<fabd914c>] phonet_device_destroy+0x4c/0x160 [phonet]
[ 6295.349006] [<c05469a9>] ? addrconf_notify+0x99/0x470
[ 6295.349006] [<c0512e34>] ? arp_ifdown+0x14/0x20
[ 6295.349006] [<c056ab76>] ? packet_notifier+0x26/0x1a0
[ 6295.349006] [<fabd95d9>] phonet_device_notify+0x19/0x40 [phonet]
[ 6295.349006] [<c058fa03>] notifier_call_chain+0x43/0x60

Book 'em Dano (heymrdjd)
affects: linux → chromium-browser
affects: chromium-browser → module-init-tools
affects: module-init-tools → chromium-browser
Revision history for this message
Sergio Callegari (callegar) wrote :

Please, provide a Stable Release Update (SRU) of module-init-tools including a file named

/etc/modprobe.d/blacklist-phonet.conf

with the following content

# Phonet is broken on the lucid kernel and may result in hard lockups
# so disable it
blacklist cdc_phonet
blacklist phonet

With the above, the bug can be finally closed.

The rationale for the SRU is the following:

1) Bug on the phonet module obviously has a high-impact, since it can result in a full freeze of the machine

2) Bug may result in user data loss as any bug causing a freeze while people are working. The consequences of the unclean shutdown may just add to the problem

3) Since bug can be triggered by unprivileged applications (e.g. the rather common google-chrome or chromium), any use knowing that phonet is loaded can purposely cause a denial of service with very good ease

4) Bug is easily triggered (any application using network namespaces would likely freeze the kernel if phonet is loaded). Google chrome is just an example

5) There are no downsides since there are no known users of the phonet functionalities in lucid. The fact that the bug has only been reported by a few people is a mere conseguence of the fact that only a few people got phonet autoloaded by attaching a Nokia phone to the PC via usb.

Revision history for this message
Evan Martin (Chromium) (evan-chromium) wrote :

http://code.google.com/p/chromium/issues/detail?id=54617

We were using a poorly-tested kernel feature (which is not available to userspace; we have a suid helper) for sandboxing.
I think we worked around it by turning off that part of the sandbox, though I don't see an update on that bug about it.

Revision history for this message
Sergio Callegari (callegar) wrote :

Probably still there in lucid (cannot test), but certainly fixed in oneiric.

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Changed in module-init-tools (Ubuntu):
status: New → Fix Released
status: Fix Released → Confirmed
Changed in module-init-tools (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.