Kernel Oops - BUG: unable to handle kernel NULL pointer dereference at (null); EIP is at radeon_suspend_kms+0x78/0x1e0 [radeon]

Bug #756555 reported by phobos
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

XZ

ProblemType: KernelOops
DistroRelease: Ubuntu 11.04
Package: linux-image-2.6.38-8-generic 2.6.38-8.41
Regression: Yes
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.38-8.41-generic 2.6.38.2
Uname: Linux 2.6.38-8-generic i686
NonfreeKernelModules: wl
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.23.
Annotation: Your system might become unstable now and might need to be restarted.
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC272 Analog [ALC272 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: phobos 1604 F.... pulseaudio
 /dev/snd/pcmC0D0p: phobos 1604 F...m pulseaudio
CRDA: Error: [Errno 2] Нет такого файла или каталога
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xf0800000 irq 47'
   Mixer name : 'Realtek ALC272'
   Components : 'HDA:10ec0272,17aac008,00100001'
   Controls : 19
   Simple ctrls : 11
Card1.Amixer.info:
 Card hw:1 'HDMI'/'HDA ATI HDMI at 0xf0410000 irq 48'
   Mixer name : 'ATI R6xx HDMI'
   Components : 'HDA:1002aa01,00aa0100,00100100'
   Controls : 4
   Simple ctrls : 1
Card1.Amixer.values:
 Simple mixer control 'IEC958',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
Date: Sun Apr 10 17:04:47 2011
Failure: oops
HibernationDevice: RESUME=UUID=ae36490b-9ce5-4047-b2b5-64b93d0d108c
MachineType: LENOVO 20032
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-2.6.38-8-generic root=UUID=2d7c932d-5f02-47e5-933f-aa7770308762 ro splash quiet vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-2.6.38-8-generic N/A
 linux-backports-modules-2.6.38-8-generic N/A
 linux-firmware 1.50
SourcePackage: linux
Title: BUG: unable to handle kernel NULL pointer dereference at (null)
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 01/20/2010
dmi.bios.vendor: LENOVO
dmi.bios.version: 22CN35WW(V2.02)
dmi.board.name: NIUR1
dmi.board.vendor: LENOVO
dmi.board.version: REFERENCE
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: No Enclosure
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnLENOVO:bvr22CN35WW(V2.02):bd01/20/2010:svnLENOVO:pn20032:pvrLenovoIdeaPadU450:rvnLENOVO:rnNIUR1:rvrREFERENCE:cvnNoEnclosure:ct10:cvrN/A:
dmi.product.name: 20032
dmi.product.version: Lenovo IdeaPad U450
dmi.sys.vendor: LENOVO

Revision history for this message
phobos (b3phobos) wrote :
summary: BUG: unable to handle kernel NULL pointer dereference at (null)
+ radeon_suspend_kms+0x78/0x1e0 [radeon] from
+ radeon_switcheroo_set_state+0x4b/0xa0
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
tags: added: radeon
tags: added: kernel-driver-radeon
Revision history for this message
a.r.karthick@gmail.com (a-r-karthick) wrote : Re: BUG: unable to handle kernel NULL pointer dereference at (null) radeon_suspend_kms+0x78/0x1e0 [radeon] from radeon_switcheroo_set_state+0x4b/0xa0
Download full text (5.9 KiB)

I was pointed to this by a friend of mine (Mayank Rungta) whom I helped crack a radeon driver OOPs on suspend: (bug 820746)

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/

For some reason, he wanted me to take a look at this issue as well which has been inactive for sometime :)

Detailed RCA and some action items for the guy who reproduced this. Its evident that he had 3 connectors (or displays) attached to the radeon driver during boot time and might be also suspending with them. (LVDS laptop display + VGA connector display + HDMI connector attached). Need to know how it was reproduced or whether laptop lid was closed/suspended with all 3 connectors attached. So I can ask my friend with a similar hardware and radeon driver (same guy who reproduced 820746) to reproduce this.

Read ahead for the full story:
=====================

Again using the objdump disassembly of the radeon driver (radeon.ko.out) attached from bug 820746 (same 2.6.38), I managed to crack the place that's causing the OOPs.

Reverse engineering the OOPs to the assembly and mapping the assembly to C code, the panic was triggered by this instruction on radeon driver suspend in radeon_suspend_kms:

radeon_suspend_kms.c:

  /* turn off display hw */
list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
drm_helper_connector_dpms(connector, DRM_MODE_DPMS_OFF);
}

  In the above list_head iteration of connector_list for the radeon drm_device on SUSPEND, the dev->mode_config.connector_list.next is NULL.

Or in other words, the DRM device connector list is _corrupted_. Its mostly certain that the device connector was detached or destroyed while suspend is trying to switch off display on all your connectors.

dev->mde_config.connector_list.next is NULL
 the panic or faulting instruction EIP was triggered by a NULL in register EBX
  EBX value is 0xfffffea8
  which is nothing but: NULL pointer - 0x158,
 which is nothing but: ~0U - 0x157.

  The OOPs EIP is at:
  radeon_suspend_kms+0x78

which from objdump disassembly maps to:

which is radeon_suspend_kms + 19888
  19888: 8b 83 58 01 00 00 mov 0x158(%ebx),%eax
  bingo:
  as thats a list_entry macro trying to iterator "struct drm_connector" or drm connector list. The drm connector list head field is at offset 0x158 which has to be subtracted from the list_head pointer to arrive at the drm_connector.

So at panic time, the radeon driver OOps while trying to suspend display on each of the connected devices.
But the connector list was corrupted.

Also the OOPs hexdump exactly matches the objdump dissassembly hexdump at the time of the panic:
81 eb 58 01 00 00 <8b> 83 58 01 00 00 0f
<8b> (angular brackets) is the faulting instruction or the "mov".

This matches the list_head walk for the drm connector from the objdump disassembly of radeon_suspend_kms function:
  19882: 81 eb 58 01 00 00 sub $0x158,%ebx
   19888: 8b 83 58 01 00 00 mov 0x158(%ebx),%eax ->PANIC EIP is here.

Now that we know the C code and the reason of the Oops or the null pointer field, we have to trace backwards in code and see how the drm connector list can be corrupted or can have NULL as a list element or a co...

Read more...

Revision history for this message
penalvch (penalvch) wrote :

phobos, thank you for reporting this bug and helping make Ubuntu better. This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux <replace-with-bug-number>

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

summary: - BUG: unable to handle kernel NULL pointer dereference at (null)
- radeon_suspend_kms+0x78/0x1e0 [radeon] from
- radeon_switcheroo_set_state+0x4b/0xa0
+ Kernel Oops - BUG: unable to handle kernel NULL pointer dereference at
+ (null); EIP is at radeon_suspend_kms+0x78/0x1e0 [radeon]
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.