Latest kernel 2.6.38-11.48 crashes the iwlagn (wifi) driver

Bug #837819 reported by Juliano Ravasi
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

The latest kernel 2.6.38-11.48 pushed a few days ago is causing the iwlagn driver to crash with a "microcode software error" on this wireless network interface (from a Dell XPS L502X):

03:00.0 Network controller: Intel Corporation Centrino Wireless-N 1030 (rev 34)
        Subsystem: Intel Corporation Centrino Wireless-N 1030 BGN
        Flags: bus master, fast devsel, latency 0, IRQ 49
        Memory at f1b00000 (64-bit, non-prefetchable) [size=8K]
        Capabilities: [c8] Power Management version 3
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [e0] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number bc-77-37-ff-ff-37-05-96
        Kernel driver in use: iwlagn
        Kernel modules: iwlagn

The errors are random, but more frequent during heavy usage. This is dumped to the system log:

iwlagn 0000:03:00.0: Microcode SW error detected. Restarting 0x2000000.
iwlagn 0000:03:00.0: Loaded firmware version: 17.168.5.2 build 35905
iwlagn 0000:03:00.0: Start IWL Error Log Dump:
iwlagn 0000:03:00.0: Status: 0x0002B2E4, count: 6
iwlagn 0000:03:00.0: Desc Time data1 data2 line
iwlagn 0000:03:00.0: ADVANCED_SYSASSERT (0x1BD8) 0024444172 0x00000258 0x000000C8 199
iwlagn 0000:03:00.0: pc blink1 blink2 ilink1 ilink2 hcmd
iwlagn 0000:03:00.0: 0x05FBC 0x05F1E 0x05F1E 0x0DC1A 0x00000 0x92D0091
iwlagn 0000:03:00.0: CSR values:
iwlagn 0000:03:00.0: (2nd byte of CSR_INT_COALESCING is CSR_INT_PERIODIC_REG)
iwlagn 0000:03:00.0: CSR_HW_IF_CONFIG_REG: 0X00c80303
iwlagn 0000:03:00.0: CSR_INT_COALESCING: 0X0000ff40
iwlagn 0000:03:00.0: CSR_INT: 0X00000000
iwlagn 0000:03:00.0: CSR_INT_MASK: 0X00000000
iwlagn 0000:03:00.0: CSR_FH_INT_STATUS: 0X00000000
iwlagn 0000:03:00.0: CSR_GPIO_IN: 0X0000003c
iwlagn 0000:03:00.0: CSR_RESET: 0X00000000
iwlagn 0000:03:00.0: CSR_GP_CNTRL: 0X080403c5
iwlagn 0000:03:00.0: CSR_HW_REV: 0X000000b0
iwlagn 0000:03:00.0: CSR_EEPROM_REG: 0X27010ffd
iwlagn 0000:03:00.0: CSR_EEPROM_GP: 0X90000801
iwlagn 0000:03:00.0: CSR_OTP_GP_REG: 0X00030001
iwlagn 0000:03:00.0: CSR_GIO_REG: 0X00080042
iwlagn 0000:03:00.0: CSR_GP_UCODE_REG: 0X0000521e
iwlagn 0000:03:00.0: CSR_GP_DRIVER_REG: 0X00000000
iwlagn 0000:03:00.0: CSR_UCODE_DRV_GP1: 0X00000000
iwlagn 0000:03:00.0: CSR_UCODE_DRV_GP2: 0X00000000
iwlagn 0000:03:00.0: CSR_LED_REG: 0X00000058
iwlagn 0000:03:00.0: CSR_DRAM_INT_TBL_REG: 0X8811bea2
iwlagn 0000:03:00.0: CSR_GIO_CHICKEN_BITS: 0X27800200
iwlagn 0000:03:00.0: CSR_ANA_PLL_CFG: 0X00000000
iwlagn 0000:03:00.0: CSR_HW_REV_WA_REG: 0X0001001a
iwlagn 0000:03:00.0: CSR_DBG_HPET_MEM_REG: 0Xffff0000
iwlagn 0000:03:00.0: FH register values:
iwlagn 0000:03:00.0: FH_RSCSR_CHNL0_STTS_WPTR_REG: 0X0a958200
iwlagn 0000:03:00.0: FH_RSCSR_CHNL0_RBDCB_BASE_REG: 0X00a97a60
iwlagn 0000:03:00.0: FH_RSCSR_CHNL0_WPTR: 0X000000f8
iwlagn 0000:03:00.0: FH_MEM_RCSR_CHNL0_CONFIG_REG: 0X80819104
iwlagn 0000:03:00.0: FH_MEM_RSSR_SHARED_CTRL_REG: 0X000000fc
iwlagn 0000:03:00.0: FH_MEM_RSSR_RX_STATUS_REG: 0X07030000
iwlagn 0000:03:00.0: FH_MEM_RSSR_RX_ENABLE_ERR_IRQ2DRV: 0X00000000
iwlagn 0000:03:00.0: FH_TSSR_TX_STATUS_REG: 0X07ff0001
iwlagn 0000:03:00.0: FH_TSSR_TX_ERROR_REG: 0X00000000
iwlagn 0000:03:00.0: Start IWL Event Log Dump: display last 20 entries
iwlagn 0000:03:00.0: EVT_LOGT:0025527212:0x00040000:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025527292:0x000000fa:0106
iwlagn 0000:03:00.0: EVT_LOGT:0025527470:0x00002000:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025527471:0x00004000:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025527655:0x00001000:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025528337:0x00000080:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025528338:0x00040000:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025530451:0x000000fa:0106
iwlagn 0000:03:00.0: EVT_LOGT:0025530542:0x00000080:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025530543:0x00040000:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025530549:0x000000fa:0106
iwlagn 0000:03:00.0: EVT_LOGT:0025530803:0x00000080:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025530804:0x00040000:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025531835:0x000000fa:0106
iwlagn 0000:03:00.0: EVT_LOGT:0025531927:0x00000080:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025531928:0x00040000:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025532245:0x000000fa:0106
iwlagn 0000:03:00.0: EVT_LOGT:0025532335:0x00000080:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025532336:0x00040000:1243
iwlagn 0000:03:00.0: EVT_LOGT:0025534328:0x00000000:0125
iwlagn 0000:03:00.0: Attempting to modify non-existing station 2

This error repeats over and over again, until I disconnect. The WiFi led turns off and then on again every time the driver gives this error. Going back to version 2.6.38-10.46 resolves the problem, so this is a regression in the current version.

The changelog of this version mentions:

    iwlagn: fix iwl_is_any_associated

Perhaps this fix is the culprit, and it introduced this new problem.

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: linux-image-2.6.38-11-generic 2.6.38-11.48
ProcVersionSignature: Ubuntu 2.6.38-11.48-generic 2.6.38.8
Uname: Linux 2.6.38-11-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.23.
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: PCH [HDA Intel PCH], device 0: ALC665 Analog [ALC665 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: juliano 2135 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'PCH'/'HDA Intel PCH at 0xf1c00000 irq 50'
   Mixer name : 'Intel CougarPoint HDMI'
   Components : 'HDA:10ec0665,102804b6,00100003 HDA:80862805,80860101,00100000'
   Controls : 27
   Simple ctrls : 13
Date: Wed Aug 31 00:24:08 2011
EcryptfsInUse: Yes
HibernationDevice: RESUME=UUID=2dd8520d-df2b-4937-8655-c779de8f152a
InstallationMedia: Kubuntu 11.04 "Natty Narwhal" - Release amd64 (20110427.1)
MachineType: Dell Inc. Dell System XPS L502X
ProcEnviron:
 LANGUAGE=en_US:en
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/zsh
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.38-11-generic root=UUID=bd6ddeeb-587d-403c-b829-c6f49fbd0796 ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-2.6.38-11-generic N/A
 linux-backports-modules-2.6.38-11-generic N/A
 linux-firmware 1.52
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 03/25/2011
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A04
dmi.board.name: 0YR8NN
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.chassis.version: 0.1
dmi.modalias: dmi:bvnDellInc.:bvrA04:bd03/25/2011:svnDellInc.:pnDellSystemXPSL502X:pvr:rvnDellInc.:rn0YR8NN:rvrA00:cvnDellInc.:ct8:cvr0.1:
dmi.product.name: Dell System XPS L502X
dmi.sys.vendor: Dell Inc.

Revision history for this message
Juliano Ravasi (jravasi) wrote :
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Marius B. Kotsbak (mariusko) wrote :
Revision history for this message
Juliano Ravasi (jravasi) wrote :

This is the commit reported by bisect:

2b4208383d114d202d38eb3b04ff7b844de8eefa is the first bad commit
commit 2b4208383d114d202d38eb3b04ff7b844de8eefa
Author: Johannes Berg <email address hidden>
Date: Fri May 6 11:11:20 2011 -0700

    iwlagn: fix iwl_is_any_associated

    BugLink: http://bugs.launchpad.net/bugs/793702

    commit 054ec924944912413e4ee927b8cf02f476d08783 upstream.

    The function iwl_is_any_associated() was intended
    to check both contexts, but due to an oversight
    it only checks the BSS context. This leads to a
    problem with scanning since the passive dwell
    time isn't restricted appropriately and a scan
    that includes passive channels will never finish
    if only the PAN context is associated since the
    default dwell time of 120ms won't fit into the
    normal 100 TU DTIM interval.

    Fix the function by using for_each_context() and
    also reorganise the other functions a bit to take
    advantage of each other making the code easier to
    read.

    Signed-off-by: Johannes Berg <email address hidden>
    Signed-off-by: Wey-Yi Guy <email address hidden>
    Signed-off-by: John W. Linville <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>
    Signed-off-by: Tim Gardner <email address hidden>

:040000 040000 0db112ac8e49c4741150666f49236a5386ecbeb3 7d2640cda3ac0b9b69fae1bdf35ee62e3be310cc M drivers

Revision history for this message
Bert JW Regeer (xistence) wrote :
Download full text (5.5 KiB)

I am seeing the same error messages in my log as the OP:

[ 3336.284786] iwlagn 0000:01:00.0: Microcode SW error detected. Restarting 0x2000000.
[ 3336.284810] iwlagn 0000:01:00.0: Loaded firmware version: 17.168.5.2 build 35905
[ 3336.284904] iwlagn 0000:01:00.0: Start IWL Error Log Dump:
[ 3336.284915] iwlagn 0000:01:00.0: Status: 0x0002B2E4, count: 6
[ 3336.285158] iwlagn 0000:01:00.0: Desc Time data1 data2 line
[ 3336.285170] iwlagn 0000:01:00.0: ADVANCED_SYSASSERT (0x1BD8) 0299830228 0x00000258 0x000000C8 199
[ 3336.285182] iwlagn 0000:01:00.0: pc blink1 blink2 ilink1 ilink2 hcmd
[ 3336.285195] iwlagn 0000:01:00.0: 0x05FBC 0x05F1E 0x05F1E 0x0DC1A 0x00000 0x9A90091
[ 3336.285206] iwlagn 0000:01:00.0: CSR values:
[ 3336.285216] iwlagn 0000:01:00.0: (2nd byte of CSR_INT_COALESCING is CSR_INT_PERIODIC_REG)
[ 3336.285258] iwlagn 0000:01:00.0: CSR_HW_IF_CONFIG_REG: 0X00c80303
[ 3336.285296] iwlagn 0000:01:00.0: CSR_INT_COALESCING: 0X0000ff40
[ 3336.285334] iwlagn 0000:01:00.0: CSR_INT: 0X00000000
[ 3336.285371] iwlagn 0000:01:00.0: CSR_INT_MASK: 0X00000000
[ 3336.285407] iwlagn 0000:01:00.0: CSR_FH_INT_STATUS: 0X00000000
[ 3336.285442] iwlagn 0000:01:00.0: CSR_GPIO_IN: 0X0000003c
[ 3336.285478] iwlagn 0000:01:00.0: CSR_RESET: 0X00000000
[ 3336.285515] iwlagn 0000:01:00.0: CSR_GP_CNTRL: 0X080403c5
[ 3336.285552] iwlagn 0000:01:00.0: CSR_HW_REV: 0X000000b0
[ 3336.285588] iwlagn 0000:01:00.0: CSR_EEPROM_REG: 0X88160ffd
[ 3336.285625] iwlagn 0000:01:00.0: CSR_EEPROM_GP: 0X90000801
[ 3336.285661] iwlagn 0000:01:00.0: CSR_OTP_GP_REG: 0X00030001
[ 3336.285696] iwlagn 0000:01:00.0: CSR_GIO_REG: 0X00080042
[ 3336.285732] iwlagn 0000:01:00.0: CSR_GP_UCODE_REG: 0X000063c1
[ 3336.285769] iwlagn 0000:01:00.0: CSR_GP_DRIVER_REG: 0X00000000
[ 3336.285803] iwlagn 0000:01:00.0: CSR_UCODE_DRV_GP1: 0X00000000
[ 3336.285838] iwlagn 0000:01:00.0: CSR_UCODE_DRV_GP2: 0X00000000
[ 3336.285875] iwlagn 0000:01:00.0: CSR_LED_REG: 0X00000058
[ 3336.285911] iwlagn 0000:01:00.0: CSR_DRAM_INT_TBL_REG: 0X881b20dc
[ 3336.285948] iwlagn 0000:01:00.0: CSR_GIO_CHICKEN_BITS: 0X27800200
[ 3336.285984] iwlagn 0000:01:00.0: CSR_ANA_PLL_CFG: 0X00000000
[ 3336.286023] iwlagn 0000:01:00.0: CSR_HW_REV_WA_REG: 0X0001001a
[ 3336.286058] iwlagn 0000:01:00.0: CSR_DBG_HPET_MEM_REG: 0Xffff0000
[ 3336.286067] iwlagn 0000:01:00.0: FH register values:
[ 3336.286145] iwlagn 0000:01:00.0: FH_RSCSR_CHNL0_STTS_WPTR_REG: 0X1a241f00
[ 3336.286225] iwlagn 0000:01:00.0: FH_RSCSR_CHNL0_RBDCB_BASE_REG: 0X01b36ba0
[ 3336.286316] iwlagn 0000:01:00.0: FH_RSCSR_CHNL0_WPTR: 0X00000038
[ 3336.286406] iwlagn 0000:01:00.0: FH_MEM_RCSR_CHNL0_CONFIG_REG: 0X80819104
[ 3336.286497] iwlagn 0000:01:00.0: FH_MEM_RSSR_SHARED_CTRL_REG: 0X000000fc
[ 3336.286589] iwlagn 0000:01:00.0: FH_MEM_RSSR_RX_STATUS_REG: 0X0703000...

Read more...

Revision history for this message
Juliano Ravasi (jravasi) wrote :

Just to reconfirm, after I found the bad commit, I git checkout Ubuntu-2.6.38-11.48 and then git revert 2b420838. The resulting kernel does not show the problem, as expected. The commit 2b420838 is bad.

And just like Bert, I too use hostap in this system to connect with my phone and my other laptop. All my tests were done with the interface in "Master" mode, acting as an Access Point using hostap.

Revision history for this message
Bert JW Regeer (xistence) wrote :

Are you seeing the same kernel panics that I am seeing Juliano?

Revision history for this message
Juliano Ravasi (jravasi) wrote :

No crashes, but perhaps because I didn't let it run long enough in that configuration. As soon as I noticed the driver crashing, I reported the bug and reverted back to the previous kernel.

Revision history for this message
penalvch (penalvch) wrote :

Juliano Ravasi, thank you for reporting this bug and helping make Ubuntu better. This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux <replace-with-bug-number>

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

tags: added: needs-upstream-testing regression-release
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.