wpa_supplicant CTRL-EVENT-SCAN-RESULTS crashes ath9k wireless driver

Bug #460886 reported by Neil Wilson
66
This bug affects 8 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

The regular scans by the Network Manager causes the ath9k driver to crash in Karmic RC. The wireless stays stable for about 8 or 9 pings on average before crashing with a 'ath9k: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x40000020' error.

ProblemType: Bug
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: neil 1640 F.... pulseaudio
CRDA:
 country 98:
  (2402 - 2482 @ 40), (N/A, 20)
Card0.Amixer.info:
 Card hw:0 'SB'/'HDA ATI SB at 0xf0000000 irq 16'
   Mixer name : 'Realtek ALC888'
   Components : 'HDA:10ec0888,10250206,00100202 HDA:14f12c06,10250093,00100000'
   Controls : 28
   Simple ctrls : 16
Card1.Amixer.info:
 Card hw:1 'HDMI'/'HDA ATI HDMI at 0xcfdec000 irq 19'
   Mixer name : 'ATI RS690/780 HDMI'
   Components : 'HDA:1002791a,00791a00,00100000'
   Controls : 4
   Simple ctrls : 1
Card1.Amixer.values:
 Simple mixer control 'IEC958',0
   Capabilities: pswitch pswitch-joined
   Playback channels: Mono
   Mono: Playback [off]
Date: Mon Oct 26 06:50:33 2009
DistroRelease: Ubuntu 9.10
HibernationDevice: RESUME=UUID=a1085570-586c-4eac-a7cb-52ca2f87881e
Lsusb:
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Acer Aspire 5536
Package: linux-image-2.6.31-14-generic 2.6.31-14.48
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.31-14-generic root=UUID=28617baa-7d84-4b4b-a9ae-ca79b57b9b16 ro quiet splash
ProcEnviron:
 LANG=en_GB.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.31-14.48-generic
RelatedPackageVersions:
 linux-backports-modules-2.6.31-14-generic N/A
 linux-firmware 1.24
SourcePackage: linux
Uname: Linux 2.6.31-14-generic x86_64
XsessionErrors:
 (gnome-settings-daemon:1653): GLib-CRITICAL **: g_propagate_error: assertion `src != NULL' failed
 (gnome-settings-daemon:1653): GLib-CRITICAL **: g_propagate_error: assertion `src != NULL' failed
 (polkit-gnome-authentication-agent-1:1757): GLib-CRITICAL **: g_once_init_leave: assertion `initialization_value != 0' failed
 (nautilus:1748): Eel-CRITICAL **: eel_preferences_get_boolean: assertion `preferences_is_initialized ()' failed
dmi.bios.date: 02/27/2009
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: V1.03
dmi.board.name: JV50PU
dmi.board.vendor: Acer
dmi.board.version: Rev
dmi.chassis.type: 10
dmi.chassis.vendor: Acer
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvrV1.03:bd02/27/2009:svnAcer:pnAspire5536:pvr0100:rvnAcer:rnJV50PU:rvrRev:cvnAcer:ct10:cvrN/A:
dmi.product.name: Aspire 5536
dmi.product.version: 0100
dmi.sys.vendor: Acer

Revision history for this message
Neil Wilson (neil-aldur) wrote :
Revision history for this message
snircher (deuteriumoxide) wrote :

I believe I have the same, or at the least a very similar issue. These constant CTRL-EVENT-SCAN-RESULTS by the wpa_supplicant are always the last syslog entries on my system prior to the crash/freeze and my manual restart. I also filed a bug report about this. It can be found here, https://bugs.launchpad.net/ubuntu/+source/wpasupplicant/+bug/468519

Revision history for this message
Marc Rossi (mrossi19) wrote :

Same situation here. Every 2 minutes I get the CTRL-EVENT-SCAN-RESULTS message with about a 10 sec network outage.

Revision history for this message
Stefan Bader (smb) wrote :

The PPA at https://launchpad.net/~stefan-bader-canonical/+archive/karmic contains the coming kernel update for Karmic. It fixes some issues with ath9k, though I am not completely sure this issue is covered. But it would be worth a try. If that does not help, the PPA also contains the latest version of linux-backports-modules-wireless which contains more recent upstream wireless drivers. I would suggest, you give that a try, if the kernel update alone is not enough. Please report back here how this worked (or not).

Revision history for this message
Neil Wilson (neil-aldur) wrote :

The kernel currently in karmic-proposed stops the driver crashing on the scan events. However the event still regularly kills the upper layer throughput with the wireless layer still thinking it is connected properly.

Doing a remove of the module at that point gets you

Nov 14 07:23:35 acer-aspire-5536 kernel: [ 749.610185] wlan0: deauthenticating by local choice (reason=3)
Nov 14 07:23:35 acer-aspire-5536 kernel: [ 749.773204] ath9k: DMA failed to stop in 10 ms AR_CR=0x00000024 AR_DIAG_SW=0x40000020
Nov 14 07:23:35 acer-aspire-5536 kernel: [ 750.203835] ath9k 0000:06:00.0: PCI INT A disabled
Nov 14 07:23:35 acer-aspire-5536 kernel: [ 750.203909] ath9k: Driver unloaded

however the module will reinsert.

So we're better than we were, but with a way to go.

When the event doesn't cause the driver to crash you get a definite pause in the throughput.

64 bytes from 192.168.2.1: icmp_seq=181 ttl=64 time=1.41 ms
64 bytes from 192.168.2.1: icmp_seq=182 ttl=64 time=1048 ms
64 bytes from 192.168.2.1: icmp_seq=183 ttl=64 time=46.9 ms
64 bytes from 192.168.2.1: icmp_seq=184 ttl=64 time=2.10 ms

This is with the laptop next to the AP

linux-image-2.6.31-15-generic:
  Installed: 2.6.31-15.50
  Candidate: 2.6.31-15.50
  Version table:
 *** 2.6.31-15.50 0
        500 http://gb.archive.ubuntu.com karmic-proposed/main Packages
        100 /var/lib/dpkg/status

Revision history for this message
Neil Wilson (neil-aldur) wrote :

That is on driver version "21B680AD1946884D70F623E"

filename: /lib/modules/2.6.31-15-generic/kernel/drivers/net/wireless/ath/ath9k/ath9k.ko
license: Dual BSD/GPL
description: Support for Atheros 802.11n wireless LAN cards.
author: Atheros Communications
srcversion: 21B680AD1946884D70F623E
alias: pci:v0000168Cd0000002Bsv*sd*bc*sc*i*
alias: pci:v0000168Cd0000002Asv*sd*bc*sc*i*
alias: pci:v0000168Cd00000029sv*sd*bc*sc*i*
alias: pci:v0000168Cd00000027sv*sd*bc*sc*i*
alias: pci:v0000168Cd00000024sv*sd*bc*sc*i*
alias: pci:v0000168Cd00000023sv*sd*bc*sc*i*
depends: led-class,mac80211,ath,cfg80211
vermagic: 2.6.31-15-generic SMP mod_unload modversions
parm: debug:uint
parm: nohwcrypt:Disable hardware encryption (int)
parm: btcoex_enable:Enable Bluetooth coexistence support (bool)

Revision history for this message
Neil Wilson (neil-aldur) wrote :

The backport modules (2.6.31.15.28) (which I think is based upon stable compat-wireless-2.6.32-rc4) help a lot. I've had stability for several hours on those drivers with no DMA stalls

The only downside is the the connection speed tends to be 11Mb/sec or lower with a 40Mhz 802.11n AP. Entirely the opposite of the over aggressive speed optimisations of previous versions. However it does work...

neil@acer-aspire-5536:~$ iw dev wlan0 station dump
Station 00:1c:df:a1:c7:2c (on wlan0)
 inactive time: 17020 ms
 rx bytes: 1152188
 rx packets: 3393
 tx bytes: 90321
 tx packets: 582
 signal: -12 dBm
 tx bitrate: 11.0 MBit/s

I still get marked stalls in throughput when the Network manager initiates a scan.

64 bytes from 192.168.2.1: icmp_seq=105 ttl=64 time=2.11 ms
64 bytes from 192.168.2.1: icmp_seq=106 ttl=64 time=2.11 ms
64 bytes from 192.168.2.1: icmp_seq=107 ttl=64 time=2.00 ms
64 bytes from 192.168.2.1: icmp_seq=108 ttl=64 time=4467 ms
64 bytes from 192.168.2.1: icmp_seq=109 ttl=64 time=3459 ms
64 bytes from 192.168.2.1: icmp_seq=110 ttl=64 time=2452 ms
64 bytes from 192.168.2.1: icmp_seq=111 ttl=64 time=1452 ms
64 bytes from 192.168.2.1: icmp_seq=112 ttl=64 time=452 ms
64 bytes from 192.168.2.1: icmp_seq=113 ttl=64 time=2.12 ms
64 bytes from 192.168.2.1: icmp_seq=114 ttl=64 time=2.43 ms
64 bytes from 192.168.2.1: icmp_seq=115 ttl=64 time=2.62 ms

Driver details:

neil@acer-aspire-5536:~$ sudo modinfo ath9k
filename: /lib/modules/2.6.31-15-generic/updates/cw/ath9k.ko
license: Dual BSD/GPL
description: Support for Atheros 802.11n wireless LAN cards.
author: Atheros Communications
srcversion: D775B5D263382E67C9CAD8F
alias: pci:v0000168Cd0000002Esv*sd*bc*sc*i*
alias: pci:v0000168Cd0000002Dsv*sd*bc*sc*i*
alias: pci:v0000168Cd0000002Bsv*sd*bc*sc*i*
alias: pci:v0000168Cd0000002Asv*sd*bc*sc*i*
alias: pci:v0000168Cd00000029sv*sd*bc*sc*i*
alias: pci:v0000168Cd00000027sv*sd*bc*sc*i*
alias: pci:v0000168Cd00000024sv*sd*bc*sc*i*
alias: pci:v0000168Cd00000023sv*sd*bc*sc*i*
depends: mac80211,led-class,ath,cfg80211
vermagic: 2.6.31-15-generic SMP mod_unload modversions
parm: debug:uint
parm: nohwcrypt:Disable hardware encryption (int)

neil@acer-aspire-5536:~$ sudo modinfo mac80211
filename: /lib/modules/2.6.31-15-generic/updates/cw/mac80211.ko
license: GPL
description: IEEE 802.11 subsystem
srcversion: 52FBBF68ECFF38BB0E66029
depends: cfg80211
vermagic: 2.6.31-15-generic SMP mod_unload modversions
parm: ieee80211_default_rc_algo:Default rate control algorithm for mac80211 to use (charp)

Revision history for this message
Stefan Bader (smb) wrote :

Just to clarify: the version in the PPA is ahead of the one in proposed. I am providing pre-releases of the next proposed kernels.

Revision history for this message
Neil Wilson (neil-aldur) wrote :

Sorry Stefan. Helps if I read the instructions properly...

With 2.6.31-16.51~pre2 installed I have the same problem as it's the same driver version "21B680AD1946884D70F623E"

Comment #5 stands for that kernel version as well.

Revision history for this message
Neil Wilson (neil-aldur) wrote :

With the modules installed (2.6.31-16.18~pre2) I get stability but slow speed (2.0 mbps) and no Bluetooth co-existence support (not that that works anyway on this laptop yet).

filename: /lib/modules/2.6.31-16-generic/updates/cw/ath9k.ko
license: Dual BSD/GPL
description: Support for Atheros 802.11n wireless LAN cards.
author: Atheros Communications
srcversion: 918698AEA3CBE61D0E996E6
alias: pci:v0000168Cd0000002Esv*sd*bc*sc*i*
alias: pci:v0000168Cd0000002Dsv*sd*bc*sc*i*
alias: pci:v0000168Cd0000002Bsv*sd*bc*sc*i*
alias: pci:v0000168Cd0000002Asv*sd*bc*sc*i*
alias: pci:v0000168Cd00000029sv*sd*bc*sc*i*
alias: pci:v0000168Cd00000027sv*sd*bc*sc*i*
alias: pci:v0000168Cd00000024sv*sd*bc*sc*i*
alias: pci:v0000168Cd00000023sv*sd*bc*sc*i*
depends: mac80211,led-class,ath,cfg80211
vermagic: 2.6.31-16-generic SMP mod_unload modversions
parm: debug:uint
parm: nohwcrypt:Disable hardware encryption (int)

Revision history for this message
Neil Wilson (neil-aldur) wrote :

Note from Luis on the ath9k devel list about the pause during scanning.

"If you leave your home channel you cannot be present for data on it.
One way to address this is to scan on each target channel and come
back the home channel, and repeat this for every channel you have to
scan. This is only done with nl80211 though which you would have to
enable manually through a wpa_supplicant configuration file or by
editing your wpa_supplicant dbus service file and enabling new
git-version wpa_supplicant options which are documented on the
wireless wiki [1]. Let us know if that helps."

He also mentions that roaming isn't implemented in nl80211 yet.

Revision history for this message
sadaka (referbox) wrote :

Is there any plans to fix this bug in the near future? Please share any info on this, since this bug leaves my and about 20 people laptops completely useless (we have only wireless access).

sudo lshw -C network
  *-network
       description: Ethernet interface
       product: 191 Gigabit Ethernet Adapter
       vendor: Silicon Integrated Systems [SiS]
       physical id: 4
       bus info: pci@0000:00:04.0
       logical name: eth0
       version: 02
       serial: 00:23:54:43:00:12
       size: 10MB/s
       capacity: 100MB/s
       width: 32 bits
       clock: 33MHz
       capabilities: pm bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd autonegotiation
       configuration: autonegotiation=on broadcast=yes driver=sis190 driverversion=1.3 duplex=half latency=0 link=no multicast=yes port=MII speed=10MB/s
       resources: irq:19 memory:fddfcc00-fddfcc7f ioport:cc00(size=128)
  *-network
       description: Wireless interface
       product: AR928X Wireless Network Adapter (PCI-Express)
       vendor: Atheros Communications Inc.
       physical id: 0
       bus info: pci@0000:02:00.0
       logical name: wmaster0
       version: 01
       serial: 00:22:43:21:d9:b2
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress msix bus_master cap_list logical ethernet physical wireless
       configuration: broadcast=yes driver=ath9k ip=192.168.0.100 latency=0 multicast=yes wireless=IEEE 802.11bgn
       resources: irq:16 memory:fdff0000-fdffffff

May be someone could kindly share info on any temporary workaround for this? Thank you.

Revision history for this message
Clnanderson (clnanderson) wrote :

Is there any way of stopping the scanning, or decreasing its frequency?

Revision history for this message
sadaka (referbox) wrote :

It seems that the only user-oriented up-to-date distro I've managed to find which doesn't have this nasty bug is Mandriva. Developers assigned for fixing this one (if there will be any in the nearest decades) could take a look how Mandriva guys managed to get around it.
For couple of months now I'm using Mandriva 2010 One with GNOME (KDE is too heavy for me), and I never was so delighted with my laptop's performance (ATI proprietary drivers), hardware support (every single piece of HW is recognized and works fine), networking and everything else. Those who suffer from this bug could give it a try at least until Ubuntu guys fixes this bug (if it will ever happen :-/ ).

Revision history for this message
Clnanderson (clnanderson) wrote :

It looks like its a bug in ath5k modules. If so, maybe Mandriva has more recent kernel modules?

http://bugzilla.kernel.org/show_bug.cgi?id=12635

"Fair enough. One thing you can try in the interim is to add to your
wpa_supplicant.conf "scan_freq=2412 2437 2462" -- that will limit the channels
that are scanned to 1,6, and 11, which should reduce the problems by a lot."

Not sure how to manage that though.

Revision history for this message
Clnanderson (clnanderson) wrote :

Also:

Comment #3 From Bob Copeland 2009-04-02 11:29:01 -------

This patch *may* help, by reducing the total number of channels:

http://marc.info/?l=linux-wireless&m=123841474910111&w=2

Changed in linux (Ubuntu):
status: New → Incomplete
status: Incomplete → New
Revision history for this message
John McPherson (jrm+launchpadbugs) wrote :

'lspci' shows that my wireless hardware (on an Acer Aspire 5737z) is:

11:00.0 Network controller: Atheros Communications Inc. AR928X Wireless Network Adapter (PCI-Express) (rev 01)

When I use the ath9k module installed by the linux-backports-modules-2.6.31-19-generic package, the wireless connectivity "freezes" for a few seconds every time wpa_supplicant logs CTRL-EVENT-SCAN-RESULTS.

When the linux-backports-modules-2.6.31-19-generic package is purged and I use the modules provided by linux-image-2.6.31-19-generic, it is better but not 100%. It no longer freezes every 2 minutes, but it still happens regularly (several times an hour).

Hope this helps someone :)

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi Neil,

This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 460886

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
James Walker (jamesjoseph-walker) wrote :

The best description of this (and a way to actually work around it) can be found here:
https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/373680

Revision history for this message
Brad Figg (brad-figg) wrote : Unsupported series, setting status to "Won't Fix".

This bug was filed against a series that is no longer supported and so is being marked as Won't Fix. If this issue still exists in a supported series, please file a new bug.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: Incomplete → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.