iwlagn (i4965AGN) continually drops and reconnects to WEP-protected access point

Bug #544254 reported by Christian Reis
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
Undecided
Stefan Bader

Bug Description

I use a WEP-protected AP that is about 1 meter from my Thinkpad X61. Since upgrading to Lucid, I'm seeing this litter my syslog exactly every 2 minutes:

Mar 22 12:36:24 baratinha kernel: [ 5125.440104] No probe response from AP 00:40
:10:10:00:03 after 500ms, disconnecting.
Mar 22 12:36:24 baratinha wpa_supplicant[1055]: CTRL-EVENT-DISCONNECTED - Disconnect event - remove keys
Mar 22 12:36:24 baratinha NetworkManager: <info> (wlan0): supplicant connection state: completed -> disconnected
Mar 22 12:36:24 baratinha NetworkManager: <info> (wlan0): supplicant connection state: disconnected -> scanning
Mar 22 12:36:25 baratinha wpa_supplicant[1055]: Trying to associate with 00:40:10:10:00:03 (SSID='HARIBO' freq=2412 MHz)
Mar 22 12:36:25 baratinha NetworkManager: <info> (wlan0): supplicant connection state: scanning -> associating
Mar 22 12:36:25 baratinha kernel: [ 5126.757827] wlan0: direct probe to AP 00:40:10:10:00:03 (try 1)
Mar 22 12:36:25 baratinha kernel: [ 5126.760130] wlan0: direct probe responded
Mar 22 12:36:25 baratinha kernel: [ 5126.760139] wlan0: authenticate with AP 00:40:10:10:00:03 (try 1)
Mar 22 12:36:25 baratinha kernel: [ 5126.762098] wlan0: authenticated
Mar 22 12:36:25 baratinha kernel: [ 5126.762133] wlan0: associate with AP 00:40:10:10:00:03 (try 1)
Mar 22 12:36:25 baratinha kernel: [ 5126.764390] wlan0: RX AssocResp from 00:40:10:10:00:03 (capab=0x431 status=0 aid=1)
Mar 22 12:36:25 baratinha kernel: [ 5126.764397] wlan0: associated
Mar 22 12:36:25 baratinha wpa_supplicant[1055]: Associated with 00:40:10:10:00:03
Mar 22 12:36:25 baratinha wpa_supplicant[1055]: CTRL-EVENT-CONNECTED - Connection to 00:40:10:10:00:03 completed (reauth) [id=0 id_str=]
Mar 22 12:36:25 baratinha NetworkManager: <info> (wlan0): supplicant connection state: associating -> associated
Mar 22 12:36:25 baratinha NetworkManager: <info> (wlan0): supplicant connection state: associated -> completed

This very quick drop in connection isn't reflected in the NM applet, and it doesn't cause IRC or web connections to fail; I just feel it in that SSH very slightly lags while I am typing.

ProblemType: Bug
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
   Subdevices: 2/2
   Subdevice #0: subdevice #0
   Subdevice #1: subdevice #1
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: kiko 1432 F.... pulseaudio
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xf8220000 irq 17'
   Mixer name : 'Analog Devices AD1984'
   Components : 'HDA:11d41984,17aa20d6,00100400'
   Controls : 30
   Simple ctrls : 19
Date: Mon Mar 22 12:36:11 2010
DistroRelease: Ubuntu 10.04
HibernationDevice: RESUME=UUID=b827da69-68f9-4501-81e3-6a2b02b8627a
MachineType: LENOVO 7675CTO
Package: linux-image-2.6.32-16-generic 2.6.32-16.25
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcCmdLine: root=UUID=96cb5909-59cf-4f56-8e85-f5bbf86c2549 ro quiet splash
ProcEnviron:
 LANG=en_US.utf8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-16.25-generic
Regression: Yes
RelatedPackageVersions: linux-firmware 1.33
Reproducible: Yes
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
TestedUpstream: No
Uname: Linux 2.6.32-16-generic i686
WpaSupplicantLog:

dmi.bios.date: 07/02/2007
dmi.bios.vendor: LENOVO
dmi.bios.version: 7NET25WW (1.06 )
dmi.board.name: 7675CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr7NET25WW(1.06):bd07/02/2007:svnLENOVO:pn7675CTO:pvrThinkPadX61:rvnLENOVO:rn7675CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 7675CTO
dmi.product.version: ThinkPad X61
dmi.sys.vendor: LENOVO

Revision history for this message
Christian Reis (kiko) wrote :
Revision history for this message
Christian Reis (kiko) wrote :

I've tried linux-backports-modules-wireless-2.6.32-16-generic and that hasn't made a difference.

Revision history for this message
Michael Lustfield (michaellustfield) wrote :

In dmesg, I see this after the connection drops. Prior to that I will see the connection information. At the point of drop, this is teh only additional line.
[ 312.910959] iwlagn 0000:06:00.0: iwl_tx_agg_start on ra = 00:23:eb:61:d7:ce tid = 0

This is in 10.04.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Michael Lustfield (michaellustfield) wrote : apport information

AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
AplayDevices:
 **** List of PLAYBACK Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: STAC92xx Analog [STAC92xx Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: STAC92xx Analog [STAC92xx Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/dsp', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/hwC0D1', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/seq', '/dev/snd/timer', '/dev/sequencer2', '/dev/sequencer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfc400000 irq 22'
   Mixer name : 'SigmaTel STAC9872AK'
   Components : 'HDA:83847662,104d2300,00100201 HDA:14f12c06,104d1700,00100000'
   Controls : 12
   Simple ctrls : 8
DistroRelease: Ubuntu 10.04
HibernationDevice: RESUME=UUID=464a28f0-0f11-47c0-a39f-56c9fa233abe
MachineType: Sony Corporation VGN-FZ240E
Package: linux (not installed)
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-17-generic root=UUID=1c977f6d-fc01-408b-ade6-2028df6a8885 ro quiet
ProcEnviron:
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-17.26-generic 2.6.32.10+drm33.1
Regression: Yes
RelatedPackageVersions: linux-firmware 1.33
Reproducible: Yes
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: yes
Tags: lucid regression-potential
TestedUpstream: Yes
Uname: Linux 2.6.32-17-generic x86_64
UserGroups:

dmi.bios.date: 07/04/2007
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: R1120J7
dmi.board.asset.tag: N/A
dmi.board.name: VAIO
dmi.board.vendor: Sony Corporation
dmi.board.version: N/A
dmi.chassis.asset.tag: N/A
dmi.chassis.type: 10
dmi.chassis.vendor: Sony Corporation
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvrR1120J7:bd07/04/2007:svnSonyCorporation:pnVGN-FZ240E:pvrC6001VST:rvnSonyCorporation:rnVAIO:rvrN/A:cvnSonyCorporation:ct10:cvrN/A:
dmi.product.name: VGN-FZ240E
dmi.product.version: C6001VST
dmi.sys.vendor: Sony Corporation

tags: added: apport-collected
Revision history for this message
Michael Lustfield (michaellustfield) wrote : AlsaDevices.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : BootDmesg.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : Card0.Amixer.values.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : Card0.Codecs.codec.0.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : Card0.Codecs.codec.1.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : IwConfig.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : Lspci.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : Lsusb.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : PciMultimedia.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : ProcModules.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : UdevDb.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : UdevLog.txt

apport information

Revision history for this message
Michael Lustfield (michaellustfield) wrote : WifiSyslog.txt

apport information

tags: added: amd64
Revision history for this message
Michael Lustfield (michaellustfield) wrote : Re: iwlagn (i4965AGN) continually drops and reconnects to access point

Sorry for the flood of junk; it just looked like my report had more to add.

I also wanted to mention that over a wired connection things seem to work splendid. From searching online it seems this issue is somewhat common, is there any way to give it a Medium priority?

Revision history for this message
Christian Reis (kiko) wrote :

I queried the default parameters registered for 2.6.32-16:

/sys/module/iwlagn/parameters/fw_restart4965:1
/sys/module/iwlagn/parameters/amsdu_size_8K:1
/sys/module/iwlagn/parameters/11n_disable:0
/sys/module/iwlagn/parameters/queues_num:0
/sys/module/iwlagn/parameters/disable_hw_scan:0
/sys/module/iwlagn/parameters/swcrypto:0
/sys/module/iwlagn/parameters/antenna:0
/sys/module/iwlagn/parameters/fw_restart50:1
/sys/module/iwlagn/parameters/amsdu_size_8K50:1
/sys/module/iwlagn/parameters/11n_disable50:0
/sys/module/iwlagn/parameters/queues_num50:0
/sys/module/iwlagn/parameters/swcrypto50:N

Revision history for this message
Christian Reis (kiko) wrote :

Tried now with 2.6.32-17 and have the same symptom.

Revision history for this message
Michael Lustfield (michaellustfield) wrote :

In 2.6.32-16 with the wireless backports the wireless seemed to work
fine. In the git repo for linus' 2.6 branch 2.6.32 works great,
2.6.33(.*) and 2.6.34(.*) I see the issue.

Actually... in 2.6.34(.*) my wireless interface was detected as an
ethernet device.

I'm not smart enough to figure out what exactly is going on, but I can
at least tell it's regression from after 2.6.32.

Revision history for this message
Stefan Bader (smb) wrote :

The first statement is a bit confusing to me. Whenever linux-backports-modules-wireless gets installed in Lucid, the wireless drivers from that (which are currently around 2.6.33 level) are used, regardless which kernel is installed. So I would somehow expect when l-b-m is installed, then 2.6.33 should work the same.

Christian's regression seems to be between 2.6.31.(probably .9 as stable .12 only now went into proposed) and 2.6.32.9 (Ubuntu-2.6.32-16.25). Michael's regression I don't know for reasons above.

Maybe two things to try:

1. as root: echo "options iwlagn disable_hw_scan=1" >/etc/modprobe.d/iwlagn.conf
Then reboot and see whether this makes things better or worse.

2. Try to narrow the point of regression: http://kernel.ubuntu.com/~kernel-ppa/mainline/ contains prebuild mainline kernel
v2.6.32-rc? are after 2.6.31 but before 2.6.32. Or 2.6.33-rc? come before 2.6.33. (l-b-m-w should have no impact here).
Unfortunately this still will be slightly coarse as rc1 and rc2 usually have the biggest changes to drivers.

Revision history for this message
Christian Reis (kiko) wrote :

I've tried disabling the hardware scan, with no luck. Will try now pulling other kernels.

Revision history for this message
Christian Reis (kiko) wrote :

So with 2.6.31-02063109-generic I can't reproduce the problem (but unfortunately I'm getting a ton of ata1 errors with that).

With 2.6.33 the problem is still there.

I'll try looking through some 2.6.32-rc versions to see where the problem starts. I'll also enabling the hardware scan again since that seemed to be a non-issue.

Revision history for this message
Christian Reis (kiko) wrote :

2.6.32-020632rc5-generic doesn't work.

Revision history for this message
Christian Reis (kiko) wrote :

2.6.32-020632rc3-generic doesn't work.

Revision history for this message
Christian Reis (kiko) wrote :

2.6.32-020632rc1-generic doesn't work.

Revision history for this message
Christian Reis (kiko) wrote :

(The ATA issue mentioned above was reported by Chris Coulson as bug 539467 fwiw)

Revision history for this message
Christian Reis (kiko) wrote :

I can confirm that 2.6.31-02063112-generic works -- and I also don't see a sign of bug 539467 in it either. I hope that helps narrow the problem down somewhat -- in summary, 2.6.31.9 and 2.6.31.12 work; none of the 2.6.32-rc versions work.

Stefan Bader (smb)
Changed in linux (Ubuntu):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
Revision history for this message
Stefan Bader (smb) wrote :

Unfortunately (as nearly expected) the time frame between release and the first rc of the next kernel version has the majority of changes. It looks a bit like the problem is spread between generic code and the driver itself.
The message and the frequency it occurs sounds like the code (at least) thinks the link is idle. After 30s idle time it will try to check whether the AP is still there. Then wait for half a second and when there is no response, assume the connection dropped.
So one experiment would be to try have a ping running in the background and see whether this would stop the disconnect. That would at least prove that normal traffic is detected as activity.
For hopefully more insight, I compiled a kernel with debugging turned on for the iwlagn driver. The packages are on http://people.canonical.com/~smb/lp544254
There seem to be two debug facilities one might be turned on by "echo 0xb80a >/sys/class/net/wlan0/device/debug_level" (as root). I would expect output from this to go into dmesg. The other facility adds directories under /sys/kernel/debug. The definitions are a bit cryptic so I am not sure about the exact naming. Either net/wlan0 or iwlagn or whatever. The should be a subdir called debug and in that rx_statistics and tx_statistics as well as rx and tx_queue. Maybe output of those helps too. But I would first try to see what the dmesg shows with the debug_level set.

Revision history for this message
Christian Reis (kiko) wrote :

Hey Stefan, thanks for being so cool about helping out here. So I've installed the kernel and tried out the debug options; it's spit out a ton of stuff which I'm attaching here. I don't really see anything suspicious there, but I'm sure you understand the inner workings of the driver better. Thanks!

Revision history for this message
michael barany (michael-barany) wrote :

I'm not 100% sure whether this is the bug affecting me. Won't be in much of a position to test in the near future, but thought I'd add my symptoms in case they help. I'm running an up-to-date beta of the lucid netbook edition and network-manager has no problems whatsoever connecting to unsecured networks and WPA-Personal networks. My university's encrypted network uses WPA-Enterprise with PEAP and MSCHAPv.2. I am sometimes able to connect, but am dropped within a matter of minutes with syslog messages like the following (after a successful connection):
> Apr 5 14:43:49 michael-eee kernel: [ 103.161179] ===>rt_ioctl_giwscan.
> 23(23) BSS returned, data->length = 3477
> Apr 5 14:44:49 michael-eee kernel: [ 163.163495] ===>rt_ioctl_giwscan.
> 24(24) BSS returned, data->length = 3656
> Apr 5 14:46:09 michael-eee kernel: [ 243.159635] ===>rt_ioctl_giwscan.
> 21(21) BSS returned, data->length = 3199
> Apr 5 14:47:02 michael-eee kernel: [ 296.986171] ERROR!!!
> RTMPCancelTimer failed, Timer hasn't been initialize!
> Apr 5 14:47:02 michael-eee wpa_supplicant[812]: CTRL-EVENT-DISCONNECTED
> - Disconnect event - remove keys
> Apr 5 14:47:02 michael-eee wpa_supplicant[812]: CTRL-EVENT-DISCONNECTED
> - Disconnect event - remove keys
> Apr 5 14:47:02 michael-eee NetworkManager: <info> (wlan0): supplicant
> connection state: completed -> disconnected
> Apr 5 14:47:02 michael-eee NetworkManager: <info> (wlan0): supplicant
> connection state: disconnected -> scanning
It can then no longer connect, even on repeated attempts. University tech support can't figure out whether anything is wrong from their end, but I'll post anything they send me if relevant.

Revision history for this message
Christian Reis (kiko) wrote :

I'm currently running 2.6.32-19 and, at least when associating to a WPA1 AP, it works fine. I won't be able to test with my old WEP router in the next week or two.

Revision history for this message
Michael Lustfield (michaellustfield) wrote : Re: [Bug 544254] Re: iwlagn (i4965AGN) continually drops and reconnects to access point

I haven't noticed this happening with the latest generic kernel either.

On Sun, 11 Apr 2010 02:30:23 -0000
Christian Reis <email address hidden> wrote:

> I'm currently running 2.6.32-19 and, at least when associating to a WPA1
> AP, it works fine. I won't be able to test with my old WEP router in the
> next week or two.
>

--
Michael Lustfield
Kalliki Software

Network and Systems Administrator

Revision history for this message
Christian Reis (kiko) wrote : Re: iwlagn (i4965AGN) continually drops and reconnects to access point

I'm marking this as fixed to get it off the radar; when I'm back home I'll be able to verify versus WEP.

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Christian Reis (kiko) wrote : Re: [Bug 544254] Re: iwlagn (i4965AGN) continually drops and reconnects to access point

I am back in the office and guess what? The problem is back again. So I
can confirm that this only manifests itself when using WEP; the actual
iwlist output looks like this:

wlan0 Scan completed :
          Cell 01 - Address: 00:40:10:10:00:03
                    Channel:1
                    Frequency:2.412 GHz (Channel 1)
                    Quality=70/70 Signal level=-36 dBm
                    Encryption key:on
                    ESSID:"HARIBO"
                    Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 11 Mb/s; 18 Mb/s
                              24 Mb/s; 36 Mb/s; 54 Mb/s
                    Bit Rates:6 Mb/s; 9 Mb/s; 12 Mb/s; 48 Mb/s
                    Mode:Master
                    Extra:tsf=000000001ea6e5b4
                    Extra: Last beacon: 13112ms ago
                    IE: Unknown: 000648415249424F
                    IE: Unknown: 010882848B962430486C
                    IE: Unknown: 030101
                    IE: Unknown: 2A0106
                    IE: Unknown: 2F0106
                    IE: Unknown: 32040C121860
                    IE: Unknown: DD090010180202F0000000

It's still present in the -21 kernel that went into final. I'm rebooting now
into an older version that I know works. I'm happy to try new kernels out of
course, so just point them to me.

Revision history for this message
Michael Lustfield (michaellustfield) wrote : Re: [Bug 544254] Re: iwlagn (i4965AGN) continually drops and reconnects to access point

I don't have any available WEP AP's anymore so I'm unable to keep testing this.
Sorry.

On Thu, 29 Apr 2010 20:25:24 -0000
Christian Reis <email address hidden> wrote:

> I am back in the office and guess what? The problem is back again. So I
> can confirm that this only manifests itself when using WEP; the actual
> iwlist output looks like this:
>
> wlan0 Scan completed :
> Cell 01 - Address: 00:40:10:10:00:03
> Channel:1
> Frequency:2.412 GHz (Channel 1)
> Quality=70/70 Signal level=-36 dBm
> Encryption key:on
> ESSID:"HARIBO"
> Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 11 Mb/s; 18 Mb/s
> 24 Mb/s; 36 Mb/s; 54 Mb/s
> Bit Rates:6 Mb/s; 9 Mb/s; 12 Mb/s; 48 Mb/s
> Mode:Master
> Extra:tsf=000000001ea6e5b4
> Extra: Last beacon: 13112ms ago
> IE: Unknown: 000648415249424F
> IE: Unknown: 010882848B962430486C
> IE: Unknown: 030101
> IE: Unknown: 2A0106
> IE: Unknown: 2F0106
> IE: Unknown: 32040C121860
> IE: Unknown: DD090010180202F0000000
>
> It's still present in the -21 kernel that went into final. I'm rebooting now
> into an older version that I know works. I'm happy to try new kernels out of
> course, so just point them to me.
>

--
Michael Lustfield
Kalliki Software, LLC

Network and Systems Administrator

Christian Reis (kiko)
summary: - iwlagn (i4965AGN) continually drops and reconnects to access point
+ iwlagn (i4965AGN) continually drops and reconnects to WEP-protected
+ access point
Changed in linux (Ubuntu):
status: Fix Released → Confirmed
Revision history for this message
Michael Lustfield (michaellustfield) wrote :

kiko: You still fighting this?

tags: removed: regression-potential
Revision history for this message
penalvch (penalvch) wrote :

Christian Reis, thank you for reporting this and helping make Ubuntu better. This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux <replace-with-bug-number>

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Closing this bug with Won't fix as this kernel / release is no longer supported.
Please feel free to open a new bug report if you're still experiencing this on a newer release (Bionic 18.04.3 / Disco 19.04)
Thanks!

Changed in linux (Ubuntu):
status: Incomplete → Won't Fix
To post a comment you must log in.