rt3090: freeze on module rt2800pci unload

Bug #662288 reported by Chris
138
This bug affects 23 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Undecided
Unassigned
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Natty
Fix Released
Undecided
Unassigned

Bug Description

[Status 2010-04-12]
fixed in linux mainline (since 2.6.39) with commit 7f6e144fb99a4a70d3c5ad5f074204c5b89a6f65 "rt2x00: Fix radio off hang issue for PCIE interface" (see comment 31)
not (yet) fixed in kernels 2.6.38 and older (natty, maverick)
submitted to stable on 2010-04-05 by Stanislaw Gruszka
committed to stable-queue: http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=tree;f=queue-2.6.38;hb=HEAD (so it should be in 2.6.38.3)

This commit should be considered for inclusion in natty.

[Original Report]
When I shut down the computer it does not switch off and I have to hold down the power button to power it off completely.

Following the advice of Wolfgang Kufner I have submitted this new bug (see comment #25 on this bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/594866).

When I blacklist the modules listed in post #14 of the above bug my PC shuts down properly but I cannot connect to a WPA2 wireless router.

If I un-blacklist rt2800pci (keeping the other modules blacklisted) I can again connect to the WPA2 wireless router but I can no longer shut down.

I have tried several different combinations of blacklisting the modules in post #14 but none of the combinations I tried seemed to work.

My PC is an MSI CR700.

ProblemType: Bug
DistroRelease: Ubuntu 10.10
Package: linux-image-2.6.35-22-generic 2.6.35-22.33
Regression: Yes
Reproducible: No
ProcVersionSignature: Ubuntu 2.6.35-22.33-generic 2.6.35.4
Uname: Linux 2.6.35-22-generic i686
NonfreeKernelModules: nvidia
AcpiTables:
 Error: command ['gksu', '-D', 'Apport', '--', '/usr/share/apport/dump_acpi_tables.py'] failed with exit code 1:
 (gksu:9846): Gdk-CRITICAL **: IA__gdk_draw_pixbuf: assertion `GDK_IS_DRAWABLE (drawable)' failed
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.23.
Architecture: i386
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: marjorie 1485 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'NVidia'/'HDA NVidia at 0xfae78000 irq 21'
   Mixer name : 'Nvidia MCP79/7A HDMI'
   Components : 'HDA:10ec0888,14621019,00100202 HDA:10de0007,10de0101,00100100'
   Controls : 32
   Simple ctrls : 17
Date: Sun Oct 17 18:34:46 2010
Frequency: Once a day.
HibernationDevice: RESUME=UUID=cfb75d32-2f8e-4d89-8c5e-02f828455d9f
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release i386 (20100429)
Lsusb:
 Bus 002 Device 002: ID 0461:4d08 Primax Electronics, Ltd
 Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 001 Device 003: ID 1307:0165 Transcend Information, Inc. 2GB/4GB Flash Drive
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Micro-Star International CR700
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.35-22-generic root=UUID=c00e545b-2e51-41b6-88d9-6e25f57fcf47 ro quiet splash
ProcEnviron:
 LANG=en_GB.UTF-8
 SHELL=/bin/bash
RelatedPackageVersions: linux-firmware 1.38
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
StagingDrivers: rt2860sta
Title: [STAGING]
dmi.bios.date: 01/20/2010
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: A1734NMS V3.1A
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: MS-1734
dmi.board.vendor: MSI
dmi.board.version: Ver 1.000
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 10
dmi.chassis.vendor: To Be Filled By O.E.M.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrA1734NMSV3.1A:bd01/20/2010:svnMicro-StarInternational:pnCR700:pvrVer1.000:rvnMSI:rnMS-1734:rvrVer1.000:cvnToBeFilledByO.E.M.:ct10:cvrToBeFilledByO.E.M.:
dmi.product.name: CR700
dmi.product.version: Ver 1.000
dmi.sys.vendor: Micro-Star International

Revision history for this message
Chris (9-launchpad2-cmharper-com-deactivatedaccount) wrote :
Revision history for this message
mCoRN (metall-corn) wrote :

i have the same problem in Lenovo B450 laptop with rt3090

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :

Congratulations, Chris. You made a perfectly nice bug report, full with potentially important automatically gathered information :-) .

As a workaround you might try to unload the offending module by hand before shutdown:
open a Terminal
type: sudo modprobe -rv rt2800pci
if that succeeds shut down now. It might however complain that the module is in use (and can therefore not be unloaded). In this case right click the Network Manager icon in the menu bar and deselect "Enable Wireless". Then try sudo modprobe -rv rt2800pci again.
For one thing if this works (or not) this will give us more info about the bug and it might also be safer than a hard shutdown.

summary: - Computer does not shut down properly
+ If rt2800pci is loaded computer does not shut down properly
Revision history for this message
Chris (9-launchpad2-cmharper-com-deactivatedaccount) wrote : Re: If rt2800pci is loaded computer does not shut down properly

I have completed what you asked for me to try next and here are the results:

Typed sudo modprobe -rv rt2800pci in Terminal : the computer froze immediately after typing in my password. I usually have the seconds displayed on my panel clock and I could see that they had stopped moving. The mouse was also unresponsive. After 10 minutes I gave up waiting and held down the power switch to turn the laptop off.

Deselected "Enable Wireless" in Network Manager : as soon as I selected this the computer again froze immediately, I didn't even get to the "sudo modprobe -rv rt2800pci" step.

I hope this helps a little.

summary: - If rt2800pci is loaded computer does not shut down properly
+ rt2800pci freeze on module unload [maverick i386]
Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote : Re: rt2800pci freeze on module unload [maverick i386]

We are certainly homing in on the source of that problem.

Were you connected to a router when trying to disable wireless in Network Manager? If so, could you try to "Disconnect" (in Network Manager) before you disable? It might make a difference. Not sure.

Also, try a safer way to shut down when frozen (Magic SysRq key+REISUO):
Press and hold AltGr
Press and hold Print/SysRq
now press R and then wait two seconds
press E and wait
press I and wait
press S (watch your harddisk LED, if it goes on wait till it goes off again)
press U and wait
press O (it should power down now)
release AltGr+SysRq

Each of the above letters should start another stage in a controlled shutdown process. The waiting is just to give those processes a bit of time to run. (Documentation is at: http://www.kernel.org/doc/Documentation/sysrq.txt . Not that you have to read that.)

Revision history for this message
Chris (9-launchpad2-cmharper-com-deactivatedaccount) wrote :

Thanks again for your help.

I *was* connected to the router when I tried to disable wireless in Network Manager as you guessed.

Following your suggestion I disconnected from the router - it reported I had lost the connection with my router in the pop-up box that appears in the top right of the screen - and then I deselected "Enable Wireless". My computer then froze again (just like before).

I tried to do the safer way to shut down as you mentioned but nothing seemed to work (probably the keyboard had stopped working as well).

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :

Could you try with the current mainline kernel:
Download the two *i386.deb and the *all.deb from http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/
Install all 3 by double clicking

reboot (it might be a good idea to do an altgr+sysRq+S to force the disk to sync before you do something expected to freeze the computer)

boot into the newly installed kernel. You might have to hold the shift key down after switching on to get to select the kernel in grub.

Revision history for this message
Chris (9-launchpad2-cmharper-com-deactivatedaccount) wrote :

I have downloaded, installed and rebooted with the kernel's you suggested but it has seemingly made little difference. When I shut down my laptop it still freezes.

The only thing I did notice is that Network Manager now takes a little longer to connect to my network (it has usually connected by the time the desktop shows and is ready to use but with the new kernel it seems to take 20-30s longer to connect).

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :

So it's not fixed in mainline, either.

Try this as a workaround (booting the standard kernel again):
https://launchpad.net/~markus-tisoft/+archive/rt3090/+files/rt3090-dkms_2.3.1.7-0ubuntu0~ppa2_all.deb
blacklist rt2800pci
blacklist rt2860sta

That might work, according to e.g. comments 83+85 in bug 541620.

Revision history for this message
Chris (9-launchpad2-cmharper-com-deactivatedaccount) wrote :

The workaround (#9) seems to have solved the problem for me, thank you very much for your time and effort in resolving this. It is much appreciated.

What will happen when a new kernel is pushed through Update Manager? Will I need to update or reinstall anything?

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :

I'm happy to hear that :-)

The nice thing about dkms packages is that they compile themselves whenever necessary. So a new kernel should just take a bit longer to boot the first time. No human intervention necessary.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Greg Whiteley (greg-whiteley) wrote :

Sorry for the off-topic - but special thanks to Wolfgang for his clear and helpful responses. I wish all bugs on lp were handled so well.

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :

Thank you, Greg. It is really nice to hear this.
:-)

Revision history for this message
Tim Brody (tim-brody) wrote :

I can confirm that #9 allows maverick to WPA2 and unload the module.

This should be a high-priority to get into main because this is a blocker to using Ubuntu on the new Acer Revo 3700. (At least rt2860sta shouldn't be grabbing the device if it crashes)

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :

Hi Tim,

#9 is only a workaround till this nasty freeze bug is fixed. It is not a fix. Ralink 3090 pci devices are supposed to work with the rt2800pci driver. (See http://git.kernel.org/?p=linux/kernel/git/linville/wireless-next-2.6.git;a=blob;f=drivers/net/wireless/rt2x00/Kconfig;h=f2be1d35a5c8c8b32649c9c7e38db64d83682c37;hb=46af584d2ea86518c4cdf521903cd93ba6de2ec0 - especially the bits under ---help---)

***What we need to fix this freeze bug is a trace.***
Maybe you can try to get a trace by triggering the bug on a text console. https://help.ubuntu.com/community/DebuggingSystemCrash may be of some use. However, from what we found so far I do not expect the system to react to any SysRq commands (see earlier comments) after module unload. But maybe you get a trace by just doing modprobe -r rt2800pci on a text console (Ctrl+Alt+F1).
And again: it might be a good idea to do an altgr+sysRq+S to force the disk to sync before you do something expected to freeze the computer, and always wait till there is no more disk activity.

rt2860sta should not be in use. You can check with lspci -k.
It should say: Kernel driver in use: rt2800pci.
It will list rt2860sta as also capable of driving the device in the line "Kernel modules:".

Thanks

Revision history for this message
Stuart Wilkinson (stuart-p-wilkinson) wrote :
Download full text (3.3 KiB)

Hi,

I've tried the workaround in post #9 but I still can't get wireless working. My system is a revo 3700.
When I issue the following command - sudo lshw -C network - I get:

  *-network UNCLAIMED
       description: Network controller
       product: RT3090 Wireless 802.11n 1T/1R PCIe
       vendor: RaLink
       physical id: 0
       bus info: pci@0000:02:00.0
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: pm msi pciexpress bus_master cap_list
       configuration: latency=0
       resources: memory:febf0000-febfffff
  *-network
       description: Ethernet interface
       product: RTL8111/8168B PCI Express Gigabit Ethernet controller
       vendor: Realtek Semiconductor Co., Ltd.
       physical id: 0
       bus info: pci@0000:03:00.0
       logical name: eth0
       version: 06
       serial: d0:27:88:0e:7c:a4
       size: 100MB/s
       capacity: 1GB/s
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress msix vpd bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd 1000bt 1000bt-fd autonegotiation
       configuration: autonegotiation=on broadcast=yes driver=r8169 driverversion=2.3LK-NAPI duplex=full ip=192.168.0.7 latency=0 link=yes multicast=yes port=MII speed=100MB/s
       resources: irq:43 ioport:e800(size=256) memory:fbffb000-fbffbfff memory:fbffc000-fbffffff

I have tried uninstalling and reinstalling the kernel mentioned in post #9 as well as editing the /etc/modprobe.d/blacklist.conf
file.

Here is the latest version of my blacklis.conf file

# This file lists those modules which we don't want to be loaded by
# alias expansion, usually so some other driver will be loaded for the
# device instead.

# evbug is a debug tool that should be loaded explicitly
blacklist evbug

# these drivers are very simple, the HID drivers are usually preferred
blacklist usbmouse
blacklist usbkbd

# replaced by e100
blacklist eepro100

# replaced by tulip
blacklist de4x5

# causes no end of confusion by creating unexpected network interfaces
blacklist eth1394

# snd_intel8x0m can interfere with snd_intel8x0, doesn't seem to support much
# hardware on its own (Ubuntu bug #2011, #6810)
blacklist snd_intel8x0m

# Conflicts with dvb driver (which is better for handling this device)
blacklist snd_aw2

# causes failure to suspend on HP compaq nc6000 (Ubuntu: #10306)
blacklist i2c_i801

# replaced by p54pci
blacklist prism54

# replaced by b43 and ssb.
blacklist bcm43xx

# most apps now use garmin usb driver directly (Ubuntu: #114565)
blacklist garmin_gps

# replaced by asus-laptop (Ubuntu: #184721)
blacklist asus_acpi

# low-quality, just noise when being used for sound playback, causes
# hangs at desktop session start (Ubuntu: #246969)
blacklist snd_pcsp

# ugly and loud noise, getting on everyone's nerves; this should be done by a
# nice pulseaudio bing (Ubuntu: #77010)
blacklist pcspkr

# EDAC driver for amd76x clashes with the agp driver preventing the aperture
# from being initialised (Ubuntu: #297750). Blacklist so that the driver
# continues to build and is installable for the few cases where its
# really needed.
blacklist amd76x_edac

blacklist rt2800pci
b...

Read more...

Revision history for this message
Stuart Wilkinson (stuart-p-wilkinson) wrote :

I managed to get my wireless working by following instructions in this thread - http://ubuntuforums.org/showthread.php?p=10230708

Revision history for this message
neuromancer (neuromancer) wrote :

Hi Ubuntu user.
Solution proposed on #9 by Wolfgang worked for me on Hp 620 notebook with Ubuntu 10.10 installed.
The only problem is that when a new kernel is installed via update-manager or synpatic, my RaLink RT3090 Wireless 802.11n 1T/1R PCIe stop working and is not recognised by Ubuntu.
So every time a new kernel is installed I need to remove rt3090-dkms package and reinstall it.
After a reboot all works.

I have understood that dkms package was able to recompile himself when necessary, but in this case dkms doesn't work.
Any idea?

Bye

Revision history for this message
Greg Whiteley (greg-whiteley) wrote :

I just tried following the directions #15 to (hopefully) see a trace on from the console. No such luck - I get no output - just the hang.

What I did:
1. Blacklisted rt3090sta (the workaround module) and removed blacklist from rt2800pci
2. Disabled wireless in network-manager and removed the workaround 3090 module
3. modprobe rt2800pci
4. Re-enable wireless in network-manager and connect to network
5. Went to pseudo-terminal 1
6. Alt+SysRq+S (for sync - and to verify that my terminal keys were mapped properly)
7. sudo rmmod rt2800pci

Nothing was emitted, no prompt comes back, no response to Atl+SysRq key combinations. The last entry in /var/log/kern.log was my SysRq sync completion. Is there anywhere else I should look for info?

I haven't tried the kernel-ppa yet. I'll do that when I get the chance.

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :

Hi Greg,

it's good to see someone having the hardware is working on this bug.

Yes, kernel-ppa[1] would be a good next step.

If the newest daily 2.6.38 kernel from that ppa also shows this bug the next step would be to test with the newest cutting edge compat-wireless package[2] on top of that 2.6.38 kernel.
If the bug still shows there then a report should be filed upstream[3], where there is a good chance that experts in this matter will actually look at this. The ubuntu kernel team[4] seems to be a bit overbooked with over 5000 open bugs[5] :-) and anyway the experts for this driver live upstream.

I don't mean to overburden you with this list, Greg. It's just an outline of what could be done to get this bug moving.
And everyone with the hardware is invited to help :-) .

Please ask if you have any questions.

Thanks,
Wolfgang

[1]http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/
[2] http://wireless.kernel.org/download/compat-wireless-2.6/
     Explanations in http://wireless.kernel.org/en/users/Download#Selecting_your_driver
[3] https://bugzilla.kernel.org/
[4 ]https://launchpad.net/~ubuntu-kernel-team
[5] https://bugs.launchpad.net/ubuntu/+source/linux/+bugs

Revision history for this message
Greg Whiteley (greg-whiteley) wrote :

Thanks Wolfgang,

Ok I had trouble getting [1] the kernel-ppa daily to associate with my access-point. Eventually it worked (about 5 reboots and multiple attempts) I'm not sure if that problem is indicative of anything relevant to this bug. When it finally associated I couldn't contact any of my network so I'm not sure of the state of support in that kernel version. That being said, once I could associate I could reproduce the hang just fine so at least it is clear that the hang is present on mainline.

I'm a little uncertain as to the relationship between the kernel-ppa/mainline [1] and compat-wireless [2] with respect to the "linux-next" variant of compat-wireless. Downloaded the tar-balls from [2] but got an early compile fail when building against the kernel-ppa version (making me think the underpinnigs have changed significantly for this rev), which I don't see with the maverick package (i get further but still fails in some irrelevant drivers - I haven't spent any time getting either working yet). Should I be downloading the other (linux-next) variant of compat-wireless for use with the kernel-ppa?

Is there any reason to prefer testing wireless-compat with the kernel-ppa as the base over using the maverick kernel as the base?

I'll make sure the bug is logged upstream, although at the moment mainline is not a useful baseline for me to run day-to-day as the dkms rt3090 driver from #9 doesn't seem to work, so I'll be slow going getting progress on that.

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :
Download full text (3.5 KiB)

Hi Greg,

That was quick :-) . Did you test on battery? There are two bugs that I currently have to work around if I want to run my rt2860 hardware (which has the same rt2800pci driver) on battery. Those bugs (introduced by power save measures) do not happen when plugged in (for most BIOS versions, at least). Better to test plugged in.

There is only very little reason to prefer testing compat-wireless with a daily mainline kernel to testing it with e.g. a maverick kernel. The daily compat-package brings all the newest wireless stuff with it either way. Only for the small chance that changes in the rest of the kernel affect the bug does it make a difference. So, if mainline is inconvenient, it's perfectly fine to use an older kernel. Also, one of those power save bugs is a regression introduced by 2.6.38, so an older kernel will work better in this regard.

You might find it more convenient to not use blacklisting, which is per installed operating system, but instead disable the modules you do not want per kernel:
e.g. for disabling the rt2800pci module in your "work" kernel go to:
/lib/modules/<the kernel whoose module you want to disable>/kernel/drivers/net/wireless/rt2x00/
and just rename it:
sudo mv rt2800pci.ko rt2800pci.ko.disable
That way you can boot into your "work" kernel and have the "work" wireless without fiddling with blacklisting. And when testing then just boot into your "test" kernel and automatically have the "test" wireless modules.
If you are using compat-wireless for testing then you should usually not have to disable any other module. The modules from compat-wireless get installed into a subdirectory of /lib/modules/`uname -r`/updates/ and modules from this updates directory usually get preferred anyway.
You can use the current mainline kernel or any older, stable kernel from there, or the second, older, maverick kernel or whatever other kernel you have as "test" kernel.

I just did the compile of compat-wireless-2011-02-24.tar.bz2 on the ubuntu mainline 2.6.38-999-generic #201102240912 kernel. It went fine. Maybe you did not make use of the script to restrict compilation to just the modules you need.
You can do so by running this before make:
./scripts/driver-select rt2x00
I hope that it will then work for you. They are of course daily packages and often enough there is bit that does not compile without fiddling. The compat-wireless project certainly has to work hard to keep it in compilable form.

The tar-balls from [2] _are_ the linux-next variant of compat-wireless. They bring the wireless as it exists in linux-next, which is newer than what mainline has. That is the point of testing those. They are the closest to what the developers a currently working on. That makes relatively certain that the bug has not happened to be fixed without anyone noticing and also often the developers tend to know the newest code the best. The other, stable, variant of compat-wireless brings the wireless of mainline and older stable kernels, the same stuff you get when installing those kernels. They are not interesting for the purpose of this testing (unless we would want to test whether it possibly is a regression).

btw: modinfo rt2...

Read more...

Revision history for this message
Greg Whiteley (greg-whiteley) wrote :

I'll respond to #21 and #22 in parts for clarity:

I think I've cleared up the problems with failing to associate (or pass data after associating) described in #21 as artifacts of the way I was switching between the rt3090sta "workaround" module and rt2800pci - the module under test - by use of manual rmmod/modprobe. Ensuring only one module ever talked to the hardware per boot made things quite reliable.

I don't think I encountered the power-management problems you described with 2.6.38, which is reassuring.

Revision history for this message
Greg Whiteley (greg-whiteley) wrote :

I'll respond to #21 and #22 in parts for clarity:

Summary of my testing (once I had achieved some reliability):

Observed the failure (hang):
1. Kernel 2.6.35-25-generic (Maverick)
2. Kernel 2.6.38-999-generic (kernel-ppa version 2.6.38-999.201102240912)
3. Kernel 2.6.35-25-generic (Maverick) + compat-wireless-2.6.38-rc4-1

No failure observed ("works for me")
1. Kernel 2.6.35-25-generic (Maverick) + compat-wireless-2011-02-24
2. Kernel 2.6.38-999-generic (kernel-ppa version 2.6.38-999.201102240912) + compat-wireless-2011-02-24

So it is clear that the problem is resolved in post-2.6.38 changes.

I'm not sure where this leaves this bug:
1. I haven't logged against bugzilla.kernel.org yet as it appears to be resolved in the latest code
2. For this "ubuntu/launchpad bug" - any linux-backports-modules-compat-wireless-2.6.38 (when released) is unlikely to be a workaround, manual intervention would likely be required.

I haven't tried binary-searching compat-wireless versions to find which change contributed the fix, as I'm not sure if that would be useful for this (ubuntu) bug, and obviously not useful to kernel as it is already fixed ;)

I'll follow up to this comment with the steps I used to get things working on each version just so others finding this bug in Google will have something reliable to work from.

Revision history for this message
Greg Whiteley (greg-whiteley) wrote :

To get this working (for me) from maverick:

Note: everything here was on i386

The latest automatically installed linux-version at this time (26th Feb 2011) is:
   linux-image-2.6.35-25-generic 2.6.35-25.44
Nothing required to get this.

1. Downloaded compat-wireless-2011-02-24
http://wireless.kernel.org/download/compat-wireless-2.6/compat-wireless-2011-02-24.tar.bz2
2. Untar and enter directory
3. ./scripts/driver-select rt2x00 # This parameter is documented in driver-select script help, but not the README
4. make # completed with no problems
5. sudo rmmod rt2800pci # not included in unload below
6. sudo rmmod rt2x00pci # not included in unload below
7. sudo make unload # unloads all affected drivers (except the two above - need to update compat-wireless)
8. sudo make install
9. sudo modprobe rt2800pci

At this point everything worked ok, including removing the module when associated.

To back this change out go back into the compat-wireless directory and run steps 5-7, then
1. sudo make uninstall
2. sudo modprobe rt2800pci # reinstate vanilla (broken) driver

--
To get this working with the kernel-ppa version:

1. Install the kernel-ppa modules (linux-image-generic, linux-header-generic and linux-header) - I used version 2.6.38-999.201102240912
2. Folow steps 1-3 above
3. Edit config.mk and change COMPAT_LATEST_VERSION to 38 (was 39)
4. Follow steps 4-9 above

Back out as for the maverick version.

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :

Thanks Greg,

really nice work. Good to hear it is fixed in linux-next. I also don't see a reason to file an upstream bug anymore.

I have looked a little bit at the commits that might have fixed this[1]. Could you maybe try the following two compat-wireless versions:

http://wireless.kernel.org/download/compat-wireless-2.6/compat-wireless-2011-02-10.tar.bz2
This one I expect to work, as it has the rather big patchset by Helmut Schaa converting interrupt handling to use tasklets, which I think is the most likely candidate.

http://wireless.kernel.org/download/compat-wireless-2.6/compat-wireless-2011-01-20.tar.bz2
Most likely this one will freeze. It has just the patch by Ralink's Jay Hung "rt2x00: Fix radio off hang issue for PCIE interface". Would be easy to port if this were what fixes it.

Are you using the compat-wireless module for normal work now? Any other problems with it? I am thinking of advertising the fact that it is working now with compat-wireless in other relevant launchpad bugs so that it sees more widespread testing. With your detailed explanation in comment 25 I think everyone with a minimum of command line experience can do it.

Thanks,
Wolfgang

[1] http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git&a=search&h=HEAD&st=commit&s=rt2x00

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :

Oh, I forgot to say that you should do those two tests on your 2.6.35 kernel as at least one of them won't compile on 2.6.38.

Revision history for this message
Greg Whiteley (greg-whiteley) wrote :

I'm using the 2.6.35-25 (maverick) + compat-wireless-2011-02-24 for my day-to-day work and it seems good. No problems with normal wifi operation and (critical for me) no problems with suspend or hibernate.

The "workaround" driver rt3090sta needed to be unloaded on suspend/hibernate calls - rt2800pci has no such problems.

Revision history for this message
Greg Whiteley (greg-whiteley) wrote :

2011-02-10 and 2011-01-20 both seem solid.

I tried 2011-01-18 to isolate the Ralink radio off change and 01-18 fails. So that pretty much fingers the "Radio off hang on PCIE" change as the fix [1].

I'll follow up with a check on 2011-01-19 for completeness (I wasn't brave with possible timezone differences between generation of tar-balls vs actual commit times).

[1] http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=7f6e144fb99a4a70d3c5ad5f074204c5b89a6f65

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :

Now that sounds very encouraging. Looks like Jay Hung didn't mean just a little hang then in his commit message.

I so far always ended up looking into the code to see if a patch was already in. AFAIR 2011-01-19 didn't have it, not completely sure though.

Revision history for this message
Greg Whiteley (greg-whiteley) wrote :

Duh, I should have checked the code - have been treating foreign code-base as black-box :-\

Today verified that's the commit [1] for the fix. 2011-01-19 fails as is. Applied patch for [1] to 2011-01-19 and module removes cleanly without hang.

[1] http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commitdiff;h=7f6e144fb99a4a70d3c5ad5f074204c5b89a6f65

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :

Great :-)

Since you have already verified the commit the next step is to get the attention of the ubuntu developers so that this fix reaches the users as soon as possible. I'll try to do that...

Changed in linux:
status: New → Fix Committed
summary: - rt2800pci freeze on module unload [maverick i386]
+ rt3090: freeze on module rt2800pci unload
description: updated
Changed in linux:
status: Fix Committed → Fix Released
description: updated
Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :

@Andy Whitcroft -- Please consider this patch for inclusion into Natty.

The patch that fixes this freeze bug is now in linux mainline. It is the single commit 7f6e144fb99a4a70d3c5ad5f074204c5b89a6f65 "rt2x00: Fix radio off hang issue for PCIE interface". This has been verified by Greg Whiteley (see comment 31).

Without this patch rt3090 freezes the system on every shutdown or unload.

I checked that it applies cleanly to the current natty kernel. (Unsurprisingly - it's the first rt2x00 commit that didn't make it into 2.6.38 :-\ )

This patch has shown no regressions for rt3090 and rt2860 hardware.

Revision history for this message
Wolfgang Kufner (wolfgangkufner) wrote :
tags: added: natty
removed: needs-upstream-testing staging
description: updated
description: updated
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Looks like we just pulled in the 2.6.38.3 stable updates which contains the patch to resolve this bug. The 2.6.38.3 stable patch set is queued for the next (and final) Natty upload before release. Setting this to Fix Committed for now.

Changed in linux (Ubuntu):
status: Confirmed → Fix Committed
Revision history for this message
Greg Whiteley (greg-whiteley) wrote :

Thanks all, for getting this into Natty. This is very much appreciated.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Unfortunately I spoke too soon. Looks like this will land in the first SRU (stable release update) kernel for Natty. For those looking for a patched kernel to use in the mean time, I'd suggest running the pre-proposed kernel (linux-2.6.38-9.43~pre201104180902 is still building so give it a few hours):

https://launchpad.net/~kernel-ppa/+archive/pre-proposed

Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Accepted linux into natty-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in linux (Ubuntu Natty):
status: New → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

The commit for this issue came in via upstream stable release. As such it is not subject to the standard bug verification process.

tags: added: verification-done-natty
Revision history for this message
cshong (cshong87) wrote :

I would like to ask, the fix came with default installation live CD of Ubuntu 11.04, or update? I ask this because I want to install Ubuntu 11.04 safely.

Revision history for this message
Ian McMichael (ian-sigma-uk) wrote :

Tested and confirmed working on an Acer Aspire Revo 3610 from the proposed repository with kernel 2.6.38-9.43.

tags: removed: regression-potential
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (26.1 KiB)

This bug was fixed in the package linux - 2.6.38-10.46

---------------
linux (2.6.38-10.46) natty-proposed; urgency=low

  [ Steve Conklin ]

  * Release Tracking Bug
    - LP: #802464

  [ Upstream Kernel Changes ]

  * Revert "put stricter guards on queue dead checks"
  * Revert "fix oops in scsi_run_queue()"

linux (2.6.38-10.45) natty-proposed; urgency=low

  [ Upstream Kernel Changes ]

  * Revert "af_unix: Only allow recv on connected seqpacket sockets."

linux (2.6.38-10.44) natty-proposed; urgency=low

  [ Steve Conklin ]

  * Release Tracking Bug
    - LP: #792013

  [ Robert Nelson ]

  * SAUCE: omap3: beagle: detect new xM revision B
    - LP: #770679
  * SAUCE: omap3: beagle: detect new xM revision C
    - LP: #770679
  * SAUCE: omap3: beagle: if rev unknown, assume xM revision C
    - LP: #770679

  [ Stefan Bader ]

  * Include nls_iso8859-1 for virtual images
    - LP: #732046

  [ Thomas Schlichter ]

  * SAUCE: vesafb: mtrr module parameter is uint, not bool
    - LP: #778043

  [ Tim Gardner ]

  * Revert "SAUCE: acpi battery -- move first lookup asynchronous"
    - LP: #775809
  * updateconfigs after update to v2.6.38.6

  [ Upstream Kernel Changes ]

  * Revert "ALSA: hda - Fix pin-config of Gigabyte mobo"
    - LP: #780546
  * Revert "[SCSI] Retrieve the Caching mode page"
    - LP: #788691
  * Revert "USB: xhci - fix unsafe macro definitions"
  * Revert "USB: xhci - fix math in xhci_get_endpoint_interval()"
  * Revert "USB: xhci - also free streams when resetting devices"
  * ath9k_hw: fix stopping rx DMA during resets
    - LP: #775809
  * netxen: limit skb frags for non tso packet
    - LP: #775809
  * ath: add missing regdomain pair 0x5c mapping
    - LP: #775809
  * block, blk-sysfs: Fix an err return path in blk_register_queue()
    - LP: #775809
  * p54: Initialize extra_len in p54_tx_80211
    - LP: #775809
  * qlcnic: limit skb frags for non tso packet
    - LP: #775809
  * nfsd4: fix struct file leak on delegation
    - LP: #775809
  * nfsd4: Fix filp leak
    - LP: #775809
  * virtio: Decrement avail idx on buffer detach
    - LP: #775809
  * x86, gart: Set DISTLBWALKPRB bit always
    - LP: #775809
  * x86, gart: Make sure GART does not map physmem above 1TB
    - LP: #775809
  * intel-iommu: Fix use after release during device attach
    - LP: #775809
  * intel-iommu: Unlink domain from iommu
    - LP: #775809
  * intel-iommu: Fix get_domain_for_dev() error path
    - LP: #775809
  * drm/radeon/kms: pll tweaks for r7xx
    - LP: #775809
  * drm/nouveau: fix notifier memory corruption bug
    - LP: #775809
  * drm/radeon/kms: fix bad shift in atom iio table parser
    - LP: #775809
  * drm/i915/tv: Remember the detected TV type
    - LP: #775809
  * tty/n_gsm: fix bug in CRC calculation for gsm1 mode
    - LP: #775809
  * serial/imx: read cts state only after acking cts change irq
    - LP: #775809
  * ASoC: Fix output PGA enabling in wm_hubs CODECs
    - LP: #775809
  * ASoC: codecs: JZ4740: Fix OOPS
    - LP: #775809
  * ALSA: hda - Add a fix-up for Acer dmic with ALC271x codec
    - LP: #775809
  * ahci: don't enable port irq before handler is registered
    - LP: #775809
  * libata: Implement ATA_FLAG_NO_...

Changed in linux (Ubuntu Natty):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.