[regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

Bug #263059 reported by Mikael Nilsson
286
This bug affects 16 people
Affects Status Importance Assigned to Milestone
Intel Linux Wireless
Invalid
Critical
linux (Ubuntu)
Fix Released
High
Tim Gardner
Intrepid
Fix Released
High
Tim Gardner

Bug Description

Problem: Intrepid fails to boot on a variety of laptops using the iwl3945 driver.

Affected versions: all 2.6.27 Intrepid kernels. 2.6.26 is reported not to be affected.

Symptoms: A hard freeze partway through the boot process, near the time when the iwl3945 module is loaded. No kernel panic is printed, and the system is unresponsive even to Sysrq.

Frequency: This happens frequently, but not every time (20%-80% failure rate).

Workarounds: Blacklisting iwl3945 reliably avoids the problem. Delaying the loading of iwl3945 for 5 seconds also reliably avoids the problem.

Revision history for this message
Francisco T. (leviatan1) wrote :

The boot freezes when my wifi is ON, when the kernel is loading iwl3945 driver. If wifi is OFF, it doesn't freeze.

Do you have the same problem?

Revision history for this message
Mikael Nilsson (mini) wrote : Re: [regression] 2.6.27-2 fails to boot on Dell XPS M1710 when wireless enabled

Indeed, that was it.

description: updated
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi Guys,

Would either of you be able to boot with the 'quiet' and 'splash' options removed and your wireless enabled to trigger the hang. Would you then be able to take a digital photo of any errors that appear on your screen prior to the hang and attach it to this bug report? Thanks.

Changed in linux:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → High
status: New → Triaged
Revision history for this message
Mikael Nilsson (mini) wrote :

Unfortunately, the 2.6.27-2 kernel now fails to boot either way.

The symptoms seem similar to bug #102982, which affected me before (i.e. no particular error messages, and the last I see is about intel_rng).

Revision history for this message
Mikael Nilsson (mini) wrote :

Or possibly bug #106256. Will check if the boot process resumes after waiting.

Revision history for this message
Mikael Nilsson (mini) wrote :

Even with wireless OFF, the computer does not continue the boot process even after waiting for a long time (15 mins). It also does not respond to Ctrl-Alt-Delete, which to me suggests a complete hang.

removing "quiet splash" from the kernel command line, I don't see any error messages. The boot hangs after displaying "setting system clock..."

Revision history for this message
Francisco T. (leviatan1) wrote :

Me too, I can confirm . Now It doesn't matter if the wireless is on or off. Also before, if the ethernet cable was connected, it froze .

Now, It freezes when it is loading hardware drivers. The freeze is random, maybe 1 each 10 boots.

I can't find the exact reason.

Revision history for this message
Juan Pablo Salazar Bertín (snifer) wrote :

When you "switch wireless off", do you see the iwl3945 driver still being loaded? (with "quiet splash" removed)
Have you tried disabling wireless in your BIOS config?

I've reported a possible duplicate (bug #267002), please let me know about your results, thanks.

Revision history for this message
Francisco T. (leviatan1) wrote :

I tried 5 boots with the wireless on (really it's radio ON/OFF):
Once it stopped in the line: [ 13.078867] iwl3945: Detected Intel Wireless WiFi Link 3945ABG
Another time it stopped in the line: [ 13.831927] Synaptics Touchpad, model: 1, fw: 6.1, id: 0xa3a0b3, caps: 0xa04713/0x10008
Other times boot was normal.

No ethernet cable was connected and my BIOS hasn't any disable wireless option.

In attachment you have a complete normal boot.

Revision history for this message
Mikael Nilsson (mini) wrote :

2.6.27-3 boots normally, even with wireless ON.

Will report if this is a stable situation.

Revision history for this message
Juan Pablo Salazar Bertín (snifer) wrote :

2.6.27-3 still fails to boot sometimes for me.

Revision history for this message
Francisco T. (leviatan1) wrote :

Update 2.6.27-3.
After some normal boots (always with wireless ON), the last time again it failed in the line:
iwl3945: Detected Intel Wireless WiFi Link 3945ABG

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi everyone,

Francisco, it seems you may be experiencing what Juan has reported at bug 267002. If you and Juan can continue to track the issue at that bug report it would be great. It seems Mikael, the original bug reporter, no longer is experiencing issues with the newer 2.6.27-3 kernel so it would seem he may have had a slightly different bug than what you both have. For now I am makring Mikael's bug (ie this bug) as "Fix Released". Thanks.

Changed in linux:
status: Triaged → Fix Released
Revision history for this message
Mikael Nilsson (mini) wrote :

Actually, Francisco describes my experience as well - even the -3 kernel fails to boot sometimes. It is NOT connected to wireless ON or OFF - it sometimes fails anyway.

A reboot usually works.

Please reopen.

description: updated
description: updated
Mikael Nilsson (mini)
Changed in linux:
status: Fix Released → Confirmed
Revision history for this message
Marcin Feder (marfed) wrote : Re: [regression] 2.6.27-3 fails to boot on Dell XPS M1710

I can confirm the same problem on Asus V6J (nvidia + iwl3945). System always hanging on "Starting Network Interfaces" when booted from 2.6.27-3. When started with 2.6.24-19-generic kernel it works properly. When wireless card is turned off (using BIOS settings) system boot process goes further and stops on CUPS.

Revision history for this message
Marcin Feder (marfed) wrote :

2.6.27-4 fails to boot too. In my case it _always_ hangs on iwl3945 driver activation. Maybe this bug should have more general title i.e: "2.6.27-3 - boot freezes when the iwl3945 is being loaded"

Revision history for this message
Matthew Wardrop (mister.wardrop) wrote :

I can confirm this too... Because of this, and the better suspend behaviour of 2.6.25, I usually end up using the older kernel.

Revision history for this message
Matthew Wardrop (mister.wardrop) wrote :

Ah... but I note it is already confirmed... sorry for the spam.

Kind Regards,
Matthew

Revision history for this message
Yotam Benshalom (benshalom) wrote :

I have the same problem. Curiously, it happens in about 66% of the boots but not in all of them.

Here is what I get when quiet and splash are turned off (copied by hand...)

iwl3945: Intel(R) PRO/Wireless 3945ABG/BG Network Connection driver for linux, 1.2.26ks
iwl3945: Copyright (C) 2003-2008 Intel Corporation
iwl3945 0000 :05 :00 .0 PCI INT -> GSI 19 (level, low) -> IRQ 19
iwl3945: Detected Intel Wireless Wifi Link 3945ABG
cs: IO port probe 0x100-0x3af: clean
cs: IO port probe 0x3e0-0x4ff: excluding 0x4d0-0x4d7
cs: IO port probe 0x820-0x8ff: clean
cs: IO port probe 0xc00-0xcf7: clean
cs: IO port probe 0xa00-0xaff: clean
iwl3945: Tunable Channels: 13 802.11bg, 0 802.11a channels
iwl3945 0000 :05 :00 .0 PCI INT A disabled
iwl3945 0000 :05 :00 .0 PCI INT -> GSI 22 (level, low) -> IRQ 22
Setting the system clock ... OK

<<<ETERNAL HANG>>>

(sometimes it happens before the system clock line)

This looks like a general iwl3495 driver problem with intrepid. Is there more data I can send in order to help solving it?

Revision history for this message
Yotam Benshalom (benshalom) wrote :

I forgot to mention - this happens in 32-bit system on lg-s1 laptop.

Revision history for this message
DSHR (s-heuer) wrote :

Still occurs with 2.6.27-4 - currently it takes 2 or 3 tries to boot succesfully.

Revision history for this message
EdwardO (edwardooo) wrote :

Same for me on Dell XPS 1710 too after fresh install of alpha6 and update... Can confirm it happens loading the iwl3945 driver...

Revision history for this message
Mikael Nilsson (mini) wrote : Re: [regression] 2.6.27-4 fails to boot on Dell XPS M1710

Still happens on 2.6.26-4 (I'm the original reporter).

description: updated
Revision history for this message
Matthew Wardrop (mister.wardrop) wrote :

For me, it only sometimes shows the "setting system clock" item... And sometimes halts immediately after the ipw3945 output. Probably a race condition of sorts....

Kind Regards,
Matthew

Revision history for this message
Mikael Nilsson (mini) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-4 fails to boot on Dell XPS M1710

On tor, 2008-10-02 at 09:57 +0000, Matthew Wardrop wrote:
> For me, it only sometimes shows the "setting system clock" item... And
> sometimes halts immediately after the ipw3945 output. Probably a race
> condition of sorts....

This is exactly my experience.

/Mikael

Revision history for this message
Yotam Benshalom (benshalom) wrote : Re: [regression] 2.6.27-4 fails to boot on Dell XPS M1710

I get this error too on 2.6.27-4, and it gets worse. Today I had to make 6 hard boot attempts before I could log in. Are there any news about a solution? Is there perhaps an alternative driver for Intel 3945?

Mikael Nilsson (mini)
description: updated
Revision history for this message
Yotam Benshalom (benshalom) wrote : Re: [regression] 2.6.27-4 fails to boot (iwl3945 issue?)

This issue remains with 2.6.27-5 kernel installed from the repository. I get anything between 1 to 6 hangs before a successful boot.

Revision history for this message
Jakob Petsovits (jpetso) wrote :

I get this error on an HP n6320 (iwl3945, too) and booting works if the firmware (iwlwifi-3945-1.ucode) is not present. Once I put it into /lib/firmware/2.6.27-4-generic/, I get lockups similar to those described above.

Revision history for this message
DSHR (s-heuer) wrote :

Problem is still there on Lenovo X60S with kernel 2.6.27-5.

4 good boots - 2 hangs.

I am going to check the iwlwifi-3945-ucode-15.28.1.6 firmware ...

Revision history for this message
Frederic PO (fredericp) wrote :

>> For me, it only sometimes shows the "setting system clock" item... And
>> sometimes halts immediately after the ipw3945 output. Probably a race
>> condition of sorts....
>
> This is exactly my experience.

Same here with Asus A8JS 2.6.27-4-generic.
I'm attaching dmesg output after a successful boot.
Hope it helps.

Revision history for this message
bimmerd00d (brandon-holloway) wrote :

Holding a key while booting seems to bypass this issue on my dell latitude d820 with the intel 3945abg card. It fails on setting the system clock every time if i dont hold a key.

Revision history for this message
Oliver (lobohacks) wrote :

Hi,
same here on the laptop of my father, samsung r55.
boot hangs at various points.
my first thought was, that it is related to the nvidia module.
Removing it reduced the number of fail boots, but did not fix it.

Can anyone confirm that removing the nvidia-module reduces the number of fail boots?

please fix this one.

regards oliver

Revision history for this message
Mikael Nilsson (mini) wrote : Re: [regression] 2.6.27-5 sometimes fails to boot (iwl3945 issue?)

As noted, I (original reporter) still experience this on 2.6.27-5.

description: updated
Revision history for this message
Lex Berger (lexberger) wrote :

Confirming for linux 2.6.27-5 on a Samsung R65

I'm getting the same output as Yotam reporting at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/263059/comments/19

Revision history for this message
Henry Gomersall (hgomersall) wrote :

I can confirm the same bug on a Dell Inspiron 9400 with Intel Corporation PRO/Wireless 3945ABG Network Connection [8086:4222] (rev 02) (from lspci).

Systems hangs at
iwl3945: Detected Wireless WiFi Link 3945ABG
iwl3945: Tunable channels: 13 802.11bg, 23 80.11a channels
<hang>

Loading using recovery mode is a little more reliable.

Keypress workaround may work - indicative of a race hazard?

Revision history for this message
Francisco T. (leviatan1) wrote :

>Linux portatil 2.6.27-6-generic #1 SMP Tue Oct 7 04:15:04 UTC 2008 i686 GNU/Linux
It still fails.

Revision history for this message
Yotam Benshalom (benshalom) wrote :

Still fails for me too with 2.6.27-6.

Revision history for this message
johnn1949 (johnn1949) wrote :

I haven't figured out how to do this correctly but my Bug #277901 seems to be a duplicate of this one.

Revision history for this message
Ryan Davies (iownsu) wrote :

I'm getting the same issue, However this computer doesn't have wireless, So i cant disable that to try and boot.

All previous kernel's boot except the "Last known configuration"

This is a Compaq M2000
cpu model name : Mobile AMD Sempron(tm) Processor 2800+
cpu MHz : 1591.816

Revision history for this message
Mikael Nilsson (mini) wrote : Re: [regression] 2.6.27-6 sometimes fails to boot (iwl3945 issue?)

Confirm failure on 2.6.27-6 as well.

description: updated
Revision history for this message
Henry Gomersall (hgomersall) wrote :

Turning off the wireless on my Dell Inspiron 9400 still causes the wireless driver to load and doesn't prevent the hang.

Incidentally, the point of failure on the boot log varies.
sometimes at iwl3945, sometimes at Synaptics Touchpad (which is immediately after iwl3945).

Do I need to blacklist the iwl3945 driver to make sure it doesn't load?

Revision history for this message
Mikael Nilsson (mini) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-6 sometimes fails to boot (iwl3945 issue?)

tor 2008-10-09 klockan 10:48 +0000 skrev Henry Gomersall:
> Do I need to blacklist the iwl3945 driver to make sure it doesn't load?

Yes, that should help. Alternatively, move the firmware
(/lib/firmware/iwlwifi-3945-1.ucode) out of the way. If you try either,
please report your results.

Revision history for this message
Ibrahim Hancioglu (ibrahim-hancioglu) wrote : Re: [regression] 2.6.27-6 sometimes fails to boot (iwl3945 issue?)

same here, 2.6.27-6 on dell inspiron 6400. 5 restarts in a row 4 good - 1 freeze , I have pressed any key as advised above, does not help me. Is that possible to revert iwl3945 to ipw3945? maybe that helps temporary

Revision history for this message
Ibrahim Hancioglu (ibrahim-hancioglu) wrote :

I just download and install recent iwlwifi drivers from http://linuxwireless.org/en/users/Download . Unfortunately does not help, the system is freezing from time to time on boot.

Revision history for this message
Henry Gomersall (hgomersall) wrote :

Blacklisting iwl3945 seems to solve the boot problem.

Mikael Nilsson (mini)
description: updated
Changed in intellinuxwireless:
status: Unknown → Confirmed
Revision history for this message
Ibrahim Hancioglu (ibrahim-hancioglu) wrote :

How do you say that blacklisting iwl3945 seems to solve the problem? That will make that problem to bigger because of wireless card cannot use anymore, we need more useful solutions I think.

Revision history for this message
Mikael Nilsson (mini) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-6 sometimes fails to boot (iwl3945 issue?)

Well, it does solve the booting problem as I said :-).

Of course, it doesn't fix the bug. But it tells us that the iwl3945 driver
is involved.

Revision history for this message
Aanjhan Ranganathan (aanjhan) wrote : Re: [regression] 2.6.27-6 sometimes fails to boot (iwl3945 issue?)

I just upgraded to Intrepid and can confirm this bug too. But for me the boot log hangs after loading USB Video Driver. But after iwl3945 is loaded.

I am using a Dell XPS M1210.

Revision history for this message
Parthan SR (parth-technofreak) wrote :

I did a fresh install of Ubuntu Intrepid Beta and can confirm this bug, though the frequency of happening is 1/10 times. The book hangs for me when loading iwl3945 drivers. Able to boot on rebooting after turning of the laptop by pressing the power button.

Revision history for this message
Francisco T. (leviatan1) wrote :

>2.6.27-7-generic
It still fails.

Revision history for this message
Sebastian Breier (tomcat42) wrote :

Bug #263330 might be a duplicate.

Revision history for this message
johnn1949 (johnn1949) wrote :

I found that if I leave the alternate cd in ,when the boot gets to where it stalls ,it reads off the disk and continues to boot normally.

Mikael Nilsson (mini)
description: updated
Revision history for this message
Lex Berger (lexberger) wrote :

Issue persists with 2.6.27-7-generic

Revision history for this message
Pete Deremer (sportman1280-deactivatedaccount) wrote :

I am still having the issue. When i first boot it often just hangs and won't boot. However if i reboot a few times... i can eventually get it to boot up. I have a Dell Inspiron E1705

Revision history for this message
feclare (feclare) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

my Dell XPS 1210 haven't hang with 2.6.27-7 yet. While with previous
versions I hanged a lot.

On Tue, Oct 14, 2008 at 3:53 AM, Pete Deremer <email address hidden>wrote:

> I am still having the issue. When i first boot it often just hangs and
> won't boot. However if i reboot a few times... i can eventually get it
> to boot up. I have a Dell Inspiron E1705
>
> --
> [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)
> https://bugs.launchpad.net/bugs/263059
> You received this bug notification because you are a direct subscriber
> of a duplicate bug.
>

Revision history for this message
Marcin Feder (marfed) wrote :

I have a similar issue but in my case iwl3945 _always_ crashes on load. Please, look at the description of bug #275227 (marked as a duplicated of this one). Do you get the same error message in your dmesg output? I am curious is this really the same or just related problem. I have also reported it at kernel.org bugzilla. It seems that the path is already prepared - http://bugzilla.kernel.org/show_bug.cgi?id=11746 but I don't know when it will be incorporated into Intrepid kernel.

Revision history for this message
Sebastian Breier (tomcat42) wrote :

Marcin: I'm sorry, it seems I have prematurely marked bug #275227 as duplicate of this one, although it crashes always (not sometimes). I unmarked it.

It might be related though, I certainly hope so.

Revision history for this message
Sebastian Breier (tomcat42) wrote :

Anyone available for debugging/testing should definitely head over to http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1778. We have a developer's attention, we just need help with debugging/logs.

Revision history for this message
Marcin Feder (marfed) wrote :

Sebastian: No problem. I have also originally reported my problem in this thread. I decided to fill in a new bug because my system have never finished boot process if the driver wasn't blacklisted. However there are many similarities to the situation reported in this thread.

I think it can be useful if one of the reporters could boot with the driver blacklisted and than try to load it using modprobe and post dmesg output. One more observation is that command "modprobe iwl3945" does not hang itself but when I subsequently run a program that tries to access the driver like iwconfig than it hangs permanently.

I can also confirm that issue persist in the kernel 2.6.27-7-generic

Revision history for this message
Sebastian Breier (tomcat42) wrote :

I tried the blacklist/modprobe approach. It never crashes, and there are no errors in dmesg.
Also, using the wireless after about 30 reloads of the driver does not crash it, it works fine.

So I would say this might be related, but is not the same bug.
Thanks for helping clearing this up. :)

Revision history for this message
Mikael Nilsson (mini) wrote :

For the record, Marcin's bug seems to be fixed, see http://bugzilla.kernel.org/show_bug.cgi?id=11746

I wonder if the fix affects this bug?

Revision history for this message
Sebastian Breier (tomcat42) wrote :

I'd love to test 2.6.28, but compiling a kernel is a larger task for me, and I don't have much time this week. Maybe next. ;)

Revision history for this message
LumpyCustard (orangelumpycustard) wrote :

either way, the kernel freeze is in 2 days time...

Revision history for this message
Sebastian Breier (tomcat42) wrote :

Unfortunately, true.
Might still be interesting to find a fix though. ;)

Revision history for this message
Sander Jonkers (jonkers) wrote :

Sebastian,

If you do not blacklist, do you see errors on the boot screen (the dmesg)?
Because on my setup, the boot progress just stops after loading/activating
the iwl driver; I see no error messages at all.

FWIW: I believe (but I am not yet sure) that when I switch off the wifi
button before and during the boot, I have no halts. I will test this a bit
further

Sander

On Tue, Oct 14, 2008 at 10:43 AM, Sebastian Breier <email address hidden> wrote:

> I tried the blacklist/modprobe approach. It never crashes, and there are no
> errors in dmesg.
> Also, using the wireless after about 30 reloads of the driver does not
> crash it, it works fine.
>
> So I would say this might be related, but is not the same bug.
> Thanks for helping clearing this up. :)
>
> --
> [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)
> https://bugs.launchpad.net/bugs/263059
> You received this bug notification because you are a direct subscriber
> of a duplicate bug.
>

Revision history for this message
Sebastian Breier (tomcat42) wrote :

Sander: No errors on the boot screen. Regular loading, then hang.

Revision history for this message
Loïc Minier (lool) wrote :

Hi folks,

I first came across bug #275227 which looks like another dup of this dup; in bug #275227, we linked to other upstream bugs, to a patch, and we decided to include the fix in the kernel tree. I also prepared a package with this fix in my ppa at:
https://launchpad.net/~lool/+archive

Would someone with this bug confirm that this package fixes the boot hang?

(I can't reproduce the bug 100% of the time.)

If we get confirmation, I'll merge the relevant bug states.

Thanks,

Revision history for this message
Martin Pitt (pitti) wrote : Re: [Bug 263059] [NEW] [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

Incidentally I noticed similar hangs in previous kernels, in the
2.6.27-X range. I booted without usplash (which didn't work at that
time) and thus saw the kernel messages, which also stopped around
iwl3945 messages.

Interestingly enough I was never able to reproduce the hang when I
booted without "quiet". Typical Heisenbug :(

Just out of interest, does your computer always boot without "quiet"?

I rebooted several times now with the current 2.6.27-7, and never got
the hang again, so I just chalked it off as "fixed" so far. But you
still seem to get it with intrepid's current kernel?

Thanks, Pitti

Revision history for this message
Loïc Minier (lool) wrote :

I got a hang in one boot out of 6; not sure whether the patches improve the situation for you folks or not.

Revision history for this message
Loïc Minier (lool) wrote :

pitti, I got frequent hangs yesterday and infrequent hangs today, with or without my patched kernel.

I'm always booting without usplash and without quiet.

Revision history for this message
Loïc Minier (lool) wrote :

I often see snd-hda-intel messages aside of iwl3945; it looks like they do detection concurrently. Both drivers print stuff about INTerrupts.

Revision history for this message
DSHR (s-heuer) wrote :

I blacklisted iwl3945 and added a modprobe iwl3945 in /etc/rc.local
I had no freezes since then and this is my dmesg output:

[ 681.433407] Bluetooth: RFCOMM TTY layer initialized
[ 681.433420] Bluetooth: RFCOMM ver 1.10
[ 681.452827] Bridge firewalling registered
[ 681.454271] pan0: Dropping NETIF_F_UFO since no NETIF_F_HW_CSUM feature.
[ 681.516090] Bluetooth: SCO (Voice Link) ver 0.6
[ 681.516101] Bluetooth: SCO socket layer initialized
[ 685.223193] iwl3945: Intel(R) PRO/Wireless 3945ABG/BG Network Connection driv
er for Linux, 1.2.26ks
[ 685.223201] iwl3945: Copyright(c) 2003-2008 Intel Corporation
[ 685.224860] iwl3945 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 685.224883] iwl3945 0000:03:00.0: setting latency timer to 64
[ 685.224914] iwl3945: Detected Intel Wireless WiFi Link 3945ABG
[ 685.286373] iwl3945: Tunable channels: 13 802.11bg, 23 802.11a channels
[ 685.287998] phy0: Selected rate control algorithm 'iwl-3945-rs'
[ 685.378425] iwl3945 0000:03:00.0: PCI INT A disabled
[ 685.809758] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 686.088367] [drm] Initialized drm 1.1.0 20060810
[ 686.133056] pci 0000:00:02.0: power state changed by ACPI to D0
[ 686.133081] pci 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 686.133091] pci 0000:00:02.0: setting latency timer to 64
[ 686.138910] [drm] Initialized i915 1.6.0 20060119 on minor 0
[ 689.392622] iwl3945 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 689.392789] iwl3945 0000:03:00.0: restoring config space at offset 0x1 (was 0x100102, writing 0x100106)
[ 689.396877] firmware: requesting iwlwifi-3945-1.ucode
[ 689.542012] Registered led device: iwl-phy0:radio
[ 689.542104] Registered led device: iwl-phy0:assoc
[ 689.542140] Registered led device: iwl-phy0:RX
[ 689.542176] Registered led device: iwl-phy0:TX
[ 689.549639] ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 689.685114] NET: Registered protocol family 17
[ 763.680856] CPU0 attaching NULL sched-domain.

Not a solution - but a workaround for me ...

Revision history for this message
Mikael Nilsson (mini) wrote :

Loïc:

There are no signs of a crash here. An oops should show up on the console if it were the same bug, correct? It does not. The boot process just stops with no indication of error. And there is no crash of the driver is loaded after boot instead. So I doubt this bug is the same as bug #275227. It's still possible that the fix might still fix this bug though, worth a try.

Martin: I get hangs whether I use "quiet usplash" or not. And still on the -7 kernel, which is current.

Revision history for this message
Sander Jonkers (jonkers) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

I tested it a few times, and even with the Wifi switch turned off, the boot
process halts sometimes. :-(

Sander

Revision history for this message
Sander Jonkers (jonkers) wrote :

As described, I blacklisted iwl3945 in /etc/modprobe.d/blacklist and added
the modprobe to /etc/rc.local. Result: 3 succesful boots, no hanging boots.
Does this mean we have a workaround?

If loading of iwl3945 is not a problem in the end of the boot process
(/etc/rc.local), does this mean there is some kind of timing issue or race
condition earlier in the boot process?

And FWIW: My Ubuntu 8.10 now also automatically re-connects to my WEP-AP,
which it didn't so far. It did reconnect to non-encrypted and to WPA, but
not to WEP APs. Is that solved too ... ?

Sander

Revision history for this message
Sebastian Breier (tomcat42) wrote :

Just to comment on that specific part: I get hangs with "quiet usplash" as well.
That's how I found the bug at first: It would just hang, and I wondered why, so I removed "quiet usplash".
No messages though, as has been reported before.

Revision history for this message
Hernando Torque (htorque) wrote :

I can confirm that blacklisting iwl3945 stops the hangs while blacklisting snd_hda_intel (which gets initialized at the same time) did not help.

Revision history for this message
Joe Barnett (thejoe) wrote :

for what its worth, i've also seen the system hard freeze when trying to connect to a wireless network. I've also only gotten 2 or 3 successful boots (of course the first boot after upgrade worked!) in 15-20 tries before booting from the install cd and blacklisting iwl3945 in /etc/modprobe.d.

Revision history for this message
LumpyCustard (orangelumpycustard) wrote :

I also found that (didn't think it was worth mentioning before).
The first time I booted after upgrading, it was fine (then the auto-clean kicked in and deleted all my old kernels)... then it wasn't fine.

Was this the same with everyone else?

Revision history for this message
Jakob Petsovits (jpetso) wrote :

Nope, when I first updated I couldn't log in from the start. After a few unsuccessful attempts to boot the 2.6.27 kernel, I settled on 2.6.24 and only found out through this issue that the lockup actually only happens sometimes. So, no. Pure random race condition, it seems.

Revision history for this message
Sander Jonkers (jonkers) wrote :

No, not with me. At first I had the idea it was related to a cold versus a
warm reboot, but now my hypothesis is that the halts happen randomly, with a
50 - 70% chance.

So the workaround of blacklist & modprobe is very useful; it saves a lot a
time.

Revision history for this message
Matt Zimmerman (mdz) wrote :

The description says that this problem appeared between 2.6.27-1 and 2.6.27-2. Can anyone else experiencing this problem confirm that?

If so, we can start reviewing the changes between those versions and narrow the possibilities.

Revision history for this message
Matt Zimmerman (mdz) wrote :

It's been reported that, rather than blacklisting iwl3945, it's sufficient to do the following:

echo 'install iwl3945 { sleep 5; /sbin/modprobe --ignore-install iwl3945; }' | sudo tee /etc/modprobe.d/iwl3945

This introduces an artificial delay in the startup process before loading iwl3945. It will slow down the boot, but does seem to avoid whatever race condition is being triggered here.

Confirmation of this is appreciated.

Revision history for this message
Steve Langasek (vorlon) wrote :

Can the users who can reproduce this provide the full output from lspci -nn on their systems?

Is anyone on the kernel team able to reproduce this problem?

Steve Langasek (vorlon)
Changed in linux:
milestone: none → ubuntu-8.10
Revision history for this message
David Mandala (davidm) wrote :

I can confirm the bug below is output from lspci -nn

If I have quiet on the command line it happens, remove it and the system does not lock up.

pci=nomsi on the kernel line did not effect the bug, still happened.

00:00.0 Host bridge [0600]: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express Memory Controller Hub [8086:27a0] (rev 03)
00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller [8086:27a2] (rev 03)
00:02.1 Display controller [0380]: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller [8086:27a6] (rev 03)
00:1b.0 Audio device [0403]: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller [8086:27d8] (rev 02)
00:1c.0 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 [8086:27d0] (rev 02)
00:1c.1 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2 [8086:27d2] (rev 02)
00:1c.2 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 3 [8086:27d4] (rev 02)
00:1c.3 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4 [8086:27d6] (rev 02)
00:1d.0 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 [8086:27c8] (rev 02)
00:1d.1 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 [8086:27c9] (rev 02)
00:1d.2 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 [8086:27ca] (rev 02)
00:1d.3 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 [8086:27cb] (rev 02)
00:1d.7 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller [8086:27cc] (rev 02)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 Mobile PCI Bridge [8086:2448] (rev e2)
00:1f.0 ISA bridge [0601]: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge [8086:27b9] (rev 02)
00:1f.1 IDE interface [0101]: Intel Corporation 82801G (ICH7 Family) IDE Controller [8086:27df] (rev 02)
00:1f.2 SATA controller [0106]: Intel Corporation 82801GBM/GHM (ICH7 Family) SATA AHCI Controller [8086:27c5] (rev 02)
00:1f.3 SMBus [0c05]: Intel Corporation 82801G (ICH7 Family) SMBus Controller [8086:27da] (rev 02)
02:00.0 Ethernet controller [0200]: Intel Corporation 82573L Gigabit Ethernet Controller [8086:109a]
03:00.0 Network controller [0280]: Intel Corporation PRO/Wireless 3945ABG Network Connection [8086:4227] (rev 02)
15:00.0 CardBus bridge [0607]: Texas Instruments PCI1510 PC card Cardbus Controller [104c:ac56]

Revision history for this message
Loïc Minier (lool) wrote :

Some notes from debug I did in another bug: I tried "modprobe -r" / "modprobe" in a loop and couldn't trigger the bug.

I also noticed that snd-hda-intel was often on the screen when it crashed, so I tried modprobing that as well at the same time, to no luck.

Finally, I also get hangs with thinkpad-acpi on boot since about the same time.

I can't tell whether snd-hda-intel and thinkpad-acpi relate to this bug for sure or not.

Revision history for this message
Lucian Adrian Grijincu (lucian.grijincu) wrote :

On Thu, Oct 16, 2008 at 12:07 AM, Steve Langasek
<email address hidden> wrote:
> Can the users who can reproduce this provide the full output from lspci
> -nn on their systems?
 Attached.

--
Lucian

Revision history for this message
Loïc Minier (lool) wrote :
Revision history for this message
Gert van Dijk (gertvdijk) wrote :

Encountered this today while running a daily-live CD on my Thinkpad T61p. Not only got a hang, but also (very) loud beeps during boot very soon after the kernel was loaded.
Couldn't reproduce immediately so I wondered what it could be (now found this bugreport).
Anyway, my lspci -nn is attached.

Revision history for this message
Aanjhan Ranganathan (aanjhan) wrote :

This is mine! I still get hangs once in 4 times. I am running the 2.6.27-7 kernel.

00:00.0 Host bridge [0600]: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express Memory Controller Hub [8086:27a0] (rev 03)
00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller [8086:27a2] (rev 03)
00:02.1 Display controller [0380]: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller [8086:27a6] (rev 03)
00:1b.0 Audio device [0403]: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller [8086:27d8] (rev 01)
00:1c.0 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 [8086:27d0] (rev 01)
00:1c.1 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2 [8086:27d2] (rev 01)
00:1c.3 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4 [8086:27d6] (rev 01)
00:1d.0 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 [8086:27c8] (rev 01)
00:1d.1 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 [8086:27c9] (rev 01)
00:1d.2 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 [8086:27ca] (rev 01)
00:1d.3 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 [8086:27cb] (rev 01)
00:1d.7 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller [8086:27cc] (rev 01)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 Mobile PCI Bridge [8086:2448] (rev e1)
00:1f.0 ISA bridge [0601]: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge [8086:27b9] (rev 01)
00:1f.2 IDE interface [0101]: Intel Corporation 82801GBM/GHM (ICH7 Family) SATA IDE Controller [8086:27c4] (rev 01)
00:1f.3 SMBus [0c05]: Intel Corporation 82801G (ICH7 Family) SMBus Controller [8086:27da] (rev 01)
03:00.0 Ethernet controller [0200]: Broadcom Corporation BCM4401-B0 100Base-TX [14e4:170c] (rev 02)
03:01.0 FireWire (IEEE 1394) [0c00]: Ricoh Co Ltd R5C832 IEEE 1394 Controller [1180:0832]
03:01.1 SD Host controller [0805]: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter [1180:0822] (rev 19)
03:01.2 System peripheral [0880]: Ricoh Co Ltd R5C843 MMC Host Controller [1180:0843] (rev 0a)
03:01.3 System peripheral [0880]: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter [1180:0592] (rev 05)
03:01.4 System peripheral [0880]: Ricoh Co Ltd xD-Picture Card Controller [1180:0852] (rev ff)
0c:00.0 Network controller [0280]: Intel Corporation PRO/Wireless 3945ABG [Golan] Network Connection [8086:4222] (rev 02)

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

On Wed, Oct 15, 2008 at 09:25:20PM -0000, Loïc Minier wrote:
> Some notes from debug I did in another bug: I tried "modprobe -r" /
> "modprobe" in a loop and couldn't trigger the bug.
>
> I also noticed that snd-hda-intel was often on the screen when it
> crashed, so I tried modprobing that as well at the same time, to no
> luck.

I tried the same things on davidm's machine, with the same results (no
hang).

It's only reproducible during boot so far, but he's now able to reproduce it
during boot very reliably with splash and quiet turned off, so there's a
chance to debug it.

--
 - mdz

Revision history for this message
David Mandala (davidm) wrote :

Screen picture taken at lockup

Revision history for this message
David Mandala (davidm) wrote :

Screen picture taken at next reboot and it also locked up different point.

Revision history for this message
Hernando Torque (htorque) wrote :
Revision history for this message
Matt Zimmerman (mdz) wrote :

Based on David's screenshots, and comparing to dmesg for a successful boot on the same machine, it looks like it's crashing between:

iwl3945: Tunable channels: 11 802.11bg, 13 802.11a channels
and
phy0: Selected rate control algorithm 'iwl-3945-rs'

Is that consistent with what others are seeing? Does anyone see the 'phy0' message when it hangs?

Revision history for this message
Lucian Adrian Grijincu (lucian.grijincu) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

On Thu, Oct 16, 2008 at 12:07 AM, Matt Zimmerman <email address hidden> wrote:
> echo 'install iwl3945 { sleep 5; /sbin/modprobe --ignore-install
> iwl3945; }' | sudo tee /etc/modprobe.d/iwl3945
>
> Confirmation of this is appreciated.

I removed iwl3945 from blacklist and ran your workarround.
I rebooted a few times. It seems to work fine.

--
Lucian

Revision history for this message
Hernando Torque (htorque) wrote :

Another confirmation. 15 boots without a single lockup - seems to be a good backup plan. Hard resets are not really healthy for the hard disk so I hope we will see some kind of workaround for the final?

Revision history for this message
Matt Zimmerman (mdz) wrote :

Hernando: we are working on it, and hope to fix the bug if we can get enough help from folks who are experiencing it. Feedback on https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/263059/comments/95 would be appreciated.

Changed in linux:
status: Confirmed → In Progress
Revision history for this message
Loïc Minier (lool) wrote :

I tried disabling snd-hda-intel, but I could reproduce the hang. Interestingly, the hang happened further in the boot, at "Activating swap...", not immediately in iwl3945's output.

Revision history for this message
Hernando Torque (htorque) wrote :

I did some video logging yesterday (see bugzilla): I've never seen the phy0 line when it hangs but sometimes

iwl3945 0000:05:00.0: PCI INT A disabled

which seems to usually come after the phy0 line (on the last few succesful boots it did).

Revision history for this message
Loïc Minier (lool) wrote :

Matt, the linux bug has some stats WRT to boot messages from Hernando.

The last hang which I see on screen has more output after iwl3945, but the last line from iwl3945 is the "Tunable channels" line.

Revision history for this message
Loïc Minier (lool) wrote :
Revision history for this message
Aanjhan Ranganathan (aanjhan) wrote :

Matt's Workaround seems to work. 6 reboots and not a single hang. This many successive "successful" reboots had only occured when I was on Hardy :P

Revision history for this message
David Mandala (davidm) wrote :

A series of pictures captured one per hang.

Revision history for this message
David Mandala (davidm) wrote :

next hang

Revision history for this message
David Mandala (davidm) wrote :

Next hang

Revision history for this message
David Mandala (davidm) wrote :

Next hang

Revision history for this message
David Mandala (davidm) wrote :

Next

Revision history for this message
David Mandala (davidm) wrote :

Next

Revision history for this message
David Mandala (davidm) wrote :

Next

Revision history for this message
David Mandala (davidm) wrote :

Last hang in a row.

Revision history for this message
Hernando Torque (htorque) wrote :

Just produced a hang directly after

> Detected Intel Wireless WiFi Link 3945ABG

That's before the channel scanning and like the first run here http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1778#c15

So

> Tunable channels: 13 802.11bg, 23 802.11a channels

is not characterizing the beginning of the problematic section.

Revision history for this message
David Mandala (davidm) wrote :

Out of 8 lockups in a row (pictures above) we get:

3 got as far as Tunable Channels
5 got as far as iWL3945 PCI INT A disabled

This is with the radio hardware button off.

Revision history for this message
Hernando Torque (htorque) wrote :

Went through four days of syslog. The successful boots always followed the order:

Tunable channels: 13 802.11bg, 23 802.11a channels
phy0: Selected rate control algorithm 'iwl-3945-rs'
iwl3945 0000:05:00.0: PCI INT A disabled

As stated earlier I've never seen the phy0 line when booting failed but I had some hangs showing the "PCI INT A diasabled" line.

Revision history for this message
Joe Barnett (thejoe) wrote :

$ lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express Memory Controller Hub [8086:27a0] (rev 03)
00:01.0 PCI bridge [0604]: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express PCI Express Root Port [8086:27a1] (rev 03)
00:1b.0 Audio device [0403]: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller [8086:27d8] (rev 01)
00:1c.0 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 [8086:27d0] (rev 01)
00:1c.1 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2 [8086:27d2] (rev 01)
00:1c.2 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 3 [8086:27d4] (rev 01)
00:1c.3 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4 [8086:27d6] (rev 01)
00:1d.0 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 [8086:27c8] (rev 01)
00:1d.1 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 [8086:27c9] (rev 01)
00:1d.2 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 [8086:27ca] (rev 01)
00:1d.3 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 [8086:27cb] (rev 01)
00:1d.7 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller [8086:27cc] (rev 01)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 Mobile PCI Bridge [8086:2448] (rev e1)
00:1f.0 ISA bridge [0601]: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge [8086:27b9] (rev 01)
00:1f.2 IDE interface [0101]: Intel Corporation 82801GBM/GHM (ICH7 Family) SATA IDE Controller [8086:27c4] (rev 01)
00:1f.3 SMBus [0c05]: Intel Corporation 82801G (ICH7 Family) SMBus Controller [8086:27da] (rev 01)
01:00.0 VGA compatible controller [0300]: nVidia Corporation G72M [Quadro NVS 110M/GeForce Go 7300] [10de:01d7] (rev a1)
03:01.0 CardBus bridge [0607]: O2 Micro, Inc. Cardbus bridge [1217:7135] (rev 21)
03:01.4 FireWire (IEEE 1394) [0c00]: O2 Micro, Inc. Firewire (IEEE 1394) [1217:00f7] (rev 02)
09:00.0 Ethernet controller [0200]: Broadcom Corporation NetXtreme BCM5752 Gigabit Ethernet PCI Express [14e4:1600] (rev 02)
0c:00.0 Network controller [0280]: Intel Corporation PRO/Wireless 3945ABG [Golan] Network Connection [8086:4222] (rev 02)

Revision history for this message
Wilken Haase (hibbelharry) wrote :

This Bug affects me too. Happens on HP NC6400 RH485EA Notebook, newest bios and so on. Matt's Delay workaround seems to help for now.

Tim Gardner (timg-tpi)
Changed in linux:
assignee: ubuntu-kernel-team → timg-tpi
Revision history for this message
Parthan SR (parth-technofreak) wrote :

Matt,

This is what I get for your comment #95 after a successful boot immediately after the bug.

$ dmesg | grep iwl3945
[ 19.456367] iwl3945: Intel(R) PRO/Wireless 3945ABG/BG Network Connection driver for Linux, 1.2.26ks
[ 19.456375] iwl3945: Copyright(c) 2003-2008 Intel Corporation
[ 19.456516] iwl3945 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 19.456534] iwl3945 0000:03:00.0: setting latency timer to 64
[ 19.456564] iwl3945: Detected Intel Wireless WiFi Link 3945ABG
[ 19.567178] iwl3945: Tunable channels: 11 802.11bg, 13 802.11a channels
[ 20.063927] iwl3945 0000:03:00.0: PCI INT A disabled
[ 38.939442] iwl3945 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 38.939614] iwl3945 0000:03:00.0: restoring config space at offset 0x1 (was 0x100002, writing 0x100006)
[ 39.016150] iwl3945: Radio disabled by HW RF Kill switch
[ 39.016248] iwl3945 0000:03:00.0: PCI INT A disabled
[ 39.029675] iwl3945 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 39.029831] iwl3945 0000:03:00.0: restoring config space at offset 0x1 (was 0x100002, writing 0x100006)
[ 39.030551] iwl3945: Radio disabled by HW RF Kill switch
[ 39.030807] iwl3945 0000:03:00.0: PCI INT A disabled
$ dmesg | grep phy0
[ 19.584980] phy0: Selected rate control algorithm 'iwl-3945-rs'

Also for me, the bug has started repeating itself on every alternate boot. I switch myself between a wireless connection at home and a wired connection at work. When I do the first boot at work in the morning, it gets stuck but boots up at restart. Similarly when I boot it the first time on going home with wireless turned on, it hangs but reboot at the second attempt.

Revision history for this message
Alexey Balmashnov (a.balmashnov) wrote :

lspci -nn attached. I've posted screen shot earlier in duplicate bug 263928

As of https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/263059/comments/95 I do not remember seeing "phy0:..." line, when system hangs.

Revision history for this message
Christoph Burgdorf (christoph-burgdorf) wrote :

Affects me too, happens on a HP 6510b. When it fails to boot my display shows a lot of graphic error, thats why I first supposed it is an graphic driver related error.

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

On Wed, Oct 15, 2008 at 10:33:55PM -0000, Loïc Minier wrote:
> I tried disabling snd-hda-intel, but I could reproduce the hang.
> Interestingly, the hang happened further in the boot, at "Activating

Yes, the hang seems to occur while the iwl3945 driver is initializing, which
continues after the module has been loaded (and while other modules are
being loaded). The ordering varies, but if we delay iwl3945 for 5 seconds
(which should cause it to be loaded last, on its own) the problem goes away.

I'm not entirely convinced that iwl3945 is actually at fault; it may just be
a trigger.

Can anyone else confirm, as stated in the bug description, that the problem
appeared between 2.6.27-1 and 2.6.27-2?

--
 - mdz

Revision history for this message
Alexey Balmashnov (a.balmashnov) wrote :

Matt Zimmerman wrote:
> Can anyone else confirm, as stated in the bug description, that the problem
> appeared between 2.6.27-1 and 2.6.27-2?

I don't remember that I ever used 2.6.27-1 kernel. Is there a simple way to install it? I think that many people started to test 8.10 later, than the kernel was introduced (e.g. from beta release), and having such an instruction in the comments would result in broader testing (I will definitely try).

Revision history for this message
Sander Jonkers (jonkers) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

On Thu, Oct 16, 2008 at 10:41 AM, Alexey Balmashnov
<email address hidden> wrote:
>
> Matt Zimmerman wrote:
> > Can anyone else confirm, as stated in the bug description, that the problem
> > appeared between 2.6.27-1 and 2.6.27-2?
>
> I don't remember that I ever used 2.6.27-1 kernel. Is there a simple way
> to install it? I think that many people started to test 8.10 later, than
> the kernel was introduced (e.g. from beta release), and having such an
> instruction in the comments would result in broader testing (I will
> definitely try).

It seems linux-image-2.6.27-1-generic has existed, as Google reports:

     Ubuntu -- Details voor pakket linux-image-2.6.27-1-generic in intrepid

     Linux kernel image for version 2.6.27 on x86/x86_64.
     packages.ubuntu.com/nl/intrepid/main/linux-image-2.6.27-1-generic - 14k -

However, http://packages.ubuntu.com/intrepid/linux-image-2.6.27-1-generic
does not exist (anymore), whereas
http://packages.ubuntu.com/intrepid/linux-image-2.6.27-7-generic does
exist.

So: where's linux-image-2.6.27-1-generic?

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

On Thu, Oct 16, 2008 at 09:13:13AM -0000, Sander Jonkers wrote:
> On Thu, Oct 16, 2008 at 10:41 AM, Alexey Balmashnov
> <email address hidden> wrote:
> >
> > Matt Zimmerman wrote:
> > > Can anyone else confirm, as stated in the bug description, that the problem
> > > appeared between 2.6.27-1 and 2.6.27-2?
> >
> > I don't remember that I ever used 2.6.27-1 kernel. Is there a simple way
> > to install it? I think that many people started to test 8.10 later, than
> > the kernel was introduced (e.g. from beta release), and having such an
> > instruction in the comments would result in broader testing (I will
> > definitely try).
>
> It seems linux-image-2.6.27-1-generic has existed, as Google reports:
>
> Ubuntu -- Details voor pakket linux-image-2.6.27-1-generic in
> intrepid
>
> Linux kernel image for version 2.6.27 on x86/x86_64.
> packages.ubuntu.com/nl/intrepid/main/linux-image-2.6.27-1-generic - 14k -
>
> However, http://packages.ubuntu.com/intrepid/linux-image-2.6.27-1-generic
> does not exist (anymore), whereas
> http://packages.ubuntu.com/intrepid/linux-image-2.6.27-7-generic does
> exist.
>
> So: where's linux-image-2.6.27-1-generic?

You can find old versions of the Ubuntu kernel at:
https://edge.launchpad.net/ubuntu/+source/linux

Click on a version:
https://edge.launchpad.net/ubuntu/+source/linux/2.6.27-1.2

Click on a build for the right architecture (say i386):
https://edge.launchpad.net/ubuntu/+source/linux/2.6.27-1.2/+build/699407

Then find the linux-image-xxx-generic package:
https://edge.launchpad.net/ubuntu/intrepid/i386/linux-image-2.6.27-1-generic/2.6.27-1.2

and click the link to download the .deb:
http://launchpadlibrarian.net/17040869/linux-image-2.6.27-1-generic_2.6.27-1.2_i386.deb

It would help if someone could narrow down the regression to a specific
version by thoroughly testing each one.

--
 - mdz

Revision history for this message
Matt Zimmerman (mdz) wrote :

Another helpful thing to try would be to try the instrumented iwl3945.ko and iwlcore.ko modules I'm attaching now. These will print additional debug information, so boot the system without "quiet" or "splash" and send us a photo of what appears on the screen when it hangs.

Revision history for this message
Hernando Torque (htorque) wrote :

Definitely happening with -2. Now trying -1 to see, if -2 is where it started. Will then use your module the get more info.

Revision history for this message
Hernando Torque (htorque) wrote :

Unlike the bug reporter I also have hangs with -1.2. -1.1 not available, so I tested 2.6.26-5.17 which turned out to be fine. Can anyone confirm this?

Matt Zimmerman (mdz)
description: updated
Revision history for this message
Tim Holy (holy-wustl) wrote :

With a Thinkpad T60, 2.6.27-1 also caused a hard lockup for me. See bug 263330. I think the "working on 2.6.27-1" must have been a red herring, explainable since the problem is intermittent.

Revision history for this message
Hernando Torque (htorque) wrote :
Revision history for this message
Frederic PO (fredericp) wrote :

Here are my results with instrumented modules posted by Matt.
Different sequences each time.
Kernel 2.6.27-7-generic #1 SMP Fri Oct 10 03:55:24 UTC 2008 i686 GNU/Linux
Hope it helps.

Revision history for this message
SK (stephantom) wrote :

The kernel bug tracker which is attached to this report does list the issue as RESOLVED/CODE_FIX. See http://bugzilla.kernel.org/show_bug.cgi?id=11746 - does this help us?

Matt Zimmerman (mdz)
description: updated
Revision history for this message
Mikael Nilsson (mini) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

tor 2008-10-16 klockan 08:21 +0000 skrev Matt Zimmerman:

>
> Can anyone else confirm, as stated in the bug description, that the problem
> appeared between 2.6.27-1 and 2.6.27-2?

I'm the bug reporter. I just tried 2.6.27-1.2, and it hanged.

I can't say for sure whether I ran 2.6.27-1.1 successfully or whether I
was just lucky with -1.2 and never happened to have the hang before I
updated to -2.

Sorry for the misinformation.

/Mikael

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

On Thu, Oct 16, 2008 at 12:26:43PM -0000, Stephan Klein wrote:
> The kernel bug tracker which is attached to this report does list the
> issue as RESOLVED/CODE_FIX. See
> http://bugzilla.kernel.org/show_bug.cgi?id=11746 - does this help us?

No, unfortunately that is a different issue (bug 275227) which is already
fixed in the Intrepid kernel. That one will crash it every time.

--
 - mdz

Revision history for this message
Matt Zimmerman (mdz) wrote :

The screenshots with instrumented modules seem to indicate that the hang is occurring sometimes while iwl3945 is initializing, and other times after it is finished initializing. This hints that something else happening in parallel is triggering the hang.

Revision history for this message
Hernando Torque (htorque) wrote :

After blacklisting rfkill (just tried it because the driver was added with 2.6.27) the phy0 line showed up for the first time:
http://img.xrmb2.net/images/436677.jpeg
Wasn't there one hang later, though.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

I think there is some weird racy stuff going on with rf-kill. In the i3945 init code you can see that it disables the PCI device right before it registers the rf-kill handler. This doesn't prevent the handler from being called which looks like it might access the device, even though its disabled. I imagine the side effects of this are hardware dependent.

Incidentally, the init code in Hardy is completely different.

Revision history for this message
Luke12 (luca-venturini) wrote :

Not very useful, but I'd like to add a "I am NOT affected by this bug" comment. I use a Dell Inspiron 6400, Nvidia Driver, IWL3945, and I have been using Intrepid both from live cd and as my default OS for two weeks now. I have never had any problem at boot time, and actually this driver is at its best ever. If you need any data to understand why this model is not affected, please ask.

Revision history for this message
André Ventura (afv) wrote :

> Not very useful, but I'd like to add a "I am NOT affected by this bug" comment. I use a Dell Inspiron 6400, Nvidia
> Driver, IWL3945, and I have been using Intrepid both from live cd and as my default OS for two weeks now. I have
> never had any problem at boot time, and actually this driver is at its best ever. If you need any data to understand
> why this model is not affected, please ask.

What is the kernel version you're using?

Revision history for this message
Sebastian Breier (tomcat42) wrote :

Luke12: Might be a very special model though. I myself have a Dell Inspiron 6400 (though without NVidia), and I do have the bug.

Revision history for this message
Hernando Torque (htorque) wrote :

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/263059/comments/43 also has an Inspirion 6400 and the bug. Maybe the output of lspci -nn could be helpful?

Revision history for this message
Henry Gomersall (hgomersall) wrote :

What graphics driver are people using? I had funny interactions around networking/compiz/radeon r500 using the open source radeon driver, which I'm using at the moment.

Revision history for this message
Sebastian Breier (tomcat42) wrote :

Here's my lspci -nn output.

Revision history for this message
Sebastian Breier (tomcat42) wrote :

I use the Intel 915 driver (module i915) for graphics.

Revision history for this message
Geek87 (geek87) wrote :

Hi, I'm experiencing the same bug with my ASUS F3JC laptop wich is equiped with an ipw3945 card too. I don't think the graphic card has something to see with this bug since I've a nVIDIA and the bug seems to happen with Intel and Ati cards as well. I don't know if this is important but I don't remember to have had this bug witht the live cd of the beta which I booted a lot of times.

Revision history for this message
LumpyCustard (orangelumpycustard) wrote :

I booted into the live cd (I think it had kernel -5 on it) today with no problem also. That was on a Lenovo machine

Revision history for this message
Luke12 (luca-venturini) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)
  • lspci.txt Edit (2.4 KiB, text/plain; name="lspci.txt"; charset="UTF-8")

Neither do I actually; I have not experienced this bug either with nv
driver or the nvidia one. I am attaching here my lspci -nn. As for the
kernel version, I am using the last one provided, which is
2.6.27-7-generic. I have all the available updates repositories enabled.

Il giorno gio, 16/10/2008 alle 16.13 +0000, Geek87 ha scritto:
> Hi, I'm experiencing the same bug with my ASUS F3JC laptop wich is
> equiped with an ipw3945 card too. I don't think the graphic card has
> something to see with this bug since I've a nVIDIA and the bug seems to
> happen with Intel and Ati cards as well. I don't know if this is
> important but I don't remember to have had this bug witht the live cd of
> the beta which I booted a lot of times.
>

Revision history for this message
Sebastian Breier (tomcat42) wrote :

LumpyCustard: Please remember you usually need multiple boots (about 1 in 5 for me) to get a hang.

Revision history for this message
Loïc Minier (lool) wrote :

@Tim: when I tried my modprobe -r / modprobe loops, I tried playing with the hardware wifi kill switch and didn't get the driver to catch in hundreds of modprobe/modprobe -r cycles and a dozen of kill switch on/off cycles.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

I've narrowed it down to a call to ieee80211_register_hw() from within the i3945 wireless driver, so I don't think its driver specific. Its very racy, and seems to only happen at boot time.

Revision history for this message
Martin Pitt (pitti) wrote :

FWIW, I just started to experience the hang very late, more like 2.6.27-5; definitively not yet in -3. However, I hardly ever reboot, most of the time I just hibernate (which has never had this problem, BTW).

I also could never reproduce it without "quiet", and quite reliably with "quiet". Typically Heisenbugish.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

I have to retract my last statement about the hang happening in ieee80211_register_hw(). I have now observed the i3945 module init process complete at least once right before a hang. That's going to make things much more difficult because now I don't really know where to look.

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

I am seeing very similar problems with ipw2200. See bug #284406 for details. All 2.6.27 kernels starting with 2.6.27-2.3-generic exhibit the problem for me (I haven't tried a 2.6.27 kernel earlier than -2.3).

Revision history for this message
Tim Gardner (timg-tpi) wrote :

I tried some kernel debug features (without any change in behavior):

CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=1
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_PI_LIST=y
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y
CONFIG_LOCKDEP=y
CONFIG_PROVE_LOCKING=y
CONFIG_TRACE_IRQFLAGS=y

Revision history for this message
Hernando Torque (htorque) wrote :

The only thing we know: a five (probably less) seconds delay stops the hangs.

So I removed everything [up to "Loading hardware drivers..."] before the module gets loaded during a successful delayed boot by either blacklisting it, removing the driver, disabling it in the kernel, and adding "acpi=off" to the kernel line.

Result: I still get hangs, now looking like this: http://img.xrmb2.net/images/874633.png

I'm outta ideas.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

The i3945 part of this bug is a red herring. It also happens with an ipw2200 on a Thinkpad T42 (bug #284406). I'm wondering if the problem is hardware related. Can anyone confirm this hang using a 64 bit kernel?

Revision history for this message
Luke12 (luca-venturini) wrote :

Again sorry for a "me not" post; using a 64 bit kernel here I have no
problems. Cannot say for a 32 bit kernel though. You can find my lspci
output in earlier comments.

Il giorno ven, 17/10/2008 alle 13.23 +0000, Tim Gardner ha scritto:
> The i3945 part of this bug is a red herring. It also happens with an
> ipw2200 on a Thinkpad T42 (bug #284406). I'm wondering if the problem is
> hardware related. Can anyone confirm this hang using a 64 bit kernel?
>

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

I've created an extremely minimal test case that replicates this bug.

First I compiled a custom kernel, this had almost all core parts compiled in and only true "drivers" as modules. I attach the config here.

Notably this only leaves iwl3945 and tg3 as modules for me.

The hang still occurred at udevadm trigger time - proving that it wasn't anything core being raced, just ordinary PCI drivers

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Next I eliminated userspace from the problem. I commented out the "start on startup" line from /etc/event.d/rcS and instead added the attached "sysinit" job.

This performs the absolute minimum necessary to get udev running, and sets off the trigger.

I still had the hang, so it's not a race with anything like dbus, HAL, NM, X, etc.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

To make sure it wasn't any udev background processing, I cut out all the non-essential udev rules.

I was left with just this:

# ls /etc/udev/rules.d
20-names.rules
40-basic-permissions.rules
40-permissions.rules
80-programs.rules
85-ifupdown.rules
90-modprobe.rules

NOTE that I explicitly removed the network device renaming rules -- the hang still occurred, so this is not a problem with device renaming.

It must simply be a module loading issue.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

I blacklisted iwl3945, and the boot goes normally with no hang (replicated about 50 times)

So I tried the following at the shell:

# while true; do modprobe iwl3945; sleep 0.1; modprobe -r iwl3945; sleep 0.1; done

Unfortunately this did not cause a hang, even after hundreds of iterations.

The only difference between that and what udev is doing is that udev may load modules in parallel.

So I blacklisted tg3 as well, then tried the attached shell script.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Not only does this hang, frequently; I also saw the following *wonderful* error message a few times:

Uhhuh. NMI received for unknown reason b1 on CPU 0.
You have some hardware problem, likely on the PCI bus.
Dazed and confused, but trying to continue.

Substituting tg3 for another driver (ThinkPad users have e1000 anyway) seems to still produce the hang - I had the pcmcia socket as a module and used that instead, that caused the hang.

Repeatedly loading tg3 and the pcmcia socket together does _not_ hang.

My hypothesis is that the iwl family of drivers may leave the PCI bus in an invalid state, so when combined with another driver load, can cause a hang or at least leaving the kernel severely unhappy.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Note that I am able to reproduce the hang with the kill switch both on and off, but it is far more common with the kill switch on (device disabled)

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

On Fri, Oct 17, 2008 at 04:00:42PM -0000, Scott James Remnant wrote:
> Note that I am able to reproduce the hang with the kill switch both on
> and off, but it is far more common with the kill switch on (device
> disabled)

That's consistent with my testing on David's machine: the frequency went up
when we turned the kill switch on.

--
 - mdz

Revision history for this message
taiebot65 (taiebot65) wrote :

Hello i don't know if it is related to this bug but now i hang on boot and shutdown for more than 15 second at each boot.
My wifi load or not load and when it load my connection is so weak i can not do anything. I ve got when i m connecting to the wire more than 700kb/s and with my wifi connected less than 50kb/s.

taiebot@home:~$ lsusb
Bus 005 Device 002: ID 0bda:8187 Realtek Semiconductor Corp. RTL8187 Wireless Adapter
Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

Revision history for this message
Matt Zimmerman (mdz) wrote :

On Sat, Oct 18, 2008 at 04:59:01PM -0000, taiebot65 wrote:
> Hello i don't know if it is related to this bug but now i hang on boot and shutdown for more than 15 second at each boot.
> My wifi load or not load and when it load my connection is so weak i can not do anything. I ve got when i m connecting to the wire more than 700kb/s and with my wifi connected less than 50kb/s.

Your problem is not related to this bug report.

--
 - mdz

Revision history for this message
xinit (ubuntu-evenflow) wrote :

I'm experiencing the same thing on an HP Compac nc6320. Haven't tried out any workarounds yet, but at boot, I have about 20% success and booting nicely.

lscpi:

00:00.0 Host bridge [0600]: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express Memory Controller Hub [8086:27a0] (rev 03)
00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller [8086:27a2] (rev 03)
00:02.1 Display controller [0380]: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller [8086:27a6] (rev 03)
00:1b.0 Audio device [0403]: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller [8086:27d8] (rev 01)
00:1c.0 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 [8086:27d0] (rev 01)
00:1c.2 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 3 [8086:27d4] (rev 01)
00:1c.3 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4 [8086:27d6] (rev 01)
00:1d.0 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 [8086:27c8] (rev 01)
00:1d.1 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 [8086:27c9] (rev 01)
00:1d.2 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 [8086:27ca] (rev 01)
00:1d.3 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 [8086:27cb] (rev 01)
00:1d.7 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller [8086:27cc] (rev 01)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 Mobile PCI Bridge [8086:2448] (rev e1)
00:1f.0 ISA bridge [0601]: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge [8086:27b9] (rev 01)
00:1f.1 IDE interface [0101]: Intel Corporation 82801G (ICH7 Family) IDE Controller [8086:27df] (rev 01)
00:1f.2 SATA controller [0106]: Intel Corporation 82801GBM/GHM (ICH7 Family) SATA AHCI Controller [8086:27c5] (rev 01)
02:06.0 CardBus bridge [0607]: Texas Instruments PCIxx12 Cardbus Controller [104c:8039]
02:06.1 FireWire (IEEE 1394) [0c00]: Texas Instruments PCIxx12 OHCI Compliant IEEE 1394 Host Controller [104c:803a]
02:06.2 Mass storage controller [0180]: Texas Instruments 5-in-1 Multimedia Card Reader (SD/MMC/MS/MS PRO/xD) [104c:803b]
02:06.3 SD Host controller [0805]: Texas Instruments PCIxx12 SDA Standard Compliant SD Host Controller [104c:803c]
02:06.4 Communication controller [0780]: Texas Instruments PCIxx12 GemCore based SmartCard controller [104c:803d]
02:0e.0 Ethernet controller [0200]: Broadcom Corporation NetXtreme BCM5788 Gigabit Ethernet [14e4:169c] (rev 03)
08:00.0 Network controller [0280]: Intel Corporation PRO/Wireless 3945ABG [Golan] Network Connection [8086:4222] (rev 02)

Revision history for this message
travtek (bddunham) wrote :

I was having the same problem on my lenovo Z61m laptop which also uses the iwl3945 driver. It would consistently fail on every other boot attempt. Since I ran the Update Manager last night, it hasn't failed to boot once. Maybe it is fixed now.

Revision history for this message
Andres Järv (andresjarv) wrote :

The 2.6.27 kernel has survived 2 cold boots here too. Previously that did never happen. I'll test some more.

Revision history for this message
xinit (ubuntu-evenflow) wrote :

Same here. 2 boots and no problems. Don't see any related fixes in the changelog though.

Revision history for this message
Hernando Torque (htorque) wrote :

15 boots without a hang. Fixed with 2.6.27-7.12?

Revision history for this message
Sander Jonkers (jonkers) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

What's "2.6.27-7.12"? Is that an Ubuntu kernel, or a Linux kernel?

I'm now running "2.6.27-7-generic #1 SMP Fri Oct 17 22:24:21 UTC 2008 i686
GNU/Linux" (without the .12), and there are no further updates available. Is
that the version that has fixed the bug for you? I can't tell right away
because I'm using the blacklict&modprobe workaround. Is it safe / worthwile
to remove that workaround?

Sander

On Mon, Oct 20, 2008 at 1:12 AM, Hernando Torque <email address hidden>wrote:

> 15 boots without a hang. Fixed with 2.6.27-7.12?
>
> --
> [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)
> https://bugs.launchpad.net/bugs/263059
> You received this bug notification because you are a direct subscriber
> of a duplicate bug.
>

Revision history for this message
Alexey Balmashnov (a.balmashnov) wrote :

Sander, 2.6.27-7.12 is an actual version of the package see http://packages.ubuntu.com/intrepid/linux-image-2.6.27-7-generic or package description in your favorite package manager.

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 263059] Re: [regression] 2.6.27-7 sometimes fails to boot (iwl3945 issue?)

On Mon, Oct 20, 2008 at 08:15:32AM -0000, Sander Jonkers wrote:
> What's "2.6.27-7.12"? Is that an Ubuntu kernel, or a Linux kernel?

cat /proc/version_signature

--
 - mdz

Revision history for this message
Matt Zimmerman (mdz) wrote :

On Sun, Oct 19, 2008 at 09:18:37PM -0000, xinit wrote:
> Same here. 2 boots and no problems. Don't see any related fixes in the
> changelog though.

Notably, this upload disabled the ftrace feature in the kernel. This is a
new feature in 2.6.27 which is suspected to have bugs related to loadable
modules. It may have been the culprit.

  * disable CONFIG_DYNAMIC_FTRACE due to possible memory corruption on
    module unload

--
 - mdz

Revision history for this message
Hernando Torque (htorque) wrote :

Another 15 boots without a hang. Tried to find a pattern in the syslogs but there is none. iwl3945 seems to usually get loaded earlier but not always.

I've suspected the ftrace change too and am currently building a kernel with this option enabled.

Revision history for this message
Matt Zimmerman (mdz) wrote :

Could someone with the relevant hardware try Scott's minimal test case under 2.6.27-7.12 and see if it still breaks?

Revision history for this message
Hernando Torque (htorque) wrote :

Will try it later, for now I can just confirm that enabling CONFIG_DYNAMIC_FTRACE caused hangs again (didn't touch other config parts).

Revision history for this message
Loïc Minier (lool) wrote :

I booted the old and new kernels today. The old kernel would hang 3 times out of 3 when my wired cable wasn't plugged, and didn't hang 3 times out of 3 when it was plugged (probably a timing issue with e1000e when the cable is plugged).

The new kernel booted successfully 6 times out of 6 (half of these tries with network cable plugged).

I only tried the testcase with the new kernel, and it passes just fine at least 150 times (however I tried triggerring the bug with parallel loading in the past myself, with snd-hda-intel, and didn't succeed in getting the hang).

Revision history for this message
Martin Pitt (pitti) wrote :

Scott's test script runs successfully over 100 iterations with -7.12.

I successfully booted -7.12 with "quiet splash" two times and with "quiet" two times. With either of those, previous kernels almost always hung for me. Booting without "quiet" still works fine (just as in previous kernels).

So this is fixed for me, too, thanks to everyone!

Revision history for this message
jekle (jekle) wrote :

I had the same problem with >=2.6.27-3

Since some days (2.6.27.7) the problem seems to be fixed.

I have a Dell Precision M90 Notebook.

Revision history for this message
Matt Zimmerman (mdz) wrote :

Marking this fixed by:

linux (2.6.27-7.12) intrepid; urgency=low
[...]
  * disable CONFIG_DYNAMIC_FTRACE due to possible memory corruption on
    module unload

Changed in linux:
status: In Progress → Fix Released
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

I'm up over 1,000 iterations with the new kernel. I am content that disabling ftrace has fixed the problem.

From the available information, and note that I'm *way* out of my comfort zone and area of expertise here, I would hypothesise that having the __init section of another module discarded while iwl3945 is initialising causes memory corruption due to the ftrace bug - and due to the complex initialisation of the PCI device thanks to the kill switch behaviour, this can result in a hang.

Revision history for this message
Ara Pulido (ara) wrote :

Tested in Sony VGN-SZ140P
This machine wasn't booting with the kernel with the bug. With the 2.6.27-7.12 kernel it boots correctly every time I tried (>20 times).

Revision history for this message
DSHR (s-heuer) wrote :

I removed all modprobe workarounds and system boots reliably now (lenovo thinkpad X60s).

Revision history for this message
Mikael Nilsson (mini) wrote :

-7.12 seems to work reliably for me as well. I'm the original reporter.

Changed in intellinuxwireless:
status: Confirmed → In Progress
Revision history for this message
Andrew Lentvorski (bsder) wrote :

Just wanted to confirm that this works on a Dell Inspiron Mini 9 with a Dell pulled Intel 3945ABG card. Before, my system would quite reliably hang. Now, it works fine.

Thanks for all the hard work fixing this.

Revision history for this message
Andrew Lentvorski (bsder) wrote :

Spoke too soo, my Mini 9 is hanging again.

Even worse, when it hangs the system gets *HOT*. Too hot to actually hold. And it only took about 60 seconds.

Not good.

Revision history for this message
Andrew Lentvorski (bsder) wrote :

Spoke too soon, my Mini 9 is hanging again.

Even worse, when it hangs the system gets *HOT*. Too hot to actually hold. And it only took about 60 seconds.

Not good.

Revision history for this message
Andrew Lentvorski (bsder) wrote :

Spoke too soon, my Mini 9 with a Dell pulled Intel 3945ABG is hanging again.

Even worse, when it hangs the system gets *HOT*. Too hot to actually hold. And it only took about 60 seconds.

Not good.

Revision history for this message
mockdeep (rtfletch81) wrote :

My 8.10 installation will only boot up at home. Anywhere else it hangs at "Configuring network interfaces". It seems it will only boot up if it is in the presence of my default network connection.

Changed in intellinuxwireless:
status: In Progress → Confirmed
Changed in intellinuxwireless:
status: Confirmed → Invalid
Revision history for this message
Jason Tackaberry (tack) wrote :

Just upgraded to Intrepid and am hit with this extraordinarily irritating problem. 2.6.27-9.19 has CONFIG_HAVE_DYNAMIC_FTRACE=y. I assume at some point between 7.12 and 9.19 it was reenabled?

Revision history for this message
Hernando Torque (htorque) wrote :

It was CONFIG_DYNAMIC_FTRACE that got (and still is) disabled. My system's working fine - are you sure the bug described here is what happens to you (look at the load of linked pictures)?

tags: added: iso-testing
Changed in intellinuxwireless:
importance: Unknown → Critical
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.