Linux kernel 2.6.24-12 lockup

Bug #204996 reported by Elod VALKAI
178
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
High
Unassigned
Nominated for Hardy by Julian Alarcon
Intrepid
Invalid
High
Unassigned

Bug Description

Binary package hint: linux-image-2.6.24-12-generic

I was upgrading from gutsy (with Linux 2.6.22-14) to the latest alpha last sunday (16.03.2008), and I've got some problems with the kernel.

The 2.6.24-12-generic (I think, may be -386) causes my machine to lock up (hard) after about 5 minutes. Generally I was under X, there is no specific program I was using at the time it locked up.

The hardware is completely stable & has been for the last 3 years with ubuntu, and still is with hardy & the gutsy kernel.

Another thing I noticed is that something in the initrd keeps the machine from booting for at least 2-3 minutes. It's definetely before running scripts in the /etc/rcS.d folder, I have not traced it in the initrd. After booting all is well until it locks up.

The hardware is a Dell C400, Intel chipset, Intel graphics (i830), 384MB of RAM and an atheros wireless card.

lsb_release:
Description: Ubuntu hardy (development branch)
Release: 8.04

It's completely reproductible, the freeze takes out the whole kernel (no reply to ping from network). Any suggestions on how to trace it? I have a serial port, no parallel.

Revision history for this message
Elod VALKAI (elod) wrote :

Freezes do not happen with the latest beta (as of 24.03.2008), but the initrd still keeps me waiting at startup.

Revision history for this message
BobPendleton (bob-pendleton) wrote :

I have the same hardware but with 512MB. I have the same problem. The problem occured with alpha 4, alpha 5, and beta 1. After alpha 5 I did a complete reinstall of 7.10, which works perfectly, and then did a new install of beta 1 using update-manager -d.

It still locks up hard after 5 to 10 minutes. I'm usually in Firefox when it locks up but I have had it lock up in other applications including synaptic and xemacs. Last time it locked up was this afternoon after doing a complete apt-get update/apt-get dist-upgrade, so the software was at the latest patch level as of this afternoon (US central standard time).

Bob Pendleton

Revision history for this message
Elod VALKAI (elod) wrote :

I've not tested it for very long. I'll get back to it when I've got time. Still running 2.4.22

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi Guys,

Per the kernel team's bug policy, can you please attach the following information. Please be sure to attach each file as a separate attachment.

* cat /proc/version_signature > version.log
* dmesg > dmesg.log
* sudo lspci -vvnn > lspci-vvnn.log

For more information regarding the kernel team bug policy, please refer to https://wiki.ubuntu.com/KernelTeamBugPolicies . Also, the following wiki may help to gather additional useful logs to help debug: https://help.ubuntu.com/community/DebuggingSystemCrash . Thanks again and we appreciate your help and feedback.

Changed in linux:
status: New → Incomplete
Revision history for this message
Elod VALKAI (elod) wrote :

I'm submitting the requested information about my system. It surely does lock up with the current kernel (just did, not after minutes, but rather 2-3 hours).

I've also noted the lines output by initrd.img at boot:

Begin: /scripts/local-premount
/script/local-premount/resume: /script/local-premount/resume: 57: log_begin_msg: not found
--> at this point booting hangs for aprox 2-3 minutes, and resumes normally after
/script/local-premount/resume: /script/local-premount/resume: 57: log_end_msg: not found

Haven't looked inside the initrd, is this something related to suspend? I don't have a swap partition, only a swap file
that's hosted on on /dev/sda3 (FAT32). Swap is not related to the lockups, it always happened with 1 application loaded or in gdm.

I'm attaching the requested files.

Revision history for this message
Elod VALKAI (elod) wrote :
Revision history for this message
Elod VALKAI (elod) wrote :
Revision history for this message
Elod VALKAI (elod) wrote :

I'd also like to note that the machine has the latest BIOS update from Dell (A12), and is working perfectly with linux-image 2.6.22-14.52 & the rest of hardy.

Revision history for this message
Elod VALKAI (elod) wrote :

It just produced a nice lockup right after my last post, so my previus post with hangs after 2-3 hours is not entirely accurate.

I do know this is rather hard to trace, but I'm willing to take the time.

Revision history for this message
Wolfgang Glas (wglas) wrote :

Same problem here, I'm using hardy beta-1 x86_64, Hardware is a Dell Latitude D830 with 2GB of RAM. Lockup occurred while using firefox3.

Revision history for this message
Elod VALKAI (elod) wrote :

Wolfgang, please post the information requested above by Leann:

* cat /proc/version_signature > version.log
* dmesg > dmesg.log
* sudo lspci -vvnn > lspci-vvnn.log

Danke schön! :)

Revision history for this message
Wolfgang Glas (wglas) wrote :
Revision history for this message
Wolfgang Glas (wglas) wrote :
Revision history for this message
Wolfgang Glas (wglas) wrote :
Revision history for this message
Wolfgang Glas (wglas) wrote :

Setting to confirmed, because I attached all required informations and multiple users observe this bug.

Generally, we've been observing this kind of lockup in each ubuntu kernel starting with feisty's 2.6.20. In fact, this kind of kernel lockup is the reason why we are operating debian etch on our servers in favor of an ubuntu distribution.

Changed in linux:
status: Incomplete → Confirmed
Changed in linux:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → High
status: Confirmed → Triaged
Revision history for this message
Jim March (1-jim-march) wrote :

Yeah, same thing here, hard lockups every two to four hours. Thing goes totally dead.

Hardware: Acer 3680 laptop, Celeron single-core 1.6gHz, 533 memory bus, Atheros WiFi, 80gig SATA hard disk, 1.5gigs memory, Intel945 video, Intel sound, Marvell Ethernet chipset. This thing ran great under Gutsy and Feisty, no signs of overheating. I briefly switched back to Gutsy and all lockups vanished. My usually Internet connection is a Verizon EVDO cellmodem, Kyocera KPC650 PCMCIA set up as a PPP device, but I'm also crashing with it pulled so that's not it.

Software: pretty basic Hardy setup, also VirtualBox 1.5.6. Most crashes have happened without running a guest OS (XP).

Things I've tried: disabling Compiz, disabling Powernowd, pulling the Atheros miniPCI card completely, switching from Network Manager to Wicd, disabling Tracker completely, various others. Nothing has helped.

Version.log file (as described above) contains:

Ubuntu 2.6.24-16.30-generic

dmesg.log file is attached to this post, the last log will go in next.

Revision history for this message
Jim March (1-jim-march) wrote :

Final log...

Revision history for this message
Anthony Chianese (achianese) wrote :

I'm having a problem that may be the same as this, also using the latest Hardy beta. I'm using network-manager, which automatically detected my Linksys WMP54G wireless PCI card as "RaLink RT2561/RT61 802.11g PCI". It works very well until the crash, which causes the system to become completely unresponsive (no keyboard or mouse response, Capslock and Scroll lock blink, nothing is added to syslog at the time of crash).

I can reproduce the crash easily by asking rsync to sync a directory across my local network.. After 2-3 minutes, it crashes. I'm willing to provide any additional information someone wants, but I'm not sure right how to get more information about the cause. The system has also crashed while I was surfing the web using firefox or downloading a large number of emails in thunderbird, always while the network is transferring data.

Attachments coming..

Revision history for this message
Anthony Chianese (achianese) wrote :
Revision history for this message
Anthony Chianese (achianese) wrote :
Revision history for this message
martyg (snowbird) wrote :

I think I am seeing the same issue on my little server.

Screen goes blank. No mouse response.
Can''t switch VC's, CTL-ALT-BS, or CTL-ALT-DEL.
Not accessible from network either.

Next time it happens, I'll try doing a ALT-SysRq-E and dig around.

No idea how to reproduce - Just happens ranodomly.
No obvious artifacts in logs I can send the maintainers.

This machine has been 100% stable on Debian for past 3 years! (Running 24/7)
Hardware is AMD64 barebone desktop with cheapo ATI graphics.

I am enclosing my system description as requested in an earlier post by the kernel maintainers.

Revision history for this message
martyg (snowbird) wrote :

Following Anthony's guidance, I have been able to reproduce once in just a few minutes by loading up the network as much as I can.
Regrettably, haven't been able to do it again after trying for about 30 minutes.
But, at this point, I am convinced this issue is network traffic related. This is consistent with my other failures earlier today.

No response to ALT-SysRq-E after the failure. Need to power cycle to continue.
After rebooting, nothing in the logs.

Note I have observed this issue on both NICs I have installed on this system.
(After getting a couple of these, I rolled my connection over to the other NIC and have been seeing it there too)

Revision history for this message
reh4c (gene-hoffler) wrote :

I have experienced this issue (or one with the same traits) when using synaptic to perform upgrades. My laptop typically freezes when installing packages...screen doesn't go blank, but the whole system is frozen. Already filed another bug and posted the log files there.

Revision history for this message
Jim March (1-jim-march) wrote :

The common denominator is heavy CPU activity. I can't see ANY other pattern here, guys, unless it's some very obscure chipset component. Even then, some are running ATI CPUs which likely use damn few similar "guts" parts with my Intel-based lappy fr'instance...

Video chips, WiFi chips, we're all over the friggin' map here. Which makes me think this one will be a royal bear to sort out.

Revision history for this message
Anthony Chianese (achianese) wrote :

I've been playing for a day now, and I think my lockup is directly related to wireless. Switching from network-manager to wicd didn't fix it, but switching from wireless to wired (either network-manager or wicd) seems to fix it (stable for a day, constantly maxing out my local bandwidth using rsync). If I go back to wireless, I can lock it up in a few mins as before.

Skimming your logs Marty, it looks like you're only using wired ethernet, correct?

I'll see if ndiswrapper does anything differently.

Revision history for this message
Wolfgang Glas (wglas) wrote :

AFAIK the lockup occurs on server hardware, too. In my case, I used a PCMCIA UMTS card (Sierra wireless AirCard 875).

My overall impression is, that it might be related to the payload wireless devices put on the TCP/IP-stack. Typically, wireless devices stress TCP congestion handling far more than wired connections do. And yes, in my case the lockup may be provoked by a combination of CPU load plus a fair amount of open TCP/IP connections.

Revision history for this message
martyg (snowbird) wrote : Re: Linux kernel 2.6.24-16 lockup

There are no WiFi adaptors in my system. I have observed this bug on both my hardwired 100/1000 NICs.

Note all my failures have been on 2.6.24-16. (Changed title) I did not run the -14 kernel.

I tried reproducing yesterday by pounding on the CPU with infinite make -j4 kernel compiles.
Both cores in my CPU were saturated for 3+ hours. No failures during this test.
Crashed once during the 24 hours, but I still can't figure out a good way to reproduce.

My best guess is this is triggered by an unusual condition in the networking stack, e.g. segmentation, checksum, or MTU misalignment.

Has anyone tried seeing if a serial console still responds after failure?

Revision history for this message
PeteDonnell (pete-donnell-deactivatedaccount) wrote :

I'm experiencing the same hard freeze with kernel 2.6.24-16. No wireless here, just onboard ethernet. The same system worked just fine under Gutsy, the problem's only started since upgrading to Hardy. My graphics card is an old S3 Virge. Crashes usually appear when watching Youtube in Firefox, but also have occured when watching videos from a locally attached USB drive. System appears stable under light load though - I stupidly uninstalled the Gutsy kernel not long after the upgrade, before I became aware of this problem, so I haven't been able to test the system with the old kernel. The computer is totally unresponsive, the pointer doesn't move etc, and appears offline from when scanned by other computers on the network.

The weirdest thing (to me at least) is that after the computer has been power cycled (the only way to get it to respond again) the BIOS sequence hangs. The hard drive light goes on and stays on, but the BIOS doesn't go on to detect the hard drives. The only way to get round this appears to be to hold down the power button for five seconds until the machine shuts down, then leave it for 10 minutes or so, and then turn it back on. After this it appears to behave normally, until the next freeze.

There's a post on the forums that appears to be describing the same problem: http://ubuntuforums.org/showthread.php?t=725669
As suggested on there I tried booting with 'noapic nolapic irqpoll noirqdebug', but this didn't help. I don't have a great deal of time to troubleshoot, but would like to help in sorting this out if there's any other information I can provide.

Revision history for this message
PeteDonnell (pete-donnell-deactivatedaccount) wrote :
Revision history for this message
PeteDonnell (pete-donnell-deactivatedaccount) wrote :
Revision history for this message
Nicholas (drkoljan) wrote :

I would like to add my two cents here, as the Ubuntu 8.04 64-bit has locked up for me twice already. I'm not sure if it has anything to do with me overclocking my CPU, though (no problems under Windows, temperature <45C).

Revision history for this message
BobPendleton (bob-pendleton) wrote : Re: [Bug 204996] Re: Linux kernel 2.6.24-12 lockup
  • unnamed Edit (2.8 KiB, text/html; charset=ISO-8859-1)

Folks, this lock up appears to be affecting a lot of people who are trying
prerelease versions of hardy, if this thing goes public with this bug it is
going to be a *disaster* for Ubuntu. Is there anyway to get hold of the
people responsible and get the release delayed until this bug is fixed?
Ubuntu, not to mention Linux, can not afford this kind of a black eye.

Bob Pendleton

On Mon, Apr 21, 2008 at 12:05 PM, Nicholas <email address hidden> wrote:

> I would like to add my two cents here, as the Ubuntu 8.04 64-bit has
> locked up for me twice already. I'm not sure if it has anything to do
> with me overclocking my CPU, though (no problems under Windows,
> temperature <45C).
>
> --
> Linux kernel 2.6.24-12 lockup
> https://bugs.launchpad.net/bugs/204996
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in Source Package "linux" in Ubuntu: Triaged
>
> Bug description:
> Binary package hint: linux-image-2.6.24-12-generic
>
> I was upgrading from gutsy (with Linux 2.6.22-14) to the latest alpha last
> sunday (16.03.2008), and I've got some problems with the kernel.
>
> The 2.6.24-12-generic (I think, may be -386) causes my machine to lock up
> (hard) after about 5 minutes. Generally I was under X, there is no specific
> program I was using at the time it locked up.
>
> The hardware is completely stable & has been for the last 3 years with
> ubuntu, and still is with hardy & the gutsy kernel.
>
> Another thing I noticed is that something in the initrd keeps the machine
> from booting for at least 2-3 minutes. It's definetely before running
> scripts in the /etc/rcS.d folder, I have not traced it in the initrd. After
> booting all is well until it locks up.
>
> The hardware is a Dell C400, Intel chipset, Intel graphics (i830), 384MB
> of RAM and an atheros wireless card.
>
> lsb_release:
> Description: Ubuntu hardy (development branch)
> Release: 8.04
>
> It's completely reproductible, the freeze takes out the whole kernel (no
> reply to ping from network). Any suggestions on how to trace it? I have a
> serial port, no parallel.
>

--

+ Bob Pendleton: writer and programmer
+ email: <email address hidden>
+ web: www.GameProgrammer.com
+ www.Wise2Food.com

+--------------------------------------+

Revision history for this message
Wolfgang Glas (wglas) wrote :

Hi Bob,

 Keep your hair on ;-) We've never seen a release of a "stable" Linux distribution, which became actually stable in the sense of a user after at least 2 months of post-release bug fixing. Unfortunately, both users and developers start seriously working on bug reports in the face of a release deadline. So, the best thing we can do is to support the mighty kernel developers by supplying them with the information they need.

Revision history for this message
drivinghome (drivinghome) wrote :

i too get this sort of lock ups, but *only* when i use the wired connection.
It didn't occur to me until recently as i mostly use the wireless card.

#lspci|grep Ethernet
02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller

The module i use is e1000.

The lockups appear after ~2, 3 hours of wired connection.

Revision history for this message
martyg (snowbird) wrote :

1. Has anyone ssen this happen on the SERVER build of kernels, or is this just happening on the GENERIC build?

2. Has anyone tried to hook up a serial console and see if they can still talk to their box after it crashes?

Revision history for this message
tjutberg (burzmon) wrote :

I too get random lockups, haven't found any special trigger so far but happens more often with heavy CPU-activity.

Im running hardy amd 64 with nvidia graphics and it worked just fine under gutsy.

Revision history for this message
tjutberg (burzmon) wrote :
Revision history for this message
tjutberg (burzmon) wrote :
Revision history for this message
Christoph Lechleitner (lech) wrote :

>1. Has anyone ssen this happen on the SERVER build of kernels, or is this just happening on the GENERIC build?

I have seen this kind of crash with feisty's server kernel on my IBM x335 servers on a regular basis.
I would not be surprised if the initial cause(s) are in since 2.6.20 (which seem to have started the deep fall of kernel stability).
My problems vanished when I stepped back to edgy, in the meantime I have switched the host system to etch, while keeping the OpenVZ guests on feisty/gutsy.

I get the impression that the problem only occurs on SMP systems, and my systems crashed when LAN (wired GBit) and CPU and HD were under some load, e.g. 2 rsyncs over SSH over GBit. HD and LAN IO produce a large number of interrupts which seem not to be trivial to handle on a multicore system.

Revision history for this message
Elod VALKAI (elod) wrote :

I'll try the serial console, but it seems to be completely frozen.

I've never seen a distro (some slackware, more redhat, debian's woody, sarge, etch & lenny) behave like this after release. I've upgraded packages the night before (23th of april), there is no new kernel.

Changed in linux:
status: Triaged → Fix Released
264 comments hidden view all 344 comments
Revision history for this message
Tom M. (wvm7fk202-deactivatedaccount) wrote :

I switched from Hardy Heron to Debian Lenny and it's running 2.6.25 and has been up weeks in X windows (with my PCI wireless card and Nvidia video and AMD Athlon64 dual core) chugging along nicely. No lockups. 2.6.25 definitely fixes it. Good luck all.

Revision history for this message
Sergio Callegari (callegar) wrote :

In response to John:

For what concerns the "fix released" indication... unfortunately it does not indicate that there is a fix released for hardy, as far as I know the part of the 2.6.24 kernel causing the issue is still to be identified. There has been some discussion on whether it was appropriate to place the "fix released" tag on this bug (see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/204996/comments/273).

In response to Tom M.

Unfortunately, this is one more indication that going with 2.6.24 for hardy has been a very unfortunate choice: it is not the kernel of fedora, it is not the kernel of opensuse, it is not the kernel of debian, and it has a userbase that is probably limited to hardy only. Furthermore it is not anymore upstream maintained. Patchsets for 2.6.26 and 2.6.25 come out regularly (2.6.25 is at .16 delivered on August 20th), while 2.6.24 stopped at .7 shipped on May 7th.

There has been in the past a strong request to make a version of the intrepid kernel installable on hardy. As a matter of fact it is, but to the best of my knowledge the headers (necessary for compilation against the kernel) are not.

Sergio

Revision history for this message
John Ward (automail) wrote :

Thanks for the response Sergio,

Is there anyway that kernel 2.6.25 can be put into the repositories for usage and testing and then possible release as an update during Hardy's last period? This problem is serious and having the basis for a flexible update system and a large group of people looking for this problem and reporting back the best information they can theres no reason that something can't be released for this crippling thing.

Revision history for this message
Sergio Callegari (callegar) wrote :

This has been asked many times (including myself). Unfortunately, the answer so far has not been positive:

1) Switching hardy to the 2.6.25 kernel has been excluded as a "jump in the dark".
2) Providing two alternative kernel versions for hardy (namely both 2.6.24 and 2.6.25) has been indicated as not sustainable with the resources of the ubuntu kernel team.

The closest we got is:

a) an interview (to Mark Shuttleworth, if I remember correctly) where it is said that due to the very long support time of hardy (5 years on server) hardy might eventually switch to a more modern kernel when it becomes impossible to support 2.6.24 (cannot find the link, sorry).

b) an email on this very list, again by Mark Shuttleworth suggesting that it would be very valuable to give Hardy users the ability to test the Intrepid kernel. Unfortunately, in applying this proposal there is there is an apparent need to compromise since kernel developers do not want to decrease the motivation to test the intrepid codebase as a whole. The situation so far is that the intrepid kernel (2.6.26) can be installed on hardy, but not its kernel headers (and not either the restricted modules from what I heard).

Personally, what I have done so far on all the machines I am responsible for is using ubuntu without the hardy kernel, having compiled a 2.6.25 and then a 2.6.26 from kernel.org with make-kpkg that gives you nice deb packages. It is a bit of a pain to upgrade whenever a new patchset comes out for 2.6.26... but... still better than the lockups. In any case, 8.10 is not that far away now, so lets just hope this times it takes the same kernel version as fedora or opensuse.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

I'm expecting that we will shortly have a PPA for hardy which includes:

 - the proposed Intrepid kernel
 - a daily build of kernel.org's "tip"
 - the virgin kernel.org kernel that corresponds to hardy's kernel

Between those, we should have ample opportunity to help provide testing
for upstream as well as triage issues specific to Ubuntu.

Mark

Revision history for this message
Jeremy Bar (j.b) wrote :

Mark, that would be great, because the Lenovo Thinkpad T60 and X60s I own are both affected by this issue. The only solution I am aware of now is a manual patch of the e1000 driver.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

Ben Collins wrote:
> Upstream doesn't care about testing 2.6.24 any more.
But it is useful for us to assess if an issue was introduced in patches
we added to that stable release, or if it was there already.
> They want us to help test tip.
Sure, which is why we should make tip available for both stable and
development releases (currently Hardy and Intrepid).
> Besides, there's no good base to say "corresponds to hardy's kernel"
> because we stopped syncing at like 2.6.24.2, but we have lots of
> cherry picks for CVE's and SRU's from 2.6.24.y beyond .2. So hardy is
> currently > 2.6.24.2 but < 2.6.24.y head.
Then choose either .2 or .y, I would go with .2 personally, and I would
also try not to stop syncing, though I understand there are ABI issues.

> So it wouldn't even be beneficial to us to provide a "stock" kernel
> for hardy users. It wouldn't tell us the difference between .y fixing
> it, or stock working because we have a bad patch.
But .2 would tell us that.

> Ubuntu-next we've already started with. I'm quite reluctant to provide
> it in a PPA. Upstream constantly complains about the quality of bug
> reports from our users, and I fear that this would increase it because
> of non-technical users trying these kernels and not being able to
> properly help debug them.
I think we should DROP ubuntu-next. It's more work than any other
option, it's bugs are of no interest to upstream OR US.

> IMO, if we really want this PPA stuff, we need more man-power on the
> QA and engineering end of it. Just making it available isn't useful at
> all and would probably cause the reverse with upstream than what you
> want.
Please nonetheless put these into your plan, with or without ubuntu-next.

> On a similar note, I've considering putting out the idea of adding the
> LP bugzilla plug-in to upstream kernel to make it easier for us to
> forward good bug reports upstream.

That would rock indeed :-)

Mark

Revision history for this message
wackyiniraqi (wackyiniraqi) wrote :

Check out bug 248591....just was informed that "

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the
upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would
appreciate it if you could please test this newer 2.6.27 Ubuntu kernel.

"

Mark Shuttleworth wrote:
> Ben Collins wrote:
>
>> Upstream doesn't care about testing 2.6.24 any more.
>>
> But it is useful for us to assess if an issue was introduced in patches
> we added to that stable release, or if it was there already.
>
>> They want us to help test tip.
>>
> Sure, which is why we should make tip available for both stable and
> development releases (currently Hardy and Intrepid).
>
>> Besides, there's no good base to say "corresponds to hardy's kernel"
>> because we stopped syncing at like 2.6.24.2, but we have lots of
>> cherry picks for CVE's and SRU's from 2.6.24.y beyond .2. So hardy is
>> currently > 2.6.24.2 but < 2.6.24.y head.
>>
> Then choose either .2 or .y, I would go with .2 personally, and I would
> also try not to stop syncing, though I understand there are ABI issues.
>
>
>> So it wouldn't even be beneficial to us to provide a "stock" kernel
>> for hardy users. It wouldn't tell us the difference between .y fixing
>> it, or stock working because we have a bad patch.
>>
> But .2 would tell us that.
>
>
>> Ubuntu-next we've already started with. I'm quite reluctant to provide
>> it in a PPA. Upstream constantly complains about the quality of bug
>> reports from our users, and I fear that this would increase it because
>> of non-technical users trying these kernels and not being able to
>> properly help debug them.
>>
> I think we should DROP ubuntu-next. It's more work than any other
> option, it's bugs are of no interest to upstream OR US.
>
>
>> IMO, if we really want this PPA stuff, we need more man-power on the
>> QA and engineering end of it. Just making it available isn't useful at
>> all and would probably cause the reverse with upstream than what you
>> want.
>>
> Please nonetheless put these into your plan, with or without ubuntu-next.
>
>
>> On a similar note, I've considering putting out the idea of adding the
>> LP bugzilla plug-in to upstream kernel to make it easier for us to
>> forward good bug reports upstream.
>>
>
> That would rock indeed :-)
>
> Mark
>
>

Revision history for this message
Martin Božič (martin-bozic) wrote :

Well, to add some more confusion to this bug...

The last kernel panic I experienced was on the first boot up after upgrading the kernel to 2.6.24-19. After that, no kernel panics whatsoever. Also processor hiccups dissapeared somewhere in the beginning of August. I've tested the 2.6.24-17 and *-18 kernels each one for a day, daily common use, no problems with any of the current kernels (although Firefox was crashing every second time I was playing Flash videos in *-17 kernel). I have Dell Latitude D400 laptop, no non-free drivers.

One more thing, kernel panics gradually disappeared over time, at least that's how it seemed to me in my case.

So, could it be that this bug is not directly the kernels fault, but some common and crucial package that can be found on all Ubuntu variants?

Revision history for this message
John Ward (automail) wrote :

I have downloaded and installed "Ubuntu 8.10 Intrepid Ibex Alpha 5" and apart from being a little incomplete here and there (it is an Alpha build after all) I can say confidently that for the last 5 days Ubuntu has been running smoothly without any kernel panics. I have left the machine on its own and no kernel panics, I have stressed it with Azureus Vuze and multiple torrents, Firefox with flash content and multiple open tabs, The Gimp and flame rendering, Rhythmbox, Brasero disc burning, Sound Juicer .ogg ripping and System Monitor, all on at the same time and there has not been a hiccup, a single crash or a kernel panic. Well done in nailing this bug, or accidentally realising its not in the .27 kernel - whatever you did the problems are gone.

I urge others still experiencing this problem to try Ubuntu 8.10 Intrepid and see if the panics disappear. Apart from an issue with installing proprietary nVidia drivers everything has been running very well with 8.10 and I actually like the new "Brown Human" theme.

Anyway, I'm glad to say these things.

John.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi John,

Thanks for testing and the feedback regarding Intrepid Ibex Alpha5. Would anyone else be willing to test Alpha5 as well. It does, as John pointed out, contain a newer 2.6.27 kernel. For more information regarding Alpha5 please refer to http://www.ubuntu.com/testing/intrepid/alpha5 . Please let us know your results. Thanks.

Revision history for this message
nst (nst16) wrote :

Hi,
I'm experiencing a hard freeze both with a fresh Hardy install and with an updated Intrepid (kernel 2.6.27-3). My laptop is a Targa Traveller 826 with AMD Turion64 1.8GHz and ATI Radeon Mobility X700.
In my case the freezes seem to be correlated with heavy network traffic, but not with a specific application (Firefox,synaptic,apt-get). Most times it only takes less than one minute after the start of the data transmission until it freezes. It happens both with wired and with wireless network connections.
When booting with acpi=off I experienced no freezes so far.
Let me know if you need you need additional information.
Nils

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

Does anyone see a clear kernel panic during these hangs? For example, if
it can be reproduced within a minutes, it would be interesting to boot
into administrative mode and run a big FTP download or other network
stress test, to see if a kernel panic is displayed. If anyone sees it
then grab a camera and attach it here.

Mark

Revision history for this message
Dev (scotty-amnet) wrote :

Got the same issue here on hardy, 2.6.24-21-generic, nVidia Corporation GeForce 7100 GS (rev a1) running driver 173.14.12, RaLink RT2561/RT61 rev B 802.11g

locks up hard, no mouse, no ping... nothing at all.

If I can be of any use please let me know.

Revision history for this message
nst (nst16) wrote :

Unfortunately I don't see a clear kernel panic when the freeze occurs. I am able to reproduce it in a console e.g. by initiating a ftp transfer. It takes only a few seconds until it freezes. The cursor stops blinking, there is no reaction to inputs (CapsLock, Magic SysRq ..) and there are no entries in the logs.
Regards
Nils

Revision history for this message
Wolfgang Glas (wglas) wrote :

Honestly, I think this bug should be chased by a core kernel developer, because

  a) This issue is very long standing
  b) The impact on the affected users is high
  c) It is well known, that there are no logs or kernel panics etc. are produced

Best regards,

   Wolfgang

Revision history for this message
Elod VALKAI (elod) wrote :

I've upgraded to Intrepid, and it's got other nasty problems (the intel driver in xorg does not like intel's 830 chipset).

The kernel (2.6.27) seems stable, but it's very strange to see a kernel that has not been released (rc6 at the moment) in a linux distribution to be released in a month.

Revision history for this message
Elod VALKAI (elod) wrote :

After one week with Intrepid (and having resolved the xorg intel driver quirks) I can say I'm pretty happy with it. 2.6.27 is stable. I hope it stays that way :).

I won't test 2.6.24 further, as it's simply impossible. I'm beginning to think that an LTS release should have the kernel of the previous release. At least it's been tested _widely_ for 6 months.

Revision history for this message
Stefan_Ares (twolve2146) wrote :

Hello everyone who seems to be in the same sinking boat, I guess I'm here to join you.

I've been trying linux for about a month now, and with the same similar problem as many of you. My system will lock-up and I can't use the mouse or any keyboard buttons. I have ran openSuse and ubuntu hardy 32 bit, also just started running ubuntu hardy 64 bit.

I noticed with a GPU temp sensor that whenever my temp of my GPU gets up to 61C and hovers there without dropping, the computer will reliably lock up, and the best thing to do is to let it cool down. It seems to get really hot for no reason, since I'm running cooler in Vista right now (and in my opinion Vista should be running much hotter because of how taxing it is on hardware).

I would also like to post for the dev team that I have tried out Alpha5 and Alpha 6, both causing the same lock-up problem. It seems to take longer to heat up in these versions because the system seems to use less power in Intrepid, and I would love to switch but I'm trying to find an OS that won't freeze after getting too hot. Even though I realize that it may just be doing this to protect my hardware.

I have no logs of my problems, but I feel confident that my case is a GPU temp problem, after having tried many things. I doubt it is IO in my case because its usually when I'm browsing in firefox (which causes my computer to run hotter)

Revision history for this message
bwana (marcusmarcus) wrote :

I've experienced frequent lockups with 8.04 32/64 (and -rt), 8.10 64 (all the way to Linux dlm1 2.6.27-1-generic #1 SMP).

I just had to reset my computer and I captured logs for a "full cycle" (from boot to crash - I got to the logon screen, then kablam).

I've attached a zip with the output of:
* cat /proc/version_signature > version.log
* dmesg > dmesg.log
* sudo lspci -vvnn > lspci-vvnn.log

I've also attached a bunch of logs from the full "cycle mentioned" earlier.

I've been hoping for this progress since migrating off 6.06 - but I've given up.
I'm off to distrowatch.com et al. to find me a replacement for Ubuntu.
I've wasted too much time believing/hoping that someone would be able to find a fix.

Revision history for this message
Chainz (chainzee) wrote :

I have just replaced my Graphics card in my PC, from ATI Rage 128 to ATI HD 3450...
And guess what? NO MORE LOCKUPS!!!

Revision history for this message
Sam!r Jadhav (jadhav333) wrote :

Sometimes my desktop just freezes. and there are some vertical blue dotted streaks across the screen. The mouse works but nothing on the desktop can be clicked.

Assuming some compatibility error due to any recently installed software, I did a clean install of the OS. But the proble still persists.

Though it occurs randomly, it usually occurs while I am using firefox browser 3.0.3.

This is the first time since I am using ubuntu (since version 6.06) that I am encountering an issue like this.

I am unable to give a screenshot becoz even the print scrren functionality doesnot work during the freeze.

Can anybody suggest a solution?

My Spec
Quadcore intel cpu Q6600@2.4Ghz, 4 Gb ram, S975XBX2 motherboard, PCI Express GeForce 8600 GT (generic drivers)
Ubuntu 8.10 Intrepid Ibex 64bit
Kernel Linux 2.6.27-7-generic
Gnome 2.24.1
Partition Info:
\-------------------Root------------10GB
\Home-----------Home----------10GB
\Media\Sda3---Data folder--400Gb

Revision history for this message
Chainz (chainzee) wrote :

Your case might be connected with flash issues.
Please try to use flash block and see if it happens again.

2008/11/20 Sam!r Jadhav <email address hidden>

> Sometimes my desktop just freezes. and there are some vertical blue
> dotted streaks across the screen. The mouse works but nothing on the
> desktop can be clicked.
>
> Assuming some compatibility error due to any recently installed
> software, I did a clean install of the OS. But the proble still
> persists.
>
> Though it occurs randomly, it usually occurs while I am using firefox
> browser 3.0.3.
>
> This is the first time since I am using ubuntu (since version 6.06) that
> I am encountering an issue like this.
>
> I am unable to give a screenshot becoz even the print scrren
> functionality doesnot work during the freeze.
>
> Can anybody suggest a solution?
>
> My Spec
> Quadcore intel cpu Q6600@2.4Ghz, 4 Gb ram, S975XBX2 motherboard, PCI
> Express GeForce 8600 GT (generic drivers)
> Ubuntu 8.10 Intrepid Ibex 64bit
> Kernel Linux 2.6.27-7-generic
> Gnome 2.24.1
> Partition Info:
> \-------------------Root------------10GB
> \Home-----------Home----------10GB
> \Media\Sda3---Data folder--400Gb
>
> --
> Linux kernel 2.6.24-12 lockup
> https://bugs.launchpad.net/bugs/204996
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in "linux" source package in Ubuntu: Fix Released
> Status in linux in Ubuntu Intrepid: Fix Released
>
> Bug description:
> Binary package hint: linux-image-2.6.24-12-generic
>
> I was upgrading from gutsy (with Linux 2.6.22-14) to the latest alpha last
> sunday (16.03.2008), and I've got some problems with the kernel.
>
> The 2.6.24-12-generic (I think, may be -386) causes my machine to lock up
> (hard) after about 5 minutes. Generally I was under X, there is no specific
> program I was using at the time it locked up.
>
> The hardware is completely stable & has been for the last 3 years with
> ubuntu, and still is with hardy & the gutsy kernel.
>
> Another thing I noticed is that something in the initrd keeps the machine
> from booting for at least 2-3 minutes. It's definetely before running
> scripts in the /etc/rcS.d folder, I have not traced it in the initrd. After
> booting all is well until it locks up.
>
> The hardware is a Dell C400, Intel chipset, Intel graphics (i830), 384MB of
> RAM and an atheros wireless card.
>
> lsb_release:
> Description: Ubuntu hardy (development branch)
> Release: 8.04
>
> It's completely reproductible, the freeze takes out the whole kernel (no
> reply to ping from network). Any suggestions on how to trace it? I have a
> serial port, no parallel.
>

Revision history for this message
Sergio Callegari (callegar) wrote :

I have upgraded to intrepid the PC on which I was experiencing the hard lockups. And the lockups have disappeared.
I have also upgraded to hardy a laptop on which I was running gutsy. And the lockups have appeared although by no means as frequently as I was experiencing them on the older desktop. I am experiencing about a couple of hard lockups a week, with the laptop on for most of the day.

So the problem seems to be really due to 2.6.24.

Revision history for this message
brainiac8008 (frankfurter) wrote :

Hello all,

I wrote a while back in the comments about how my computer would lock up with Hardy installed. I had found that my wireless USB adapter, based on the zd1211 platform, would cause the system to freeze as soon as Ubuntu tried to establish a connection to the Internet. I have a feeling that I also had the more common problem of the computer locking up at seemingly random times.

Well, I installed Intrepid last week and it has been very stable. I have not gotten a single lockup! I'm glad that I am finally able to use Ubuntu again.

Thanks,
Noah

Revision history for this message
Launchpad Janitor (janitor) wrote : Kernel team bugs

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

Revision history for this message
Adam Buchbinder (adam-buchbinder) wrote :

I had this problem with Hardy; I upgraded to Intrepid recently, but it's still present--mouse won't move, machine won't respond to pings, nothing in the logs. It seems to be associated with network activity--loading a lot of tabs in Firefox or Epiphany--but I can't reliably reproduce it. It seems not to happen when I don't have anything at all running on the machine. It's a Thinkpad T40p; the four logs attached in, e.g., comment 324, will be attached momentarily.

Changed in linux:
status: Fix Released → Confirmed
status: Fix Released → Confirmed
Revision history for this message
Adam Buchbinder (adam-buchbinder) wrote :

Attached find the tarballed results of:

$ uname -a > uname-a.log
$ cat /proc/version_signature > version.log
$ dmesg > dmesg.log
$ sudo lspci -vvnn > lspci-vvnn.log

Revision history for this message
Adam Buchbinder (adam-buchbinder) wrote :

As I don't have a good way of triggering the bug, I'd like to request that anyone who can do so attempt to find the regression via git-bisect. It's somewhat time-consuming, but it will definitely let us nail down the bug and submit it upstream (as it's appeared in vanilla kernels as well as Ubuntu-specific ones). It would finally make it possible for us to isolate and fix this thing after all these months.

There are good instructions for bisecting the kernel over in bug 273266; I'm willing to help out however I can. If anyone has any specific suggestions for making the bug easily reproducible on my own laptop, I'll try to help out with it as well, though I don't remember the specific revision at which this last didn't happen.

Revision history for this message
Adam Buchbinder (adam-buchbinder) wrote :

Status update: I've gotten started on finding the regression via git-bisect. I was able to reproduce the bug on 2.6.27 vanilla, and unable to do so on 2.6.22 vanilla. I'm currently down to just under seven thousand candidate commits left. The whole thing is rather complicated by the fact that configuration options change between kernel versions, which has made me a bit leery of the possibility of actually isolating the code change that introduces the bug.

Also, older kernel versions seem to cause X to die every so often, and even apart from that, testing is slow, because finding out if a kernel is bad or not is like the Halting Problem--I can only get a "yes" answer, never a definitive "no".

Revision history for this message
bwana (marcusmarcus) wrote :

I'm not experiencing the random crashes anymore (on any Ubuntu versions).

On further examination, I managed to track it down to a faulty GPU fan causing the system to overheat (duh!).

I was also seeing this problem on a laptop. Turns out that the laptop would remain on 24 hours a day. I removed the battery (as it was permanently plugged in anyway) which lowered the temperature, which caused the crashes to stop.

In other words, the data I provided earlier in order to assist in the troubleshooting will only show how a system looks while overheating - so please disregard it when trying to get to the bottom of the real bug.

Hope I didn't encourage anyone to waste time barking up the wrong tree..

Cheers,
/m

Revision history for this message
Mark Shuttleworth (sabdfl) wrote : Re: [Bug 204996] Re: Linux kernel 2.6.24-12 lockup

Marcus, thanks for the update. Adam, thanks for trying to chase this
down, I hope we can at least identify the revision which caused the
issues you are seeing.

In general, I think this bug has become a melting pot of a number of
different lockup issues, but eliminating them one by one is worthwhile
nonetheless.

Mark

Revision history for this message
seh62 (seh62) wrote :

I've been running 8.04 in Virtual Box in a Windows XP host on a Toshiba Laptop with an Intel Celeron M and Atheros wifi for 8 or 9 months now and it worked great, (except Vbox has no support for 3D acceleration).
My hard drive broke and I installed a new one and decided to to dual boot which worked fine. I use a dial up at home, this caused me some confusion and

I finally got it working with sl-modem-daemon (after trying SLMODEMD), but shortly therafter I experienced my first crash, just a loud noise from the hard drive then all power off. I went to a wifi spot to download all the updates before trouble shooting, this was a major hassle, finally I got a connection but was forced to leave before finishing the updates. I returned began updating again had to stop for awhile and when I tryed to reconnect I experienced the sudden crash again when I rebooted the wireless and ethernet drivers where gone (no wireless in the network manager and no drivers listed in hardware drivers) then another crash.

Now ubuntu will only run for about a minute before a hard crash (not much time to do anything), I decided to reinstall ubuntu in the same partition but when I boot up with the CD by the time I make it to the install dialoque there's a hard crash.

EVERY TIME before a crash when plugged in to AC power, the charge light turns to amber (indicating battery charging), the screen freezes, hard drive makes a loud noise, everything shuts completely off, and then the charge light turns back to green (indicating full charge on AC power).

This is depressing, never had a problem in Vbox maybe those guys at Sun have something going. Am I going to have to wait for the 8.10 disc to arrive? I really like Ubuntu, Windows sucks, help!

Revision history for this message
Adam Buchbinder (adam-buchbinder) wrote :

seh62, that sounds separate from this; if it's something you think you can get a developer to reproduce, file a bug; if it's not, file a question.

As for me, I've had to do some backtracking after a version which I'd previously considered good turned out to be bad; the number of revisions left to test is still rather large. I'm still working on it, though building kernels is rather slow. The best way to reproduce it seems to be downloading something with Vuze. Not just uploading, but downloading, seems to cause a lockup almost certainly within a few hours. It's kind of hard to quantify, though. More information will follow as I narrow it down. Each known-bad commit *does* help me pare back the search space, at least somewhat. I'm getting closer. I must be.

Revision history for this message
Dieter Burghardt (dieter-burghardt) wrote :

At first my machine (Intel Core2 Quad CPU) was perfectly stable since november '08. The machine was up 24/7 and also running 4 instances folding@home.
But when I started to use folding@home with the -smp switch I got random lockups.
The machine is not overclocked and in both cases the machine is under high CPU load, so overheating or other hardware related issues shouldn't be the cause of the lockups. I guess the lockups are related to MPI (which is turned on in folding@home with -smp switch.)

Revision history for this message
Dieter Burghardt (dieter-burghardt) wrote :

Oops, forgot to mention ... it's intrepid, and the kernel is up to date

2.6.27-11-generic #1 SMP Thu Jan 29 19:28:32 UTC 2009 x86_64 GNU/Linux

Revision history for this message
hadisen (microtherm) wrote :

I experienced my first two random(!) kernel panics (blinking caps and scroll lock) today after having used my computer for 2 or 3 weeks without any problems. It is a Sony Vaio Subnotebook VGN-TX3XP running Ubuntu 8.04.2 Kernel 2.6.24-23-generic. I suspect it to be a network related bug, scince it did not occur a third time when I had deactivated my wireless network (driver iwl3945 - also see thread 944123 "The Broadcom STA wl driver is buggy"). I would like to mention that my neigbour has exactly the same router as me, which caused some problems with the roaming-mode under the Gnome network-manager during the last days, so the bug might have something to do with WLAN overlapping, too. I replaced it with Wicd, activated the WLAN again and have had no further kernel-panics so far. I'll post if they come back.

Revision history for this message
Adam Buchbinder (adam-buchbinder) wrote :

My attempts to compile at least a dozen kernels so far, leaving each running for either a week or until it froze, have led to varying degrees of confusion; kernels that I thought were good would freeze up or work for a week straight depending on, it seemed, the phase of the moon. Bisecting the kernel has taken a great deal of time, but has not, in the end, been particularly helpful. I've been reluctant to upgrade to Jaunty yet, since upgrading in the first place is what started all of this, and it could always get worse, but since the problem *seems* to be kernel-based, I can always just boot into the Intrepid kernel afterwards.

Future directions that I'll be working on:

I don't recall whether or not I actually removed the 'airo' module when I was using the wired ethernet for my connectivity. Seeing if that helps may be helpful in tracking down the actual bug.

I hadn't heard of netconsole ( https://wiki.ubuntu.com/KernelTeam/Netconsole ); since the bug locks the system hard enough that no logging information is available on the next boot, this may be a helpful avenue; if I can get a stack trace, I can even bring it upstream. (I would have done so already, but despite the time I've sunk into this disaster, I essentially know little more than I did when I started.) On the other hand, the wireless on my laptop is semi-broken enough (connection quality constantly wavers, and occasionally drops entirely, even though other wireless devices work fine) that this may not help. If that's the case, I'll connect it by wire and try to trigger the crash that way.

Revision history for this message
Jim Lieb (lieb) wrote :

This path has been marked as invalid because the original bug applies to 2.6.24 and Hardy. If Intrepid with its 2.6.27 kernel locks up for you, please file a new bug against that release and that kernel. This close message is the 343rd comment action and your Intrepid issue will get lost in this avalanche of bug comments. This also applies to Jaunty and Karmic issues. We ask for separate bugs based on release, system/cpu type, and usage because a "hang" or "lockup" is a very generic description of what most often ends up being a very specific case for a very particular configuration. A separate bug gets noticed and the comment path to its resolution is comprehensible. Please understand that this is has been marked as invalid for Intrepid because the original fault was logged against Hardy and 2.6.24. Your bug issue is still valid, but not here. Thank you.

Changed in linux (Ubuntu Intrepid):
status: Confirmed → Invalid
Revision history for this message
Jim Lieb (lieb) wrote :

This bug is closed because it originally applied to a 2.6.24-12 kernel. If you are experiencing an issue with the current Hardy kernel, please file a new bug with the details as outlined in the wiki page(s). In this way we can sort out and address issues that are still active and relevant from what was already fixed and no longer applicable. Thank you for your cooperation.

Changed in linux (Ubuntu):
status: Confirmed → Won't Fix
Displaying first 40 and last 40 comments. View all 344 comments or add a comment.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.