NV18 GPU lockup with Nouveau

Bug #537741 reported by djr013
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Nouveau Xorg driver
Invalid
Medium
xserver-xorg-video-nouveau (Ubuntu)
Fix Released
High
Unassigned

Bug Description

Binary package hint: xorg

Ubuntu Lucid, amd64
Attempted with multiple Xorg/Nouveau versions and with different kernel flavors (generic, preempt, -16).
Whenever something graphically intensive happens (except, apparently, video) like compositing, a gpu lockup occurs, freezing X and even terminal switching but allowing use of SysRq commands. Can sometimes switch terminals by first pressing Alt-SysRq-K. Only occurs with an NV18 card; NV17 works fine, for example.

Note: bug was filed while using xorg-edgers version; however, it also applies to normal Lucid (pre-)release packages.

ProblemType: Bug
Architecture: amd64
Date: Thu Mar 11 15:23:39 2010
DistroRelease: Ubuntu 10.04
DkmsStatus: Error: [Errno 2] No such file or directory
Package: xorg 1:7.5+3ubuntu1
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-16-generic root=UUID=9d4febcd-17ce-4be0-8748-796131edf0af ro quiet splash
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-16.25-generic
SourcePackage: xorg
Uname: Linux 2.6.32-16-generic x86_64
dmi.bios.date: 11/12/2004
dmi.bios.vendor: Phoenix Technologies, LTD
dmi.bios.version: 6.00 PG
dmi.board.name: AV8 (VIA K8T800P-8237)
dmi.board.vendor: http://www.abit.com.tw/
dmi.board.version: 1.x
dmi.chassis.type: 3
dmi.modalias: dmi:bvnPhoenixTechnologies,LTD:bvr6.00PG:bd11/12/2004:svn:pn:pvr:rvnhttp//www.abit.com.tw/:rnAV8(VIAK8T800P-8237):rvr1.x:cvn:ct3:cvr:
system:
 distro: Ubuntu
 codename: lucid
 architecture: x86_64
 kernel: 2.6.32-16-generic

Revision history for this message
djr013 (djr013) wrote :
djr013 (djr013)
description: updated
Bryce Harrington (bryce)
affects: xorg (Ubuntu) → nvidia-graphics-drivers (Ubuntu)
Robert Hooker (sarvatt)
affects: nvidia-graphics-drivers (Ubuntu) → xserver-xorg-video-nouveau (Ubuntu)
Revision history for this message
djr013 (djr013) wrote :

I just ran xorg-edgers' ppa-purge and tried the card again. I have enabled compositing in Metacity. One test was selecting the Applications menu and rapidly switching through that and Places and System, which eventually froze X. After doing Alt-SysRq-K, it allowed me to log back in, and I tried again with no problem; I opened some semi-transparent terminals, enlarged them, and moved them around over eachother. Trying to enlarge the ~eighth terminal froze X. Eventually I had a kernel panic. Occasionally I can't even boot fully. Sometimes X will freeze and allow cursor movement, other times won't. Has occured a couple times in a (Ctrl-Alt-F1...) virtual terminal.

Often I see colorful flashing as textures load, particularly at boot. Once I rebooted and seen the screen as it was when X froze, for a few seconds, and it cleared and finished loading GNOME.

I suppose there is a chance the card is defective...but it has no problem with the NVIDIA binary drivers. (As an aside, I even had Compiz running in Nouveau for some durations before freezing.)

I may try booting an Live session soon to see if it happens there too.

Revision history for this message
djr013 (djr013) wrote :

To clarify, the one-reboot persistent freeze image I mentioned was with Compiz under xorg-edgers packages.

One other odd occurrence was terminal text partially overlayed on GNOME.

Bryce Harrington (bryce)
Changed in xserver-xorg-video-nouveau (Ubuntu):
status: New → Confirmed
Revision history for this message
In , Chris Halse Rogers (raof) wrote :

Forwarded from Launchpad: https://bugs.edge.launchpad.net/ubuntu/+source/xserver-xorg-video-nouveau/+bug/537741

Whenever something graphically intensive happens (except, apparently, video) like compositing, a gpu lockup occurs, freezing X and even terminal switching but allowing use of SysRq commands. Can sometimes switch terminals by first pressing Alt-SysRq-K. Only occurs with an NV18 card; NV17 works fine, for example.

I have enabled compositing in Metacity. One test was selecting the Applications menu and rapidly switching through that and Places and System, which eventually froze X. After doing Alt-SysRq-K, it allowed me to log back in, and I tried again with no problem; I opened some semi-transparent terminals, enlarged them, and moved them around over eachother. Trying to enlarge the ~eighth terminal froze X. Eventually I had a kernel panic. Occasionally I can't even boot fully. Sometimes X will freeze and allow cursor movement, other times won't. Has occured a couple times in a (Ctrl-Alt-F1...) virtual terminal.

Often I see colorful flashing as textures load, particularly at boot. Once I rebooted and seen the screen as it was when X froze, for a few seconds, and it cleared and finished loading GNOME.

I suppose there is a chance the card is defective...but it has no problem with the NVIDIA binary drivers. (As an aside, I even had Compiz running in Nouveau for some durations before freezing.)

Xorg log: http://launchpadlibrarian.net/40822534/XorgLogOld.txt
Boot dmesg: http://launchpadlibrarian.net/40822504/BootDmesg.txt
dmesg around the freeze & SAK recovery: http://launchpadlibrarian.net/40822506/CurrentDmesg.txt
lspci: http://launchpadlibrarian.net/40822515/Lspci.txt

Revision history for this message
Chris Halse Rogers (raof) wrote :

I've forwarded this upstream to the Nouveau developers. It would be helpful if you could subscribe to https://bugs.freedesktop.org/show_bug.cgi?id=27153 as the developers may wish to ask for more information from you.

Changed in xserver-xorg-video-nouveau (Ubuntu):
importance: Undecided → High
Changed in nouveau:
status: Unknown → Confirmed
Revision history for this message
In , Francisco Jerez (currojerez) wrote :

Can you reproduce without the 3d driver? E.g. after hiding "/usr/lib/dri/nouveau_vieux_dri.so" somewhere the loader wouldn't look in.

Bryce Harrington (bryce)
tags: added: edgers
Robert Hooker (sarvatt)
Changed in xserver-xorg-video-nouveau (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
In , Chris Halse Rogers (raof) wrote :

The report on launchpad indicates that this is reproducible without the 3D component installed.

Revision history for this message
In , Francisco Jerez (currojerez) wrote :

Created an attachment (id=34310)
nv18_pgraph_lockup.patch

The attached patch might help, in any case we don't have an mmiotrace for this card yet and it would be useful if we want to fix it properly.

In short, you'd need to:
- Start tracing
- Load nvidia.ko and start X up with the nvidia proprietary driver
- Run some 3d for a few seconds (e.g. fire up glxgears)
- Stop tracing and send the generated output to mmio.dumps at gmail.com

It's explained in more depth here:
http://nouveau.freedesktop.org/wiki/MmioTrace
http://cgit.freedesktop.org/nouveau/linux-2.6/tree/Documentation/trace/mmiotrace.txt

Revision history for this message
TJ (tj) wrote :

I believe I may be seeing the same symptom on a Sony Vaio VGN-FE41Z that contains an Nvidia GeForce Go 7600 (NV40) [Xorg: "Chipset: "NVIDIA NV4b"].

In this case though, it is hard to determine what precisely is causing it.

The symptom is that the plymouth splash screen remains with the "Press C to cancel checking disks" message, some disk activity is evident from occasional LED flashes, but the system doesn't respond to VT switch key-presses (Alt-F1 to Alt-F7) or any other key sequence. It responds to Alt+SysReq with S (sync disks), B (boot) but I didn't see any signs that K (Kill processes on tty) did anything useful - although it is possible that it had invisibly switched from tty7 without changing the display.

Examining the Lucid log files from the regular Karmic environment I couldn't see anything obvious in terms of errors, but I did notice that /var/log/Xorg.0.log had been written to but didn't seem to be complete.

At the time I looked I wasn't sure what the final entries in that log-file would usually be, but having managed a successful start it appears that Xorg gets 'stuck' after nouveau has started whilst additional modules are being loaded - I'll do another test to get a fresh log now I know what to look for.

In this case I found a workaround was to unplug the external DVI-D monitor from the docking station (the internal LVDS output was mirrored on the external DVI-D), so not sure if this bug is 'the one' on that basis.

Revision history for this message
TJ (tj) wrote :

It appears I can reproduce this bug with the NV40. Starting Lucid with the external DVI-D connected and plymouth output mirrored it 'stops' with the plymouth splash screen as described previously.

Using Alt+SysReq+K to kill the processes on the current terminal (tty7 at this point) disk activity increases and Xorg restarts and presents the GDM greeter/log-in screen from where things progress.

I've not yet seen any similar freeze after logging in, but maybe DJR013 can suggest an application/action that is almost guaranteed to produce one so I can test?

Revision history for this message
TJ (tj) wrote :

After more research I found bug #533135 "System fails to boot with plymouth installed (nouveau driver with >1 display)" which seems to more accurately fit the symptoms I'm experiencing.

Revision history for this message
In , Francisco Jerez (currojerez) wrote :

*** This bug has been marked as a duplicate of bug 25366 ***

Changed in nouveau:
importance: Unknown → Medium
status: Confirmed → Unknown
Changed in nouveau:
importance: Medium → Unknown
Changed in nouveau:
importance: Unknown → Medium
Changed in nouveau:
status: Unknown → Invalid
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

This is fixed at least in natty, according to upstream.

Changed in xserver-xorg-video-nouveau (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.