Hidden file does not get removed when switching from nvidia-glx-new/nvidia-glx-legacy to nvidia-glx causing X not to start due to mismatch of versions

Bug #106217 reported by heckheck
92
Affects Status Importance Assigned to Milestone
linux-restricted-modules-2.6.20 (Ubuntu)
Won't Fix
Undecided
Unassigned
Nominated for Feisty by zeddock
Gutsy
Invalid
Undecided
Unassigned
linux-restricted-modules-2.6.22 (Ubuntu)
Fix Released
High
Unassigned
Nominated for Feisty by zeddock
Gutsy
Fix Released
Undecided
Unassigned

Bug Description

I have found that when I tried switching back from the new nvidia-glx-new package to the nvidia-glx package, a hidden file .nvidia_new_installed gets left behind and X won't start because /sbin/lrm-video continues to load the nvidia-new driver due to the hidden file.

root@heckmedia:/lib/linux-restricted-modules# ls -la
total 24
drwxr-xr-x 3 root root 4096 Apr 13 09:34 .
drwxr-xr-x 16 root root 12288 Apr 10 19:40 ..
-rw-r--r-- 1 root root 58 Apr 10 19:35 .nvidia_new_installed
drwxr-xr-x 13 root root 4096 Apr 10 19:43 2.6.20-14-generic
root@heckmedia:/lib/linux-restricted-modules#

I can get X to start by hiding the hidden file (moving it to .hidden) and doing

modprobe -r nvidia
startx

The error that X spits out when I load with the mismatched drivers is

Error: API mismatch: the NVIDIA kernel module has the version 1.0-9755, but this X module has the version 1.0-9631. Please make sure that the kernel module and all NVIDIA driver components have the same version.

Here is the last entry in my /var/log/aptitude file

Log complete.
Aptitude 0.4.4: log report
Fri, Apr 13 2007 08:22:59 -0400

IMPORTANT: this log only lists intended actions; actions which fail due to
dpkg problems may not be completed.

Will install 2 packages, and remove 2 packages.
2892kB of disk space will be freed
===============================================================================
[HOLD] linux-image-2.6.20-14-generic
[INSTALL] nvidia-glx
[INSTALL] nvidia-kernel-source
[REMOVE] nvidia-glx-new
[REMOVE] nvidia-new-kernel-source
===============================================================================

Log complete.

Revision history for this message
Capineiro Capaz (paulogotardo) wrote :

I can confirm that problem!

I've installed Feisty Fawn beta and updated to the latest packages as of 15-APR-07. My PC has an integrated GeForce4 MX and it wouldn't work with nvidia-glx-new nor nvidia-glx. It did work with nvidia-glx-legacy but I believe this old version does not suport X composite and also has problems with suspend/hibernate (at least here).

So I decided to remove the ubuntu packages and install the binary package from Nvidia's web site. I dowloaded version ...9631 , which is not the newest one, but is the latest to suport my chipset. And after instaling it successfully it didn't work.

I've found out that "modprobe nvidia" causes /sbin/lrm-video to run and check whether there is a file "/lib/linux-restricted-modules/.nvidia_new_installed". It so happens that this file existed (it was still there!) even though I had already removed the package "nvidia-glx-new" that had been installed at first.

I deleted this file and now the driver version 9631 works and I believe that installing the deb package for nvidia-glx would work too.

Cheers,

Paulo

Revision history for this message
kripken (kripkenstein) wrote :

I can also confirm this problem.

Paulo's fix worked for me, thanks Paulo.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Confirming based on kripkenstein's comment.

Changed in linux-restricted-modules-2.6.20:
status: Unconfirmed → Confirmed
Revision history for this message
Tristan Schmelcher (tschmelcher) wrote :

I too can confirm this, and the posted fix works. But please make the fix automatic in a future version. Tracking this down ate up an entire evening of my time.

Revision history for this message
elventear (elventear) wrote :

I had a similar problem but I haven't been able to solve it. No matter what driver I have installed (nvidia-glx/-legacy) I can't start X. When I do dmesg I get a message saying that v1.0-9755 doesn't support my card and that legacy 1.0-96xx only do. But I still get the same no matter what, and I have removed nvidia-glx-new, so there must something else staying behind.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

elventear:
There's not enough information in your comment to make a diagnosis of the problem. Could you create a new bug report and upload
/etc/X11/xorg.conf
/var/log/Xorg.0.log
and also add the output of
ls -al /lib/linux-restricted-modules
Can you also indicate whether you have ever tried to manually install the nvidia binary drivers or used a 3rd party non Ubuntu provided tool to do so and then post a link to your new bug report in this bug.

Revision history for this message
Nicolas Wu (nicolas.wu) wrote :

I can also confirm this bug.

Downgrading from nvidia-glx-new to nvidia-glx does not remove the file /lib/linux-restricted-modules/.nvidia_new_installed. This causes the wrong nvidia kernel module to be loaded (it loads the nvidia-glx-new module, instead of the nvidia-glx one), and causes an API mismatch.

The fix is indeed to remove the /lib/linux-restricted-modules/.nvidia_new_installed file, this ought to be done when removing the nvidia-glx-new package.

Revision history for this message
Hugo Vincent (hugo-vincent) wrote :

I got this too, and it took me ages to work and fix :-)

Serves me right for forgetting about the new restricted-manager and just installing the drivers with synaptic.

Dmitry Kotik (dkotik)
Changed in linux-restricted-modules-2.6.20:
assignee: nobody → dkotik
assignee: dkotik → nobody
Revision history for this message
Philip Belemezov (phible) wrote :

Well, me too... :(

Is this going to be fixed soon?

Revision history for this message
Ludi Maciel (iludi-deactivatedaccount) wrote :

I can confirm this problem too.
I had to manually remove all nvidia modules and reinstall (using dpkg, since apt-get nvidia-glx require an old module/kernel version) the nvidia-glx & restricted-modules again. It's working now.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Ludi:
Your comment doesn't make sense in the context of this bug. This issue is only for people switching from nvidia-glx-new/nvidia-glx-legacy to just nvidia-glx. If you have only ever used nvidia-glx then this is definitely not the bug for you. Please can you search launchpad for a bug that better fits your problem and if none is found please file a new bug report.

Revision history for this message
Ludi Maciel (iludi-deactivatedaccount) wrote :

I'm getting the same error message that was described above.
After a reboot, my system keeps refusing to load X again.

Thanks to the dubious package description, I installed the nvidia-glx-legacy. This driver won't work for my card and the nvidia-glx (which is the right one) now won't(?) load. I removed the nvidia-glx-legacy package but the message keeps showing up every time that I use the 'nvidia' driver in the device section of xorg.conf.

I'm using the 2.6.20-16-generic kernel.

Are you sure that this has nothing to do with this bug?

Revision history for this message
Ludi Maciel (iludi-deactivatedaccount) wrote :

               "It so happens that this file existed (it was still there!) even though I had already removed the package "nvidia-glx-new" that had been installed at first.
               I deleted this file and now the driver version 9631 works and I believe that installing the deb package for nvidia-glx would work too."

Thanks Capineiro. Your fix did the trick here.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Ludi:
Your issue appears to be precisely this bug so I was completely wrong. I was unaware that you had nvidia-glx-legacy and had then subsequently tried to install nvidia-glx.

Revision history for this message
Richard Kleeman (kleeman) wrote :

This bug is cropping up A LOT on the nvidia linux forum. It would be good from a Ubuntu publicity angle if a fix was backported to Feisty fairly soon. Just my 2 cents......

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Richard:
Hmm that's worrying (I noticed that was cropping up a lot too and I've been trying to add the most frequently reported bugs to https://help.ubuntu.com/community/BinaryDriverHowto/Nvidia with the most common problems nearer the top) but this is only a symptom. People are switching from -new to nvidia-glx for a reason and I suspect that's why we're seeing this crop up more. Going from -legacy to nvidia-glx is even more worrying as the -legacy drivers don't cope well with Composite and really should not be used on cards that have the possibility of using something newer. Whatever is making people attempt these switches in greater numbers needs addressing...

Revision history for this message
Fran6co (fran6co) wrote :

I can confirm this bug, removing that file solved the problem.

Revision history for this message
Aaron Wohl (xub) wrote :

As to why people are messing with there nvidia drivers at all, here is how I fell into this:

- boot 7.04 from a live cd
- use restricted manager to install nvidia (thought id try it out first)
- it finishes ok but then says needs to reboot, which I did not do figuring the changes would go away rather than work as it was on a cd not a writable file system
- install to a hard disk (new format)

This installs X11 stuff thinking it can use nvidia as it is on the live cd but the restricted stuff is not installed and X11 doesn't startup.

It seems like a bug to me that changes to a live cd boot that need to reboot to start working have an effect on the install to hard disk. Eventualy I gave up and booted the live cd and installed to a formatted disk, then booted the hard disk and ran the restricted driver manager there ok.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Aaron:
That issue is almost certainly worth spinning off into a new bug of its own...

Revision history for this message
bodhi.zazen (bodhi.zazen) wrote :

AAARRRRGHHH ....

It took me two days and crawling the Nvidia forums to sort this one out :(

In case anyone is reading this looking for a solution, here it is :

  sudo rm /lib/linux-restricted-modules/.nvidia.new.installed

I know this is a temporary solution, still ...

Revision history for this message
Christian Kellner (gicmo) wrote :

It took me like 3 hours to find this bug and get rid of the file, this should really be fix ASAP! ;-)

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Bug #130799 says that this issue is still happening in Gutsy...

Revision history for this message
Joseph Garvin (k04jg02) wrote :

I can confirm the bug is still present in Gutsy Tribe 4 (still silently messes up manual installation of the binary nvidia drivers).

Revision history for this message
Ethan Bissett (draimus-deactivatedaccount) wrote :

The problem seems to be in the prerm script of the deb file. Instead of removing /lib/linux-restricted-modules/.nvidia_new_installed it tries to remove /lib/linux-restricted-modules/.nvidia_legacy_installed.

Revision history for this message
Jos Dehaes (jos-dehaes) wrote :

Still a bug in gutsy (just ran into it). Removing the offending file worked for me.

Revision history for this message
zeddock (zeddock) wrote :

tried to nominate for Gusty but it was not listed.

zeddock

Revision history for this message
John Dong (jdong) wrote :

This is definitely still a bug in gutsy up to today. It would be nice if it were fixed by Gutsy's release, as it should just be another rm statement in postrm. Flipping between nvidia-glx/nvidia-glx-new is pretty common for people who are trying to troubleshoot desktop-effects glitches and this is a highly nontrivial nuance to find.

Changed in linux-restricted-modules-2.6.22:
status: New → Confirmed
Revision history for this message
Mark Carey (careym) wrote :

I hit this one as well on Gutsy 7.10rc today, machine wouldnt start gdm using an nvidia-glx after first having nvidia-glx-new installed. Had to manually insmodding the correct nvidia.ko and then start gdm, bit of a pain as nvidia-glx-new is no good for nouveau Reverse Engineering (it causes crashes on PCIE machines) so downgrading to nvidia-glx is required.

Deleting the file /lib/linux-restricted-modules/.nvidia_legacy_installed work for me

p.s Nice work on the graphical xorg fall back configuration, much better than the ugly old white text on blue background with red buttons curses messages.

Changed in linux-restricted-modules-2.6.22:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → High
milestone: none → gutsy-updates
status: Confirmed → Triaged
Revision history for this message
zeddock (zeddock) wrote : Re: [Bug 106217] Re: Hidden file does not get removed when switching from nvidia-glx-new/nvidia-glx-legacy to nvidia-glx causing X not to start due to mismatch of versions

Sorry Ben. I have now been educated on the setting of triaged.

Thanx for the help.

zeddock

On 10/18/07, Ben Collins <email address hidden> wrote:
>
> ** Changed in: linux-restricted-modules-2.6.22 (Ubuntu)
> Importance: Undecided => High
> Assignee: (unassigned) => Ubuntu Kernel Team (ubuntu-kernel-team)
> Status: Confirmed => Triaged
> Target: None => gutsy-updates
>
> --
> Hidden file does not get removed when switching from
> nvidia-glx-new/nvidia-glx-legacy to nvidia-glx causing X not to start due to
> mismatch of versions
> https://bugs.launchpad.net/bugs/106217
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

I've emailed a patch to the kernel-team mailing list for their consideration.

https://lists.ubuntu.com/archives/kernel-team/2007-October/001874.html

Thanks!

Revision history for this message
bpotato (imapotato2) wrote :

Actually, I'm not sure how I got in this state, but my problem is of the same sort but even more of a pain to fix. Not only did that .file have to be deleted as discussed, but SOMETHING ins insmod'ing an old nvidia driver from somewhere. The only way I can get X to load is to rmmod nvidia and then startx.

I guess I'll be putting that in my startup scripts somewhere. seems something really broke thing.

Revision history for this message
Peter Clifton (pcjc2) wrote :

Try:
sudo update-initramfs -u

To copy the new driver into the initrd. That might be where its getting loaded from.

If this helps, and the new driver was installed via an official Ubuntu package, please note back here, as that update hook probably ought to be run after being installed. I'm not familiar with the nvidia package specifically.

Revision history for this message
unggnu (unggnu) wrote :

I can confirm this. Still happens in Gutsy with VGA compatible controller: nVidia Corporation NV17 [GeForce4 440 Go] (rev a3).

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Peter:
Experience says that many mismatch problems usually arise when people manually compile the driver for themselves / use a tool like envy or automatrix to install the driver (additionally the NVIDIA driver should not be initramfs because it is not needed to boot the system. It could have been added but you'd have to manually make the change). As such this bug is very attractive to people doing searches for "driver mismatch". However, this bug will become absolutely huge if everyone who has a driver mismatch for whatever reason starts posting here. If you were going to support these cases then please speak up! I saw your sterling work on the Intel bug so I'd rather have you solve this bug and just unsubscribe myself so I reduce the mail.

bpotato:
Can you indicate whether you have ever manually installed the NVIDIA drivers from the NVIDIA website or used a 3rd party tool like envy or automatrix to install the drivers?

Revision history for this message
bpotato (imapotato2) wrote :

I did indeed manually install the NVIDIA drivers... but only after the X system was completely broken. I used the built-in package update program (is it called aptitude??) to change drivers initially. Once done, I rebooted and X failed to come up. I tried to use 'dselect' to go back to the other driver, but this first off declared a lot of conflicts and eventually seemed to work. And yet X still didn't come up with a reboot. That's when I started casting about, eventually trying to install the drivers directly from nvidia (which still didn't help).

Concerning "update-initramfs -u":

I tried that and it didn't help (after a reboot, of course). So out of curiosity, I tried it again. It yielded a different file, but still didn't help (after a reboot). Out of curiosity, I tried it a third time. It yielded yet another file. After the next reboot, that kernel is now trash. Panics on the second line of startup. Luckily, I still have the -12 kernel that initially installed. I suppose this problem could be based on my nvidia problems, but I'm suspecting that there's something unwholesome about "update-initramfs" as well.

Oh! To answer the implicit question a while back, I got started on this whole "update nvidia" quest because the driver Ubuntu installed as part of the initial installation tended to crash on video mode switches. Especially going from X to a console login. It doesn't crash _every_ time, just often enough to be annoying. And when it crashes, lights flash on the keyboard and the system is dead. One time it even died when the screen blanker initiated.

Revision history for this message
Peter Clifton (pcjc2) wrote :

As I don't have nvidia hardware, I'm not likely to be able to help with any underlying problems.
I hadn't realised there was common practice for people to be installing non Ubuntu versions of these pacakges.

If there are bugs with the Ubuntu shipped verison of the drivers, please find the existing bug which matches your symptoms exactly, or open a new one. Often part of the debugging process will be to produce a "propper" .deb package of a newer driver to test, which shouldn't suffer the kind of conflicts you'll get using tools like automatix.

uptate-initramfs causing breakage may indicates that the modules you have installed have corrupted / conflicted with those already installed. update-initramfs copies various modules (including some of the agp drivers and dri modules etc..) to the ram disk where some modules load from, so you may not have been loading them before now.

update-initramfs -u should uptate due initrd corresponding to the currently running kernel. I don't know why its updating different files each time you booted.

If you're having issues still, I'd suggest using aptitude to remove and re-install the latest kernel (linux-image, linux-restricted-modules and linux-ubuntu-moduels at least).

Revision history for this message
bpotato (imapotato2) wrote :

Thanks much for the advice. I did uninstall and reinstall the mentioned drivers. That alone does not seem to have fixed the problem, but perhaps it helped. While installing one of them, I noticed that it mentioned that I should run /usr/sbin/nvidia-glx-config.

I tried the "enable" and "disable" both, I don' t remember which order, and neither worked. I tried one or the other again, and X came up during boot! So I shut down and tried again, and the system locked. A reboot and X came up. Seems to be reliably coming up now.

I'm still thinking there's something unwholesome, probably in the nvidia driver, but I'm willing to leave it alone if it'll leave me alone.

Anyway, nvidia-glx-config seems to be part of the puzzle.

Revision history for this message
Ben Collins (ben-collins) wrote :

I believe lamont has uploaded a package to gutsy-proposed that includes Leann's patch. Should make its way to gutsy-updates after the SRU process.

Changed in linux-restricted-modules-2.6.22:
assignee: ubuntu-kernel-team → lamont
status: Triaged → Fix Committed
Revision history for this message
Martin Pitt (pitti) wrote :

Accepted into gutsy-proposed, please test.

Changed in linux-restricted-modules-2.6.22:
status: New → Fix Committed
Changed in linux-restricted-modules-2.6.20:
status: New → Invalid
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Verification completed successfully.

After installation and removal of nvidia-glx-new version 100.14.19+2.6.22.4-14.9, the file /lib/linux-restricted-modules/.nvidia.new.installed still exists.
After installation and removal of nvidia-glx-new version 100.14.19+2.6.22.4-14.10, the file /lib/linux-restricted-modules/.nvidia.new.installed is removed.

Revision history for this message
Martin Pitt (pitti) wrote :

Copied to gutsy-updates, closing gutsy task. Please do the same fix for Hardy. Thank you!

Changed in linux-restricted-modules-2.6.22:
status: Fix Committed → Fix Released
Revision history for this message
Markyb86 (mark-baylin) wrote :

nvidia-glx-config enable

DID THE TRICK

then have to

sudo nvidia-settings

:-)

feisty

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

This fix is also in hardy.

Changed in linux-restricted-modules-2.6.22:
assignee: lamont → nobody
status: Fix Committed → Fix Released
Revision history for this message
Bryce Harrington (bryce) wrote : linux-restricted-modules-2.6.20 is obsolete

This package has become obsolete so we're closing out the bug report as WONTFIX.
Thanks for reporting it though!

Changed in linux-restricted-modules-2.6.20:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.