nv, autodetection fails because of missing PCI-IDs, in nv.ids, for a large number of supported cards in Jaunty

Bug #385703 reported by Bojan Vitnik
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
xserver-xorg-video-nv (Ubuntu)
Won't Fix
High
Unassigned

Bug Description

Binary package hint: xserver-xorg-video-nv

In my case, "/usr/share/xserver-xorg/pci/nv.ids" was missing PCI-ID for my 7600 GS AGP (10de:02e1). X server autodetection failed and -vesa was used instead. By manualy adding appropriate PCI-ID to "nv.ids" and restarting X server, autodetection worked like a charm and I didn't have any problems with nv driver - changing resolution and refresh rate worked.

I took some time to figure out what was realy going on here. It seems that a large number of supported cards are affected by this bug. The problem is the way "nv.ids" is generated. As far as I could tell, it's generated from "nv_driver.c" (NVKnownChipsets[] to be exact) by a simple awk+sed script. The problem is that not all PCI-IDs of supported cards are listed there and therefore, "nv.ids" is missing PCI-IDs that are not explicitely found in "nv_driver.c". Some supported cards, for example, AGP versions of the cards where both AGP and PCI-E versions exist, are handled diferently by the code.

Here is the code snippet from nv_driver.c:

    const CARD32 id = ((dev->device_id & 0xfff0) == 0x00F0 ||
                       (dev->device_id & 0xfff0) == 0x02E0) ?
                      NVGetPCIXpressChip(dev) : dev->vendor_id << 16 | dev->device_id;
    const char *name = xf86TokenToString(NVKnownChipsets, id);

So, cards with PCI-IDs 02Ex and 00Fx (AGP) are handled by translating their PCI-IDs in PCI-E equivalent PCI-IDs and then looked up in NVKnownChipsets[]. For example, PCI-IDs of my card, 02E1 is translated to 0392, looked up in NVKnownChipsets[] and then handled properly. NVGetPCIXpressChip() is used in the process and, as far as I could tell, it's medling with registers to do its job. Even more supported cards are handled by NVIsSupported() and NVIsG80() using bitmasks.

All in all, "nv.ids" is incomplete because of the way it's generated at the moment.

Solutions:

1) Add supported PCI-IDs, 02Ex and 00Fx, to "nv.ids" manualy, after "nv.ids" is generated with awk+sed script. Consult "pci.ids" if needed.
2) Add supported PCI-IDs in NVKnownChipsets[] and then generate "nv.ids". Those entries in NVKnownChipsets[] would never be used by the code so the question is should they even be there.
3) Change how X server handle .ids files upstream, by adding support for whildcards/bitmasks. "nv.ids" would be generated differently. (ideal)

References used:

xserver-xorg-video-nv_2.1.12.orig.tar.gz - (nv_driver.c)
xserver-xorg-video-nv_2.1.12-1ubuntu5.diff.gz - (how nv.ids is generated)
/usr/share/xserver-xorg/pci/nv.ids
http://pciids.sourceforge.net/pci.ids
http://cgit.freedesktop.org/xorg/driver/xf86-video-nv/plain/src/nv_driver.c

Bojan Vitnik (bvitnik)
description: updated
Revision history for this message
Mario Limonciello (superm1) wrote :

Wildcards/bitmasks aren't necessarily a solution. There is hardware that is made by NVIDIA that doesn't work with NV, so if you just match on NVIDIA as a vendor, you are shooting yourself in the foot.

The best solution is:
 * Submit your IDs upstream, and then we can pull a patch..

Revision history for this message
Bojan Vitnik (bvitnik) wrote :

I never said that it should just "match NVIDIA as a vendor". I said that "nv.ids" should explicitely have, or match with whildcards/bitmasks, PCI-IDs of all hardware -nv *supports*, nothing more, nothing less, and that's not an upstream problem. "nv.ids" is not generated by upstream. So the problem are the cards with PCI-IDs 02Ex (02E0, 02E1, 02E2, 02E3 and 02E4, to be exact) and 00Fx (where x is 1-F) because of the way "nv.ids" is generated at the moment. While generating "nv.ids", the piece of code I put up there was not taken into consideration, so "nv.ids" is missing some PCI-IDs of card that are properly supported by -nv.

It's not realy a big problem, it's only autodetection that's not working. I, myself, have no problem with that. I know how to fix the problem, but that's not the case for everybody.

Revision history for this message
Robert Hooker (sarvatt) wrote :

There's a slight problem with this that I had a discussion with the -nv maintainer about while attempting to submit the id's upstream. 01_gen_pci_ids.diff which creates the nv.ids only adds the things explicitly defined in the tables in src/nv_driver.c but the AGP cards report a pci id of the AGP-PCIe bridge chip initially. The driver then figures out what chip it actually is behind the bridge (if the nv driver is actually loaded) via NVGetPCIXpressChip in the code, but when nv.ids exists that internal detection logic is lost and xserver only loads the nv driver automatically for cards that exist in nv.ids. In jaunty you then get a fall back to vesa because of a patch in xserver to make vesa the default when no driver can load. In karmic currently you get a failsafe X boot because the fallback to vesa patch was dropped. If nv is explicitly defined for the device in xorg.conf it will work fine though in karmic as long as the driver supports your actual card (which it will for the AGP variants for the above mentioned reason).

The current method being used to add these cards is to add them to the tables in ubuntu's patches (102_geforce_7300_gt.patch and 104_geforce_6600gt_9100mg.patch). The maintainer suggested that was a bad idea in this specific case because the cards are already defined which could cause problems, and the problem instead is the detection logic brought in from 01_gen_pci_ids.diff.

I've attached a patch that adds support for every AGP-PCIe bridge chip I could find in a manner that will only permit the id's to be included in the nv.ids generation and not interfere with any actual card detections as suggested by the driver maintainer. Adding them behind a #if 0/#endif makes sure they are never actually used for anything, but will get included by the nv.ids generation so that the server will pick the -nv driver and let it do its detection work.

There is still the small problem of the nv.ids generation routine not accounting for the catchall cases set up so that it will work for newer chips in series that haven't been explicitly added to the tables though.. For example, look around lines 713 and 738 here-

http://cgit.freedesktop.org/xorg/driver/xf86-video-nv/tree/src/nv_driver.c?id=36eb96854b34bee6b65a2b2d4df25f53b47194e4

That will expand each supported chip range from 10DE0390 through 10DE039F for instance to account for unreleased future cards or things not explicitly defined in the tables..The pci id generation routine ignores all of that though so it would probably be a good idea to expand each one hidden behind the #if 0/#endif from my patch so 01_gen_pci_ids.diff will actually create id's for them for better support on older releases. I've submitted the patch upstream to the maintainer but since it is a debian/ubuntu specific thing and nv.ids generation is going the way of the dodo now (debian already dropped it) I'm not sure its relevant there.

Revision history for this message
Bojan Vitnik (bvitnik) wrote :

My point exactly. Thanks for helping me clarify this bug even more. Since no one i paying attention, I was prepared to make a patch myself, thou my idea was to modifying the way "nv.ids" is generated in "01_gen_pci_ids.diff".

A the moment the script looks like this:

awk '/{ 0x.*/ || /case 0x.*/ {print $$2}' ${srcdir}/nv_driver.c | sed -e s/0x// -e s/,// -e s/:// -e s/^0/10DE0/ | sort -u > nv.ids

This line doesn't even work and "nv.ids" must be cleaned afterwards. "/case 0x.*/" is useless too. My idea was to change it to something like this:

awk '/{ 0x.*/ {print $$2}' nv_driver.c | sed -e 's/ { //' -e 's/0x//' -e 's/,.*//' -e 's/:.*//' > nv_tm
p.ids; for i in 0 1 2 3 4 5 6 7 8 9 A B C D E F; do echo "10DE00F$i"; done >> nv_tmp.ids; for i in 0 1 2 3 4; do echo "10DE02E$i"; done >> nv_tmp.ids; sort
-u nv_tmp.ids > nv.ids

There is probably more elegant way but I'm not realy good at bash :).

In, the end it seems that it doesn't even matter any more since *.ids are going to be dropped.

Revision history for this message
Bryce Harrington (bryce) wrote :

Thanks Sarvatt, I'm pulling your patch now.

I notice this covers the 6600 GT case, so am dropping that bit from patch 104, however it does not appear to cover the 9100, 7025 and 7050, so have left those patches. Maybe you could doublecheck if we need to redo those patches as well to be included here?

Changed in xserver-xorg-video-nv (Ubuntu):
importance: Undecided → High
status: New → Triaged
Revision history for this message
Bryce Harrington (bryce) wrote :

I've packaged the changes and put a -nv package into my PPA:

   https://edge.launchpad.net/~bryceharrington/+archive/ppa

Please test and let me know this is working properly, and I'll upload for karmic.

Changed in xserver-xorg-video-nv (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Bojan Vitnik (bvitnik) wrote :

9100, 7025 and 7050 are either mobile or intergrated chips. It seems they are not officialy supported by the driver. IDs are not referenced in the source code anywhere in any way. Upstream maintainer, Aaron Plattner I believe, should be asked if those chips are supported or not. Were those IDs left out by mistake or on purpose. A rather large number of mobile chips is affected - IDs 07xx, 08xx and some 053x. Has anyone tested those chips agains -nv by adding IDs to source code? According to bug report 321613, 9100M works, but what about the others?

Revision history for this message
Robert Hooker (sarvatt) wrote :

7xxx IGP support was brought in by a patch that fedora uses which I have tested and does work fine. It goes above and beyond just adding pci ids. 9100M support I do not by any means believe actually works but I do not have the hardware to test it and I have not been able to get one Xorg.0.log to prove it working (in the same bug you linked). Aaron Plattner is the one I spoke with regarding the pci id change Bryce added to the drivers on the PPA, and the method I used in there is what was suggested by him actually. I have forwarded the 7xxx IGP support patch to him but have not heard any response about it.

Bryce: I just tested your package and it works correctly on a 6800 AGP that didn't work with the stock drivers. Removing pci ids install from the package also makes it work.

Revision history for this message
Bojan Vitnik (bvitnik) wrote :

Bryce, I tested your package. Autodetection and driver worked as expected with my card (7600 GS AGP). Logs attached.

Bryce Harrington (bryce)
tags: added: jaunty
Revision history for this message
Bryce Harrington (bryce) wrote :

Sarvatt, am I correct in understanding that with the latest -nv 2.1.14 code, this patch is no longer necessary?

Changed in xserver-xorg-video-nv (Ubuntu):
status: In Progress → Incomplete
Revision history for this message
Bojan Vitnik (bvitnik) wrote :

Bryce, I took some time to test the Beta version of Karmic Koala Kubuntu, LiveCD only thou. I must say that things are even worse now. Not only does autodetection fails but now -nv doesn't work at all with my card. I also took some time to take a look at new patches introduced in Ubuntu package of -nv 2.1.14, "0001-Move-the-logic-for...patch" and "0002-Run-the-parsing-script...patch". If I understand correctly, NVKnownChipsets[] and NVPciIdMatchList[] tables, this time in separate files - "nvidia_chipset_gen.h" and nvidia_pci_device_match_gen.h, are now generated from a file called "nv_list.cvs", using a Perl script called "parse_pci_ids.pl". I suppose "nv_list.cvs" is maintained manually. The problem is that, once again, PCI-IDs 02E0-02E4 and 00F0-00FF are missing, this time in "nv_list.cvs". That's probably because "nv_list.cvs" was based on NVKnownChipsets[]. As we already confirmed, NVKnownChipsets[] is not a complete table of all *supported* chips. Fixing this problem now is not simple as before (just by editing nv.ids). This time editing of "nv_list.cvs" is required, then you have to generate NVKnownChipsets[] and NVPciIdMatchList[] and then you have to recompile the driver.
As for the problem of -nv failing to work completely, as far as I could tell, it's because of the NVPciIdMatchList[] table. NVPciIdMatchList[] matched any chip with nVidia vendor ID (12D2 and 10DE) in the original source code but after the patches it only matches a selection of PCI-IDs generated from "nv_list.cvs". So the driver reports to the X Server that my card is not supported. In the end -vesa is used instead.
Xorg.0.log will be in attachment. It can be clearly seen that -nv was picked by X Server as candidate (so autodetection did work?) but dropped later in favor of -vesa.

Sollution:
------------
Add PCI-IDs 02E0-02E4 and 00F0-00FF to "nv_list.cvs", use "parse_pci_ids.pl", and recompile the driver.

Note: PCI-IDs 02E0-02E4 and 00F0-00FF are now going to be in NVKnownChipsets[] which is exactly what Sarvatt's patch was trying to avoid so maybe some fixing for "parse_pci_ids.pl" are also needed.

Revision history for this message
Bojan Vitnik (bvitnik) wrote :

This is a patched version of "nv_list.csv".

I made a little typo in last comment - it was "nv_list.csv" not "nv_list.cvs".

Revision history for this message
Bojan Vitnik (bvitnik) wrote :

Regarding the note at the and of comment #11 - after some further inspection of Mario's patches ("0001-Move-the-logic-for...patch" and "0002-Run-the-parsing-script...patch"), I now see that NVKnownChipsets[], generated by "parse_pci_ids.pl", won't contain any PCI-IDs that don't have matching card name string so it's pretty safe to add missing PCI-IDs to "nv_list.cvs", NVKnownChipsets[] won't be affected.

Revision history for this message
Bryce Harrington (bryce) wrote :

Thank you for reporting this issue about xserver-xorg-video-nv. Starting with Lucid, Ubuntu is transitioning to using the -nouveau video driver by default instead of -nv. The reason for this change is because upstream development for the -nv driver has been quite slow. We hope bugs will get fixed faster with -nouveau as well.

Because of this, I'm closing this bug report at this time. I'm marking it wontfix because what you describe is probably a valid issue, but we do not have intentions to work on it in Ubuntu. If you would still like to see this issue investigated, I would encourage you to file it upstream.

Changed in xserver-xorg-video-nv (Ubuntu):
status: Incomplete → Won't Fix
Revision history for this message
Bojan Vitnik (bvitnik) wrote :

Bryce, thanks for responding. I look forward to Lucid, -nouveau and KMS on nVidia cards. In my experience -nouveau worked nicely with my card, as far as proper resolution and refresh rate are concerned. Tested it in Fedora 12. KMS worked nicely also. I hope it will be same for Lucid.
Since Lucid will be using -nouveau instead of -nv, my problem is not relevant any more and, as far as I'm concerned, this bug report can be closed.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.