nV News Forums

 
 

nV News Forums (http://www.nvnews.net/vbulletin/index.php)
-   NVIDIA Linux (http://www.nvnews.net/vbulletin/forumdisplay.php?f=14)
-   -   BusID PCI on Server Boards (http://www.nvnews.net/vbulletin/showthread.php?t=142869)

Chello 12-21-09 07:58 AM

Xorg Failed to initialize the NVIDIA graphics device
 
Hallo i run Ubuntu on an Serverboard with an NPF3600 and NPG3050 and my problem is that the gfx driver didnt detect my card when its conneted to the 3050 Chipset.

lspci on NPF3600:
Code:

0000:18:00.0 VGA compatible controller: nVidia Corporation G96 [GeForce 9500 GT] (rev a1)
part of xorg.log
Code:

(--) PCI:*(0:24:0:0) 10de:0640:0000:0000 nVidia Corporation G96 [GeForce 9500 GT] rev 161, Mem @ 0xfa000000/16777216, 0xe0000000/26843545
6, 0xf8000000/33554432, I/O @ 0x00003000/128
(II) Dec 21 14:21:00 NVIDIA(0): NVIDIA GPU GeForce 9500 GT (G96) at PCI:24:0:0 (GPU-0)

When i place the card in the 2nd pci-e 16x slot that it on the 3050, xorg fail to start:

lspci on NPF3050:
Code:

0001:58:00.0 VGA compatible controller: nVidia Corporation G96 [GeForce 9500 GT] (rev a1)
parts of xorg.log
Code:

(--) PCI:*(1:88:0:0) 10de:0640:0000:0000 nVidia Corporation G96 [GeForce 9500 GT] rev 161, Mem @ 0xfa000000/16777216, 0xe0000000/26843545
6, 0xf8000000/33554432, I/O @ 0x00002000/128
..
..
(**) NVIDIA(0): Depth 24, (--) framebuffer bpp 32
(==) NVIDIA(0): RGB weight 888
(==) NVIDIA(0): Default visual is TrueColor
(==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
(**) Dec 20 14:39:48 NVIDIA(0): Enabling RENDER acceleration
(II) Dec 20 14:39:48 NVIDIA(0): Support for GLX with the Damage and Composite X extensions is
(II) Dec 20 14:39:48 NVIDIA(0):    enabled.
(EE) Dec 20 14:39:48 NVIDIA(0): Failed to initialize the NVIDIA graphics device!
(II) UnloadModule: "nvidia"
(II) UnloadModule: "wfb"
(II) UnloadModule: "fb"
(EE) Screen(s) found, but none have a usable configuration.

Fatal server error:
no screens found

Now my q. is how to say that the card is on the 2nd chipset ?

The format like " BusID "PCI:18:0:0" " didnt include the chipset ?

I hope someone here know a solution :)

AaronP 12-21-09 01:43 PM

Re: BusID PCI on Server Boards
 
Please see the forum sticky posts about how to report a problem, including the part about generate an nvidia-bug-report.log.gz file.

Chello 12-21-09 02:50 PM

Re: Xorg Failed to initialize the NVIDIA graphics device
 
2 Attachment(s)
Ok here it is:

nvidia-bug-report.log.gz == the non working one. (gfx on npf3050)

nvidia-bug-report.log.working-on-npf3600.gz = the working one

AaronP 12-21-09 03:53 PM

Re: BusID PCI on Server Boards
 
Hmm, that is interesting. I suspect the X server's not to blame here, but it's worth trying the BusID line. It looks like the server parses the "bus@domain:device:func" syntax, so you could try "PCI:88@1:0:0".

Chello 12-21-09 04:31 PM

Re: BusID PCI on Server Boards
 
I tested it with "PCI:88@1:0:0" and also with "PCI:58@1:0:0" (DEC and HEX) but non is working.

But i found inside the bug-report:

/proc/driver/nvidia/cards/0
Code:

Model:                  GeForce 9500 GT
IRQ:                    41
Video BIOS:          ??.??.??.??.??
Card Type:          PCI-E
DMA Size:          32 bits
DMA Mask:          0xffffffff
Bus Location:          58.00.0

So the OS cant find the BIOS from the card at this slot. I dont know if this is the problem but on the other pci-e slot it see it.
Code:

Model:                  GeForce 9500 GT
IRQ:                    21
Video BIOS:          62.94.3c.00.00
Card Type:          PCI-E
DMA Size:          40 bits
DMA Mask:          0xffffffffff
Bus Location:          18.00.0

I also added "vmalloc=320M" to grub but this also didnt help.

Chello 12-21-09 05:44 PM

Re: BusID PCI on Server Boards
 
1 Attachment(s)
Here is one more bug-report when i use the "vesa" driver and there its working inside this pci-e.

AireTamStrm 01-05-10 10:23 PM

Re: BusID PCI on Server Boards
 
I am having a similar (if not the same) problem on a Tyan S7025 with a pair of GTX 260s. Xorg.0.log reports:

(II) Primary Device is: PCI 88@00:00:0
(EE) No devices detected.

Despite trying several BusID lines in Xorg:

BusID "PCI:88@00:00:0"
BusID "PCI 88@00:00:0"
BusID "PCI:88:00:0"

/sbin/lspci :
88:00.0 VGA compatible controller: nVidia Corporation GT200 (GeForce GTX 260) (rev a1)
I'm on my phone right now, I'll post the bug report and more lspci magic here tomorrow.

AireTamStrm 01-06-10 08:28 AM

Re: BusID PCI on Server Boards
 
5 Attachment(s)
I have attached the nvidia bug report (Attachment 39092), as well as the output of lspci -t with one card (Attachment 39094) and with both cards in the system (Attachment 39095).

Additionally, there is the output of cat /proc/driver/nvidia/cards/* with both cards installed (Attachment 39097) and lspci (Attachment 39096).

This is a dual processor system with two Intel 5520 Northbridge chipsets. Perhaps the BusID addressing is different in this scenario?
http://www.vacuumtube.org.uk/images/S7025.jpg

AireTamStrm 01-06-10 11:24 AM

Re: BusID PCI on Server Boards
 
Another update: As per the Intel 5520 / 5500 Datasheet (http://www.intel.com/Assets/PDF/datasheet/321328.pdf), the above illustrated Dual IOH (I/O Hub) configuration appears as a single IOH:

Quote:

IOH supports a special mode to work with dual socket processor platform that allows
two IOHs to appear as a single IOH to the processors in the system. This mode results
in special behavior in the link and protocol layers. Each IOH will have a unique NodeID
for communication between each other, but only the legacy IOH’s NodeID will be
exposed to the CPU.
The protocol flows to the processor (DRAM or interrupts) divide behavior between the
master IOH and the slave IOH. The master is the one that is connected directly to the
home agent. The slave is the IOH connected to the non-home processor. Each IOH will
behave as both a master and a slave depending on the processor that is being
targeted. The job of the master is to act as a Intel QuickPath Interconnect proxy for the
slave. This is done to ensure that the home agent only sees a single IOH NodeID
sending requests. This also requires that the master IOH resolve any conflicts that
occur between it and the slave.
I will continue to do further testing and search for a resolution/workaround, however I would like to submit a support ticket with nVIDIA regarding this.

Chello 01-06-10 11:27 AM

Re: BusID PCI on Server Boards
 
My Problems are also on an Mobo with 2 Chipsets. Looks like the nvidia driver didnt check for an card on the 2nd chipset.

AireTamStrm 01-07-10 08:25 AM

Re: BusID PCI on Server Boards
 
Further testing more or less leads me to believe that this is a driver issue with dual chipset configurations. When in one of two x16 slots, the nVIDIA driver loads correctly and X can start. When in one of the other two, the device is unable to be addressed (even explicitly using the PCI Bus ID) and the driver fails to initialize that card.

I'm not sure that this has anything to do with the Video BIOS Version not appearing, to be perfectly honest. I still received a value of ??.??.??.??.?? when I looked at /proc/driver/nvidia/cards/*, however the driver initialized on that very card. In X11/nvidia-settings, you can see the Video BIOS Version.

This behavior is most likely a bug in the nVIDIA GeForce Linux Driver. I had a support ticket open with Tyan, my motherboard manufacturer. They have tested three x16 Tesla cards in conjunction with a GeForce card, and the Tesla cards are able function properly under Linux. Four GeForce GTX 295 cards are also reported to work in the Windows driver.

If nVIDIA could shed some light on this, it would be great.

AireTamStrm 01-08-10 09:29 PM

Re: BusID PCI on Server Boards
 
2 Attachment(s)
Update:

I was able to get the cards working on both chipsets. I had to use the hex values for the BusIDs, so I'm not sure if this fix is applicable to the author's problem. The 5520's on my motherboard do some magic to appear as a single chipset.

Code:

/sbin/lspci -t (abridged for readability)
-+-[0000:80]-+-00.0-[8b]--   
...
 |          +-07.0-[84]----00.0
...
 \-[0000:00]-+-00.0
...
            +-07.0-[08]----00.0
...

Code:

/sbin/lspci | grep VGA
08:00.0 VGA compatible controller: nVidia Corporation GT200 [GeForce GTX 260] (rev a1)
84:00.0 VGA compatible controller: nVidia Corporation GT200 [GeForce GTX 260] (rev a1)

The hex for '08' is '08', and for '84' it is '132'.
Code:

Section "Device"
    Identifier    "Device0"
    Driver        "nvidia"
    VendorName    "NVIDIA Corporation"
    BoardName      "GeForce GTX 260"
    BusID          "PCI:8:0:0"
EndSection

Section "Device"
    Identifier    "Device1"
    Driver        "nvidia"
    VendorName    "NVIDIA Corporation"
    BoardName      "GeForce GTX 260"
    BusID          "PCI:132:0:0"
EndSection

@Chello: If you have a second nVIDIA card laying around, I would suggest putting one card on one chipset, and the other card on the secondary chipset. Start X11 on the first card, and then open nvidia-settings and verify whether or not it sees the secondary card:
http://www.nvnews.net/vbulletin/atta...1&d=1263007670

Then confirm the BusID it sees. Attempt to configure this card in nvidia-settings if possible to see if that resolves your issue.
http://www.nvnews.net/vbulletin/atta...1&d=1263007670

Additional Note:
Upon activating both cards in X11, you are able to see the Video BIOS version, while the nVIDIA driver is active on your card(s).
Code:

cat /proc/driver/nvidia/cards/*
Model:          GeForce GTX 260
IRQ:            30
Video BIOS:      62.00.1a.00.1a
Card Type:      PCI-E
DMA Size:        40 bits
DMA Mask:        0xffffffffff
Bus Location:    08.00.0
Model:          GeForce GTX 260
IRQ:            54
Video BIOS:      62.00.38.00.50
Card Type:      PCI-E
DMA Size:        40 bits
DMA Mask:        0xffffffffff
Bus Location:    84.00.0



All times are GMT -5. The time now is 11:53 AM.

Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright ©1998 - 2014, nV News.