nV News Forums

 
 

nV News Forums (http://www.nvnews.net/vbulletin/index.php)
-   NVIDIA Linux (http://www.nvnews.net/vbulletin/forumdisplay.php?f=14)
-   -   To NVIDIA developers, a strange bug, kernel guys blame you (http://www.nvnews.net/vbulletin/showthread.php?t=156231)

artem 10-19-10 04:02 AM

To NVIDIA developers, a strange bug, kernel guys blame you
 
Here's a bug which I'm unable to resolve and kernel developers claim it's a problem with the NVIDIA driver.

Please, look into it, I'm ready to provide any necessary information.

The error I get is:
Code:

(EE) NVIDIA(0): The NVIDIA kernel module does not appear to be receiving
(EE) NVIDIA(0):    interrupts generated by the NVIDIA graphics device
(EE) NVIDIA(0):    PCI:5:0:0.  Please see Chapter 8: Common Problems in the
(EE) NVIDIA(0):    README for additional information.
(EE) NVIDIA(0): Failed to initialize the NVIDIA graphics device!


kwizart 10-19-10 06:36 AM

Re: To NVIDIA developers, a strange bug, kernel guys blame you
 
You can start to read the sticky If you have a problem, PLEASE read this first

Then start with the very lastest driver as 260.19.12.

Also, one option I usually tend to use more and more is :
options nvidia NVreg_EnableMSI=1 in /etc/modprobe.d/nvidia.conf
You might give it a try. (couldn't MSI be activated automatically in some cases ?)

artem 10-19-10 07:42 AM

Re: To NVIDIA developers, a strange bug, kernel guys blame you
 
1 Attachment(s)
I already have NVreg_EnableMSI=1 in my modprobe configuration:

Code:

# grep -i nvidia /proc/interrupts
 42:      4021      3914      3563      2909  PCI-MSI-edge      nvidia

The newest driver also exhibit this problem.

AaronP 10-19-10 08:36 AM

Re: To NVIDIA developers, a strange bug, kernel guys blame you
 
Interrupt routing is set up by the system BIOS and the kernel, and the driver has little to do with it so it's surprising that the kernel developers blame the driver. Have you tried disabling MSI?

artem 10-19-10 08:40 AM

Re: To NVIDIA developers, a strange bug, kernel guys blame you
 
I will try and report back ASAP.

Thanks, AaronP, disabling MSI helped and now I'm able to load nvidia.ko module at anytime.

However my GPU now shares an interrupt with a USB controller:
Code:

grep -i nvidia /proc/interrupts
 16:        25        80        16        116  IO-APIC-fasteoi  ehci_hcd:usb1, nvidia

which I don't much like but I can live with.

Alan Cox said exactly this: "It's a non free driver only they have the source to all the parts so only they can debug it." :( And as expected the bug was closed as INVALID.

dae 10-19-10 09:57 AM

Re: To NVIDIA developers, a strange bug, kernel guys blame you
 
FYI, MSI stopped working for me when I switched from 260.19.06 to 260.19.12 (same kernel version). Something must've changed in the driver that broke it.

AaronP 10-19-10 12:18 PM

Re: To NVIDIA developers, a strange bug, kernel guys blame you
 
I talked to our kernel guy and he said that MSI is notoriously problematic throughout the hardware and software stack, and he recommended that you just stick with traditional interrupts.

dae 10-19-10 01:00 PM

Re: To NVIDIA developers, a strange bug, kernel guys blame you
 
AaronP, that's an interresting statement. I'm absolutely not question him, but I'd love to hear more.

I wasn't aware of any problems in the kernel (which I assume is what he means with "software stack"), and I have a number of devices using PCI MSI without any issues what so ever. Is it related to nvidia hardware only, or is it a more general problem?

I don't mind using legacy INTx interrupts, as I don't think the overhead from interrupt sharing (nvidia shares interrupt with USB controller) is anything I need to be concerned about; I'm only asking out of curiosity.

artem 10-19-10 01:13 PM

Re: To NVIDIA developers, a strange bug, kernel guys blame you
 
Aaron, thanks for the information.

However it's kinda weird that I've been running MSI'ed nvidia.ko module for three years straight with no problems and this issue has surfaced only recently (I must mention that I swapped my PC half a year ago, but even with a new one I didn't have this problem at the beginning - but then it was a different kernel and nvidia driver version).

dura91 10-24-10 11:04 AM

Re: To NVIDIA developers, a strange bug, kernel guys blame you
 
MSI stop working for me with driver 195.36.24 (195.36.31+ are too buggy for me) when I switch from kernel 2.6.35.4 to 2.6.35.7 or 2.6.36.

So there was a change in the kernel that make MSI not working anymore with nvidia driver.

MSI still working fine with other hardware.

AaronP : On my computer it's the contrary, it really works better with MSI enabled. With traditional interrupts I got after a few days :
Code:

kernel: [144397.974550] Disabling IRQ #16
and graphics start to be very slow until I unload and reload nvidia kernel module.


All times are GMT -5. The time now is 10:06 PM.

Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.