|
|
#1 | |
|
Registered User
Join Date: Sep 2006
Location: Toronto, Canada
Posts: 23
|
The kernel module still fails, saying the interrupts aren't being received. This failure occurs in all the 177. series so far. The 173. series drivers work fine with the Dell hardware.
I've tried kernel boots with pci=noacpi, acpi=off (doesn't work well anyway as the SMP configuration requires ACPI), noapic, acpi=noirq and pci=biosirq all result in the same problem with the NVIDIA module. I haven't tried irqpoll as that really isn't a viable option. Is the Dell 8700M SLi configuration not supported? |
|
|
|
|
|
|
#2 | |
|
Registered User
Join Date: Aug 2008
Posts: 11
|
Hi NightOwl, I have the same machine and while I'm still running the older 173.14.09 drivers (from the debian packages), I have had similar problems with IRQs for any kernel in the 2.6.25/26 series. In particular, I get something like the following in my kern.log...
Code:
Sep 11 11:14:51 localhost kernel: NVRM: loading NVIDIA UNIX x86 Kernel Module 173.14.09 Wed Jun 4 23:43:17 PDT 2008 Sep 11 11:14:53 localhost kernel: irq 16: nobody cared (try booting with the "irqpoll" option) Sep 11 11:14:53 localhost kernel: Pid: 2573, comm: g15daemon Tainted: P 2.6.26.5 #1 Sep 11 11:14:53 localhost kernel: [<c0158db4>] __report_bad_irq+0x24/0x90 Sep 11 11:14:53 localhost kernel: [<f97e7149>] nv_kern_isr+0x59/0xb0 [nvidia] Sep 11 11:14:53 localhost kernel: [<c015908f>] note_interrupt+0x26f/0x2a0 Sep 11 11:14:53 localhost kernel: [<c01584b8>] handle_IRQ_event+0x28/0x50 Sep 11 11:14:53 localhost kernel: [<c01597eb>] handle_fasteoi_irq+0xab/0xd0 Sep 11 11:14:53 localhost kernel: [<c0159740>] handle_fasteoi_irq+0x0/0xd0 Sep 11 11:14:53 localhost kernel: [<c0106d70>] do_IRQ+0x80/0xd0 Sep 11 11:14:53 localhost kernel: [<c012b96c>] irq_exit+0x3c/0x80 Sep 11 11:14:53 localhost kernel: [<c01046f3>] common_interrupt+0x23/0x28 Sep 11 11:14:53 localhost kernel: [<c031f2b9>] lock_kernel+0x29/0x40 Sep 11 11:14:53 localhost kernel: [<c018bc45>] vfs_ioctl+0x65/0x90 Sep 11 11:14:53 localhost kernel: [<c018bcd7>] do_vfs_ioctl+0x67/0x2d0 Sep 11 11:14:53 localhost kernel: [<c012b9e8>] irq_enter+0x38/0x70 Sep 11 11:14:53 localhost kernel: [<c018bf7d>] sys_ioctl+0x3d/0x70 Sep 11 11:14:53 localhost kernel: [<c0103d01>] sysenter_past_esp+0x6a/0x91 Sep 11 11:14:53 localhost kernel: ======================= Sep 11 11:14:53 localhost kernel: handlers: Sep 11 11:14:53 localhost kernel: [<f97e70f0>] (nv_kern_isr+0x0/0xb0 [nvidia]) Sep 11 11:14:53 localhost kernel: Disabling IRQ #16 With the 2.6.24 series however, everything works nicely without any special boot parameters. I did try the 177.13 binary drivers (before I switched to using the debian packages) a while ago and the same thing happens, if I recall correctly. So, maybe if you try a 2.6.24 kernel, the 177.70 drivers *may* work for you? (if you're lucky...) :-) |
|
|
|
|
|
|
#3 |
|
Registered User
Join Date: Oct 2004
Posts: 6
|
I have the same laptop with the same SLI GPUs but get different results (OpenSUSE 11.0):
- 177.13 works OK but very slowly in KDE 4 - 177.68 works well (faster in KDE 4) but takes ~20secs to start X, and hangs when X is terminated - 177.70 never works because it does not receive interrupts from the NVIDIA device at PCI:3.0.0 This morning, a new OpenSuSE kernel came out with interrupt handling changes but did not change these results. For me, something changed after .13 and not for the better... |
|
|
|
|
|
#4 | |
|
Registered User
Join Date: Sep 2006
Location: Toronto, Canada
Posts: 23
|
The IRQ 16 interrupt is not so much of a problem I have found - I think that is actually the PhysX componentry that isn't supported yet (or ever) in Linux - it gets disabled because there is nothing to service the interrupt.
The interrupt detection is certainly more flaky with the NVidia driver in the later kernels. When the system seems unresponsive in X, you may find the the NVidia interrupts have gone crazy and you will need to reboot. I've only had that happen a few times. A good utility to have is powertop (an Intel power monitoring program). It tells you about interrupts and how many are being generated. That's how I could tell it was an NVidia problem - it was generating 97% of the interrupts. I've tried other things as well. I've implemented MSI use in the kernel as the M1730 has some MSI-capable devices. It doesn't seem to have helped, but it does move some of the devices so they aren't shared on an interrupt line. Code:
CPU0 CPU1 0: 2215984 2284775 IO-APIC-edge timer 1: 5 5 IO-APIC-edge i8042 8: 1 0 IO-APIC-edge rtc0 9: 0 2 IO-APIC-fasteoi acpi 12: 68 68 IO-APIC-edge i8042 14: 34780 33890 IO-APIC-edge ata_piix 15: 0 0 IO-APIC-edge ata_piix 16: 50098 49903 IO-APIC-fasteoi nvidia 17: 3454 3080 IO-APIC-fasteoi nvidia 18: 0 0 IO-APIC-fasteoi mmc0 19: 1 1 IO-APIC-fasteoi ohci1394 20: 347267 309814 IO-APIC-fasteoi uhci_hcd:usb2, ehci_hcd:usb4, uhci_hcd:usb5 21: 49550 44988 IO-APIC-fasteoi uhci_hcd:usb3, HDA Intel, uhci_hcd:usb6 22: 3 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb7 215: 164491 153559 PCI-MSI-edge eth0 216: 47531 43970 PCI-MSI-edge iwl4965 217: 66465 63122 PCI-MSI-edge ahci NMI: 0 0 Non-maskable interrupts LOC: 2078656 2328864 Local timer interrupts RES: 1004541 862813 Rescheduling interrupts CAL: 1826 5533 function call interrupts TLB: 42096 45506 TLB shootdowns TRM: 0 0 Thermal event interrupts SPU: 0 0 Spurious interrupts ERR: 0 MIS: 0 |
|
|
|
|
|
|
#5 | |
|
Registered User
Join Date: Aug 2008
Posts: 11
|
Quote:
Code:
0f:00.0 Class ff00: AGEIA Technologies, Inc. Device 0000 Thanks for the tip about powertop. :-) I also have CONFIG_PCI_MSI=y in my kernels, but I wasn't sure if it was doing anything useful. |
|
|
|
|
![]() |
| Thread Tools | |
|
|