Go Back   nV News Forums > Linux Support Forums > NVIDIA Linux

Newegg Daily Deals

Reply
 
Thread Tools
Old 03-14-04, 08:53 AM   #1
bengibbs
Registered User
 
Join Date: Mar 2004
Posts: 9
Default Badness in pci_find_subsys at drivers/pci/search.c:167

Hello folks,

I have searched this forum and googled farily extensively but I have found no conclusive solution to this problem. Please rediect me if such a thing exists.

I have a (fairly old) tnt2 ultra runniing on (similarly old) MSI K7T Turbo II with and Athlon XP 2400. This has been a solid set up for some time now. However after upgrading kernels recently (first to 2.6.1 and now 2.6.3 both using the 5336 nvidia drivers) I am getting regular crashes, usuallly requiring a full power cycle to get the machine back - this is really annoying.

I have disabled APIC:
ben@palantir:/usr/src/modules
$ cat /proc/interrupts
CPU0
0: 1185572 XT-PIC timer
1: 4200 XT-PIC i8042
2: 0 XT-PIC cascade
5: 0 XT-PIC Ensoniq AudioPCI
8: 3 XT-PIC rtc
9: 0 XT-PIC VIA686A
10: 73078 XT-PIC uhci_hcd, uhci_hcd, nvidia
11: 8617 XT-PIC eth0
14: 48302 XT-PIC ide0
15: 42 XT-PIC ide1
NMI: 0
LOC: 1185548
ERR: 1316
MIS: 0

FW and SBA :

ben@palantir:/usr/src/modules
$ cat /proc/driver/nvidia/agp/status
Status: Enabled
Driver: AGPGART
AGP Rate: 4x
Fast Writes: Disabled
SBA: Disabled

As suggested elsewhere on this forum by passing the following parameters to the kernel

pci=noacpi acpi=off noapic

Checking /var/log/messages immediately after a crash I always get the following:

Mar 14 00:25:40 palantir kernel: Badness in pci_find_subsys at drivers/pci/search.c:167
Mar 14 00:25:40 palantir kernel: Call Trace:
Mar 14 00:25:40 palantir kernel: [pci_find_subsys+233/256] pci_find_subsys+0xe9/0x100
Mar 14 00:25:40 palantir kernel: [pci_find_device+47/64] pci_find_device+0x2f/0x40
Mar 14 00:25:40 palantir kernel: [pci_find_slot+40/80] pci_find_slot+0x28/0x50
Mar 14 00:25:40 palantir kernel: [_end+545372540/1069041396] os_pci_init_handle+0x3e/0x6d [nvidia]
Mar 14 00:25:40 palantir kernel: [_end+543883091/1069041396] _nv001243rm+0x1f/0x24 [nvidia]
Mar 14 00:25:40 palantir kernel: [_end+544608593/1069041396] _nv003797rm+0xa9/0x128 [nvidia]
Mar 14 00:25:40 palantir kernel: [_end+545053589/1069041396] _nv001490rm+0x55/0xe4 [nvidia]
Mar 14 00:25:40 palantir kernel: [_end+545220680/1069041396] _nv000816rm+0x334/0x384 [nvidia]
Mar 14 00:25:40 palantir kernel: [_end+545219395/1069041396] _nv000809rm+0x2f/0x34 [nvidia]
Mar 14 00:25:40 palantir kernel: [_end+544603716/1069041396] _nv003816rm+0xf0/0x104 [nvidia]
Mar 14 00:25:40 palantir kernel: [_end+544597762/1069041396] _nv003795rm+0x6ea/0xaec [nvidia]
Mar 14 00:25:40 palantir kernel: [_end+543983963/1069041396] _nv004046rm+0x3a3/0x3b0 [nvidia]
Mar 14 00:25:40 palantir kernel: [_end+545039003/1069041396] _nv001476rm+0x277/0x45c [nvidia]
Mar 14 00:25:40 palantir kernel: [_end+543894158/1069041396] _nv000896rm+0x4a/0x64 [nvidia]
Mar 14 00:25:40 palantir kernel: [_end+543900328/1069041396] rm_isr_bh+0xc/0x10 [nvidia]
Mar 14 00:25:40 palantir kernel: [_end+545362421/1069041396] nv_kern_isr_bh+0xf/0x13 [nvidia]
Mar 14 00:25:40 palantir kernel: [tasklet_action+70/112] tasklet_action+0x46/0x70
Mar 14 00:25:40 palantir kernel: [do_softirq+147/160] do_softirq+0x93/0xa0
Mar 14 00:25:40 palantir kernel: [do_IRQ+263/320] do_IRQ+0x107/0x140
Mar 14 00:25:40 palantir kernel: [common_interrupt+24/32] common_interrupt+0x18/0x20


From what I can glean from the web (http://lkml.org/lkml/2004/3/6/37) this is a bug within the nvidia driver - looks like some ISR is failing or something. Does anyone know if this is going to get fixed and on what sort of timescale.?

Is there other stuff I can try other than the usual ACPI /AGP stuff?

Any thoughts / ideas would be graciously appreciated

Kind Regards

Ben
bengibbs is offline   Reply With Quote
Old 03-15-04, 05:47 AM   #2
LordMorgul
Electrical Engineer
 
LordMorgul's Avatar
 
Join Date: Dec 2002
Location: San Luis Obispo, CA
Posts: 872
Default Avoid for the present?

You might try the other drivers available since the feature set of the 5336 isn't aimed at the tnt. Any of the drivers that have 2.6.x patches from www.minion.de would be a reasonable place to begin testing.
__________________
"..the triumph of evil is for good men to do nothing." (Edmond Burke)
nVIDIA video driver RPMs for Fedora :: see yum repo at livna.org.
LordMorgul is offline   Reply With Quote
Old 03-15-04, 01:38 PM   #3
bengibbs
Registered User
 
Join Date: Mar 2004
Posts: 9
Default old cards

Thanks for that am downloading old drivers now. I thought I could live with the flakey nv driver - I'm no gamer so I don't really need any blistering 3d performance - however mplayer sucks monumentally with the nv module so I'll give these ago.

Thanks for the advice.
bengibbs is offline   Reply With Quote
Old 03-15-04, 04:55 PM   #4
tamran
Registered User
 
Join Date: Feb 2004
Location: Ft. Myers, FL
Posts: 67
Default

Check out the following thread:

http://www.nvnews.net/vbulletin/show...threadid=24866

Tamran
tamran is offline   Reply With Quote
Old 03-16-04, 11:24 AM   #5
bdw
Registered User
 
Join Date: May 2003
Posts: 13
Default IRQ sharing

I noticed that your nVidia driver is sharing an interrupt with the IDE drivers. On my box, the nVidia driver shares an interrupt with the ethernet card and I get a similar pci_find_subsys error message.

www.minion.de does have patches for the 2.6 kernel, but they don't have one for the 5336 nVidia driver.
bdw is offline   Reply With Quote
Old 03-16-04, 03:38 PM   #6
LordMorgul
Electrical Engineer
 
LordMorgul's Avatar
 
Join Date: Dec 2002
Location: San Luis Obispo, CA
Posts: 872
Default

5336 incorporates patches for 2.6, at least partially taken from Zander's work at www.minion.de (due to the credit given).
__________________
"..the triumph of evil is for good men to do nothing." (Edmond Burke)
nVIDIA video driver RPMs for Fedora :: see yum repo at livna.org.
LordMorgul is offline   Reply With Quote
Old 03-16-04, 05:02 PM   #7
bengibbs
Registered User
 
Join Date: Mar 2004
Posts: 9
Wink

Cheers for the suggestions but I/'ve given up : (

I thought I'd got it stable... but even with AGP disabled, APIC disabled, old drivers and barely asking the card to render a pixel it still kept crashing. Luckily I remembered I had an old Rage 128 sat in another unused box - seems to work a treat. It does everything but games, and I don't own any games for linux any way.

I hope others with this problem have more luck than myself - specially those that have forked out for expensive cards.

Once again thanks for the help.
bengibbs is offline   Reply With Quote
Old 03-16-04, 06:06 PM   #8
tamran
Registered User
 
Join Date: Feb 2004
Location: Ft. Myers, FL
Posts: 67
Default Interrupt sharing?

Hello there,

I've got the following interrupts:

# cat /proc/interrupts
CPU0 CPU1
0: 173234 0 IO-APIC-edge timer
1: 151 0 IO-APIC-edge i8042
2: 0 0 XT-PIC cascade
8: 0 0 IO-APIC-edge rtc
9: 0 0 IO-APIC-level acpi
12: 18201 0 IO-APIC-edge i8042
14: 17 0 IO-APIC-edge ide0
16: 11562 0 IO-APIC-level eth0, nvidia
19: 0 0 IO-APIC-level EMU10K1
20: 3995 0 IO-APIC-level libata
NMI: 18 12
LOC: 173091 172982
ERR: 0
MIS: 0

I found this at minion.de:
Quote:
Many ACPI and UP I/O APIC configurations seem to be unreliable; if you're experiencing startup or stability problems, it may help to disable them with acpi=off and noapic. In some cases, a combination of pci=noacpi and pci=biosirq has also been reported to help.
Is this what you're referring to? I get the same error cropping up in my logs as well as actual lockups (as many others have). Is there a way to force this from happening? Is this interrupt sharing the source of my problems? What kernel patch are you referring to? I only seep patched nvidia kernel modules.

Also, I've just disabled USB support to see if that helps solve anything.

Tamran
tamran is offline   Reply With Quote

Old 03-16-04, 07:18 PM   #9
bdw
Registered User
 
Join Date: May 2003
Posts: 13
Default

Tamran:

Yes, I was referring to IRQ interrupts.

I just booted linux 2.6.4 with the following kernel parameters:

pci=noacpi acpi=off noapic pci=biosirq

And here's my /proc/interrupts:

CPU0
0: 837418 XT-PIC timer
1: 2355 XT-PIC i8042
2: 0 XT-PIC cascade
5: 18935 XT-PIC EMU10K1
9: 105560 XT-PIC bttv0
10: 745 XT-PIC uhci_hcd, uhci_hcd, advansys
11: 54596 XT-PIC eth0, nvidia
12: 15970 XT-PIC i8042
14: 73 XT-PIC ide0
15: 32749 XT-PIC ide1
NMI: 0
LOC: 837284
ERR: 1274
MIS: 0

Interesting that the nvidia driver likes to share the same IRQ as eth0.

I wish there was a way that I could put the nvidia driver on it's own IRQ.
bdw is offline   Reply With Quote
Old 03-17-04, 01:54 AM   #10
LordMorgul
Electrical Engineer
 
LordMorgul's Avatar
 
Join Date: Dec 2002
Location: San Luis Obispo, CA
Posts: 872
Default

Depending on your motherboard chipset and BIOS you may be able to configure them to use separate IRQs. It may not change anything but is worth trying. If your BIOS refuses to show separate IRQs for the video then shuffle your other pci cards around to different slots and see if it changes how you can arrange IRQs.
__________________
"..the triumph of evil is for good men to do nothing." (Edmond Burke)
nVIDIA video driver RPMs for Fedora :: see yum repo at livna.org.
LordMorgul is offline   Reply With Quote
Old 03-17-04, 10:13 AM   #11
JoseJX
GeForce Ti4200
 
Join Date: Mar 2004
Posts: 20
Default

I still get the pci_badness / common_interrupt message even though my GeForce is on its own irq. I thought maybe you were on to something there, but it doesn't fix the problem on my computer.
JoseJX is offline   Reply With Quote
Old 03-17-04, 10:14 AM   #12
bdw
Registered User
 
Join Date: May 2003
Posts: 13
Default

LordMogul:

I have an Award Bios, and I think there is a section there to arrange the IRQ configuration but assinging IRQs to given slots.

Of course, I need to see if the lspci command will tell me PCI slot holds what card.

--Brian
bdw is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 06:51 PM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.