nV News Forums

 
 

nV News Forums (http://www.nvnews.net/vbulletin/index.php)
-   NVIDIA Linux (http://www.nvnews.net/vbulletin/forumdisplay.php?f=14)
-   -   geforce 6600 random system lock (http://www.nvnews.net/vbulletin/showthread.php?t=77051)

dystopianray 09-23-06 12:38 AM

geforce 6600 random system lock
 
1 Attachment(s)
I recently bought a Geforce 6600 AGP card and my system randomly locks up when using it. The card does not get very hot and is barely warm to the touch after playing doom3, it also does not seem to be a power related issue. I do not experience any graphical artifacts or other anomalies it just locks up the system at random times.

I am running a gentoo linux system with kernel 2.6.17, I do not have any other operating systems to test it with. My motherboard is a MSI K8T Neo2-FIR using the Via K8T800 Pro chipset. I have had no problems for a year or more with a geforce fx 5200 card. I have tried bios versions 3.3 and 9.3 and both experience the random lockups.

I have done everything I can think of to rectify the issue including: forcing 4x AGP, disabling fast writes, trying different x.org and driver versions, booting with noapic irqpoll or acpi=off, trying NvAGP, trying different PSU connectors, removing all pci cards and unplugging non essential hdds and optical drives. However it always results in the same random system lock ups.

At one point I was able to ssh into the system after it had locked up and dmesg had reported something along the lines of "IRQ 209: nobody cared (try irqpoll)" and "disabling IRQ 209" however every other time the system locked up i was not able to access it from the network.

I would very much appreciate any help with this issue, thanks in advance.

whig 09-23-06 12:49 AM

Re: geforce 6600 random system lock
 
Do not load the dri module with nvidia drivers - so remove it from your X conf. Also monitor gpu/cpu temps, and check your ram with memtest86 (eg, live cd).

dystopianray 09-23-06 01:32 AM

Re: geforce 6600 random system lock
 
I removed dri from my xorg.conf and it still crashed.

Right before it crashed nvidia-settings was reporting 32C for the gpu temperature.

After i rebooted I went into the bios and it was reporting 28C for the cpu temperature and 35C for the system temperature.

netllama 09-24-06 12:45 AM

Re: geforce 6600 random system lock
 
The "irq nobody cared" message typically suggests a BIOS bug.
According to your bug report, the glx module failed to load, which suggests to me that the NVIDIA driver isn't correctly installed.

Have you verified that you're using the latest BIOS for the motherboard?
Have you tested with NvAGP set to 0 in xorg.conf ?
Does this problem also reproduce with the 1.0-9625 beta driver?

Thanks,
Lonni

dystopianray 09-24-06 02:28 AM

Re: geforce 6600 random system lock
 
1 Attachment(s)
I reinstalled the driver and now GLX is being correctly loaded, however the behaviour is still exactly the same.

I have tried BIOS versions 3.3 and 9.3 from the following MSI url: http://www.msi.com.tw/program/suppor...UID=608&kind=1

I tried with NvAgp "0" however all it did was make it lock up sooner, once when I tried to start a doom3 game and then again while glxgears was running.

I will try with the 1.0-9625 driver next and see if it has the same behaviour.

I have used the 'nv' driver for a while in the past couple of days and never noticed any lockups or any other problems apart from it being painfully slow.

I have attached the bug report logs for my 5200 card which is working perfectly fine and the 6600 card which is experiencing the random lockups. These reports are from after fixing the GLX loading problem.

dystopianray 10-01-06 05:41 AM

Re: geforce 6600 random system lock
 
1 Attachment(s)
I am now using the 1.0-9625 beta driver and the problem is still there. The machine will randomly hang with the 6600 but works perfectly fine when using the fx 5200.

The hangs don't appear to be power or temperature related and they happen at random times, such as when opening a menu or viewing a web page.

netllama 10-01-06 11:30 AM

Re: geforce 6600 random system lock
 
If you set NvAGP to 0 in xorg.conf, does that help?
If you set RenderAccel to false in xorg.conf, does that help?

Thanks,
Lonni

dystopianray 10-01-06 12:29 PM

Re: geforce 6600 random system lock
 
Setting NvAgp 0 or RenderAccel false did not help at all, however I was able to capture the exact dmesg output during a lock up this time. This is the end of dmesg after the system has locked up.

eth0: no IPv6 routers present
agpgart: Found an AGP 3.0 compliant device at 0000:00:00.0.
agpgart: Putting AGP V3 device at 0000:00:00.0 into 8x mode
agpgart: Putting AGP V3 device at 0000:01:00.0 into 8x mode
irq 201: nobody cared (try booting with the "irqpoll" option)

Call Trace: <IRQ> <ffffffff80244ba5>{__report_bad_irq+53}
<ffffffff80244dc8>{note_interrupt+456} <ffffffff80244629>{__do_IRQ+169}
<ffffffff8020c3d2>{do_IRQ+66} <ffffffff80209f38>{ret_from_intr+0} <EOI>
<ffffffff80209faf>{retint_careful+13}
handlers:
[<ffffffff882e15e4>] (nv_kern_isr+0x0/0x5d [nvidia])
Disabling IRQ #201
NVRM: Xid (0001:00): 16, Head 00000000 Count 0000a6ac
NVRM: Xid (0001:00): 16, Head 00000001 Count 00000002

netllama 10-01-06 02:06 PM

Re: geforce 6600 random system lock
 
This looks like the same IRQ problem as before.

-Lonni

dystopianray 10-01-06 09:35 PM

Re: geforce 6600 random system lock
 
Quote:

Originally Posted by netllama
This looks like the same IRQ problem as before.

-Lonni

What can be done about it? and why does it only affect the 6600 and not the 5200?

whig 10-01-06 11:16 PM

Re: geforce 6600 random system lock
 
As you would expect, the 6600 draws more power. In your bios check the voltages on your rails, ie 5 and 12V. If they are low the extra draw done by 6600 could be too much for your system.

Hannibal 10-10-06 03:15 PM

Re: geforce 6600 random system lock
 
1 Attachment(s)
dystopianray, did you solve your problem?

I have simmilar "effects" when I upgrade from Ti 4200 to 6600. I suspect power problems, but I haven't any prove. BIOS, lmsensors (just before hang), external multimeter ( during hang ) voltages was OK (3.33-3.34V, 4.98-5.04V, 11.95-12.06V), but possible refresh time was not good enough.

Additional running Xorg driver doesn't cause hang. At least I didin't detect it until now.

Logs are almost clean. Only two times ( for approx 20 hangs) i see something like that:
Code:

Oct 10 16:20:02 hannibal kernel: irq 18: nobody cared (try booting with the "irqpoll" option)
Oct 10 16:20:02 hannibal kernel:  [<c013a228>] __report_bad_irq+0x24/0x7d
Oct 10 16:20:02 hannibal kernel:  [<c013a33e>] note_interrupt+0x9f/0xc9
Oct 10 16:20:02 hannibal kernel:  [<c013ab13>] handle_fasteoi_irq+0xdd/0x10e
Oct 10 16:20:02 hannibal kernel:  [<c013aa36>] handle_fasteoi_irq+0x0/0x10e
Oct 10 16:20:02 hannibal kernel:  [<c0105309>] do_IRQ+0x71/0xc6
Oct 10 16:20:02 hannibal kernel:  [<c01038d2>] common_interrupt+0x1a/0x20
Oct 10 16:20:02 hannibal kernel:  =======================
Oct 10 16:20:02 hannibal kernel: handlers:
Oct 10 16:20:02 hannibal kernel: [<e0f93117>] (usb_hcd_irq+0x0/0x59 [usbcore])
Oct 10 16:20:02 hannibal kernel: [<e0fcd662>] (snd_fm801_interrupt+0x0/0x1b7 [snd_fm801])
Oct 10 16:20:02 hannibal kernel: [<e17932ee>] (nv_kern_isr+0x0/0x72 [nvidia])
Oct 10 16:20:03 hannibal kernel: Disabling IRQ #18
Oct 10 16:20:07 hannibal kernel: NVRM: Xid (0001:00): 16, Head 00000001 Count 00000000
Oct 10 16:20:08 hannibal kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 005f9a62

Code:

Oct 10 21:27:10 hannibal kernel: irq 19: nobody cared (try booting with the "irqpoll" option)
Oct 10 21:27:10 hannibal kernel:  [<c013a228>] __report_bad_irq+0x24/0x7d
Oct 10 21:27:10 hannibal kernel:  [<c013a33e>] note_interrupt+0x9f/0xc9
Oct 10 21:27:10 hannibal kernel:  [<c013ab13>] handle_fasteoi_irq+0xdd/0x10e
Oct 10 21:27:10 hannibal kernel:  [<c013aa36>] handle_fasteoi_irq+0x0/0x10e
Oct 10 21:27:10 hannibal kernel:  [<c0105309>] do_IRQ+0x71/0xc6
Oct 10 21:27:10 hannibal kernel:  [<c01038d2>] common_interrupt+0x1a/0x20
Oct 10 21:27:10 hannibal kernel:  =======================
Oct 10 21:27:10 hannibal kernel: handlers:
Oct 10 21:27:10 hannibal kernel: [<d0fb4117>] (usb_hcd_irq+0x0/0x59 [usbcore])
Oct 10 21:27:10 hannibal kernel: [<d0fec662>] (snd_fm801_interrupt+0x0/0x1b7 [snd_fm801])
Oct 10 21:27:10 hannibal kernel: [<d17382ee>] (nv_kern_isr+0x0/0x72 [nvidia])
Oct 10 21:27:10 hannibal kernel: Disabling IRQ #19

I attach error log, but probably it doesn't help ;)


All times are GMT -5. The time now is 10:19 AM.

Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.