|
|
#1 | |
|
Registered User
Join Date: Dec 2006
Posts: 7
|
On my new system I get random freezes.
When playing 3D games the system will hang in about 1 to 30 minutes. And I also experienced one freeze only running a normal KDE desktop. When the system is freezed only the SysRQ keys are working, only allowing me to reboot. I tried the following nVidia drivers: 8776 and 9631 and the following Linux kernel versions: 2.6.17 and 2.6.18 I tried to boot the kernel with the following options
Every setup resulted in the same freeze. This is my hardware setup:
The following software was used:
The following messages were logged into the syslog after the freeze: Code:
irq 114: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> [<ffffffff802a4d5d>] __report_bad_irq+0x30/0x7d [<ffffffff802a4f97>] note_interrupt+0x1ed/0x22e [<ffffffff802a44a4>] __do_IRQ+0xc7/0x105 [<ffffffff80210448>] __do_softirq+0x5e/0xd5 [<ffffffff8026415f>] do_IRQ+0x65/0x73 [<ffffffff80258b09>] ret_from_intr+0x0/0xa <EOI> handlers: [<ffffffff887ce298>] (nv_kern_isr+0x0/0x5e [nvidia]) Disabling IRQ #114 Code:
irq 114: nobody cared (try booting with the "irqpoll" option)
Call Trace: <IRQ> <ffffffff8029f304>{__report_bad_irq+48}
<ffffffff8029f513>{note_interrupt+450} <ffffffff8029edc2>{__do_IRQ+188}
<ffffffff80264fd1>{do_IRQ+59} <ffffffff8025a008>{ret_from_intr+0} <EOI>
<ffffffff8025cc76>{ia32_syscall+6}
handlers:
[<ffffffff8886ab8e>] (nv_kern_isr+0x0/0x62 [nvidia])
Disabling IRQ #114
Code:
irq 114: nobody cared (try booting with the "irqpoll" option)
Call Trace: <IRQ> <ffffffff8029f304>{__report_bad_irq+48}
<ffffffff8029f513>{note_interrupt+450} <ffffffff8029edc2>{__do_IRQ+188}
<ffffffff80264fd1>{do_IRQ+59} <ffffffff8025a008>{ret_from_intr+0} <EOI>
handlers:
[<ffffffff88839512>] (nv_kern_isr+0x0/0x5e [nvidia])
Disabling IRQ #114
NVRM: Xid (0006:00): 16, Head 00000000 Count 001897bf
NVRM: Xid (0006:00): 16, Head 00000001 Count 00000001
NVRM: Xid (0006:00): 8, Channel 00000020
NVRM: Xid (0006:00): 28, L1 -> L0
NVRM: Xid (0006:00): 16, Head 00000000 Count 00189bf9
NVRM: Xid (0006:00): 8, Channel 00000020
NVRM: Xid (0006:00): 28, L0 -> L0
NVRM: Xid (0006:00): 16, Head 00000000 Count 00189bfa
NVRM: Xid (0006:00): 8, Channel 00000020
NVRM: Xid (0006:00): 16, Head 00000000 Count 00189bfb
NVRM: Xid (0006:00): 8, Channel 00000020
NVRM: Xid (0006:00): 16, Head 00000000 Count 00189bfc
NVRM: Xid (0006:00): 8, Channel 00000020
NVRM: Xid (0006:00): 16, Head 00000000 Count 00189bfd
NVRM: Xid (0006:00): 8, Channel 00000020
NVRM: Xid (0006:00): 16, Head 00000000 Count 00189bfe
Code:
irq 114: nobody cared (try booting with the "irqpoll" option)
Call Trace: <IRQ> <ffffffff8029f304>{__report_bad_irq+48}
<ffffffff8029f513>{note_interrupt+450} <ffffffff8029edc2>{__do_IRQ+188}
<ffffffff80264fd1>{do_IRQ+59} <ffffffff80263153>{default_idle+0}
<ffffffff8025a008>{ret_from_intr+0} <EOI> <ffffffff8026317e>{default_idle+43}
<ffffffff80246d3a>{cpu_idle+151} <ffffffff80521778>{start_kernel+502}
<ffffffff80521288>{_sinittext+648}
handlers:
[<ffffffff8883b512>] (nv_kern_isr+0x0/0x5e [nvidia])
Disabling IRQ #114
Code:
irq 114: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> [<ffffffff802a4d5d>] __report_bad_irq+0x30/0x7d [<ffffffff802a4f97>] note_interrupt+0x1ed/0x22e [<ffffffff802a44a4>] __do_IRQ+0xc7/0x105 [<ffffffff80210448>] __do_softirq+0x5e/0xd5 [<ffffffff8026415f>] do_IRQ+0x65/0x73 [<ffffffff80258b09>] ret_from_intr+0x0/0xa <EOI> [<ffffffff80271571>] do_gettimeoffset_pm+0x8/0x23 [<ffffffff80264a66>] do_gettimeofday+0x50/0x94 [<ffffffff8023862c>] sys_gettimeofday+0x1b/0x62 [<ffffffff8025860e>] system_call+0x7e/0x83 handlers: [<ffffffff887fb913>] (nv_kern_isr+0x0/0x5e [nvidia]) Disabling IRQ #114 Code:
irq 114: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> [<ffffffff802a4d5d>] __report_bad_irq+0x30/0x7d [<ffffffff802a4f97>] note_interrupt+0x1ed/0x22e [<ffffffff802a44a4>] __do_IRQ+0xc7/0x105 [<ffffffff80210448>] __do_softirq+0x5e/0xd5 [<ffffffff8026415f>] do_IRQ+0x65/0x73 [<ffffffff8026224d>] default_idle+0x0/0x50 [<ffffffff80258b09>] ret_from_intr+0x0/0xa <EOI> [<ffffffff80262276>] default_idle+0x29/0x50 [<ffffffff8024559b>] cpu_idle+0x95/0xb8 [<ffffffff80538799>] start_kernel+0x216/0x21b [<ffffffff80538288>] _sinittext+0x288/0x28c handlers: [<ffffffff887fb913>] (nv_kern_isr+0x0/0x5e [nvidia]) Disabling IRQ #114 NVRM: Xid (0006:00): 16, Head 00000000 Count 005da74c NVRM: Xid (0006:00): 16, Head 00000001 Count 00000001 Regards, Markus |
|
|
|
|
|
|
#2 | |
|
NVIDIA Corporation
Join Date: Dec 2004
Posts: 8,763
|
It looks like your BIOS/kernel has IRQ management problems. You should test with a newer kernel and also verify that you're using the latest BIOS for the motherboard.
Thanks, Lonni |
|
|
|
|
|
|
#3 |
|
Registered User
Join Date: Dec 2006
Posts: 7
|
The bios is still the latest version provided by the vendor.
I tried Linux kernel 2.6.19 with no luck, same error: Code:
Dec 16 01:28:56 atlantis kernel: irq 16: nobody cared (try booting with the "irqpoll" option) Dec 16 01:28:56 atlantis kernel: Dec 16 01:28:56 atlantis kernel: Call Trace: Dec 16 01:28:56 atlantis kernel: <IRQ> [<ffffffff802a5966>] __report_bad_irq+0x30/0x7d Dec 16 01:28:56 atlantis kernel: [<ffffffff802a5b94>] note_interrupt+0x1e1/0x222 Dec 16 01:28:56 atlantis kernel: [<ffffffff802a63ab>] handle_fasteoi_irq+0x9e/0xc5 Dec 16 01:28:56 atlantis kernel: [<ffffffff8025824c>] call_softirq+0x1c/0x28 Dec 16 01:28:56 atlantis kernel: [<ffffffff80263221>] do_IRQ+0x7b/0xc5 Dec 16 01:28:56 atlantis kernel: [<ffffffff802612a6>] default_idle+0x0/0x47 Dec 16 01:28:56 atlantis kernel: [<ffffffff80257641>] ret_from_intr+0x0/0xa Dec 16 01:28:56 atlantis kernel: <EOI> Dec 16 01:28:56 atlantis kernel: handlers: Dec 16 01:28:56 atlantis kernel: [<ffffffff887af975>] (nv_kern_isr+0x0/0x5e [nvidia]) Dec 16 01:28:56 atlantis kernel: Disabling IRQ #16 Dec 16 01:29:04 atlantis kernel: NVRM: Xid (0006:00): 16, Head 00000000 Count 0001de50 Dec 16 01:29:04 atlantis kernel: NVRM: Xid (0006:00): 16, Head 00000001 Count 00000001 Thanks, Markus Update: I just tried the nVidia beta driver version 9742 with Linux kernel 2.6.18 and 2.6.19. The error still occurs ![]() Last edited by magicmarkus; 12-15-06 at 07:28 PM. Reason: update |
|
|
|
|
|
#4 | |
|
NVIDIA Corporation
Join Date: Dec 2004
Posts: 8,763
|
Unfortunately, this is still a kernel or (more likely) a BIOS problem. You'll need to contact MSI.
Thanks, Lonni |
|
|
|
|
|
|
#5 |
|
Registered User
Join Date: Dec 2006
Posts: 7
|
Some updates on this problem:
I contacted the MSI support, which was quite disappointing since they dont seem to be very experienced in Linux support. They even admitted that they did not test with Linux So I will keep bothering you ![]() I installed a new Bios version, but with no improvement. My current setup:
I can reproduce the freeze very fast ( < 5 min) by playing Warcraft III with wine. But also the SPECViewperf Benchmark results in the freeze. However I was only able to reproduce the problem in my 32Bit chroot. So maybe there is something wrong with the 32Bit libraries. <edit>I was able to reproduce the problem also in the normal 64Bit environment. It just takes some more time...</edit> I also tried a new kernel version 2.6.20-rc4 without success. And in addition I bootet the kernel with various combinations of noapic, pci=noacpi and pci=biosirq, but the freezes still occur. I thought that these options would circumvent Bios problems? I you have any ideas how to debug this problem, I would appreciate this! Regards, Markus Last edited by magicmarkus; 01-14-07 at 06:43 PM. |
|
|
|
|
|
#6 | |
|
Geforce 8800 GTS 512
Join Date: Nov 2002
Location: Australia
Posts: 396
|
Have you checked that the BIOS is in fact setting up the memory voltage correctly?? I have an ASUS Crosshair mother board with the nForce 590 chipset, and it would setup the memory voltage too low. (the memory timings it would setup correctly) So it would randomly hang or lockup.. Since correcting the memory voltage (It set mine to 1.9V, (Default) where as it should have been 2.2Volts), the system is now rock solid.. It worth checking out..
You will need to look up the specs for the ram your using, and setup the BIOS manually. That should correct the lockups.. Wolf |
|
|
|
|
|
|
#7 |
|
Registered User
Join Date: Dec 2006
Posts: 7
|
Thanks for you response, Wolf.
I am using 2x1024 MB DDR2-667 Ram from MDT. In the specs from MDT it says that they need 1.8V voltage. My Bios also set the voltage to 1.9V. So I tried both 1.8V and 2.2V, but I could still reproduce the freezes. But I think the problem is not related to the memory. I can do very memory intensive tasks (like running Eclipse or compiling a kernel) without the freeze. The freezes occur when I run applications with high graphics/OpenGL usage. |
|
|
|
|
|
#8 |
|
Geforce 8800 GTS 512
Join Date: Nov 2002
Location: Australia
Posts: 396
|
Running OpenGL programs and Games, would hang and lock my system too.. Also when running a memtest86+. Setting all relevent memory timing and voltage stuff fixed it for me.. I'm running FC5 with all the updates.. My system is rock solid now.. I'd just make sure the timings are correct too.. Can't hurt.. Other than that, not much else I can suggest... (Kernel 2.6.18-1.2257.fc5smp)
Edit: Oh, and the kernel command line I had to setup was just these settings.. "pci=nommconf combined_mode=libata", other than that, everything else is stock standard.. Wolf |
|
|
|
|
|
#9 |
|
Registered User
Join Date: Dec 2006
Posts: 7
|
OK, I did some research on my own.
First I have to straighten out that the by me called "freeze" is actually no freeze. Its the kernel that disables the interrupt of the nvidia module. This results in disabling all graphical output which is like a freeze. But I am sometimes still able to ssh into my machine and make a remote restart. And the SysRQ keys are always still working, so the system is not completely dead. So, to the next step: Why is the kernel disabling the IRQ? I am not a kernel hacker, so I am guessing here a bit: The kernel eventually gets an IRQ (from the BIOS?) and it tries to handle it. Therefore it calls the appropriate handler in __do_IRQ. The handler is the function nv_kern_isr defined in nv.c. The handler must return a boolean value (IRQ_HANDLED or IRQ_NONE) that indicates if the IRQ was handled or not. If the return value is IRQ_HANDLED then everything is OK. But if it is IRQ_NONE, then the IRQ was not handled, i.e. it was not meant for this handler. The kernel tries to give the IRQ to the next handler. But in my case there is no other handler on this interrupt, so nobody cared for this interrupt. If that happens more than 99900 times, the kernel reports it as a bad IRQ in note_interrupt and disables the interrupt (and the kernel is so kind to inform me about it ).The question is now: Why does the nvidia interrupt handler not handle the IRQ? The return value of the handler is determined by the return value of the function rm_isr. And this is for me a dead end, because this function is implemented in the closed source binary library. So either nvidia releases a free open source version of its driver (what many Linux users would love to see ) or the nvidia hackers themself must find out why rm_isr does not handle the interrupt and freezes my system!!!Oh, and thanks for your tips, Wolf. I tried different memory setting in the Bios and also your kernel parameters, but it did not help ![]() I hope that someone from nVidia reads this... Markus |
|
|
|
|
|
#10 |
|
Registered User
Join Date: Dec 2006
Posts: 7
|
Big update:
Since the MSI support team insisted on it, I did some test under Windows XP. And it seems to have the same problem. If I run a 3D Benchmark after some time (2-5 min) the system does a soft reset and reboots. So this problem actually seems to be a hardware problem. I am running memtest+ over an hour now, without any problems. So the memory seems to be OK. Also the fact that the SysRQ keys are still working when the system freezes under Linux indicates that the kernel is still working. So my guess is that the vga cards somehow stops all its operation and somehow freezes. This would also fit to the error message "irq 114: nobody cared". The kernel sends IRQs to the graphic card, but the card does not handle them because it is dead. And if that happens to often the kernel disables the interrupt and logs the error message. So what could be the reason for the vga card to freeze? Any suggestions? Thanks for your support, Markus |
|
|
|
|
|
#11 |
|
Registered User
Join Date: Jan 2007
Posts: 1
|
I have the same Problem with my 7600 GS. But i don't know if it is the same Error. I am going to make an Logfile of this.
|
|
|
|
|
|
#12 |
|
Registered User
Join Date: Dec 2006
Posts: 7
|
I just want to post my final solution:
As it turned out, it really was a hardware problem. So I sent the vga card back to my reseller and got a new replacement card. Now everything works OK, no more freezes Thanks @ all for your help; problem solved! |
|
|
|
![]() |
| Thread Tools | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Nvidia GeForce 301.42 WHQL drivers | DSC | NVIDIA Windows Graphics Drivers | 5 | 05-29-12 10:12 PM |
| Enhance Max Payne 3, Diablo III with GeForce R300 Drivers | News | Latest Tech And Game Headlines | 0 | 05-22-12 06:30 PM |
| New GPU from Nvidia Announced Today, the GeForce GTX 670 | News | Latest Tech And Game Headlines | 0 | 05-10-12 01:50 PM |
| Gainward Unleashes the Sexy GeForce GTX 670 Phantom Graphics Card, Also launches the | News | Latest Tech And Game Headlines | 0 | 05-10-12 09:28 AM |