|
|
#1 | |
|
Registered User
Join Date: Jan 2007
Posts: 11
|
When I modprobe the nvidia module it kernel-page-faults on ffffffff6d723538 in os_get_cpu_frequency+0xb. From dmesg:
Code:
Unable to handle kernel paging request at ffffffff6d723538 RIP: [<ffffffff6d723538>] PGD 203027 PUD 0 Oops: 0010 [1] PREEMPT SMP CPU 0 Modules linked in: nvidia(P) bcraid Pid: 7028, comm: modprobe Tainted: P 2.6.19-gentoo-r4 #1 RIP: 0010:[<ffffffff6d723538>] [<ffffffff6d723538>] RSP: 0018:ffff81003b25de40 EFLAGS: 00010296 RAX: 0000000000181b00 RBX: ffff81003b25de98 RCX: 00000000078bfbff RDX: 0000000000181100 RSI: 0000000000000001 RDI: ffff81003ddef000 RBP: ffff81003ddef000 R08: ffff81003b25de8c R09: ffff81003b25de88 R10: 0000000000000002 R11: 0000000000000001 R12: ffffffff88848380 R13: 00002ae561bda010 R14: 00000000005080f8 R15: 00002ae561bda010 FS: 00002ae561bd8ae0(0000) GS:ffffffff80691000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffffff6d723538 CR3: 000000003bdf7000 CR4: 00000000000006e0 Process modprobe (pid: 7028, threadinfo ffff81003b25c000, task ffff81003df1f880) Stack: ffffffff88426e17 0000000000000286 ffff81003b25de98 ffff81003ddef000 ffffffff88119bbd ffff81003b25de98 ffffffff88104316 ffff81003b25de90 0000000000003200 0000080000000000 00000f5a078bfbff 69746e6568747541 Call Trace: [<ffffffff88426e17>] :nvidia:os_get_cpu_frequency+0xb/0x44 [<ffffffff88119bbd>] :nvidia:_nv003359rm+0x9/0xe [<ffffffff88104316>] :nvidia:_nv002562rm+0x1f6/0x362 [<ffffffff881079ce>] :nvidia:_nv002556rm+0x80/0xa6 [<ffffffff88122751>] :nvidia:rm_init_rm+0x9/0xe [<ffffffff8884e0e3>] :nvidia:nvidia_init_module+0xe3/0x7aa [<ffffffff802215cf>] __up_read+0x13/0x8a [<ffffffff8029aa76>] sys_init_module+0xaf/0x227 [<ffffffff8025ba1e>] system_call+0x7e/0x83 Code: Bad RIP value. RIP [<ffffffff6d723538>] RSP <ffff81003b25de40> CR2: ffffffff6d723538 I cannot remove the module with rmmod, it seems to be stuck initializing due to this page-fault. When letting Xorg load the nvidia module, it'll crash in the same way. (Although the input freezes I ascertained this via ssh) |
|
|
|
|
|
|
#2 | |
|
NVIDIA Corporation
Join Date: Dec 2004
Posts: 8,763
|
I have a few questions:
0) Is this problem specific to your 2.6.19-gentoo-r4 kernel? Does it reproduce with an older kernel and/or a kernel.org kernel? 1) What kind of motherboard & graphics card are you using? 2) Have you verified that you're using the latest motherboard BIOS? Thanks, Lonni |
|
|
|
|
|
|
#3 | |||
|
Registered User
Join Date: Jan 2007
Posts: 11
|
Quote:
Quote:
Quote:
Just to reiterate: the startup and most of the stuff works fine with the older 9631 nvidia-drivers, but these tend to randomly freeze completely after about 20min in a wm like compiz. |
|||
|
|
|
|
|
#4 | |
|
NVIDIA Corporation
Join Date: Dec 2004
Posts: 8,763
|
I'm not able to reproduce this problem with a GeForce 7600 in a Tyan 2885 motherboard with 1.0-9746. X starts up fine (which would accomplish much more than just modprobing nvidia). My only guess at this point is that your crash is something specific to the Gentoo environment that you've running.
|
|
|
|
|
|
|
#5 | |
|
Registered User
Join Date: Jan 2007
Posts: 11
|
Quote:
Edit Oh, and maybe it might be in the settings of the kernel too. |
|
|
|
|
|
|
#6 | |
|
NVIDIA Corporation
Join Date: Dec 2004
Posts: 8,763
|
I was using the latest Fedora Core 6 kernel, which is based off of 2.6.18.x. You can get its source & configuration here:
http://mirrors.kernel.org/fedora/cor...69.fc6.src.rpm |
|
|
|
|
|
|
#7 | |
|
Registered User
Join Date: Jan 2007
Posts: 11
|
Quote:
Just after I compiled the 9746 drivers it worked fine. Then I restarted and modprobe-d the nvidia module. This, contrary to the newer kernel, did work. However, when I started Xorg, it hangs. I ssh-d into the machine and found this in dmesg: Code:
Unable to handle kernel paging request at ffffffff6d723638 RIP: [<ffffffff6d723638>] PGD 203027 PUD 0 Oops: 0010 [1] PREEMPT SMP CPU 0 Modules linked in: stir4200 usbhid parport_pc nvidia parport uhci_hcd ohci_hcd eth1394 bcraid Pid: 9956, comm: Xorg Tainted: P 2.6.18.6 #1 RIP: 0010:[<ffffffff6d723638>] [<ffffffff6d723638>] RSP: 0018:ffff81007f1bbba0 EFLAGS: 00010202 RAX: ffff810037868000 RBX: ffff810037868000 RCX: ffff810037489110 RDX: ffff810037489110 RSI: ffff810037489110 RDI: ffffffff8886c200 RBP: ffff8100384ca8c0 R08: ffff810037489110 R09: 0000000000000001 R10: 0000000000000000 R11: ffffffff80492204 R12: ffff810037489110 R13: ffff810037489110 R14: 0000000000000001 R15: 0000000000000000 FS: 00002b686123fae0(0000) GS:ffffffff806c6000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff6d723638 CR3: 000000007ed45000 CR4: 00000000000006e0 Process Xorg (pid: 9956, threadinfo ffff81007f1ba000, task ffff81007f331180) Stack: ffffffff8813ae60 ffff810037489110 ffffffff8838f0a2 00000000bfef0020 ffff810037458800 0000000000000000 00000000bfef0100 0000000000000000 ffff81003c917400 ffff81003d05b000 ffffffff88112ece 00000000bfef0100 Call Trace: [<ffffffff8813ae60>] :nvidia:_nv003253rm+0x34/0x3a [<ffffffff8838f0a2>] :nvidia:_nv004835rm+0x74/0xd8 [<ffffffff88112ece>] :nvidia:_nv002598rm+0x6e/0x94 [<ffffffff88112cdb>] :nvidia:_nv002595rm+0xcd/0xee [<ffffffff882a1742>] :nvidia:_nv009103rm+0x8c/0xae [<ffffffff8811dfee>] :nvidia:_nv002597rm+0x1da/0x2d2 [<ffffffff804922bf>] pci_conf1_read+0xbb/0xc6 [<ffffffff8811dd2d>] :nvidia:_nv002600rm+0xef/0x1d6 [<ffffffff8811da06>] :nvidia:_nv002603rm+0x42/0x27a [<ffffffff88143883>] :nvidia:rm_set_interrupts+0x11f/0x136 [<ffffffff8844834e>] :nvidia:os_acquire_sema+0x5f/0x77 [<ffffffff88119592>] :nvidia:_nv004373rm+0x70/0xaa [<ffffffff88146461>] :nvidia:_nv002552rm+0x1a9/0x63a [<ffffffff88143b4d>] :nvidia:rm_ioctl+0x9/0xe [<ffffffff884450da>] :nvidia:nv_kern_ioctl+0x35a/0x3eb [<ffffffff884451aa>] :nvidia:nv_kern_unlocked_ioctl+0x1c/0x23 [<ffffffff8024152d>] do_ioctl+0x21/0x6b [<ffffffff8022fa49>] vfs_ioctl+0x252/0x26b [<ffffffff8022fa83>] __up_write+0x21/0x10d [<ffffffff8024bf49>] sys_ioctl+0x3c/0x5c [<ffffffff8025d452>] system_call+0x7e/0x83 Code: Bad RIP value. RIP [<ffffffff6d723638>] RSP <ffff81007f1bbba0> CR2: ffffffff6d723638 I attached the .config of the 2.6.18.6 kernel I used. |
|
|
|
|
|
|
#8 |
|
NVIDIA Corporation
Join Date: Dec 2004
Posts: 8,763
|
You stated that you "restarted". What did you restart?
Does reinstalling 1.0-9746 have any impact? Also, please post a new bug report. thanks, Lonni |
|
|
|
|
|
#9 |
|
Registered User
Join Date: Jan 2007
Posts: 11
|
I rebooted the entire system.
I'll try to reproduce it now. First I make sure there isn't any remaining nvidia module that could be loaded by udevd on startup with a ` rm `find -name nvidia.ko`'. Then I rebooted into the 2.6.18.6 kernel. I prevent Xorg from starting (it would just complain about the missing module), and reinstall the nvidia-drivers from the Vt. To make sure this isn't all a filesystem corruption bug I noted the md5sum: Code:
henk ~ # md5sum /lib/modules/2.6.18.6/video/nvidia.ko c9ce219812c25a258ee2bed93222214e /lib/modules/2.6.18.6/video/nvidia.ko I created a bug report log. Then I shut down linux and the system itself completely and booted the computer again. And booted into the 2.6.18.6 kernel. udevd automatically loads the nvidia kernel for me and during init I saw a kernel page fault scrolling by. It continued init properly. This time I again prevent Xorg from starting because it would just hang. I check the nvidia module, and to my amazement: Code:
henk ~ # md5sum /lib/modules/2.6.18.6/video/nvidia.ko a59547309e66c1e7d98ee6de0f9e26dc /lib/modules/2.6.18.6/video/nvidia.ko h Code:
henk ~ # md5sum /lib/modules/2.6.18.6/video/* c9ce219812c25a258ee2bed93222214e /lib/modules/2.6.18.6/video/nvidia.ko a59547309e66c1e7d98ee6de0f9e26dc /lib/modules/2.6.18.6/video/nvidia.old.ko Code:
henk video # objdump -D nvidia.ko > nvidia.ko.dump henk video # objdump -D nvidia.old.ko > nvidia.old.ko.dump henk video # diff nvidia.ko.dump nvidia.old.ko.dump 2c2 < nvidia.ko: file format elf64-x86-64 --- > nvidia.old.ko: file format elf64-x86-64 Then I tried a diff on two hexdumps: Code:
henk video # hexdump nvidia.ko > nvidia.ko.dump henk video # hexdump nvidia.old.ko > nvidia.old.ko.dump henk video # diff nvidia.ko.dump nvidia.old.ko.dump 424513c424513 < 06f09f0 c77c 0001 0000 0000 0002 0000 16db 0000 --- > 06f09f0 c77c 0001 0000 0000 0002 0000 17db 0000 424528c424528 < 06f0ae0 c9d8 0001 0000 0000 0002 0000 16db 0000 --- > 06f0ae0 c9d8 0001 0000 0000 0002 0000 17db 0000 424530c424530 < 06f0b00 000b 0000 16db 0000 0000 0000 0000 0000 --- > 06f0b00 000b 0000 17db 0000 0000 0000 0000 0000 425382c425382 < 06f4040 0002 0000 16db 0000 fffb ffff ffff ffff --- > 06f4040 0002 0000 17db 0000 fffb ffff ffff ffff 426295c426295 < 06f7950 fe5c 0002 0000 0000 0002 0000 36df 0000 --- > 06f7950 fe5c 0002 0000 0000 0002 0000 37df 0000 435540c435540 < 071bb20 0002 0000 16fd 0000 fffc ffff ffff ffff --- > 071bb20 0002 0000 17fd 0000 fffc ffff ffff ffff 459937c459937 < 077aff0 042b 0000 0000 0000 0002 0000 36db 0000 --- > 077aff0 042b 0000 0000 0000 0002 0000 37db 0000 This is what I found in dmesg: Code:
NVRM: loading NVIDIA UNIX x86_64 Kernel Module 1.0-9746 Fri Dec 15 10:19:35 PST 2006 Unable to handle kernel paging request at ffffffff6d723938 RIP: [<ffffffff880f7428>] :nvidia:nvidia_init_module+0x428/0x7aa PGD 203027 PUD 0 Oops: 0000 [1] PREEMPT SMP CPU 0 Modules linked in: nvidia parport_pc parport ohci_hcd uhci_hcd eth1394 bcraid Pid: 2568, comm: modprobe Tainted: P 2.6.18.6 #2 RIP: 0010:[<ffffffff880f7428>] [<ffffffff880f7428>] :nvidia:nvidia_init_module+0x428/0 x7aa RSP: 0018:ffff81003de9ff08 EFLAGS: 00010282 RAX: ffffffff88875240 RBX: 0000000000000000 RCX: ffff81003de9fe28 RDX: ffff81003dc80c80 RSI: ffffffff8854c7cf RDI: 0000033d00000000 RBP: 0000000000000000 R08: 0000000000000000 R09: ffff81003dc80a40 R10: ffff81003dc80c80 R11: ffff8100021ed000 R12: 0000000000972eed R13: 00002b80ee39c010 R14: 00000000005080e8 R15: ffff81003d6ad740 FS: 00002b80ee39aae0(0000) GS:ffffffff806c6000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffffff6d723938 CR3: 000000003d068000 CR4: 00000000000006e0 Process modprobe (pid: 2568, threadinfo ffff81003de9e000, task ffff810037de2740) Stack: ffffffff805c6de0 ffffffff80221455 ffffffff805c6de0 ffffffff88875240 0000000000508120 0000000000972eed 00002b80ee39c010 00000000005080e8 00002b80ee39c010 ffffffff8029a8c4 0000000000000000 0000000000000000 Call Trace: [<ffffffff80221455>] __up_read+0x13/0x8a [<ffffffff8029a8c4>] sys_init_module+0xaf/0x228 [<ffffffff8025d452>] system_call+0x7e/0x83 Code: 48 8b 15 09 c5 62 e5 48 89 42 48 83 3d b6 1c 78 00 00 0f 84 RIP [<ffffffff880f7428>] :nvidia:nvidia_init_module+0x428/0x7aa RSP <ffff81003de9ff08> CR2: ffffffff6d723938 I made a second nv bug report. This will be the 'after' one. |
|
|
|
|
|
#10 |
|
NVIDIA Corporation
Join Date: Dec 2004
Posts: 8,763
|
If the md5sum of the nvidia kernel module is changing during reboots, then that seems like an OS or hardware problem. There's certainly nothing in the nvidia driver itself that would cause such behavior.
|
|
|
|
|
|
#11 | |
|
Registered User
Join Date: Jan 2007
Posts: 11
|
Quote:
At this moment I'm running my linux system from a normal IDE harddisk (I haven't even loaded the raid drivers). Now installing the latest drivers works fine, restarting too. However, the random crashes I experienced (which was the original reason to upgrade to the latest drivers) still persist. Because the raid drivers aren't loaded, they can't be the problem. After a while the whole system freezes. sysrq doesn't respond. sshd doesn't respond. My two monitors still display everything that I was doing, but freezed. No artifacts though. I've attached the new bug report. |
|
|
|
|
|
|
#12 |
|
Registered User
Join Date: Jan 2007
Posts: 11
|
Oh, maybe useful to note: there aren't any records of the crash in /var/log/messages or in any other log file after reboot. I guess everything freezes, which prevents logging.
|
|
|
|
![]() |
| Thread Tools | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Random crashes, NVRM Xid messages | Iesos | NVIDIA Linux | 90 | 10-04-12 03:27 AM |
| Corrupted display - 302.17 - Dell Precision T3500 (G98 [Quadro NVS 295]) | gbailey | NVIDIA Linux | 1 | 06-27-12 10:24 AM |
| UEFI+Nvidia - NVRM: Your system is not currently configured to drive a VGA console... | interzoneuk | NVIDIA Linux | 0 | 06-26-12 04:51 AM |
| xorg locks-up with newest nvidia drivers w/ vdpau. | theroot | NVIDIA Linux | 1 | 06-24-12 11:04 AM |
| Crash when logout from X | TGL | NVIDIA Linux | 10 | 09-13-02 08:22 PM |