|10-22-02, 12:42 PM||#1|
Join Date: Oct 2002
RH 7.3 (2.4.18-17.7smp) 3123 kernel oops
Ran into a problem that seems to point to the NVIDIA driver (drv 3123 from --recompile)
Upgraded RH 7.3 (2.4.18-5smp to 2.4.18-17.7.xsmp) on a Dell Precision 530. After 15 minutes of working around with the system, the following kernel oops was emited:
Oct 22 08:59:02 host kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000020 Oct 22 08:59:02 host kernel: printing eip: Oct 22 08:59:02 host kernel: c01e6665 Oct 22 08:59:02 host kernel: *pde = 00000000 Oct 22 08:59:02 host kernel: Oops: 0000 Oct 22 08:59:02 host kernel: ide-cd cdrom i810_audio ac97_codec soundcore agpgart NVdriver autofs nfs lockd
Oct 22 08:59:02 host kernel: CPU: 0
Oct 22 08:59:02 host kernel: EIP: 0010:[<c01e6665>] Tainted: P
Oct 22 08:59:02 host kernel: EFLAGS: 00013202
Oct 22 08:59:02 host kernel:
Oct 22 08:59:02 host kernel: EIP is at __scm_destroy [kernel] 0x15 (2.4.18-17.7.xsmp)
Oct 22 08:59:02 host kernel: eax: f6a6ff08 ebx: f6a6ff08 ecx: f64543dc edx: 00000000
Oct 22 08:59:02 host kernel: esi: 00000020 edi: f6a6ff1c ebp: f64506a4 esp: f6a6fed4
Oct 22 08:59:02 host kernel: ds: 0018 es: 0018 ss: 0018
Oct 22 08:59:02 host kernel: Process X (pid: 1073, stackpage=f6a6f000) Oct 22 08:59:02 host kernel: Stack: f6a6ff08 00000018 f6a6ff1c c01e0799 f6a6ff08 f6a64ce0 00003246 f6483560
Oct 22 08:59:02 host kernel: 00000246 f4bad000 c1038030 c0305924 00000246 00000503 000001fd 000001fa
Oct 22 08:59:02 host kernel: 00000000 00000020 f6249008 00000001 c01531c4 f64c3940 00001000 ffffffea
Oct 22 08:59:02 host kernel: Call Trace: [<c01e0799>] sock_recvmsg [kernel] 0x69 (0xf6a6fee0)) Oct 22 08:59:02 host kernel: [<c01531c4>] poll_freewait [kernel] 0x44 (0xf6a6ff24)) Oct 22 08:59:02 host kernel: [<c01e0868>] sock_read [kernel] 0x88 (0xf6a6ff38)) Oct 22 08:59:02 host kernel: [<c0153a22>] sys_select [kernel] 0x472 (0xf6a6ff70)) Oct 22 08:59:02 host kernel: [<c0143236>] sys_read [kernel] 0x96 (0xf6a6ff7c)) Oct 22 08:59:02 host kernel: [<c0108c7b>] system_call [kernel] 0x33 (0xf6a6ffc0)) Oct 22 08:59:02 host kernel:
Oct 22 08:59:02 host kernel:
Oct 22 08:59:02 host kernel: Code: 8b 1e 4b 78 11 8d 7e 04 8d 76 00 8b 04 9f e8 b8 da f5 ff 4b
Oct 22 08:59:02 host kernel: <6>NVRM: AGPGART: freed 16 pages Oct 22 08:59:03 host kernel: NVRM: AGPGART: allocated 16 pages
>>EIP; c01e6665 <qdisc_lookup_ops+75/c0> <=====
Trace; c01e0799 <neigh_update+3b9/3c0>
Trace; c01531c4 <lease_alloc+64/f0>
Trace; c01e0868 <neigh_hh_init+48/c0>
Trace; c0153a22 <flock_lock_file+12/160>
Trace; c0143236 <sys_fsync+66/a0>
Trace; c0108c7b <ret_from_sys_call+b/11>
Code; c01e6665 <qdisc_lookup_ops+75/c0>
Code; c01e6665 <qdisc_lookup_ops+75/c0> <=====
0: 8b 1e mov (%esi),%ebx <=====
Code; c01e6667 <qdisc_lookup_ops+77/c0>
2: 4b dec %ebx
Code; c01e6668 <qdisc_lookup_ops+78/c0>
3: 78 11 js 16 <_EIP+0x16> c01e667b <qdisc_lookup_ops+8b/c0>
Code; c01e666a <qdisc_lookup_ops+7a/c0>
5: 8d 7e 04 lea 0x4(%esi),%edi
Code; c01e666d <qdisc_lookup_ops+7d/c0>
8: 8d 76 00 lea 0x0(%esi),%esi
Code; c01e6670 <qdisc_lookup_ops+80/c0>
b: 8b 04 9f mov (%edi,%ebx,4),%eax
Code; c01e6673 <qdisc_lookup_ops+83/c0>
e: e8 b8 da f5 ff call fff5dacb <_EIP+0xfff5dacb> c0144130 <discard_buffer+20/90>
Code; c01e6678 <qdisc_lookup_ops+88/c0>
13: 4b dec %ebx
|10-22-02, 03:15 PM||#2|
Join Date: Sep 2002
Does this happen if you don't load your NVdriver kernel module? i.e. reboot, don't load it at all (this means you'll want to remove the alias from /etc/modules.conf, and that means you'll have to move back to the opensource "nv" or the "vesa" X driver), and run for a while.
If it happens again, then the problem isn't with the NVdriver kernel module. This seems to be the case anyway, as I can't think of any way the stack trace would have functions like "sys_select", "sock_recvmsg", and "sock_read" in it if the problem was with the nVidia drivers -- they shouldn't be touching the socket system. Although X does use sockets, so the driver might ... no, never mind. The nvidia_drv.o file runs in userspace, as does X itself.
When you upgraded your kernel, did you also upgrade kernel-source, and recompile the NVdriver kernel module? You have to do that, you can't just copy the built version.
Registered Linux User #219692
|10-22-02, 03:24 PM||#3|
Join Date: Oct 2002
Yes, we recompiled with the NVidia driver after rebooting with the kernel upgrade.
Also FYI--This box is running stock KDE from 7.3.
I tried for an hour to get Xinerama mode on the "nvidia" driver that comes with XFree86, we need both monitors active for our in house application(s).
Twinview I can setup fine with the NVidia rpm driver, but it crashes for us.
Will run with the "nvidia" or "vesa" driver if I can get Xinerama working for this...
|Thread||Thread Starter||Forum||Replies||Last Post|
|[BUG] nvidia crashes kernel with 'Xid 13' and attempted to yield the CPU while atomic||rockob||NVIDIA Linux||36||09-26-12 07:48 AM|
|upgrade to xorg server 1.13 damaged Geforce 7300 GT||KDE||NVIDIA Linux||2||09-07-12 08:04 AM|
|Fatal upgrade: from bad (8800 GTS) to worse 9GTX 560 Ti)||ssnyder||NVIDIA Linux||1||07-01-12 08:14 PM|
|UEFI+Nvidia - NVRM: Your system is not currently configured to drive a VGA console...||interzoneuk||NVIDIA Linux||0||06-26-12 04:51 AM|
|xorg locks-up with newest nvidia drivers w/ vdpau.||theroot||NVIDIA Linux||1||06-24-12 11:04 AM|