View Single Post
Old 09-07-11, 05:10 AM   #1
bailey.cj
Registered User
 
Join Date: Sep 2011
Posts: 9
Default startx causes kernel OOPS on Fujitsu D2618-C1 mobo, debian, many kernels & drivers

Dear List,

I am starting a new thread, but please see this one for very similar issues (I do not have a zotac mobo, and yet I have exactly the same kernel oops to worry about):

http://www.nvnews.net/vbulletin/showthread.php?t=162538

I have of course attached the nvidia-bug-report.log, but here are some pertinent details.

I've tried Ubuntu 10.04 and 11.04 and Debian 6.0 (squeeze), the latter with both the stable kernel (2.6.32) and the "official" backport of 2.6.39. I can compile and install both the certified 260- and 280-versions of NVIDIA drivers (both directly off nvidia.com and the debian way using dkms) without glitches. modprobe loads the nvidia module without issues:

Code:
$ lsmod | grep nvidia
nvidia              11483630  4 
i2c_core               23766  3 nvidia,fschmd,i2c_i801
Every single time I start X or let GDM do the same, I end up with a black/blank screen, and an unresponsive keyboard (no Ctrl-Alt-F1 to get back to console). I can ssh into the machine and have, like others such as in the thread quoted above, found that the problem is a kernel OOPS upon linking to (?) NVRM. This is the relevant portion of kern.log

Code:
Sep  7 10:01:24 isis kernel: [70461.603348] BUG: unable to handle kernel paging request at ffffc90007d95000
Sep  7 10:01:24 isis kernel: [70461.603352] IP: [<ffffffffa042c479>] _nv022575rm+0x127/0x202 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.603453] PGD c1f415067 PUD 61f875067 PMD c060e9067 PTE 0
Sep  7 10:01:24 isis kernel: [70461.603456] Oops: 0000 [#1] SMP 
Sep  7 10:01:24 isis kernel: [70461.603457] last sysfs file: /sys/bus/acpi/drivers/NVIDIA ACPI Video Driver/uevent
Sep  7 10:01:24 isis kernel: [70461.603459] CPU 12 
Sep  7 10:01:24 isis kernel: [70461.603460] Modules linked in: parport_pc ppdev bridge lp parport stp bnep rfcomm bluetooth rfkill acpi_cpufreq mperf cpufreq_conservative cpufreq_stats cpufreq_powersave cpufreq_userspace binfmt_misc fuse loop firewire_sbp2 snd_hda_codec_hdmi nvidia(P) snd_hda_codec_realtek tpm_infineon snd_hda_intel snd_hda_codec snd_hwdep fschmd i2c_i801 tpm_tis tpm tpm_bios ipmi_si i7core_edac snd_pcm edac_core psmouse snd_seq i2c_core snd_timer ipmi_msghandler snd_seq_device evdev serio_raw pcspkr snd container wmi processor soundcore snd_page_alloc thermal_sys button ext4 mbcache jbd2 crc16 dm_mod raid1 raid0 md_mod sd_mod crc_t10dif usbhid hid usb_storage uas sg sr_mod cdrom uhci_hcd ahci libahci libata mptsas firewire_ohci firewire_core crc_itu_t mptscsih mptbase scsi_transport_sas scsi_mod r8169 ehci_hcd mii usbcore [last unloaded: scsi_wait_scan]
Sep  7 10:01:24 isis kernel: [70461.603496] 
Sep  7 10:01:24 isis kernel: [70461.603498] Pid: 4383, comm: Xorg Tainted: P           O 2.6.39-bpo.2-amd64 #1 FUJITSU                          CELSIUS R670-2                /D2618-C1
Sep  7 10:01:24 isis kernel: [70461.603501] RIP: 0010:[<ffffffffa042c479>]  [<ffffffffa042c479>] _nv022575rm+0x127/0x202 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.603567] RSP: 0018:ffff880604fbf978  EFLAGS: 00010206
Sep  7 10:01:24 isis kernel: [70461.603568] RAX: 00000000000000b9 RBX: ffffc90007d94f47 RCX: 000000000000001e
Sep  7 10:01:24 isis kernel: [70461.603570] RDX: 000000000000010c RSI: ffffc90007d94000 RDI: ffffffff81623fe0
Sep  7 10:01:24 isis kernel: [70461.603571] RBP: ffff88060529f108 R08: 00003ffffffff000 R09: ffff880000000000
Sep  7 10:01:24 isis kernel: [70461.603573] R10: 0000000000000296 R11: 00000000df7bb000 R12: ffff880605976800
Sep  7 10:01:24 isis kernel: [70461.603574] R13: 00000000df7baf47 R14: 00000000df7baf47 R15: 0000000000000000
Sep  7 10:01:24 isis kernel: [70461.603576] FS:  00007f495fc3c700(0000) GS:ffff88061fcc0000(0000) knlGS:0000000000000000
Sep  7 10:01:24 isis kernel: [70461.603578] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep  7 10:01:24 isis kernel: [70461.603579] CR2: ffffc90007d95000 CR3: 0000000604303000 CR4: 00000000000006e0
Sep  7 10:01:24 isis kernel: [70461.603580] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Sep  7 10:01:24 isis kernel: [70461.603582] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Sep  7 10:01:24 isis kernel: [70461.603583] Process Xorg (pid: 4383, threadinfo ffff880604fbe000, task ffff88060444d6a0)
Sep  7 10:01:24 isis kernel: [70461.603585] Stack:
Sep  7 10:01:24 isis kernel: [70461.603586]  ffff880605976000 ffff880605976800 ffff880605976000 ffff880606302000
Sep  7 10:01:24 isis kernel: [70461.603589]  000000000000001e ffffffffa042c7d4 ffff8806066ca000 ffff8806066ca000
Sep  7 10:01:24 isis kernel: [70461.603592]  ffff880606302000 ffffffffa04279f8 ffff8806066ca000 ffff880606302000
Sep  7 10:01:24 isis kernel: [70461.603595] Call Trace:
Sep  7 10:01:24 isis kernel: [70461.603660]  [<ffffffffa042c7d4>] ? _nv022591rm+0x76/0x9e [nvidia]
Sep  7 10:01:24 isis kernel: [70461.603724]  [<ffffffffa04279f8>] ? _nv022548rm+0xcf/0x301 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.603788]  [<ffffffffa0427cd5>] ? _nv022598rm+0xab/0x174 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.603851]  [<ffffffffa0427dee>] ? _nv022547rm+0x50/0x5d [nvidia]
Sep  7 10:01:24 isis kernel: [70461.603916]  [<ffffffffa042ea5d>] ? _nv022539rm+0x6e/0x78 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.603997]  [<ffffffffa0a43134>] ? _nv019309rm+0x69/0x121 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.604076]  [<ffffffffa0a430b2>] ? _nv019323rm+0xe8/0x101 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.604150]  [<ffffffffa046b421>] ? _nv004632rm+0x68/0x1f4 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.604290]  [<ffffffffa0809387>] ? _nv015073rm+0x176/0x4c3 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.604430]  [<ffffffffa0807eeb>] ? _nv015366rm+0xe9/0x165 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.604491]  [<ffffffffa03f2748>] ? _nv015546rm+0xd/0x12 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.604570]  [<ffffffffa0a42e79>] ? _nv002297rm+0x19d/0x28a [nvidia]
Sep  7 10:01:24 isis kernel: [70461.604648]  [<ffffffffa0a43ee8>] ? _nv002291rm+0x4a5/0x684 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.604727]  [<ffffffffa0a4ac84>] ? rm_init_adapter+0x9e/0x1b6 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.604804]  [<ffffffffa0a6be47>] ? nv_kern_open+0x56f/0x708 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.604809]  [<ffffffff810fe37c>] ? deactivate_super+0x3c/0x3c
Sep  7 10:01:24 isis kernel: [70461.604812]  [<ffffffff810fe6fe>] ? chrdev_open+0x12a/0x148
Sep  7 10:01:24 isis kernel: [70461.604814]  [<ffffffff810fe5d4>] ? cdev_put+0x1a/0x1a
Sep  7 10:01:24 isis kernel: [70461.604816]  [<ffffffff810fa46d>] ? __dentry_open+0x180/0x297
Sep  7 10:01:24 isis kernel: [70461.604818]  [<ffffffff81103537>] ? dget+0x12/0x1e
Sep  7 10:01:24 isis kernel: [70461.604821]  [<ffffffff81105bd1>] ? do_last+0x449/0x543
Sep  7 10:01:24 isis kernel: [70461.604823]  [<ffffffff811071c1>] ? path_openat+0xc6/0x317
Sep  7 10:01:24 isis kernel: [70461.604827]  [<ffffffff8110f5c1>] ? inode_change_ok+0x92/0x109
Sep  7 10:01:24 isis kernel: [70461.604829]  [<ffffffff8110f48a>] ? setattr_copy+0x98/0xd7
Sep  7 10:01:24 isis kernel: [70461.604831]  [<ffffffff811074df>] ? do_filp_open+0x2c/0x75
Sep  7 10:01:24 isis kernel: [70461.604833]  [<ffffffff8110fd94>] ? alloc_fd+0x69/0x10b
Sep  7 10:01:24 isis kernel: [70461.604835]  [<ffffffff810fa1b8>] ? do_sys_open+0x61/0xe8
Sep  7 10:01:24 isis kernel: [70461.604838]  [<ffffffff81339392>] ? system_call_fastpath+0x16/0x1b
Sep  7 10:01:24 isis kernel: [70461.604839] Code: 55 14 48 89 c6 4c 89 ef 41 ff 94 24 88 02 00 00 48 89 c3 48 85 c0 0f 84 b3 00 00 00 b8 00 00 00 00 48 83 7d 08 00 76 12 8b 55 2c <0f> b6 0c 03 00 4d 2b 48 ff c0 48 39 c2 77 f1 80 7d 2b 00 75 7c 
Sep  7 10:01:24 isis kernel: [70461.604855] RIP  [<ffffffffa042c479>] _nv022575rm+0x127/0x202 [nvidia]
Sep  7 10:01:24 isis kernel: [70461.604919]  RSP <ffff880604fbf978>
Sep  7 10:01:24 isis kernel: [70461.604920] CR2: ffffc90007d95000
Sep  7 10:01:24 isis kernel: [70461.604922] ---[ end trace fd4da73e7b31f3c8 ]---
Other similar posts on other forums have indicated a possible ldd-bug (binutils). However, I'm certain I'm linking with the bdf-variant, not gold:

Code:
$ ls -l /usr/bin/ld*
lrwxrwxrwx 1 root root       6 Sep  5 13:50 /usr/bin/ld -> ld.bfd
-rwxr-xr-x 1 root root  555184 Jan 25  2011 /usr/bin/ld.bfd
-rwxr-xr-x 1 root root 2066024 Jan 25  2011 /usr/bin/ld.gold
-rwxr-xr-x 1 root root    5271 Jan 23  2011 /usr/bin/ldd
My system details are in the log, but just to be explicit... My system is a Fujitsu R670-2 (12 x Xeon 5660, 12x4GB RAM) with a Fujitsu D2618-C1 motherboard (Intel 5520 chipset) and a single Tesla C2070 (no other GPU's installed).

I'd like to reiterate that I run into the same issue on several distros running several combinations of kernel, Xorg, and nvidia.ko.

I hope this is sufficient to help debug the problem, and that indeed a solution may soon be available... As you may have guessed from the presence of the C2070, nouveau is just not an option.

Best regards,

Chris
Attached Files
File Type: gz nvidia-bug-report.log.gz (40.4 KB, 73 views)
bailey.cj is offline   Reply With Quote