|
|
#25 | |
|
Registered User
Join Date: Nov 2008
Posts: 95
|
Sadly, the just-released 304.37 release does not fix the problem (though I think according to the release notes it is supposed to). The symptoms are slightly different, but X still locks up completely and requires a cold boot.
I didn't actually see the "GPU has fallen off the bus" message in the log, but the video did freeze after about a minute of gameplay in Crysis2 (the audio kept going, though), and when I tried to close its window, its bumblebee X process froze, locking the main X shortly afterwards. I *was* able to ssh into the machine, which is new, but I was unable to kill -9 any of the locked processes, and restarting lightdm failed, as did a reboot command. Only a hard reset fixed it. fwiw, I was running the 3.6-rc1 kernel. Last edited by rockob; 08-13-12 at 09:20 PM. Reason: added info |
|
|
|
|
|
|
#26 | |
|
Registered User
Join Date: Nov 2008
Posts: 95
|
I tried again with kenel 3.6-rc2, and nvidia 304.37 crashed about 30 seconds in to the game. Again there were no Xid errors or "GPU has fallen off the bus" messages, but the kernel reported hung processes within the nvidia module. Eventually I had to hard reset the PC because it became completely unresponsive. Below is the kernel log for the hung processes:
Code:
Aug 18 16:45:29 sierra kernel: [69562.835272] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 304.37 Aug 18 16:46:48 sierra kernel: [69641.144875] NVRM: GPU at 0000:01:00: GPU-1b1589e9-15df-5ca5-919b-2f748fae640f ... Aug 18 16:51:45 sierra kernel: [69938.074719] INFO: task kworker/0:3:6594 blocked for more than 120 seconds. Aug 18 16:51:45 sierra kernel: [69938.074723] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 18 16:51:45 sierra kernel: [69938.074725] kworker/0:3 D ffff88023e613dc0 0 6594 2 0x00000000 Aug 18 16:51:45 sierra kernel: [69938.074730] ffff8802215d7b00 0000000000000046 ffff88023117db40 ffff8802215d7fd8 Aug 18 16:51:45 sierra kernel: [69938.074735] ffff8802215d7fd8 ffff8802215d7fd8 ffff8800a7b596d0 ffff88023117db40 Aug 18 16:51:45 sierra kernel: [69938.074739] 0000000000000000 7fffffffffffffff ffff88023117db40 ffff880219675388 Aug 18 16:51:45 sierra kernel: [69938.074743] Call Trace: Aug 18 16:51:45 sierra kernel: [69938.074753] [<ffffffff81680979>] schedule+0x29/0x70 Aug 18 16:51:45 sierra kernel: [69938.074757] [<ffffffff8167ee1c>] schedule_timeout+0x1bc/0x280 Aug 18 16:51:45 sierra kernel: [69938.074761] [<ffffffff8167fc0b>] __down_common+0xa0/0xf7 Aug 18 16:51:45 sierra kernel: [69938.074765] [<ffffffff8167fcd5>] __down+0x1d/0x1f Aug 18 16:51:45 sierra kernel: [69938.074770] [<ffffffff8107fc01>] down+0x41/0x50 Aug 18 16:51:45 sierra kernel: [69938.074866] [<ffffffffa0f4f9f2>] os_acquire_mutex+0x42/0x50 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.074947] [<ffffffffa0f1fe25>] _nv014757rm+0x1c/0x21 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.075020] [<ffffffffa095c343>] ? _nv016374rm+0x6c/0x100 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.075132] [<ffffffffa0e1afb1>] ? _nv015315rm+0x211/0x358 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.075215] [<ffffffffa0f251a7>] ? _nv001080rm+0x298/0x97d [nvidia] Aug 18 16:51:45 sierra kernel: [69938.075301] [<ffffffffa0f28550>] ? rm_execute_work_item+0x4c/0xc2 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.075384] [<ffffffffa0f5043f>] ? os_execute_work_item+0x4f/0x90 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.075390] [<ffffffff810730f3>] ? process_one_work+0x143/0x500 Aug 18 16:51:45 sierra kernel: [69938.075474] [<ffffffffa0f503f0>] ? nv_printf+0x80/0x80 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.075480] [<ffffffff810748be>] ? worker_thread+0x16e/0x480 Aug 18 16:51:45 sierra kernel: [69938.075484] [<ffffffff81074750>] ? manage_workers.isra.21+0x2b0/0x2b0 Aug 18 16:51:45 sierra kernel: [69938.075488] [<ffffffff81079793>] ? kthread+0x93/0xa0 Aug 18 16:51:45 sierra kernel: [69938.075493] [<ffffffff8168ac04>] ? kernel_thread_helper+0x4/0x10 Aug 18 16:51:45 sierra kernel: [69938.075497] [<ffffffff81079700>] ? kthread_freezable_should_stop+0x70/0x70 Aug 18 16:51:45 sierra kernel: [69938.075501] [<ffffffff8168ac00>] ? gs_change+0x13/0x13 Aug 18 16:51:45 sierra kernel: [69938.075511] INFO: task Crysis2.exe:8829 blocked for more than 120 seconds. Aug 18 16:51:45 sierra kernel: [69938.075513] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 18 16:51:45 sierra kernel: [69938.075515] Crysis2.exe D ffff88023e793dc0 0 8829 8694 0x20020000 Aug 18 16:51:45 sierra kernel: [69938.075518] ffff88022152db28 0000000000200082 ffff880234df2da0 ffff88022152dfd8 Aug 18 16:51:45 sierra kernel: [69938.075522] ffff88022152dfd8 ffff88022152dfd8 ffff880234dc5b40 ffff880234df2da0 Aug 18 16:51:45 sierra kernel: [69938.075526] ffff88022152db18 7fffffffffffffff ffff880234df2da0 ffff880219675388 Aug 18 16:51:45 sierra kernel: [69938.075530] Call Trace: Aug 18 16:51:45 sierra kernel: [69938.075536] [<ffffffff81680979>] schedule+0x29/0x70 Aug 18 16:51:45 sierra kernel: [69938.075539] [<ffffffff8167ee1c>] schedule_timeout+0x1bc/0x280 Aug 18 16:51:45 sierra kernel: [69938.075544] [<ffffffff8167fc0b>] __down_common+0xa0/0xf7 Aug 18 16:51:45 sierra kernel: [69938.075548] [<ffffffff8167fcd5>] __down+0x1d/0x1f Aug 18 16:51:45 sierra kernel: [69938.075552] [<ffffffff8107fc01>] down+0x41/0x50 Aug 18 16:51:45 sierra kernel: [69938.075639] [<ffffffffa0f4f9f2>] os_acquire_mutex+0x42/0x50 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.075724] [<ffffffffa0f1fe25>] _nv014757rm+0x1c/0x21 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.075798] [<ffffffffa095c343>] ? _nv016374rm+0x6c/0x100 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.075868] [<ffffffffa0952bfb>] ? _nv014649rm+0x9/0x21 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.075938] [<ffffffffa0940ecd>] ? _nv001039rm+0xc5c/0xd59 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.076005] [<ffffffffa09410be>] ? _nv001073rm+0x73/0x2d09 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.076071] [<ffffffffa09395b8>] ? _nv000947rm+0x26/0x147 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.076156] [<ffffffffa0f1b0ed>] ? _nv001106rm+0x34d/0xaaf [nvidia] Aug 18 16:51:45 sierra kernel: [69938.076237] [<ffffffffa0f26ce6>] ? rm_ioctl+0x76/0x100 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.076318] [<ffffffffa0f4566d>] ? nv_kern_ioctl+0x14d/0x480 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.076398] [<ffffffffa0f459c1>] ? nv_kern_compat_ioctl+0x21/0x30 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.076403] [<ffffffff811d0051>] ? compat_sys_ioctl+0xd1/0x1330 Aug 18 16:51:45 sierra kernel: [69938.076407] [<ffffffff8101a2f9>] ? read_tsc+0x9/0x20 Aug 18 16:51:45 sierra kernel: [69938.076412] [<ffffffff810a57bc>] ? getnstimeofday+0x4c/0xe0 Aug 18 16:51:45 sierra kernel: [69938.076415] [<ffffffff810a58ba>] ? do_gettimeofday+0x1a/0x50 Aug 18 16:51:45 sierra kernel: [69938.076419] [<ffffffff810bf3f5>] ? compat_sys_time+0x25/0x70 Aug 18 16:51:45 sierra kernel: [69938.076424] [<ffffffff8168af26>] ? sysenter_dispatch+0x7/0x21 Aug 18 16:51:45 sierra kernel: [69938.076434] INFO: task kworker/0:0:9270 blocked for more than 120 seconds. Aug 18 16:51:45 sierra kernel: [69938.076436] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 18 16:51:45 sierra kernel: [69938.076437] kworker/0:0 D ffff88023e613dc0 0 9270 2 0x00000000 Aug 18 16:51:45 sierra kernel: [69938.076441] ffff8801fdca7b00 0000000000000046 ffff8801f69016d0 ffff8801fdca7fd8 Aug 18 16:51:45 sierra kernel: [69938.076445] ffff8801fdca7fd8 ffff8801fdca7fd8 ffff88023117db40 ffff8801f69016d0 Aug 18 16:51:45 sierra kernel: [69938.076449] 00000000ffffffff 7fffffffffffffff ffff8801f69016d0 ffff880219675388 Aug 18 16:51:45 sierra kernel: [69938.076452] Call Trace: Aug 18 16:51:45 sierra kernel: [69938.076458] [<ffffffff81680979>] schedule+0x29/0x70 Aug 18 16:51:45 sierra kernel: [69938.076462] [<ffffffff8167ee1c>] schedule_timeout+0x1bc/0x280 Aug 18 16:51:45 sierra kernel: [69938.076466] [<ffffffff8109015c>] ? update_curr+0xfc/0x190 Aug 18 16:51:45 sierra kernel: [69938.076469] [<ffffffff8167fc0b>] __down_common+0xa0/0xf7 Aug 18 16:51:45 sierra kernel: [69938.076474] [<ffffffff8167fcd5>] __down+0x1d/0x1f Aug 18 16:51:45 sierra kernel: [69938.076478] [<ffffffff8107fc01>] down+0x41/0x50 Aug 18 16:51:45 sierra kernel: [69938.076561] [<ffffffffa0f4f9f2>] os_acquire_mutex+0x42/0x50 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.076645] [<ffffffffa0f1fe25>] _nv014757rm+0x1c/0x21 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.076718] [<ffffffffa095c343>] ? _nv016374rm+0x6c/0x100 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.076833] [<ffffffffa0e1afb1>] ? _nv015315rm+0x211/0x358 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.076917] [<ffffffffa0f251a7>] ? _nv001080rm+0x298/0x97d [nvidia] Aug 18 16:51:45 sierra kernel: [69938.077000] [<ffffffffa0f28550>] ? rm_execute_work_item+0x4c/0xc2 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.077083] [<ffffffffa0f5043f>] ? os_execute_work_item+0x4f/0x90 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.077087] [<ffffffff810730f3>] ? process_one_work+0x143/0x500 Aug 18 16:51:45 sierra kernel: [69938.077167] [<ffffffffa0f503f0>] ? nv_printf+0x80/0x80 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.077172] [<ffffffff810748be>] ? worker_thread+0x16e/0x480 Aug 18 16:51:45 sierra kernel: [69938.077175] [<ffffffff81074750>] ? manage_workers.isra.21+0x2b0/0x2b0 Aug 18 16:51:45 sierra kernel: [69938.077179] [<ffffffff81079793>] ? kthread+0x93/0xa0 Aug 18 16:51:45 sierra kernel: [69938.077184] [<ffffffff8168ac04>] ? kernel_thread_helper+0x4/0x10 Aug 18 16:51:45 sierra kernel: [69938.077189] [<ffffffff81079700>] ? kthread_freezable_should_stop+0x70/0x70 Aug 18 16:51:45 sierra kernel: [69938.077192] [<ffffffff8168ac00>] ? gs_change+0x13/0x13 Aug 18 16:51:45 sierra kernel: [69938.077195] INFO: task kworker/0:4:9686 blocked for more than 120 seconds. Aug 18 16:51:45 sierra kernel: [69938.077196] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 18 16:51:45 sierra kernel: [69938.077198] kworker/0:4 D ffff88023e613dc0 0 9686 2 0x00000000 Aug 18 16:51:45 sierra kernel: [69938.077201] ffff880103959b00 0000000000000046 ffff8800865a16d0 ffff880103959fd8 Aug 18 16:51:45 sierra kernel: [69938.077205] ffff880103959fd8 ffff880103959fd8 ffff8800a7b596d0 ffff8800865a16d0 Aug 18 16:51:45 sierra kernel: [69938.077208] 6db8926917d28c35 7fffffffffffffff ffff8800865a16d0 ffff880219675388 Aug 18 16:51:45 sierra kernel: [69938.077212] Call Trace: Aug 18 16:51:45 sierra kernel: [69938.077217] [<ffffffff81680979>] schedule+0x29/0x70 Aug 18 16:51:45 sierra kernel: [69938.077221] [<ffffffff8167ee1c>] schedule_timeout+0x1bc/0x280 Aug 18 16:51:45 sierra kernel: [69938.077225] [<ffffffff81327383>] ? cpumask_next_and+0x23/0x40 Aug 18 16:51:45 sierra kernel: [69938.077229] [<ffffffff81091ee3>] ? update_sd_lb_stats+0x133/0x610 Aug 18 16:51:45 sierra kernel: [69938.077233] [<ffffffff8167fc0b>] __down_common+0xa0/0xf7 Aug 18 16:51:45 sierra kernel: [69938.077237] [<ffffffff8167fcd5>] __down+0x1d/0x1f Aug 18 16:51:45 sierra kernel: [69938.077241] [<ffffffff8107fc01>] down+0x41/0x50 Aug 18 16:51:45 sierra kernel: [69938.077320] [<ffffffffa0f4f9f2>] os_acquire_mutex+0x42/0x50 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.077401] [<ffffffffa0f1fe25>] _nv014757rm+0x1c/0x21 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.077473] [<ffffffffa095c343>] ? _nv016374rm+0x6c/0x100 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.077585] [<ffffffffa0e1afb1>] ? _nv015315rm+0x211/0x358 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.077668] [<ffffffffa0f251a7>] ? _nv001080rm+0x298/0x97d [nvidia] Aug 18 16:51:45 sierra kernel: [69938.077752] [<ffffffffa0f28550>] ? rm_execute_work_item+0x4c/0xc2 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.077834] [<ffffffffa0f5043f>] ? os_execute_work_item+0x4f/0x90 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.077838] [<ffffffff810730f3>] ? process_one_work+0x143/0x500 Aug 18 16:51:45 sierra kernel: [69938.077845] [<ffffffff8108e3af>] ? __dequeue_entity+0x2f/0x50 Aug 18 16:51:45 sierra kernel: [69938.077926] [<ffffffffa0f503f0>] ? nv_printf+0x80/0x80 [nvidia] Aug 18 16:51:45 sierra kernel: [69938.077936] [<ffffffff810748be>] ? worker_thread+0x16e/0x480 Aug 18 16:51:45 sierra kernel: [69938.077939] [<ffffffff81074750>] ? manage_workers.isra.21+0x2b0/0x2b0 Aug 18 16:51:45 sierra kernel: [69938.077944] [<ffffffff81079793>] ? kthread+0x93/0xa0 Aug 18 16:51:45 sierra kernel: [69938.077948] [<ffffffff8168ac04>] ? kernel_thread_helper+0x4/0x10 Aug 18 16:51:45 sierra kernel: [69938.077952] [<ffffffff81079700>] ? kthread_freezable_should_stop+0x70/0x70 Aug 18 16:51:45 sierra kernel: [69938.077955] [<ffffffff8168ac00>] ? gs_change+0x13/0x13 Aug 18 16:52:02 sierra kernel: [69955.149205] iwlwifi 0000:03:00.0: fail to flush all tx fifo queues Aug 18 16:52:06 sierra kernel: [69959.163935] iwlwifi 0000:03:00.0: fail to flush all tx fifo queues Aug 18 16:52:12 sierra kernel: [69963.693042] ------------[ cut here ]------------ |
|
|
|
|
|
|
#27 |
|
Registered User
Join Date: Nov 2008
Posts: 95
|
With kernel 3.5.2, nvidia crashed with the Xid 13 error:
Code:
Aug 19 12:42:19 sierra kernel: [ 123.987843] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 304.37 Wed Aug 8 19:52:48 PDT 2012 Aug 19 12:42:21 sierra kernel: [ 125.723395] NVRM: GPU at 0000:01:00: GPU-1b1589e9-15df-5ca5-919b-2f748fae640f Aug 19 12:42:39 sierra kernel: [ 144.518015] NET: Registered protocol family 4 Aug 19 12:46:41 sierra kernel: [ 385.798978] audit_printk_skb: 34 callbacks suppressed Aug 19 12:46:41 sierra kernel: [ 385.798981] type=1400 audit(1345351601.264:29): apparmor="DENIED" operation="capable" parent=1 profile="/usr/sbin/cupsd" pid=1491 comm="cupsd" pid=1491 comm="cupsd" capability=36 capname="block_suspend" Aug 19 12:51:34 sierra kernel: [ 678.458951] NVRM: Xid (0000:01:00): 13, 0006 00000000 00009197 00002390 3fb33333 00000000 Aug 19 12:51:36 sierra kernel: [ 680.462162] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context Aug 19 12:51:38 sierra kernel: [ 682.461573] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context |
|
|
|
|
|
#28 | |
|
Registered User
Join Date: Nov 2008
Posts: 95
|
And here's nvidia 304.37 crashing CoD with a Xid 31 error after less than ten seconds of gameplay (304.37 really is dreadfully buggy). I've included another stack trace from the hung nvidia process:
Code:
Aug 19 12:30:49 sierra kernel: [13496.844061] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 304.37 Wed Aug 8 19:52:48 PDT 2012 Aug 19 12:30:49 sierra acpid: client connected from 7757[0:1005] Aug 19 12:30:49 sierra acpid: 1 client rule loaded Aug 19 12:30:50 sierra kernel: [13497.790195] NVRM: GPU at 0000:01:00: GPU-1b1589e9-15df-5ca5-919b-2f748fae640f Aug 19 12:30:50 sierra acpid: client connected from 7757[0:1005] Aug 19 12:30:50 sierra acpid: 1 client rule loaded Aug 19 12:32:57 sierra kernel: [13624.986361] NVRM: Xid (0000:01:00): 31, Ch 00000006, engmask 00000101, intr 10000000 Aug 19 12:32:59 sierra kernel: [13626.989554] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context Aug 19 12:33:01 sierra kernel: [13628.988959] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context Aug 19 12:33:30 sierra kernel: [13657.393254] BUG: soft lockup - CPU#7 stuck for 22s! [iw5sp.exe:7892] Aug 19 12:33:30 sierra kernel: [13657.393257] Modules linked in: ipx p8022 psnap llc p8023 nvidia(PO) hidp snd_hrtimer pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) bbswitch(O) bnep rfcomm parport_pc ppdev nfsd nfs_acl auth_rpcgss nfs fscache lockd sunrpc binfmt_misc dm_crypt arc4 iptable_filter iwldvm xt_owner ip_tables x_tables mac80211 snd_hda_codec_hdmi snd_hda_codec_realtek dell_laptop dcdbas psmouse snd_hda_intel iwlwifi snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer lpc_ich dell_wmi snd_seq_device mei sparse_keymap cfg80211 joydev hid_generic snd soundcore mac_hid coretemp kvm_intel kvm btusb bluetooth microcode wmi snd_page_alloc serio_raw lp parport btrfs zlib_deflate libcrc32c ses enclosure uas usb_storage usbhid hid ghash_clmulni_intel aesni_intel ablk_helper cryptd aes_x86_64 i915 r8169 drm_kms_helper drm i2c_algo_bit video [last unloaded: nvidia] Aug 19 12:33:30 sierra kernel: [13657.393302] CPU 7 Aug 19 12:33:30 sierra kernel: [13657.393304] Pid: 7892, comm: iw5sp.exe Tainted: P O 3.6.0-rc2-git-20120817.1001 #15 Dell Inc. Dell System XPS L502X/0NJT03 Aug 19 12:33:30 sierra kernel: [13657.393306] RIP: 0010:[<ffffffffa075ba94>] [<ffffffffa075ba94>] _nv014655rm+0x64/0x1c2 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.393370] RSP: 0018:ffff880095eadb00 EFLAGS: 00000246 Aug 19 12:33:30 sierra kernel: [13657.393371] RAX: 0000000000000000 RBX: 0000000000000007 RCX: 0000000000000007 Aug 19 12:33:30 sierra kernel: [13657.393372] RDX: 0000000000000080 RSI: 00000000000054a8 RDI: ffff88008d38a008 Aug 19 12:33:30 sierra kernel: [13657.393373] RBP: ffff880036432b30 R08: 0000000000070000 R09: 0000000000000000 Aug 19 12:33:30 sierra kernel: [13657.393374] R10: ffff880036432b30 R11: ffffffffa0d58f4b R12: 0000000000070000 Aug 19 12:33:30 sierra kernel: [13657.393375] R13: 0000000000000000 R14: ffff880036432b30 R15: ffffffffa0d58f4b Aug 19 12:33:30 sierra kernel: [13657.393376] FS: 0000000081fd8000(0063) GS:ffff88023e7c0000(006b) knlGS:00000000175cfb40 Aug 19 12:33:30 sierra kernel: [13657.393377] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 Aug 19 12:33:30 sierra kernel: [13657.393378] CR2: 00007fc1c7e65bf0 CR3: 0000000036103000 CR4: 00000000000407e0 Aug 19 12:33:30 sierra kernel: [13657.393379] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Aug 19 12:33:30 sierra kernel: [13657.393380] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Aug 19 12:33:30 sierra kernel: [13657.393381] Process iw5sp.exe (pid: 7892, threadinfo ffff880095eac000, task ffff880095eb0000) Aug 19 12:33:30 sierra kernel: [13657.393382] Stack: Aug 19 12:33:30 sierra kernel: [13657.393383] ffffffffa0afc0b3 0000000000000000 ffff88008d38a008 0000000000000045 Aug 19 12:33:30 sierra kernel: [13657.393385] ffffffffa07b9c06 ffff88008d38a008 ffff88020bf65808 ffff880084e61408 Aug 19 12:33:30 sierra kernel: [13657.393387] ffffffffa07b9e21 0000000000000002 0000000000001340 ffff88020b92ff90 Aug 19 12:33:30 sierra kernel: [13657.393389] Call Trace: Aug 19 12:33:30 sierra kernel: [13657.393503] [<ffffffffa0afc0b3>] ? _nv009368rm+0x9bf/0xdf8 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.393567] [<ffffffffa07b9c06>] ? _nv002323rm+0x2de/0x30a [nvidia] Aug 19 12:33:30 sierra kernel: [13657.393629] [<ffffffffa07b9e21>] ? _nv002027rm+0x1ef/0x205 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.393724] [<ffffffffa09c711e>] ? _nv005899rm+0x6dd/0x707 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.393819] [<ffffffffa09c9ef4>] ? _nv006015rm+0xc7/0x61b [nvidia] Aug 19 12:33:30 sierra kernel: [13657.393913] [<ffffffffa09c9ea2>] ? _nv006015rm+0x75/0x61b [nvidia] Aug 19 12:33:30 sierra kernel: [13657.394054] [<ffffffffa0b3b332>] ? _nv010267rm+0xad/0x256 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.394180] [<ffffffffa0b47af1>] ? _nv010265rm+0x106/0x432 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.394289] [<ffffffffa0ab0f65>] ? _nv008173rm+0x24e/0x4a9 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.394400] [<ffffffffa0abcc0a>] ? _nv008168rm+0x17b/0x4e0 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.394443] [<ffffffffa074be34>] ? _nv001073rm+0x1de9/0x2d09 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.394486] [<ffffffffa0749f94>] ? _nv001039rm+0xd23/0xd59 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.394529] [<ffffffffa074a0be>] ? _nv001073rm+0x73/0x2d09 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.394571] [<ffffffffa07425b8>] ? _nv000947rm+0x26/0x147 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.394627] [<ffffffffa0d240ed>] ? _nv001106rm+0x34d/0xaaf [nvidia] Aug 19 12:33:30 sierra kernel: [13657.394681] [<ffffffffa0d2fce6>] ? rm_ioctl+0x76/0x100 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.394734] [<ffffffffa0d4e66d>] ? nv_kern_ioctl+0x14d/0x480 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.394786] [<ffffffffa0d4e9c1>] ? nv_kern_compat_ioctl+0x21/0x30 [nvidia] Aug 19 12:33:30 sierra kernel: [13657.394789] [<ffffffff811d0051>] ? compat_sys_ioctl+0xd1/0x1330 Aug 19 12:33:30 sierra kernel: [13657.394792] [<ffffffff8101a2f9>] ? read_tsc+0x9/0x20 Aug 19 12:33:30 sierra kernel: [13657.394795] [<ffffffff810a57bc>] ? getnstimeofday+0x4c/0xe0 Aug 19 12:33:30 sierra kernel: [13657.394797] [<ffffffff810a58ba>] ? do_gettimeofday+0x1a/0x50 Aug 19 12:33:30 sierra kernel: [13657.394799] [<ffffffff810bf3f5>] ? compat_sys_time+0x25/0x70 Aug 19 12:33:30 sierra kernel: [13657.394802] [<ffffffff8168af26>] ? sysenter_dispatch+0x7/0x21 Aug 19 12:33:30 sierra kernel: [13657.394803] Code: 72 1e bf 00 00 00 00 e8 6f d7 5c 00 48 89 c2 be 01 00 00 00 bf 00 00 00 00 e8 74 f9 00 00 eb 06 89 77 6c 89 4f 70 48 83 c4 08 c3 <41> 54 53 48 83 ec 08 41 89 f4 39 77 6c 73 0f 39 77 70 77 2d 39 |
|
|
|
|
|
|
#29 | |
|
Registered User
Join Date: Feb 2008
Posts: 163
|
Quote:
The driver is stable here, perhaps it's the beta kernel causing your issues.
__________________
leigh123linux |
|
|
|
|
|
|
#30 | |
|
Registered User
Join Date: Jun 2006
Posts: 28
|
Same problem here.
Date: Sun Aug 19 09:24:57 EDT 2012 uname: Linux cheetah 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux /usr/bin/nvidia-smi -q Code:
==============NVSMI LOG==============
Timestamp : Sun Aug 19 09:25:01 2012
Driver Version : 304.37
Attached GPUs : 1
GPU 0000:05:00.0
Product Name : GeForce GTX 460
Display Mode : N/A
Persistence Mode : Disabled
Driver Model
Code:
[145338.556620] NVRM: Xid (0000:05:00): 13, 0001 00000000 0000902d 000008dc 000000c1 00000000 [145338.658939] NVRM: Xid (0000:05:00): 13, 0001 00000000 0000902d 000008dc 000000c1 00000000 [145535.026418] NVRM: Xid (0000:05:00): 13, 0001 00000000 0000902d 000008dc 00000177 00000000 [145723.827410] NVRM: Xid (0000:05:00): 13, 0001 00000000 0000902d 00000804 000000cf 00000000 [145723.890722] NVRM: Xid (0000:05:00): 31, Ch 00000001, engmask 00000101, intr 10000000 [145724.086261] NVRM: Xid (0000:05:00): 31, Ch 00000001, engmask 00000101, intr 10000000 [145725.184692] NVRM: Xid (0000:05:00): 31, Ch 00000001, engmask 00000101, intr 10000000 [145725.486887] NVRM: Xid (0000:05:00): 31, Ch 00000001, engmask 00000101, intr 10000000 [145727.254452] NVRM: Xid (0000:05:00): 31, Ch 00000001, engmask 00000101, intr 10000000 [145737.715373] NVRM: Xid (0000:05:00): 13, 0001 00000000 0000902d 000008dc 00000030 00000000 [145975.957569] NVRM: Xid (0000:05:00): 13, 0001 00000000 0000902d 0000024c 000001d5 00000000 [145979.051476] NVRM: Xid (0000:05:00): 13, 0001 00000000 0000902d 000008dc 000000a0 00000000 [146067.753928] NVRM: Xid (0000:05:00): 31, Ch 00000001, engmask 00000101, intr 10000000 [146067.830989] NVRM: Xid (0000:05:00): 31, Ch 00000001, engmask 00000101, intr 10000000 [146067.967256] NVRM: Xid (0000:05:00): 31, Ch 00000001, engmask 00000101, intr 10000000 [146068.175696] NVRM: Xid (0000:05:00): 31, Ch 00000001, engmask 00000101, intr 10000000 |
|
|
|
|
|
|
#31 | |
|
Registered User
Join Date: Nov 2008
Posts: 95
|
Quote:
Also see http://www.nvnews.net/vbulletin/showthread.php?t=177732, where lesos reports that 304.37 still crashes, and of course flammon just reported that 304.37 still crashes (see the post just before this one). |
|
|
|
|
|
|
#32 |
|
Registered User
Join Date: Nov 2008
Posts: 95
|
Wow. I just managed to crash it while the game was on the paused screen... and this time the syslog reports that the GPU fell off the bus:
Code:
Aug 20 15:31:59 sierra kernel: [ 3150.215111] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 304.37 Wed Aug 8 19:52:48 PDT 2012 Aug 20 15:32:00 sierra kernel: [ 3151.268955] NVRM: GPU at 0000:01:00: GPU-1b1589e9-15df-5ca5-919b-2f748fae640f Aug 20 15:43:44 sierra kernel: [ 3854.428642] NVRM: Xid (0000:01:00): 13, 0006 00000000 00009197 000002ec 00000060 00000000 Aug 20 15:43:46 sierra kernel: [ 3856.432654] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context Aug 20 15:43:48 sierra kernel: [ 3858.442633] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context Aug 20 15:43:52 sierra kernel: [ 3862.441401] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context Aug 20 15:44:09 sierra kernel: [ 3879.560214] [drm:__gen6_gt_force_wake_get] *ERROR* Force wake wait timed out Aug 20 15:44:09 sierra kernel: [ 3879.656446] [drm:__gen6_gt_force_wake_get] *ERROR* Force wake wait timed out Aug 20 15:44:09 sierra kernel: [ 3879.704560] [drm:__gen6_gt_wait_for_thread_c0] *ERROR* GT thread status wait timed out Aug 20 15:44:11 sierra kernel: [ 3881.474027] NVRM: GPU at 0000:01:00.0 has fallen off the bus. Aug 20 15:44:11 sierra kernel: [ 3881.475310] [sched_delayed] sched: RT throttling activated Aug 20 15:46:16 sierra kernel: [ 4007.020459] BUG: soft lockup - CPU#5 stuck for 22s! [Crysis2.exe:4615] Aug 20 15:46:16 sierra kernel: [ 4007.020463] Modules linked in: nvidia(PO) hid_generic hidp hid snd_hrtimer pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) bbswitch(O) bnep rfcomm parport_pc ppdev nfsd nfs_acl auth_rpcgss nfs fscache lockd binfmt_misc sunrpc dm_crypt iptable_filter xt_owner ip_tables x_tables snd_hda_codec_hdmi snd_hda_codec_realtek arc4 iwldvm mac80211 snd_hda_intel snd_hda_codec joydev btusb bluetooth snd_hwdep snd_pcm snd_seq_midi iwlwifi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device coretemp kvm_intel snd psmouse kvm dell_wmi cfg80211 microcode mei serio_raw sparse_keymap mac_hid soundcore dell_laptop snd_page_alloc lpc_ich dcdbas wmi lp parport btrfs zlib_deflate libcrc32c ghash_clmulni_intel aesni_intel ablk_helper cryptd aes_x86_64 i915 r8169 drm_kms_helper drm i2c_algo_bit video [last unloaded: nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.020519] CPU 5 Aug 20 15:46:16 sierra kernel: [ 4007.020522] Pid: 4615, comm: Crysis2.exe Tainted: P O 3.6.0-rc2-git-20120817.1001 #15 Dell Inc. Dell System XPS L502X/0NJT03 Aug 20 15:46:16 sierra kernel: [ 4007.020523] RIP: 0010:[<ffffffffa072cad5>] [<ffffffffa072cad5>] _nv014655rm+0xa5/0x1c2 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.020594] RSP: 0018:ffff88012c849768 EFLAGS: 00200202 Aug 20 15:46:16 sierra kernel: [ 4007.020595] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000007 Aug 20 15:46:16 sierra kernel: [ 4007.020596] RDX: 0000000000000080 RSI: 00000000000054a8 RDI: ffff88022133e008 Aug 20 15:46:16 sierra kernel: [ 4007.020597] RBP: ffff88021c80abc0 R08: 0000000000070000 R09: 0000000000000000 Aug 20 15:46:16 sierra kernel: [ 4007.020598] R10: ffff88021c80abc0 R11: ffffffffa0d29f4b R12: 0000000000070000 Aug 20 15:46:16 sierra kernel: [ 4007.020598] R13: 0000000000000000 R14: ffff88021c80abc0 R15: ffffffffa0d29f4b Aug 20 15:46:16 sierra kernel: [ 4007.020600] FS: 0000000000000000(0000) GS:ffff88023e740000(0000) knlGS:0000000000000000 Aug 20 15:46:16 sierra kernel: [ 4007.020601] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b Aug 20 15:46:16 sierra kernel: [ 4007.020602] CR2: 000000000256c8c0 CR3: 0000000001c0b000 CR4: 00000000000407e0 Aug 20 15:46:16 sierra kernel: [ 4007.020603] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Aug 20 15:46:16 sierra kernel: [ 4007.020603] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Aug 20 15:46:16 sierra kernel: [ 4007.020605] Process Crysis2.exe (pid: 4615, threadinfo ffff88012c848000, task ffff88012caa0000) Aug 20 15:46:16 sierra kernel: [ 4007.020605] Stack: Aug 20 15:46:16 sierra kernel: [ 4007.020606] ffff88022133e008 ffff88022133e008 ffff88021c80abc0 ffffffffa0acd0b3 Aug 20 15:46:16 sierra kernel: [ 4007.020609] 0000000000000000 ffff88022133e008 0000000000000045 ffffffffa078ac06 Aug 20 15:46:16 sierra kernel: [ 4007.020610] ffff88022133e008 ffff880220e2b808 ffff88022faf0408 ffffffffa078ae21 Aug 20 15:46:16 sierra kernel: [ 4007.020612] Call Trace: Aug 20 15:46:16 sierra kernel: [ 4007.020723] [<ffffffffa0acd0b3>] ? _nv009368rm+0x9bf/0xdf8 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.020784] [<ffffffffa078ac06>] ? _nv002323rm+0x2de/0x30a [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.020843] [<ffffffffa078ae21>] ? _nv002027rm+0x1ef/0x205 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.020935] [<ffffffffa099811e>] ? _nv005899rm+0x6dd/0x707 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.021027] [<ffffffffa099aef4>] ? _nv006015rm+0xc7/0x61b [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.021118] [<ffffffffa099aea2>] ? _nv006015rm+0x75/0x61b [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.021224] [<ffffffffa0a82ba0>] ? _nv008695rm+0x10e/0x1a0 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.021331] [<ffffffffa0a98fe9>] ? _nv008846rm+0x164/0x396 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.021394] [<ffffffffa0796cc7>] ? _nv002402rm+0x1b/0x20 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.021456] [<ffffffffa0799d59>] ? _nv002377rm+0x2e5/0x310 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.021517] [<ffffffffa0799b12>] ? _nv002377rm+0x9e/0x310 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.021624] [<ffffffffa0a967f5>] ? _nv008107rm+0x2e2/0x3f4 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.021729] [<ffffffffa0a96d8b>] ? _nv008109rm+0x87/0xb7 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.021835] [<ffffffffa0a8dbad>] ? _nv008168rm+0x11e/0x4e0 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.021877] [<ffffffffa071ce34>] ? _nv001073rm+0x1de9/0x2d09 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.021918] [<ffffffffa071af94>] ? _nv001039rm+0xd23/0xd59 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.021959] [<ffffffffa071b033>] ? _nv016414rm+0xe/0x26 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.022000] [<ffffffffa071b54f>] ? _nv001073rm+0x504/0x2d09 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.022041] [<ffffffffa071af94>] ? _nv001039rm+0xd23/0xd59 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.022082] [<ffffffffa071b033>] ? _nv016414rm+0xe/0x26 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.022122] [<ffffffffa071b2d7>] ? _nv001073rm+0x28c/0x2d09 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.022163] [<ffffffffa071af94>] ? _nv001039rm+0xd23/0xd59 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.022204] [<ffffffffa071b007>] ? _nv016416rm+0x3d/0x5b [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.022256] [<ffffffffa0cfea77>] ? _nv001082rm+0xdf/0x1c3 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.022308] [<ffffffffa0d0100c>] ? rm_free_unused_clients+0x98/0x12d [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.022312] [<ffffffff8107fc42>] ? up+0x32/0x50 Aug 20 15:46:16 sierra kernel: [ 4007.022363] [<ffffffffa0d2033d>] ? nv_kern_ctl_close+0x7d/0x130 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.022413] [<ffffffffa0d212d3>] ? nv_kern_close+0x3b3/0x440 [nvidia] Aug 20 15:46:16 sierra kernel: [ 4007.022415] [<ffffffff8118117c>] ? __fput+0xec/0x240 Aug 20 15:46:16 sierra kernel: [ 4007.022417] [<ffffffff811812de>] ? ____fput+0xe/0x10 Aug 20 15:46:16 sierra kernel: [ 4007.022420] [<ffffffff8107680a>] ? task_work_run+0x6a/0x80 Aug 20 15:46:16 sierra kernel: [ 4007.022423] [<ffffffff8105b2b1>] ? do_exit+0x861/0x8d0 Aug 20 15:46:16 sierra kernel: [ 4007.022426] [<ffffffff81189c9a>] ? pipe_read+0x38a/0x530 Aug 20 15:46:16 sierra kernel: [ 4007.022427] [<ffffffff8105b67f>] ? do_group_exit+0x3f/0xa0 Aug 20 15:46:16 sierra kernel: [ 4007.022430] [<ffffffff8106b161>] ? get_signal_to_deliver+0x1d1/0x620 Aug 20 15:46:16 sierra kernel: [ 4007.022434] [<ffffffff810132bf>] ? do_signal+0x3f/0x610 Aug 20 15:46:16 sierra kernel: [ 4007.022436] [<ffffffff81013938>] ? do_notify_resume+0x88/0xc0 Aug 20 15:46:16 sierra kernel: [ 4007.022439] [<ffffffff81689e22>] ? int_signal+0x12/0x17 Aug 20 15:46:16 sierra kernel: [ 4007.022440] Code: 6c 73 05 39 77 70 77 1c bf 00 00 00 00 e8 28 d7 5c 00 48 89 c2 be 01 00 00 00 bf 00 00 00 00 e8 2d f9 00 00 b8 00 00 00 00 eb 15 <8b> 5f 6c e8 4e ff ff ff 48 89 c7 89 da 44 89 e6 e8 fa 09 00 00 |
|
|
|
|
|
#33 |
|
Registered User
Join Date: Nov 2008
Posts: 95
|
Switching to MSI (NVreg_EnableMSI=1 in a modprobe conf) and setting the priority of the wine task to -20 *might* help. I even managed to get 4 minutes of gameplay out of crysis2 (although the norm is still 60 seconds), and CoD seems less likely to crash as well.
|
|
|
|
|
|
#34 |
|
Registered User
Join Date: Nov 2008
Posts: 95
|
Scratch that, I don't think it helps. It was probably just luck that I managed to get 4 minutes of gameplay.
|
|
|
|
|
|
#35 |
|
Registered User
Join Date: Apr 2003
Posts: 20
|
OK, obviously such symptoms could be the result of different underlying problems, but...
in my case, what *completely* cured the problem of random freezes and Xid messages (after weeks of painful experimentation) was removing all kernel modules that have to do with thermal sensors. I have also disabled all the relevant plugins from gkrellm, in order to prevent such modules from being loaded automatically. The system is rock solid now, running for 2+ days straight under KDE+composite without a single glitch. I suspect that some such program or kernel module is periodically polling/generating interrupts under nvidia driver's nose, messing up the interface. It would be nice if someone could debug this to the end, though. Hope this helps, Bactrimel
__________________
CentOS 6 + KDE 4 GeForce GTS 450 |
|
|
|
|
|
#36 |
|
Registered User
Join Date: Nov 2008
Posts: 95
|
Which modules did you disable, and how? eg did you boot with the thermal.off=1 option?
|
|
|
|
![]() |
| Thread Tools | |
|
|