Go Back   nV News Forums > Linux Support Forums > NVIDIA Linux

Newegg Daily Deals

Reply
 
Thread Tools
Old 09-07-09, 11:00 AM   #517
Ochi
Registered User
 
Join Date: Mar 2004
Posts: 5
Default Re: 180.* graphical corruption and freezes system

Hi everyone,

such Xid-freezes occur for me once every few days or even weeks, but they do happen. They seem to occur most frequently when Second Life is running (foreground or background). I'm using the 185.18.36 drivers with a GeForce 8800 GTS 640 MB on an ASUS P5KC running Arch Linux i686 (Kernel 2.6.30-ARCH, Xorg 1.6.3). Temperature shouldn't be a problem. I'm not overclocking or something. The system always was rock solid until the last few driver releases, but I can't pinpoint the problems to a specific version because the errors happen so seldom for me.

The last time the freeze occurred I found two things:

1. The screen powered by the GeForce went black while typing something in SL. Another screen powered by another card and driver was still on but X was unusable. And so was the keyboard (dead caps lock, etc.).

2. Interestingly, I was able to ssh into the machine this time and extract this from dmesg/kernel.log/everything.log:

Code:
Sep  7 16:19:58 cerberus kernel: NVRM: Xid (0001:00): 6, PE0003 
Sep  7 16:23:35 cerberus kernel: INFO: task Xorg:4135 blocked for more than 120 seconds.
Sep  7 16:23:35 cerberus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep  7 16:23:35 cerberus kernel: Xorg          D 00203046     0  4135   4133
Sep  7 16:23:35 cerberus kernel: f688ac00 00203086 0000000d 00203046 10fd7b89 00000000 f642f000 00000000
Sep  7 16:23:35 cerberus kernel: 00000000 f6400fc0 00001b16 c0542ac0 f688ada8 c053f1a4 c0542ac0 d5ce0d99
Sep  7 16:23:35 cerberus kernel: 00001b16 f688ada8 c0542ac0 c012b37f 00000000 f5f85194 00000001 00000001
Sep  7 16:23:35 cerberus kernel: Call Trace:
Sep  7 16:23:35 cerberus kernel: [<c012b37f>] ? __wake_up_common+0x5f/0xa0
Sep  7 16:23:35 cerberus kernel: [<c03cfc40>] ? schedule+0x20/0x50
Sep  7 16:23:35 cerberus kernel: [<c03d0095>] ? schedule_timeout+0x155/0x1c0
Sep  7 16:23:35 cerberus kernel: [<c012d1ad>] ? __wake_up_sync+0x1d/0x40
Sep  7 16:23:35 cerberus kernel: [<c03afff8>] ? unix_write_space+0x48/0x90
Sep  7 16:23:35 cerberus kernel: [<c032b848>] ? sock_wfree+0x68/0x80
Sep  7 16:23:35 cerberus kernel: [<c03cf143>] ? wait_for_common+0xa3/0x140
Sep  7 16:23:35 cerberus kernel: [<c0136670>] ? default_wake_function+0x0/0x30
Sep  7 16:23:35 cerberus kernel: [<f9a4bc2f>] ? os_acquire_sema+0x8f/0xa0 [nvidia]
Sep  7 16:23:35 cerberus kernel: [<f997c288>] ? _nv005242rm+0x9/0xd [nvidia]
Sep  7 16:23:35 cerberus kernel: [<f99c4db7>] ? _nv003839rm+0x65/0x135 [nvidia]
Sep  7 16:23:35 cerberus kernel: [<f990960c>] ? _nv007074rm+0x18/0x29 [nvidia]
Sep  7 16:23:35 cerberus kernel: [<f9734139>] ? _nv003699rm+0x4f1/0x52a [nvidia]
Sep  7 16:23:35 cerberus kernel: [<f998390a>] ? rm_ioctl+0x3e/0x6d [nvidia]
Sep  7 16:23:35 cerberus kernel: [<f9a47d2a>] ? nv_kern_ioctl+0x14a/0x4d0 [nvidia]
Sep  7 16:23:35 cerberus kernel: [<c01cff6d>] ? do_sync_read+0xed/0x140
Sep  7 16:23:35 cerberus kernel: [<f9a48115>] ? nv_kern_unlocked_ioctl+0x25/0x40 [nvidia]
Sep  7 16:23:35 cerberus kernel: [<f9a480f0>] ? nv_kern_unlocked_ioctl+0x0/0x40 [nvidia]
Sep  7 16:23:35 cerberus kernel: [<c01dfbc2>] ? vfs_ioctl+0x22/0xa0
Sep  7 16:23:35 cerberus kernel: [<c01dfcc9>] ? do_vfs_ioctl+0x89/0x5a0
Sep  7 16:23:35 cerberus kernel: [<c0158edd>] ? hrtimer_start+0x2d/0x50
Sep  7 16:23:35 cerberus kernel: [<c024017b>] ? security_file_permission+0x1b/0x40
Sep  7 16:23:35 cerberus kernel: [<c01d0032>] ? rw_verify_area+0x72/0x100
Sep  7 16:23:35 cerberus kernel: [<c01d0fc3>] ? vfs_read+0x123/0x190
Sep  7 16:23:35 cerberus kernel: [<c020462a>] ? sys_inotify_init+0x2a/0x40
Sep  7 16:23:35 cerberus kernel: [<c01e026e>] ? sys_ioctl+0x8e/0xb0
Sep  7 16:23:35 cerberus kernel: [<c0103c93>] ? sysenter_do_call+0x12/0x28
Sep  7 16:23:35 cerberus kernel: [<c020462a>] ? sys_inotify_init+0x2a/0x40
By the way: Is there something like the definite overview of Xid-error-codes? Like, for end-users, you know.
Ochi is offline   Reply With Quote
Old 10-19-09, 07:09 PM   #518
zaskar
Registered User
 
Join Date: Oct 2009
Posts: 2
Default Re: 180.* graphical corruption and freezes system

I'm having exactly the same symptoms that many of you. The only difference is that in my case it used to work 3 days ago (worked ok for more than a year with many driver versions), but suddenly it started doing that. I think it's a hardware problem, but as I said, I have the same symptoms that most of you.
The strange thing is that my pc works perfectly with vesa or xorg-nv driver. The problem shows up when I install NVidia drivers. But may be it's not a driver's fault, it's just a hardware defect in some function that other drivers don't call.
My card is a gforce 8400gs, in a Dell Inspiron 1420, 4gb RAM, Intel core 2 duo T7250.
My system is a very tunned Slackware 12.0 based GNU/Linux. The problem shows up with kernels 2.6.24.4, 2.6.30.5 and 2.6.31.4 (all 32-bits, smp, I haven't tested any other), and with any driver version that can compile with them (stable and betas, since 173.xx to 190.40). Installing the lattest xorg server (1.5.2 and related stuff like libs) and Mesa (7.6) before reinstalling the NVidia driver has made my system a little more stable. The problem wait's more time until it appears, but it finnaly does.
Installing the driver on a live 64 bits Kubuntu 9.x also produces the same problem.
When it happends keyboard doesn't work, but sometimes I can get into my pc with my cellphone throw a bluetooth server I've made, or with ssh from other pc. Kernel errors vary with driver version, but are always around NVRM and some extrange XVid error codes. Examples:
Ussually: NVRM: Xid (0001:00): 6, PE0001 ... repeats 8 times
Sometimes it's alternated with some lines like
NVRM: Xid (0001:00): 6, PE0002 or NVRM: Xid (0001:00): 4, Ch 0000007f SC 00000000 M 00000000 Data 00000000
and also:X[2827]: segfault at 4 ip b7f62fef sp bfd2d4d8 error 4 in ld-2.5.so[b7f58000+1b000]
in some versions
Kernel parameters like hpet timers don't work. Module's parameters to avoid power mixer clock changes don't work. My systems boots on a terminal. I launch Xorg mannually with startx. Sometimes it gets in and then crash, sometimes it crashes when starting, sometimes it doesn't starts and after some minutes return to the tty.
This pc model is known to have temperature issues, but my one has a program checking the temp every second, that slow down thinks when it reaches some limits, so it may not be the cause. When it first show up, it was under 70 degrees.
If there's anything I can do in order to get more information about the problem let me know. If there's any way to check if that's a hardware or a software problem, I'll be glad to hear. My pc doesn't have dual boot, so I can't test in M$ Windows.
zaskar is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 08:36 AM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.