nV News Forums

 
 

nV News Forums (http://www.nvnews.net/vbulletin/index.php)
-   NVIDIA Linux (http://www.nvnews.net/vbulletin/forumdisplay.php?f=14)
-   -   X crash and system freeze with 295.40 (http://www.nvnews.net/vbulletin/showthread.php?t=178362)

tv.debian 04-16-12 05:42 AM

X crash and system freeze with 295.40
 
1 Attachment(s)
Hi, on Debian amd64 with kernel 3.3.2 and drivers 295.40 and KDE desktop I am experiencing X freezes and/or crashes, leading to system freeze if the computer is left running.

In the log I get only:

NVRM: GPU at 0000:01:00.0 has fallen off the bus.

I also get various segfaults in vlc, virtualbox, to name a few since the upgrade. System freezes happen wether those applications are used or not, so it must be a separate issue.

Computer remains reacheable with ssh for a short time windows, I managed to run nvidia-bug-report during a crash, report attached.

I have been suffering various crashes for the last three releases, bug reports are dismissed and receive no apparent attention, this is very frustrating. NVidia drivers are a pain and a threat to my system, with the support nearly inexistant they are becoming an unbearable pain.

I am seriously considering a change of graphic card for one with proper support, preferably with open drivers... It will be worth every dime to avoid suffering those instability and then having to talk to a wall for support. :mad:

welle 04-16-12 06:05 AM

Re: X crash and system freeze with 295.40
 
1 Attachment(s)
hi, im having the same problem but i'm using gentoo linux. Im using a 8500GT nvidia card also with kernel 3.3.2. Im seeing random crashes most of the time when vdpau is in use. It seems the regression was introduced in nvidia-drivers-295.33 as the problems started with this driver version. Im now using 295.40 and the same crashes still happen. 290.10 was working normally and i didn't have any crashes with this version. Please track down the regression and fix this bug. The attached log shows the X11 log. The whole pc was unresponsive and i could only turn it of by pressing the power button (no output on screen anymore), but i also had chrashes that freezed the whole pc. This bug is for shure connected to the nvidia driver as i don't get crashes with 290.10.

I'm also thinking about switching to another graphic card - as a lot of bugs inside the nvidia driver don't get fixed. It seems you have to be a big company to get support. The context switching problem of nvidia cards with vdpau is still present and i reported this error some time ago. I got one answere and afterwards the devs just ignored the posts and didnt answere anymore. - Nice to know that a company that i recommended a long time is is just ignoring their customers :(

sandipt 04-20-12 06:26 AM

Re: X crash and system freeze with 295.40
 
NVIDIA internal bug to track this issue : Bug ID: 973068


NVRM: GPU at 0000:01:00.0 has fallen off the bus.

Backtrace:
[ 39835.282] 0: /usr/bin/X (xorg_backtrace+0x28) [0x563b38]
[ 39835.282] 1: /usr/bin/X (0x400000+0x167619) [0x567619]
[ 39835.282] 2: /lib64/libpthread.so.0 (0x7f2b9646d000+0x10b30) [0x7f2b9647db30]
[ 39835.282] 3: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7f2b90151000+0x4a20a6) [0x7f2b905f30a6]
[ 39835.282] 4: /usr/bin/X (miPointerSetPosition+0x140) [0x54ff70]
[ 39835.282] 5: /usr/bin/X (GetPointerEvents+0x3f5) [0x449995]
[ 39835.283] 6: /usr/bin/X (QueuePointerEvents+0x1d) [0x44a19d]
[ 39835.283] 7: /usr/bin/X (xf86PostMotionEventP+0x3c) [0x48199c]
[ 39835.283] 8: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7f2b8f51d000+0x48da) [0x7f2b8f5218da]
[ 39835.283] 9: /usr/bin/X (0x400000+0x6d5d7) [0x46d5d7]
[ 39835.283] 10: /usr/bin/X (0x400000+0x91e96) [0x491e96]
[ 39835.283] 11: /lib64/libpthread.so.0 (0x7f2b9646d000+0x10b30) [0x7f2b9647db30]
[ 39835.283] 12: /usr/bin/X (0x400000+0x167aa0) [0x567aa0]
[ 39835.283] 13: /lib64/libpthread.so.0 (0x7f2b9646d000+0x10b30) [0x7f2b9647db30]
[ 39835.283] 14: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7f2b90151000+0xf661c) [0x7f2b9024761c]
[ 39835.283] 15: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7f2b90151000+0xfd61e) [0x7f2b9024e61e]
[ 39835.283] 16: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7f2b90151000+0x4d6b92) [0x7f2b90627b92]
[ 39835.283] 17: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7f2b90151000+0x4d74d5) [0x7f2b906284d5]
[ 39835.283] 18: /usr/lib64/xorg/modules/drivers/nvidia_drv.so (0x7f2b90151000+0x4d767d) [0x7f2b9062867d]
[ 39835.283] 19: /usr/bin/X (0x400000+0xf76f3) [0x4f76f3]
[ 39835.283] 20: /usr/bin/X (0x400000+0xc8bcf) [0x4c8bcf]
[ 39835.283] 21: /usr/bin/X (0x400000+0xc9dc5) [0x4c9dc5]
[ 39835.284] 22: /usr/bin/X (0x400000+0x352e1) [0x4352e1]
[ 39835.284] 23: /usr/bin/X (0x400000+0x2485a) [0x42485a]
[ 39835.284] 24: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x7f2b9539009d]
[ 39835.284] 25: /usr/bin/X (0x400000+0x243f9) [0x4243f9]
[ 39835.284] Segmentation fault at address 0xfffffffffaf8a860
[ 39835.284]
Fatal server error:
[ 39835.284] Caught signal 11 (Segmentation fault). Server aborting
[ 39835.284]
[ 39835.290]
Please consult the The X.Org Foundation support
at http://wiki.x.org
for help.
[ 39835.290] Please also check the log file at "/var/log/Xorg.0.log" for additional information.

tv.debian 04-21-12 04:16 AM

Re: X crash and system freeze with 295.40
 
2 Attachment(s)
It happened several times since the initial report, last time after the classic:
"NVRM: GPU at 0000:01:00.0 has fallen off the bus."

The system remained functional (sort of) for a while, I got interupt messages in the syslog and kern.log (attached), and pretty much any application would segfault (vlc, chromium ...). After a few minutes Xserver froze, I ran nvidia-debug (attached) before the system locks up completly.

I tested reverting to previous drivers, issue appeared after 290.10 which is last known working driver for me (and many others).

eskuai 04-21-12 05:17 AM

Re: X crash and system freeze with 295.40
 
Hello

can you test with persitence mode?

# /usr/bin/nvidia-smi -pm 1
/usr/bin/nvidia-smi -q | grep -i Persistence



http://www.cyberciti.biz/faq/debian-...allen-off-bus/

tv.debian 04-21-12 05:41 AM

Re: X crash and system freeze with 295.40
 
Interesting link, thanks. nvidia-smi isn't part of a standard Debian NVidia install, so I installed it and I am testing right now.
I'll let the system run for a while and will stress-test it later today.

tv.debian 04-21-12 07:24 AM

Re: X crash and system freeze with 295.40
 
No luck with persistence mode, this time just starting a video player froze the system... :(

Typical "NVRM as fallen of the bus", then:

vlc[12830]: segfault at 8 ip 00007fd33cc9a4cb sp 00007fffa2b67b48 error 4 in libQtDBus.so.4.7.4[7fd33cc58000+76000]
NVRM: GPU at 0000:01:00.0 has fallen off the bus.
NVRM: Xid (0000:01:00): 3, C 00000007 SC 00000007 M 00001ffc Data ffffffff

So this isn't the fix for this case. "Nouveau" driver is indeed working with low performances, as is 290.10. I guess I'll have to patch 290.10 for the security hole corrected in 295.40, and hope it keeps working with my Xserver untill I get a new video card or a fixed driver.

bjordan555 05-21-12 06:07 PM

Re: X crash and system freeze with 295.40
 
I have this happening on my 64-bit Ubuntu 12.04 with Nvidia Quadro FX 3700 using driver ver 295.40. I have enabled persistence mode following the instructions at http://www.cyberciti.biz/faq/debian-...allen-off-bus/ and will report back any results.

Lucidor 05-22-12 05:04 PM

Re: X crash and system freeze with 295.40
 
I'm experiencing this problem on OpenSuse 12.1 and earlier on OpenSuse 11.4. On 11.4 the system would freeze and require a hard reset. On 12.1, I'm so far thrown out to the login screen. This issue was the reason I upgraded my OS, but it didn't solve the problem.

From /var/log/messages:
May 22 23:13:26 linux-h5yp kernel: [97450.832648] NVRM: Xid (0000:02:00): 6, PE0001
May 22 23:13:28 linux-h5yp kdm[3723]: X server for display :0 terminated unexpectedly

blujay 05-26-12 05:34 PM

Re: X crash and system freeze with 295.40
 
I've been running Kubuntu on this Dell XPS M1330 laptop with NVIDIA 8400M for four years with no problems. Suddenly when I upgraded from Oneiric (11.10) to Precise (12.04) I am getting crashes where X freezes at 100% CPU. The screen is frozen--sometimes the cursor still moves, sometimes it disappears. I can SSH in and see X at 100% CPU. I cannot switch to another VT. Sometimes I can SAK+K to kill X and restart KDM, and sometimes I can't, and I have to power off.

It seems to happen when playing YouTube videos in Flash in Firefox. I haven't had it crash at any other time so far.

It may be minutes, hours, or days between crashes.

I've tried 295.40 and 295.49, and both exhibit this freeze/crash.

This is what I see in dmesg:

Code:

[35437.091037] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
[35437.091072] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
[35438.672076] irq 16: nobody cared (try booting with the "irqpoll" option)
[35438.672082] Pid: 0, comm: BFS/0 Tainted: P        C O 3.3.6-pf-adp+ #6
[35438.672085] Call Trace:
[35438.672093]  [<c154ddf6>] ? printk+0x2d/0x2f
[35438.672099]  [<c10b4139>] __report_bad_irq+0x29/0xd0
[35438.672103]  [<c10b43ae>] note_interrupt+0x11e/0x1d0
[35438.672209]  [<f9bce853>] ? nv_kern_isr+0x33/0x70 [nvidia]
[35438.672213]  [<c10b21ee>] handle_irq_event_percpu+0x9e/0x200
[35438.672217]  [<c10269e8>] ? default_spin_lock_flags+0x8/0x10
[35438.672221]  [<c15560ad>] ? _raw_spin_lock_irqsave+0x2d/0x40
[35438.672225]  [<c10b238b>] handle_irq_event+0x3b/0x60
[35438.672228]  [<c10b4c20>] ? unmask_irq+0x30/0x30
[35438.672231]  [<c10b4c6e>] handle_fasteoi_irq+0x4e/0xd0
[35438.672233]  <IRQ>  [<c155d3b2>] ? do_IRQ+0x42/0xc0
[35438.672240]  [<c1084f2a>] ? tick_notify+0x2ca/0x3e0
[35438.672244]  [<c155d2f0>] ? common_interrupt+0x30/0x38
[35438.672247]  [<c1556035>] ? _raw_spin_unlock_irqrestore+0x15/0x20
[35438.672251]  [<c108462d>] ? clockevents_notify+0x3d/0x100
[35438.672255]  [<c13168d5>] ? lapic_timer_state_broadcast+0x36/0x39
[35438.672259]  [<c13169eb>] ? acpi_idle_enter_simple+0x113/0x133
[35438.672263]  [<c1448d9d>] ? cpuidle_idle_call+0xad/0x250
[35438.672267]  [<c100174c>] ? cpu_idle+0x9c/0xe0
[35438.672271]  [<c1531825>] ? rest_init+0x5d/0x68
[35438.672275]  [<c17f5745>] ? start_kernel+0x357/0x35d
[35438.672278]  [<c17f517f>] ? loglevel+0x2b/0x2b
[35438.672281]  [<c17f5078>] ? i386_start_kernel+0x78/0x7d
[35438.672283] handlers:
[35438.672366] [<f9bce820>] nv_kern_isr
[35438.672369] Disabling IRQ #16

I've never had a problem like this until "upgrading" to these newer drivers that come with 12.04 Precise. My laptop is now totally unreliable.

mark27q1 06-02-12 05:52 AM

Re: X crash and system freeze with 295.40
 
Can anyone comment on whether 295.53 fixes this? It has recently become available in Debian Wheezy and I am contemplating an upgrade... So far stuck on 290.10 which is the last version that worked properly -- Debian wheezy using GeForce 9800 GTX+ here, and I've been seeing the problem since the first of the 295 versions.

QBANIN 06-02-12 06:31 AM

Re: X crash and system freeze with 295.40
 
Quote:

Originally Posted by mark27q1 (Post 2560833)
Can anyone comment on whether 295.53 fixes this?

Nope :(


All times are GMT -5. The time now is 08:14 PM.

Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.