nV News Forums

 
 

nV News Forums (http://www.nvnews.net/vbulletin/index.php)
-   NVIDIA Linux (http://www.nvnews.net/vbulletin/forumdisplay.php?f=14)
-   -   Various hangs requiring reboot with GTX 275 896MB (http://www.nvnews.net/vbulletin/showthread.php?t=154201)

bones_was_here 08-18-10 05:47 AM

Various hangs requiring reboot with GTX 275 896MB
 
1 Attachment(s)
There seem to be problems on my machine using pretty well any driver version. I'm currently using 256.35 but it also happens with 256.44. Have also had problems using pre- 256 drivers (195, 190, 185, 180) but I can't use any of those versions now because my current display locks at an eye-burning 100% brightness with pre- 256 series drivers. The display is a 120hz alienware, but the problems used to happen with my previous (60hz) display as well. Have tried using 8x and 16x PCI-E 2.0 slots on several different mainboards (DFI T3eH6, Asus P6T, currently Gigabyte X58A-UD3R 8x slot).

Message signaled interrupts are enabled for the card, but disabling that doesn't help.

I am currently using KDE 4.4.5 with compiz, but the hangs can still occur even using kwin with the composition disabled.

The hangs rarely happen shortly after a reboot, usually it happens after the system has been running for 3-7 days (7 days is pushing it, usually it's already happened by then). The most common trigger seems to be task switching between a game and some other application, this is with both native games like nexuiz and quake 3, and games using wine eg left 4 dead.

I believe it is related, that sometimes a task switch away from a game causes all applications to redraw extremely slowly, sometimes leading to a complete hang, other times switching back to the game is possible and then things continue to run normally.

Also related (I believe), is that after a few days it usually becomes impossible to rmmod nvidia (for example, to change driver versions) even if there have been no problems so far and Xorg has stopped cleanly - the module is allegedly still in use, but ps shows no instances of Xorg.

When the hangs occur, sometimes the mouse cursor can still be moved and switching to a VT might be possible, other times there's no response to any input. However, the system is always still contactable via ssh, and I have observed a number of different strange behaviours when the system is in this state:
  • Xorg is fully utilising one CPU core; this happens pretty much every time
  • kill -9 is required to stop Xorg; again pretty much every time
  • often, it's not possible to rmmod nvidia even when ps shows there are no instances of Xorg (module is apparently still in use)
  • if the module is still in use and I attempt to rmmod -f nvidia, the kernel has an oops and startx fails until after a reboot - but often, after trying to rmmod -f nvidia, the system will hang if I try to reboot and a power cycle is required
  • sometimes it is impossible to kill Xorg, even with repeated signal 9's
  • sometimes the GPU seems to lock up completely, and it isn't even possible to do a chvt (the command just hangs until ^c)
I can't always access another machine to ssh from, I'll try to find a working laptop so I can get a post- crash bug report (or any other info that might be useful) next time it happens.

tadawson 08-18-10 04:13 PM

Re: Various hangs requiring reboot with GTX 275 896MB
 
For what it's worth, I have almost the exact same problem with the 2.6.xx kernels, pretty much any nvidia driver over the last year or so, and dual 9800GTX+ cards on an Asus M3N-HT motherboard. Oh, and X server is Xorg 1.4.2 . . .

- Tim

Deanjo 08-18-10 07:20 PM

Re: Various hangs requiring reboot with GTX 275 896MB
 
Just personal experience here but this doesn't occur here with a M3N-HT motherboard and MSI GTX-275 (with two other 8800GT's crunching away on openCL tasks) well over a few weeks without need of reboot. It's running KDE 4.5 on openSUSE 11.3 64-bit.

bones_was_here 08-18-10 07:39 PM

Re: Various hangs requiring reboot with GTX 275 896MB
 
Quote:

Originally Posted by Deanjo (Post 2304329)
Just personal experience here but this doesn't occur here with a M3N-HT motherboard and MSI GTX-275 (with two other 8800GT's crunching away on openCL tasks) well over a few weeks without need of reboot. It's running KDE 4.5 on openSUSE 11.3 64-bit.

Hm.. after running for a few weeks, does rmmod nvidia succeed cleanly after shutting down the applications that use the GPU?

Deanjo 08-18-10 09:41 PM

Re: Various hangs requiring reboot with GTX 275 896MB
 
Quote:

Originally Posted by bones_was_here (Post 2304334)
Hm.. after running for a few weeks, does rmmod nvidia succeed cleanly after shutting down the applications that use the GPU?

Yup, it seems to here.

bones_was_here 08-26-10 07:46 AM

Re: Various hangs requiring reboot with GTX 275 896MB
 
Currently, my uptime is a little over 8 days, which is rather unusual. However, task switching away from games is currently broken, when I try all rendering hangs for a few seconds until the game suddenly dies (Received signal 11, exiting...) and the Xorg log shows one of these "nvLock: client timed out, taking the lock". I suppose that's better than having to reboot?

tswe 09-09-10 06:27 PM

Re: Various hangs requiring reboot with GTX 275 896MB
 
Deanjo:

you are running opencl. I have problems with that. question for you I had was, are you using nvidia drivers pre-packaged opensuse RPM from ftp://download.nvidia.com/opensuse, or are you using the self-extracting and installing NVIDIA-Linux-x86_64-256.53.run file from nvidia as your way of installing the driver?

I have problems with the RPM not allowing me to run OpenCL executables (while compiling them works fine).

thanks


All times are GMT -5. The time now is 07:39 PM.

Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.