taniwallach 03-01-12 09:23 AM

295.20 crash - suspend/logout/shutdown failing
1 Attachment(s)
Had major problems with 295.20-1 with suspend/shutdown/logout and even stopping gdm or the X server. After downgrading back to 290.10-1 suspend/resume work fine again.

I recently (26 Feb) upgraded from 290.10-1 to 295.20-1 on my laptop (HP 8440p with a NVS 3100M graphics system) using the packages for 295.20-1 from debian testing. My system is mostly debian stable, but with some packages from debian testing. On 27 Feb I stopped X, removed the module and then restarted X when I recalled that I had not done so immediately after the upgrade. Between the upgrade and loading the new module - the system was able to suspend and resume properly! Since then I have had multiple problems, all which seem related to the Nvidia 295.20-1 driver.
  1. Suspend to Ram stopped working. Would look like it was about to suspend, but the system logs did not show a suspend starting and then CPU/fan would keep running. Only hard power off helped.
  2. On some occasions shutdown (from Gnome menu when logged in) worked and on other cases it failed. When shutdown failed - only hard power off worked.
  3. Trying to stop gdm3 caused the same type of freeze.
  4. Killing X when started using "startx" via control-c, caused the same freeze.
  5. This version of the driver does not allow changing to a Virtual Terminal from the X screen. This made some debugging and testing more complex than it should be.

In all cases of the "crash" the laptop power would stay ON, the fan was pushing warm air out, and the system would not respond to keyboard at all. Network connections also failed or were dropped - even when the connection was live and working a few seconds before the change. The system would not respond to "ping" anymore either. After every crash and hard-reset, my /home partition was "unclean" and in quite a few cases there were messed up files (multiply claimed blocks, unused inodes, etc.)

On the attempt to suspend - daemon.log shows that NetworkManager received the request to sleep, but kern.log did not get the messages "PM: Syncing filesystems ... done." or "PM: Preparing system for mem sleep" which show up as the suspend process is being done.

I tried to "sleep 5; nvidia-bug-report.sh" in one remote session, and then run "/etc/init.d/gdm3 stop" to see if I could get debugging data while the "crash" was occuring. That was not successful. Instead, I could only attach a "nvidia-bug-report.log.gz" file which was generated after starting x from a remote console with "startx -- -logverbose 6" and before initiating a crash.

Bottom line - at least on my machine the 295.20-1 driver makes a total mess of things on my machine.

Ref: files on Debian BTS also: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=661822

sandipt 03-02-12 01:38 AM

Re: 295.20 crash - suspend/logout/shutdown failing
Did you tried reinstalling nvidia driver after OS packages upgrade from ftp://download.nvidia.com/XFree86/Linux-x86_64/?

taniwallach 03-02-12 05:32 AM

Re: 295.20 crash - suspend/logout/shutdown failing
2 Attachment(s)
Now I did - hope the time spend testing this helps.

NOTE: I uninstalled all the Debian NVIDIA packages (necessary to allow a manual nVidia driver install), then downloaded and installed 295.20 from NVIDIA site. (Needed "-e" for export mode to bypass the pre-install script which was causing a SIGTERM and failure to compile/install).

Had the same hard freeze/crash problems as before (network/keyboard all dead) which forced a hard power cycle. Each reboot needed a fsck of /home which was not properly unmounted. I did not spend any time testing stability, just that the X server started and would crash the system when stopped.

Try 1: Start "gdm3" (started up) and then stop it. Crashed system when I stopped gdm3.

Try 2: I ran "startx -- -logverbose 6" and then "nvidia-bug-report.sh". This time the splinning icon of loading stuck and I still managed to run "nvidia-bug-report.sh". When I killed X - a system freeze. The bug report from this try is the small attached file called nvidia-bug-report-try2.log.gz.

Try 3: I ran "startx -- -logverbose 6" and then "nvidia-bug-report.sh". This time the splinning icon of loading stuck and I still managed to run "nvidia-bug-report.sh". When I killed X - a system freeze. The bug report from this try is the larger attached file called nvidia-bug-report.log.gz.

NOTE: switching to Virtual Terminals from X screen - does NOT work. Screen is just solid black.

Once again I switched back to the Debian 290.10-1 packages, which work fine.

changyp 03-19-12 12:53 PM

Re: 295.20 crash - suspend/logout/shutdown failing
I have exactly the save problem here on Fedora 16 with kernel 3.2.10-3.i686 on my old laptop with a 8400m GS card.
When I switch back to 290.10, this problem goes away.
But this problem never shows up on my desktop computer with a GT520 graphics card.

taniwallach 07-11-12 02:21 AM

Re: 295.20 crash - suspend/logout/shutdown failing
Problems seem to be resolved in 295.59. There was still trouble with 295.40, but did not test 295.49 or 295.53.

Both suspend/restore and switch to VT and then back to X now work and don't hard-freeze my machine.

Can be marked are resolved - in my opinion.


sandipt 07-12-12 02:27 AM

Re: 295.20 crash - suspend/logout/shutdown failing
I think VT switch console broken issue will resolve by passing kernel parameter vga=0

