nV News Forums

 
 

nV News Forums (http://www.nvnews.net/vbulletin/index.php)
-   NVIDIA Linux (http://www.nvnews.net/vbulletin/forumdisplay.php?f=14)
-   -   XServer hang (100% CPU) (http://www.nvnews.net/vbulletin/showthread.php?t=177312)

pulsor_07 03-29-12 12:13 PM

XServer hang (100% CPU)
 
We are using multi monitor environments (3 x NVS 290, PCIe 1x, each card with 2 monitors) on a Fedora Core x86_64 System with kernel 2.6.35.14 and the Nvidia display driver 280.13

We had on several such systems in the last few month, the following problem:
Systems are running 24/7. Sometimes in the morning all monitors are frozen. Moving the mouse, will start the mouse to "flicker".

The mouse pointer appears and disappears a couple times per second. Mouse does not move. All is frozen.

When starting to move the mouse (physically), the X-Server begins to consume 100% CPU. This will never end.

The only known solution is to restart the XServer. Nothing substantive can be found in /var/log/messages nor in /var/log/Xorg.0.log.

Our Java application is gtk based and we are using OpenGL (JOGL).

In the release notes of 295.20, some issues concerning OpenGL and hangs are mentioned. Are the problems solved similar to the behaviour I descried above ?
Are there some debug options to get more information ?

Regards

Rainer

Xorg log snippet:
--------------------


[105295.523]
X.Org X Server 1.9.5
Release Date: 2011-03-17
[105295.523] X Protocol Version 11, Revision 0
[105295.523] Build Operating System: x86-07 2.6.32-131.2.1.el6.x86_64
[105295.523] Current Operating System: Linux hmilab21 2.6.35.14-103.t1.fc14.x86_64 #1 SMP Tue Dec 20 15:34:51 CET 2011 x86_64
[105295.523] Kernel command line: ro root=UUID=e682877c-7a70-477d-beb6-72aef1775fe7 rd_NO_LUKS rd_NO_LVM rd_NO_MD rd_NO_DM LANG=en_GB.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=de-latin1 notsc clocksource=acpi_pm nohz=off highres=off rhgb quiet
[105295.523] Build Date: 13 October 2011 02:24:00PM
[105295.523] Build ID: xorg-x11-server 1.9.5-2.fc14
....
[105295.530] (II) Module glx: vendor="NVIDIA Corporation"
[105295.530] compiled for 4.0.2, module version = 1.0.0
[105295.530] Module class: X.Org Server Extension
[105295.530] (II) NVIDIA GLX Module 280.13 Wed Jul 27 17:12:07 PDT 2011
NVIDIA GPU Quadro NVS 290 (G86GL) at PCI:1:0:0 (GPU-0)
...
[105296.578] (--) NVIDIA(1): Memory: 262144 kBytes
[105296.578] (--) NVIDIA(1): VideoBIOS: 60.86.63.00.20

AaronP 03-30-12 12:17 PM

Re: XServer hang (100% CPU)
 
Is the mouse flickering back and forth between two adjacent screens? If so, this is a known bug in the X server's mouse event handling code.

pulsor_07 04-02-12 01:10 PM

Re: XServer hang (100% CPU)
 
I saw the bug this morning on a two monitor system. The mouse does cyclically disappear and appear exactly between the two monitors (at the most left position of the right monitor).
Moving the mouse is very sticky and the curios is, that it moves back to the old position by itself !
A login to the machine was possible. The X server consumes 100% CPU.
After a while the oom-killer killed the X server. rss about 1.8 GB!
I am sure, that in some cases (seen in the past) the X server consumes not that large amount of memory, but consumes 100% CPU load and does not react to any X-client.
We get the behaviour rarely on any of our (about 50) test machines. Rarely means about 2 weeks per bug event in sum on all 50 machines.

log:
Apr 2 08:57:28 hmilab4 kernel: [1184743.650265] Out of memory: kill process 1732 (xinit) score 50701 or a child
Apr 2 08:57:28 hmilab4 kernel: [1184743.650265] Killed process 1733 (X) vsz:1986960kB, anon-rss:1874000kB, file-rss:1924kB

Does the described behaviour fit to the X server bug ?
Is there a way to switch on any debugging to get more details of this hang ?

AaronP 04-02-12 04:44 PM

Re: XServer hang (100% CPU)
 
Yes, this sounds like the X server bug. It occurs when the mouse crosses screens too many times while the X server is busy processing a request. In this case, the server was probably busy for so long because your system was swapping due to high memory usage.

The X server using a lot of memory is typically a sign that some application is requesting a large number of resources. You can check to see if applications are using a lot of pixmaps using the 'xrestop' tool.

pulsor_07 04-05-12 06:59 AM

Re: XServer hang (100% CPU)
 
I just dug into the X server bug AaronP mentioned, but this bug is always related to Xinerama mode. But we are using single screens (no xinerama, to twinview, see xorg.conf below).
And the behaviour described in the X server bug, is often related to some actions (mouse movement while applications window start.

Our observations are:
Many times after a long time without user action (weekend, night) the first mouse movement is the trigger for the problem. Just now the X server starts to consume 100% CPU (supervision scripts show, that this is not the case before) and mouse is flashing, moving stuck, no X changes of applications possible.

Is there a chance to get rid of the problem updating to 295.33 ?



xorg.conf:
-----------------
Section "ServerLayout"
Identifier "Default Layout"
Screen 0 "Screen0" 0 0
Screen 1 "Screen1" 1680 0
Screen 2 "Screen2" 3360 0
InputDevice "Mouse0" "CorePointer"
InputDevice "Keyboard0" "CoreKeyboard"
Option "DontVTSwitch" "on"
Option "DontZap" "on"
Option "DontZoom" "on"
EndSection

Section "Module"
Load "glx"
Load "extmod"
EndSection

Section "InputDevice"
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/input/mice"
Option "Emulate3Buttons" "no"
Option "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
Identifier "Keyboard0"
Driver "kbd"
Option "XkbModel" "pc105"
Option "XkbLayout"
EndSection

Section "Monitor"
Identifier "Monitor0"
Option "DPMS"
EndSection

Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "Quadro NVS 2x0 PCI"
BusID "PCI:1:0:0"
Screen 0
EndSection

Section "Device"
Identifier "Device1"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "Quadro NVS 2x0 PCI"
BusID "PCI:1:0:0"
Screen 1
EndSection

Section "Device"
Identifier "Device2"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "Quadro NVS 2x0 PCI"
BusID "PCI:2:0:0"
Screen 0
EndSection

Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor0"
DefaultDepth 24
Option "NoLogo" "True"
SubSection "Display"
Depth 24
Modes "1680x1050"
EndSubSection
EndSection

Section "Screen"
Identifier "Screen1"
Device "Device1"
Monitor "Monitor0"
DefaultDepth 24
Option "NoLogo" "True"
SubSection "Display"
Depth 24
Modes "1680x1050"
EndSubSection
EndSection

Section "Screen"
Identifier "Screen2"
Device "Device2"
Monitor "Monitor0"
DefaultDepth 24
Option "NoLogo" "True"
SubSection "Display"
Depth 24
Modes "1680x1050"
EndSubSection
EndSection


All times are GMT -5. The time now is 06:19 PM.

Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.