View Single Post
Old 05-05-11, 12:36 PM   #1
pjwhite
Registered User
 
Join Date: May 2009
Posts: 7
Default Xorg lockup under high openGL load with newest 270.41.06 driver. ( repeatable )

Under high openGL load I repeatably get Xorg lockups that are recoverable only by a reboot. This is very damaging because I am trying to support a product that is affected by this issue.

I am using ubuntu 10.04, and have tried a few different drivers ( 260.19.x series, and the latest 270.41.06 ), and the issue exists. This particular system has GTS 450s, or another fermi based card. I can easily reproduce this with a 2 monitor, 2 GPU setup with one monitor plugged into each GPU, and spanning them with xinerama. The problem is not exclusively with xinerama, and I have reproduced it on a 2 monitor system using twinview.

Using GTS 250s I have so far been unable to reproduce this, however.

I have a test program, using Qt OpenGL that will open up and move itself from one monitor to the other every 2 seconds. If I run a few of these I will end up locking up the system almost every time withing 10 or 20 seconds of the programs starting and moving.

I have tried a number of Xorg.conf parameters:
- Option "UserEvents" "1"
- Option "TripleBuffer" "1"
- Option "BackingStore" "1"

The openGL parameter __GL_YIELD=USLEEP seemed to make a pretty good difference, and typically the test scenario does not lock up Xorg about 2/3 of the time. However 1/3 of the time Xorg still locks up and requires a reboot.

I can ssh into the system and I run a "strace -p `pidof X`" and for default __GL_YIELD i get the following ( repeated forever ):

Code:
rt_sigreturn(0xe)                       = 140470336774144
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe)                       = 2147483648
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe)                       = 140470336774144
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe)                       = 140470336774144
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe)                       = 2147483648
with __GL_YIELD set to USLEEP this is the more rare strace ( repeated forever ):

Code:
rt_sigprocmask(SIG_BLOCK, [ALRM CHLD TSTP TTIN TTOU VTALRM WINCH IO], [IO], 8) = 0
rt_sigprocmask(SIG_SETMASK, [IO], NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [IO], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [IO], [], 8)  = 0
rt_sigprocmask(SIG_UNBLOCK, [IO], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [IO], [], 8)  = 0
rt_sigprocmask(SIG_UNBLOCK, [IO], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [IO], [], 8)  = 0
rt_sigprocmask(SIG_UNBLOCK, [IO], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [IO], [], 8)  = 0

I have two log incidents of this occuring, the first (nvidia-bug-report-1.log.gz) is with normal __GL_YIELD and the second is with __GL_YIELD=USLEEP (nvidia-bug-report-2.log.gz).


If anyone is interested, and has a 10.04 system I could also attach a binary and script that I can use to reproduce it on any 10.04 system, just let me know.
Attached Files
File Type: gz nvidia-bug-report-1.log.gz (67.2 KB, 101 views)
File Type: gz nvidia-bug-report-2.log.gz (68.5 KB, 94 views)
pjwhite is offline   Reply With Quote