I've already sent an email to
linux-bugs@nvidia.com, no response so far.. but I thought maybe the forum would have some suggestions.
It's easy to see that a lot of people experiencing lockups with the NVIDIA driver that aren't chipset/agpgart/SBA/whatever related also note that disabling RenderAccel in their XF86Config/xorg.conf fixes the problem..
I've raised this issue about 4 times now. The first time it was on GeForce 2 Ti hardware. Now it's GeForce 4 Ti. Same difference. Adding:
Option "RenderAccel" "1"
To the Driver section of the X config causes my machine to randomly lock up, if the render extension is used heavily. Some people claim to have no problems with it and -- I'll admit -- since the 4xxx drivers I've been able to leave it enabled pretty much all the time, with only the occasional lockup.
However, like many people here (I'm sure), I downloaded the latest Xorg 6.8.0 release to try out the new (admittedly experimental) Composite extension, using the crappy xcompmgr utility. I got the utility installed and the extension registered, and all seemed to be good.
If I enable drop shadows with xcompmgr -cf, moving the window around rapidly locks up the X server. 100% CPU. I have to ssh into the machine, kill X with kill -9 `pidof X` and restart X. Disabling RenderAccel allows me to enable drop shadows, but it's all done in software then and is horrendously slow.
I did what other people here have suggested and strace'd my X binary. I did the following to test it. I started up X, with:
~$ strace -o ~/xtrace X
Then I switched back to vt7 and ran:
~$ xterm -display :0
Then I switched back to vt12 (X) and ran:
~# twm &
Then I ran xcompmgr:
~# xcompmgr -cf
Then I just randomly clicked outside of the terminal to bring up repeatedly the twm menu. Locked up in about 2 clicks. Reproduced the bug!
In ~/xtrace I see the log terminating with this, repeating indefinitely:
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn() = ? (mask now [])
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn() = ? (mask now [])
(The gettimeofday() "DoS"ing somebody was talking about elsewhere on the forum is bogus; it does this when RenderAccel isn't enabled, it seems to be the defacto way X functions. The above, however, does not happen.)
I simply cannot believe that a problem that has been so long lived, that is clearly software (I can kill X and restart; no problem), and in acceleration which will be especially relevant to users in future Xorg releases, has been ignored for so long!
Please NVIDIA, get somebody to fix it!