Serious bug in NVidia drivers (vblank)
In our company we are working on a simulation project in which we have several threads competing for the CPU. The main thread draws with OpenGL the scene, and other threads do I/O and some processing.
The bug we have found is that the main thread uses the 100% of the CPU even if we synchronize with vertical retrace. This is a very serious bug for our sort of projects, because the other threads need the idle time of the main thread to do their work :'( As a result, our application is running much slower than it could run if the driver waited until the vertical retrace.
If it will late too much to fix this bug, is there any workaround that we can do in the meanwhile, please?
This isn't a bug with the drivers, it's a side effect of the Linux kernel timer resolution. The kernel scheduler (at least, this is my understanding of it) runs once every 1/HZ seconds (HZ is 100 on Intel machines, and 1024 on certain others).
This means that there is a latency of 1/HZ seconds, or 10ms, possible if any thread (or process) gives up control.
This also means that the vertical retrace can't be synced to by waiting for an interrupt; the userspace code that runs in response to that interrupt might be very, very late.
So instead, the drivers busy-wait on the vertical retrace signal. When a retrace is in progress, they return (and glXSwapBuffers finishes).
The only way to change this is to change the kernel timer resolution. Or, change your threading package to be a preemptive one (AKA pthreads) rather than a cooperative one. There's no reason a high CPU utilization in one thread would stop others from running, especially if they're only doing I/O, unless you're using a userspace threading package that can only do cooperative multitasking.
polling also doesn't sync well
In fact we are using pthreads, and in a preemtive configuration. Of course while the first thread is doing the polling, it's possible that other threads take the control of the CPU, but some comments on this:
If the draw thread is heavily using the CPU and the other too, the other will have ideally only half of the CPU time it could have (while SwapBuffers). So our app could be faster than it is.
Moreover, while the draw thread is polling for the retrace, it's possible that the sheduler gives the CPU to the second thread (or even another process), and when the draw thread recovers the CPU, it could be too late.
So how can it guarantee to be in time even doing "busy wait"? In fact, the driver in our machine is not syncing the swap buffers with vertical retrace correctly (even though it is eating all CPU available).
There is a possibility of changing the HZ constant of the kernel, or use a 2.5 kernel for our app. But I think it will not solve any of the problems we have.
You just made me feel so un1337...
|All times are GMT -5. The time now is 12:47 AM.|
Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2015, Jelsoft Enterprises Ltd.
Copyright ©1998 - 2014, nV News.