VDPAU API and implementation issues
Below are some problems I've come across while implementing a better VPDAU driver for MPlayer than the svn version has. General API issues are listed first, then implementation problems.
Functions to get vsync interval and time
As discussed previously in the MPlayer thread, since the VDPAU implementation refuses to switch frames more than once per display refresh there should be a method to get the current display refresh interval. Otherwise applications can't be sure they're not trying to queue frames more rapidly than VDPAU is willing to show them. Additionally, once you're doing vsync-aware timing, it would also be useful to have a function to directly get the timestamp of a recent vsync (to calculate approximate times of future vsyncs as this plus a multiple of the interval). Once some frames have been displayed you can get vsync times by querying their status, but it would be nicer to have a correct value from the start.
BlockUntilSurfaceIdle: no non-blocking way to wait for event
In general it's not nice that the only way to wait for an event is blocking and can't be integrated in an event loop. This hasn't been a practical problem in my use yet as I haven't queued frames far ahead, but I think it would be more of an issue with multiple queued frames. This could be implemented as an fd that becomes readable when status changes, either with a message about the change or just a dummy byte that means you should recheck the status of the surface(s) you're interested in.
Bad documentation of RenderOutputSurface blend_state parameter
The documentation says "The blend math is the familiar OpenGL blend math: dst.a = equation(blendFactorDstAlpha * dst.a, blendFactorSrcAlpha * src.a)". This should have more details or at least a pointer to the OpenGL documentation - having to go search for the details elsewhere when working on VDPAU is annoying. What the current documentation does say is also wrong: the MIN and MAX equations ignore the blend factors. I at least did not remember that from OpenGL and wondered why my code didn't work until I looked up the applicable OpenGL documentation and checked the details there.
The following issues are more implementation-related. I used a 9500GT with 185.18.36 drivers; I haven't yet tested whether any of them have been fixed in latest drivers.
Trying to queue more than 8 frames for display seems to block. If there is meant to be a limit lower than the amount of surfaces you can allocate then this should be documented.
Queuing up to 8 frames worked if doing nothing else, but trying to do other operations like upload video surfaces while having two or more unshown surfaces queued for display caused the driver to use a lot of CPU (could have been a busyloop until there was only one yet undisplayed surface). This at least looks like a clear implementation problem.
The BlockUntilSurfaceIdle documentation says it will block indefinitely if queried about the most recent surface added to the queue (presumably until a surface is added from another thread, if ever). However in my tests it seems to return with an error, with the corresponding error message being "A catch-all error, used when no other error code applies.".
The VSYNC timestamps returned by QuerySurfaceStatus and BlockUntilSurfaceIdle are quite accurate when using overlay, but when overlay is disabled the accuracy drops a lot. I a test I saw the delta between timestamps returned for frames shown on consecutive display refreshes vary from 11055232 to 12650112 on a 85 Hz display. Other activity in X seems to increase the variation. This instability confused the algorithm I used to estimate VDPAU display FPS (which worked fine with overlay).
I was unable to see ANY difference between bob deinterlacing and the higher modes. The higher modes were slower though so it did look like they were at least enabled properly. I also tested this with plain svn MPlayer to check I hadn't broken anything, but didn't see differences with that either.