Default Graphics utilization measurement question

I am using the 9400 GT and 9500 GT cards, with 190.42 driver.

Given a sync'd multi-monitor graphical load, what is the best method of determining overall load as a percentage of maximum? With other graphics cards, this has been achieved by measuring the time difference between the final glfinish and the arrival of the vertical sync signal, as a fraction of the overall rendering cycle period. However the nvidia graphics appears to be doing a busy wait within the glfinish, with no waiting within glxSwapbuffer at all. So it is difficult inferring GPU idle time.

I have seen some discussion that seems to infer that the best approach is to build a CUDA application. Is this correct?
