After a long night of fighting vs CUDA, i managed to finish the first alpha version of my benchmarking tool.
You can get the tool here:
Unrar, exec 'run_tests.bat', and you should see something like this:
The best time of my GTX285 in the 256x256 test was 0.024s, and the best time of my 3.6 GHz Xeon 3350 Quad was 0.65s (using the old tool that comes with my br2 patch). So, my GPU is running around 27x times faster than my CPU in this test
. Finally, good results.
There is still a lot of room for optimizations, so, this is going really well.
There are some problems with the FP 'precision'. The ALUs in the GPU do not follow the IEEE FP standards, and there are some errors in the 1024x1024 test.
I would like to see your results.