A new "beta" version:
It will benchmark your CPU vs your GPU.
It supports multi-GPU rigs too.
You can specify from the command line, the number of GPUs to use. You will need to disable SLI to use multiple GPUs in CUDA, according to nVidia papers.
br2perlin 1 5 -> This will use just 1 GPU
br2perlin 2 5 -> This will use 2 GPUs
The library also supports mixing the CPU & GPU at the same time. In theory, when i designed it, i thought that CPU+GPU was going to be faster, but, due to the asynchronous nature of CUDA, it ends slower than the CPU or GPU alone.
My BR2 Patch is using the new CUDA code now, and the perlin effects run on the GPU now.
Unluckily, if you only have 1 gfx card, this is not a good idea, because the framerate is lower due to the resources used for the CUDA calculations. But, if you have 2 gfx cards, you won't lose any fps, and the perlin code will run faster in the GPU (bigger & more complex effects).
Basically, i've written this to use my old 8800GTX to run the Perlin effects, and my GTX285 to render the shiny graphics at 1920x1200 SSAA 2x
The results of my Xeon 3350 @ 3.6 GHz + eVGA GTX 285 SSC:
CPU SSE3 4 Threads
Total Time: 0.660127, Min: -0.699944, Max: 0.798931, Range: 1.498875
Total Time: 0.106165
In my system, the GPU is 6.5x times faster than the CPU.