Understanding CineFX - MUCH more than the R300
First, let me begin by saying that nVidia is taking a VERY dangerous bet. They're hoping for developers to use their tech instead of ATI one, even if ATI tech is the MS official minimum for DX9 ( As Far As We Know )
I'm basing myself on nVidia public "CineFX_1-final.pdf" presentation to compare the NV30, the R300 and DX8. However, a few errors or strange oddities do exist in this document:
The R300 is supposed to be able to do 128 bit color, just as the NV30.
The NV30 raytracing power is explained at all, so it's better to simply suppose it won't be used by devs for a while.
Now, the real power of the NV30 lies into the vertex shaders, but first, the pixel shaders.
The Pixel Shader of the NV30, said simply, has amazing raw power but no big advancement from PS 1.4 beside it got a lot more instructions.
But since the NV25 didn't support PS 1.4, it's a good thing this becomes a standard.
Compared to the R300, the NV30 only has more instructions available, and most likely a higher clock rate, enabling for those instructions to actually be usefull and fast.
However, nVidia isn't betting on their Pixel Shader power at all - they simply hope developers allow for higher instruction counts depending on the hardware, making games look better on their cards.
The Vertex Shader of the NV30 - A hundred shaders for the price of one
The goal of any good programmer is batching several thousands polygons into a single DrawIndexedPrimitive call.
Before Shaders, there were THREE problems here: textures and render states and buffers
After shaders, there were FIVE problems here: texture, render states ( which became less of a problem, but it still existed ) , buffers, vertex shaders and pixel shaders.
Pixel Shaders, however, could rapidly become less usefull by Vertex Shaders because as there are more and more polygons in models, Vertex Shader quality is nearly as good as the Pixel Shader one.
So it's likely pixel shaders mostly get used for water and very specific pruposes in the future, reducing the importance of that problem ( and nearly eliminating it with PS 3.0 including branching, which will hopefully be ready in about 12-24 months )
However, the R300 is doing a lame attemp to fix the Vertex Shader problem: maximum 4 loops and each having a maximum 256 instructions.
nVidia way is better: maximum 256 loops and each having a maximum of 65536 instructions.
This allows to actually have a LOT less vertex shaders than before, and all of this because there are a lot of loops and branching power.
And ya know what that means? Yep, you guessed it - much better batching. And a lot more performance if the programmers do it right.
nVidia system for Vertex Shaders is excellent, and could result in great branching and a LOT less done on the CPU.
This might also give us a lot more free time for the CPU to do AI - which is, IMO, a good thing.