PDA

View Full Version : Would you like to see a benchmark demo using PS3.0?


Abba Zabba
09-01-04, 06:34 PM
I think I'm in the mood :D

Skinner
09-01-04, 07:35 PM
Esspecially if you managed to show the advantages over SM2.b ;)

aAv7
09-01-04, 07:45 PM
id love to see one-but to my understanding nv40 doesnt have full sm3.0 support-only partial.

Abba Zabba
09-01-04, 08:59 PM
id love to see one-but to my understanding nv40 doesnt have full sm3.0 support-only partial.

You'll soon be amazed at what NV_fragment_program2 can do ;)

nvnews-reader
09-02-04, 12:15 AM
Well hurry up then. (mag)

Cruel_Logic
09-02-04, 01:17 AM
you writing it right now or something?

fivefeet8
09-02-04, 02:36 AM
http://download.nvidia.com/developer/SDK/Individual_Samples/samples.html

Just a few demos.. Some seem to require the Dx9c SDK though.

991060
09-02-04, 11:49 AM
only if you implement a very complex shader

991060
09-02-04, 12:10 PM
and a suggestion, you may want to change the font slightly, I can hardly differentiate between "1" and "7" and "C" and "G" before seeing them both.

OWA
09-02-04, 11:01 PM
http://download.nvidia.com/developer/SDK/Individual_Samples/samples.html

Just a few demos.. Some seem to require the Dx9c SDK though.
Great link. I'm enjoying the demos. :) The instancing demo could almost be the game of asteroids and nice that they show the framerate.

Edit: With instancing I get around 40fps, without about 12fps.

Yeah, I'd like to see a PS 3.0 benchmark.

fivefeet8
09-03-04, 02:10 AM
Great link. I'm enjoying the demos. :) The instancing demo could almost be the game of asteroids and nice that they show the framerate.

Edit: With instancing I get around 40fps, without about 12fps.


It's the exact same demo they used at the Nv40 presentations. It is quite interesting though. Turn the amount of asteroids to 16000, then disable instancing. ;) The FP16 floating point blending demo is also kinda cool.

OWA
09-03-04, 11:39 AM
It's the exact same demo they used at the Nv40 presentations. It is quite interesting though. Turn the amount of asteroids to 16000, then disable instancing. ;) The FP16 floating point blending demo is also kinda cool.
Yeah, I liked the blending one as well. We just need Abba Zabba to combine them all into one super demo/benchmark and we'll be set. :)

Abba Zabba
09-03-04, 01:43 PM
I'm noticing some awful performance loss when switching from FP16 to FP32 (without a noticeable quality degradation though).
I'mma have 2-3 modes in the next demo where we can switch in between full and half precision using NV40 capabilities, and a regular ARB 2 path for the Radeons.
Coming soon :)

holmes
09-03-04, 02:20 PM
sounds good. keep us posted...
:)

Abba Zabba
09-04-04, 02:01 AM
Forget SM3.0 on the current generation of hardware; it simply ain't fast enough. :retard:
Let me explain things before I get flamed away from these forums; I wrote this Blinn shader where light is attenuated according to a radial distance from the source.
Normally pixels residing outside of the light range will be darkened by an attenuation coefficient of zero magnitude leaving a shade equal to the ambient color of the material.
Those pixels don't need to go through the entire Blinn equation, and that's where branching could step forth, help identify and discard them early through the pixel pipe.
However, according to my findings, PS 3.0 is too slow on the current GeForce 6 range of cards to bring any advantage over the conventional way of computing illumination.

Here are few snapshots taken from a demo I’m working on right now, notice that in the PS3.0 lighting model, the pixels that were discarded are colored in green.

No branching
http://www.realityflux.com/abba/Pictures/regular.png
Branching
http://www.realityflux.com/abba/Pictures/branchingPS3.0.png

991060
09-04-04, 03:18 AM
you're gonna need a complex lighting model to take advantages of bynamic branching, or you can apply a more strict limitation on those survived pixels(i.e. cull more unlit pixel, such as using a smaller radius on the light).
IIRC, nVIDIA said there're 2 cycles penalty for each jump in the pixel shader, and according to my own test earlier, the penalty is even bigger. That is to say, if your shader isn't complex enough, the number of culled instructions may be even smaller than the penalty.

Abba Zabba
09-04-04, 01:14 PM
I noticed that in the case of the softshadows demo that comes with the 8th revision of NVidia SDK.
Basically we fetch 8 samples of a shadow texture, check if all of em are within the shadow area, if it's the case, our current fragment will be unlit.
However if some of those pixels are lit, that means we're on a shadow edge and hence need to fetch up to 64 jittered samples for the softshadow effect.
If done in a pure PS 2.0 fashion, we're going to have to fetch 64 samples all the time, whereas with a PS3.0 hardware we can skip a lot of sampling by determining whether we're on an edge or not through the first initial 8 samples.
According the performance counter, PS 3.0 averages 45-50 FPS whereas PS2.0 can hardly maintain a frame rate of 30 FPS.
But as you can see, 45-50 ain't too good, at least in my opinion.
Peace

MUYA
09-04-04, 01:22 PM
hmm ((45-30) / 30) * 100 = 50% ????

991060
09-05-04, 04:35 AM
Well, I think 50% faster isn't that bad. :clap: