PDA

View Full Version : Article on Cg profile performance


Nv40
05-11-03, 03:25 AM
could it be possible to write a shaders that performs very
FAst in Nvidia Cg and very slow using HLSL ?

how about in the other way?
very fast in HLSL and very slow in Cg.

take a look here ,in this nice Topic ...

http://www.cgshaders.org/forums/viewtopic.php?t=962&sid=0570167f5399e149c9ba24a74d06c149

it is really important ,more than people think , *the way*
you program shaders to be rendered in the Geforcefx ..
or you can end wasting crucial performance , and/or
causing not necesary bottlenecks in the Nv30..
hint (3dmark2003 and others...)


Now this IS interesting because while this shader is loved by Cg (FP30fixed 90fps , PS2.x 30 ps) HLSL absolutely hates it - only 9fps

so in the example above 90fps vs 9fps!!! ,any programmer that use
those test for PS performance of the Geforcefx vs R3XX ,he could end believing that the NV30 is 9 times slower than what they really are ,if he/she doesnt know how to take the best of the card.

another interesting info from one of the TImemachine demo
programmers ,where he explain some circunstances
where the Nv30 gets virtually "free performance"
when using LOng intructions shaders .

Gary King of the NVIDIA Demo Team mentioned that they did this a lot in the "time machine" demo. He said that in long programs, texture lookups basically become free (if you are bottlenecked by the performance of the fragment shader, then the extra bandwidth cost is negligible). As an example, if you are doing a lot of normalization of vectors, it might be faster to use a normalization cubemap than to increase instruction and register count by normalizing in the program.

THis is just a few examples ,and probably there are a millions more..
that will clearly show the the importance of the way you program in
the Nv30. it is very flexible but also a very twitchy with the way
you code for it. ;)

so, *keep in mind *those benchmaks when looking at syntetic test
made by other IHV or programmers like Futuremark ,shadermark
and others as *proof * of the cards PS/VS performance ..
f they dont know the ins and out's of the card or does not have
the technical assistance of Nvidia ,then you will end wasting
BY FAR ,a lot of its potential and resources.
is easy to understand why NVidia and FUturemark still have
their diferences when it comes to the way of programming.

marcocom
05-12-03, 05:31 AM
ya, simply put, pay futuremark or get fuct.

at least most game developers will seek to do the best for both GPU platforms, make a great looking game, and not give a damn about anything else.

futuremark has turned out to look really bad in this, imo. we dont need benchmarks to jerk ourselves off...we need impartial testing standards that arent geared for alliances and partnerships. its absurd.

(sort of a thread BUMP cuz i dont understand alot of what your talking about, but many here do and i want to hear about this hopefully)

Hanners
05-12-03, 06:45 AM
Originally posted by marcocom
futuremark has turned out to look really bad in this, imo. we dont need benchmarks to jerk ourselves off...we need impartial testing standards that arent geared for alliances and partnerships. its absurd.

You can't have an impartial test if you need to use a particular IHVs coding tool, and have to code specifically to get the best out of a particular chipset though. ;)

It isn't FutureMark's fault that NV30 is difficult to program with - All they do is follow the standards, which in there case means using DirectX9 HLSL.

Uttar
05-12-03, 01:03 PM
Originally posted by Hanners
It isn't FutureMark's fault that NV30 is difficult to program with - All they do is follow the standards, which in there case means using DirectX9 HLSL.

Actually, FutureMark uses assembly language for 3DMark 2003.


Uttar

StealthHawk
05-12-03, 04:09 PM
Originally posted by Uttar
Actually, FutureMark uses assembly language for 3DMark 2003.


Uttar

Aren't some of the tests using HLSL? Either GT4 or PS2.0 test. I thought I read somewhere at B3D that this was true.