Nv40
05-11-03, 03:25 AM
could it be possible to write a shaders that performs very
FAst in Nvidia Cg and very slow using HLSL ?
how about in the other way?
very fast in HLSL and very slow in Cg.
take a look here ,in this nice Topic ...
http://www.cgshaders.org/forums/viewtopic.php?t=962&sid=0570167f5399e149c9ba24a74d06c149
it is really important ,more than people think , *the way*
you program shaders to be rendered in the Geforcefx ..
or you can end wasting crucial performance , and/or
causing not necesary bottlenecks in the Nv30..
hint (3dmark2003 and others...)
Now this IS interesting because while this shader is loved by Cg (FP30fixed 90fps , PS2.x 30 ps) HLSL absolutely hates it - only 9fps
so in the example above 90fps vs 9fps!!! ,any programmer that use
those test for PS performance of the Geforcefx vs R3XX ,he could end believing that the NV30 is 9 times slower than what they really are ,if he/she doesnt know how to take the best of the card.
another interesting info from one of the TImemachine demo
programmers ,where he explain some circunstances
where the Nv30 gets virtually "free performance"
when using LOng intructions shaders .
Gary King of the NVIDIA Demo Team mentioned that they did this a lot in the "time machine" demo. He said that in long programs, texture lookups basically become free (if you are bottlenecked by the performance of the fragment shader, then the extra bandwidth cost is negligible). As an example, if you are doing a lot of normalization of vectors, it might be faster to use a normalization cubemap than to increase instruction and register count by normalizing in the program.
THis is just a few examples ,and probably there are a millions more..
that will clearly show the the importance of the way you program in
the Nv30. it is very flexible but also a very twitchy with the way
you code for it. ;)
so, *keep in mind *those benchmaks when looking at syntetic test
made by other IHV or programmers like Futuremark ,shadermark
and others as *proof * of the cards PS/VS performance ..
f they dont know the ins and out's of the card or does not have
the technical assistance of Nvidia ,then you will end wasting
BY FAR ,a lot of its potential and resources.
is easy to understand why NVidia and FUturemark still have
their diferences when it comes to the way of programming.
FAst in Nvidia Cg and very slow using HLSL ?
how about in the other way?
very fast in HLSL and very slow in Cg.
take a look here ,in this nice Topic ...
http://www.cgshaders.org/forums/viewtopic.php?t=962&sid=0570167f5399e149c9ba24a74d06c149
it is really important ,more than people think , *the way*
you program shaders to be rendered in the Geforcefx ..
or you can end wasting crucial performance , and/or
causing not necesary bottlenecks in the Nv30..
hint (3dmark2003 and others...)
Now this IS interesting because while this shader is loved by Cg (FP30fixed 90fps , PS2.x 30 ps) HLSL absolutely hates it - only 9fps
so in the example above 90fps vs 9fps!!! ,any programmer that use
those test for PS performance of the Geforcefx vs R3XX ,he could end believing that the NV30 is 9 times slower than what they really are ,if he/she doesnt know how to take the best of the card.
another interesting info from one of the TImemachine demo
programmers ,where he explain some circunstances
where the Nv30 gets virtually "free performance"
when using LOng intructions shaders .
Gary King of the NVIDIA Demo Team mentioned that they did this a lot in the "time machine" demo. He said that in long programs, texture lookups basically become free (if you are bottlenecked by the performance of the fragment shader, then the extra bandwidth cost is negligible). As an example, if you are doing a lot of normalization of vectors, it might be faster to use a normalization cubemap than to increase instruction and register count by normalizing in the program.
THis is just a few examples ,and probably there are a millions more..
that will clearly show the the importance of the way you program in
the Nv30. it is very flexible but also a very twitchy with the way
you code for it. ;)
so, *keep in mind *those benchmaks when looking at syntetic test
made by other IHV or programmers like Futuremark ,shadermark
and others as *proof * of the cards PS/VS performance ..
f they dont know the ins and out's of the card or does not have
the technical assistance of Nvidia ,then you will end wasting
BY FAR ,a lot of its potential and resources.
is easy to understand why NVidia and FUturemark still have
their diferences when it comes to the way of programming.