View Full Version : Understanding CineFX - MUCH more than the R300
Hello everyone,
First, let me begin by saying that nVidia is taking a VERY dangerous bet. They're hoping for developers to use their tech instead of ATI one, even if ATI tech is the MS official minimum for DX9 ( As Far As We Know )
I'm basing myself on nVidia public "CineFX_1-final.pdf" presentation to compare the NV30, the R300 and DX8. However, a few errors or strange oddities do exist in this document:
The R300 is supposed to be able to do 128 bit color, just as the NV30.
The NV30 raytracing power is explained at all, so it's better to simply suppose it won't be used by devs for a while.
Now, the real power of the NV30 lies into the vertex shaders, but first, the pixel shaders.
The Pixel Shader of the NV30, said simply, has amazing raw power but no big advancement from PS 1.4 beside it got a lot more instructions.
But since the NV25 didn't support PS 1.4, it's a good thing this becomes a standard.
Compared to the R300, the NV30 only has more instructions available, and most likely a higher clock rate, enabling for those instructions to actually be usefull and fast.
However, nVidia isn't betting on their Pixel Shader power at all - they simply hope developers allow for higher instruction counts depending on the hardware, making games look better on their cards.
The Vertex Shader of the NV30 - A hundred shaders for the price of one
The goal of any good programmer is batching several thousands polygons into a single DrawIndexedPrimitive call.
Before Shaders, there were THREE problems here: textures and render states and buffers
After shaders, there were FIVE problems here: texture, render states ( which became less of a problem, but it still existed ) , buffers, vertex shaders and pixel shaders.
Pixel Shaders, however, could rapidly become less usefull by Vertex Shaders because as there are more and more polygons in models, Vertex Shader quality is nearly as good as the Pixel Shader one.
So it's likely pixel shaders mostly get used for water and very specific pruposes in the future, reducing the importance of that problem ( and nearly eliminating it with PS 3.0 including branching, which will hopefully be ready in about 12-24 months )
However, the R300 is doing a lame attemp to fix the Vertex Shader problem: maximum 4 loops and each having a maximum 256 instructions.
nVidia way is better: maximum 256 loops and each having a maximum of 65536 instructions.
This allows to actually have a LOT less vertex shaders than before, and all of this because there are a lot of loops and branching power.
And ya know what that means? Yep, you guessed it - much better batching. And a lot more performance if the programmers do it right.
Conclusion?
nVidia system for Vertex Shaders is excellent, and could result in great branching and a LOT less done on the CPU.
This might also give us a lot more free time for the CPU to do AI - which is, IMO, a good thing.
Uttar
nutball
09-21-02, 03:21 PM
Do you program graphics?
Originally posted by nutball
Do you program graphics?
Yeah, DX8. But i don't exactly do 3D, 2D in Direct3D - but i'm not using stupid interfaces or stuff. I do everything by hand.
That means i've got a good understanding of many DX common mistakes and possibilities - and those are the same, no matter if it's 2D or 3D.
But then again, i just do it on my own. I'm not part of any project/company. Mostly trying a learn the most i can.
Uttar
Philibob
09-21-02, 04:35 PM
I don't know anything about coding so...
Will this be any useful in current games or will this only work for newer ones that are coded in this way?
(off a quick read it sounds like newer but i'll just make sure)
Sidestepping slightly, have you got any good websites to get started in this sort of stuff?
Originally posted by Philibob
I don't know anything about coding so...
Will this be any useful in current games or will this only work for newer ones that are coded in this way?
(off a quick read it sounds like newer but i'll just make sure)
Sidestepping slightly, have you got any good websites to get started in this sort of stuff?
Yep, newer ones only :(
It shouldn't be TOO hard to convert to that system having things like Cg, but don't hope for a simple patch to use NV30 power.
As i said in the beggining, nVidia is more than ever betting on developer support.
Uttar
IMO cinefx is just something for coders and experts to talk about.
As a common player I just care about performance&IQ since I don't know when there will be a cinefx game.Features are good also.(doing something the other doesn't)
If nv30 has great performance with almost free fsaa&aniso it will have a place in my mobos warm AGP8x slot!!!:D
Originally posted by Fotis
IMO cinefx is just something for coders and experts to talk about.
As a common player I just care about performance&IQ since I don't know when there will be a cinefx game.Features are good also.(doing something the other doesn't)
If nv30 has great performance with almost free fsaa&aniso it will have a place in my mobos warm AGP8x slot!!!:D
Well, what i'm saying is that while we got no idea what the real PERFORMANCE of the NV30 is, we still know a LOT about nVidia ambitions with it.
Free FSAA/Aniso? Maybe. But that's not part of the CineFX archtecture, and nV only gave info about that particuliar part of the NV3X GPUs. So, if you want to hope for that, do so.
But what i'm trying to do is making sure we get to know some things for sure with the little official information we got.
nVidia is actually hoping for CineFX games to come rather quickly - they're doing their best with Cg and sponsoring developers.
But no one knows if that strategy can work.
Uttar
StealthHawk
09-22-02, 06:46 AM
has any game developer announced that they were utilize Cg yet? we have heard several say its nice and all, but they do need to take that extra step, otherwise all the merits of Cg are rather worthless. although i do realize that Cg was made with NV30 in mind, which is not out yet, so i'm willing to cut Cg some slack.
Originally posted by StealthHawk
has any game developer announced that they were utilize Cg yet? we have heard several say its nice and all, but they do need to take that extra step, otherwise all the merits of Cg are rather worthless. although i do realize that Cg was made with NV30 in mind, which is not out yet, so i'm willing to cut Cg some slack.
Carmack is going to AFAIK. He didn't say so publicly, but since he said his next-gen work is going to be done according to what the NV30 is capable of, it's logical he's gonna use Cg since nVidia optimized Cg for the NV30.
Uttar
StealthHawk
09-22-02, 08:27 AM
Originally posted by Uttar
Carmack is going to AFAIK. He didn't say so publicly, but since he said his next-gen work is going to be done according to what the NV30 is capable of, it's logical he's gonna use Cg since nVidia optimized Cg for the NV30.
Uttar
as he said his next project will be out in 5 years(probably the clock starts after Doom 3 is finished) that is something that is quite out into the future regardless. i was really looking for more immediate titles, as you said, nvidia is hoping devs will adopt next gen features more quickly than they have been doing.
Originally posted by StealthHawk
as he said his next project will be out in 5 years(probably the clock starts after Doom 3 is finished) that is something that is quite out into the future regardless. i was really looking for more immediate titles, as you said, nvidia is hoping devs will adopt next gen features more quickly than they have been doing.
Yeah, might take several years for that next-gen Carmack stuff.
However, it might be really nice if developers start to really adopt the NV30 for games which would release about 14 months after the NV30 release. That would be a really good win for nV - it took them years for Transform and Lighting and about 2 years for shaders.
Uttar
jbirney
09-22-02, 03:02 PM
Well Derek Smart said he will not support Cg. I know he is no carmark or sweeny. But he is a developer...
It works. The problem I see goes back to the whole Glide vs DX debate. As far as I can see, Cg is not standardized. In fact, some of the effects which should work on an ATI board, do not. In fact, some flat out crash it.
As such, I have my doubts as to whether any dev is going to waste their time using Cg to put in effects which only work on nVidia boards. Its bad enough that we (well, me personally) have exclusion code specifically geared toward making things work on ATI boards. Why would I want to add another layer of complexity to my code base.
For that, I'm probably not going to touch Cg for anything - other than prototyping. Its cool for that.
http://www.beyond3d.com/forum/viewtopic.php?t=2463
Uttar
However, it might be really nice if developers start to really adopt the NV30 for games which would release about 14 months after the NV30 release. That would be a really good win for nV - it took them years for Transform and Lighting and about 2 years for shaders.
NV only had 53% of the graphics market. Why code for only 1/2 the market? If any developer did that they will stand a chance to lose 1/2 of their projected income. Do you think that will happen?
In all likelyhood the extra vertex shaders power will probably not be used as there is no need to write something that complicated. In fact there was a long thread over at B3D about this. I will see if I can find it for ya
Actually, i think Derek's forgetting something: Cg is open source. If ATI wants to optimize it for their cards, nV won't stop them. If someone wants to make Cg work with Matrox card and got a lot of time on his hands, nV won't stop him.
IMO, even if what Derek says is kinda true, when whatever thing being worked on with Cg will be released, Cg will have matured and will be much more stable.
And it really surprises me he says Cg crashes ATI hardware - i never heard of that.
And the "no need to write something that complicated" is IMHO truly incorrect. Yes, there's no need to. But the point of this whole architecute is to GROUP MULTIPLE VS INTO ONE using branching & loops!
And that enables better batching, thus resulting in higher performance.
In Other Words, if you consider you could group many VS into one, the NV30 doesn't have that much power. It might even need more.
Uttar
It seems to me, you dont know much of what you're talking about.
The NV30 raytracing power is explained at all, so it's better to simply suppose it won't be used by devs for a while.
The NV30 does not do raytracing. You can force it to do it, by encoding polygon ray intersection tests into the vertex/fragment programs, but it's still basically a scanline rasterizer.
The Pixel Shader of the NV30, said simply, has amazing raw power but no big advancement from PS 1.4 beside it got a lot more instructions.
Utter rubbish. The Nv30 supports pixel shader 2.0. The current batch of gf3/gf4's have very limited dependant texture read operations. Whereas with pixel shader 2.0, they are completely general and very flexable.
Pixel Shaders, however, could rapidly become less usefull by Vertex Shaders because as there are more and more polygons in models, Vertex Shader quality is nearly as good as the Pixel Shader one.
Again this is wrong. Bump-mapping, and advanced lighting effects all are done on the fragment level. This isn't gonna change simply because we have more polygons in models.
Some other stuff you say seems right tho.
nutball
09-23-02, 11:25 AM
Thank you nutty, you put into words what I couldn't calm down enough to type.
Bigus Dickus
09-23-02, 01:31 PM
Originally posted by Uttar
However, the R300 is doing a lame attemp to fix the Vertex Shader problem: maximum 4 loops and each having a maximum 256 instructions.
nVidia way is better: maximum 256 loops and each having a maximum of 65536 instructions.
Uttar
Erm... what?
The CineFX papers had a mistake claiming a maximum of 1024 static instructions for the NV3x vertex shaders. This has been corrected, and is now claimed to be 256 instructions for the NV3x VS.
Each loop having 65536 instructions? LOL, someone has been hitting the crack pipe pretty hard.
In case you were confused about the pixel shaders as well...
R300 = 255 loops * 255 instructions per loop + 1 last instruction = 65026 instructions total.
NV3x = 256 loops * 256 instructions per loop = 65536 instructions total.
The CineFX claims about the R300's swizzling, registers, flow control, and constants were innaccurate as well.
Methinks this thread title is quite humorous.
without looking at performance ,but only at imagequality :)
the biggest diference between Nvidia cineFx in Nv3x and
Radeon9700 is in the maximun Pixel shaders colors precision and not in vertex shaders.
Both cards RADeon9700 and Nv30 support up to 128bits internal colors ,
but only Nv30 support true 128 bits all times ,all the way to the framebuffer ..
while radeon9700 only do 96bits ...
Nv30 can do much longer pixel shader programs in a single pass
without a loss in image quality . :)
Nv30 can keep the precision all the way at 128bits! its is much more flexible ,more accurate and more powerfull in its pixel shaders ..
a side note ATi demos used 64bits , very rare ,maybe it was too slow when using more than 64bits colors presision .
Ati cannot claim real time cinematic quality at least not with
what they have show in their tech demos ,even the Lord of the ring demo was far by a mile from the real movie .
thats why im real interested to see Nv30 CineFX in action
maybe this means nothing in the near future for us gamers ,but
for the profesional 3d artist Nv30 could mean heaven on earth ..:)
this next table shows the key diferences between ati and Nvidia
directX9 cards ..
http://www.tech-report.com/etc/2002q3/nextgen-gpus/index.x?pg=5
i think JC comments in Nvidia CIneFx's presentation say its all..
:)
--------------------------------------------------------------
"Nvidia is the first of the consumer graphics companies to firmly understand what is going to be happening with the convergence of consumer realtime and professional offline rendering. The architectural decision in the NV30 to allow full floating point precision all the way to the framebuffer and texture fetch , instead of just in internal paths , is a good example of far sighted planning. It has been obvious to me for some time how things are going to come together, but Nvidia has made moves on both the technical and company strategic fronts that are going to accelerate my timetable over my original estimations.
My current work on Doom is designed around what was possible on the original Geforce, and reaches an optimal implementation on the NV30 . My next generation of work is designed around what is made possible on the NV30."
------------------------------------------------------------------
Wow! .YOu never know ,but i have never seen JC so entusiast and optimistic by any new tecnology since Geforce3 ,when in the past he stated that any Gamedeveloper should run!! and buy one!! hehe
my all time favorite quote ever made ,indeed.. :)
Bigus Dickus
09-24-02, 11:16 AM
Originally posted by Nv40
without looking at performance ,but only at imagequality :)
the biggest diference between Nvidia cineFx in Nv3x and
Radeon9700 is in the maximun Pixel shaders colors precision and not in vertex shaders.
Nv30 can do much longer pixel shader programs in a single pass
without a loss in image quality . :)A picture is worth a thousand words. Show me a pair of pictures where one was rendered at 96 bit precision, and the other at 128 bit precision, and show me the visual difference. Go ahead... I'm waiting. I don't think Pixar even renders movies at 128 bit precision, but 96 and downsamples to 64 bit for frame storage after rendering.
Nv30 can keep the precision all the way at 128bits! its is much more flexible ,more accurate and more powerfull in its pixel shaders .. I'll give you more accurate. Now you explain how it is more flexible or more powerful. That's right, the NV30 is only some vague paper specs right now, and you don't know. Doesn't stop you from spreading FUD though.
this next table shows the key diferences between ati and Nvidia directX9 cards ..That table from tech-report is from August 9th, and is simply wrong. NVIDIA supplied their "best guess" as to the R300's capabilities, and it turned out that the R300 was much more powerful/flexible than NVIDIA had believed. Why don't you find some source from, say, the last month or so?
Originally posted by Nutty
It seems to me, you dont know much of what you're talking about.
The NV30 does not do raytracing. You can force it to do it, by encoding polygon ray intersection tests into the vertex/fragment programs, but it's still basically a scanline rasterizer.
Utter rubbish. The Nv30 supports pixel shader 2.0. The current batch of gf3/gf4's have very limited dependant texture read operations. Whereas with pixel shader 2.0, they are completely general and very flexable.
Again this is wrong. Bump-mapping, and advanced lighting effects all are done on the fragment level. This isn't gonna change simply because we have more polygons in models.
Some other stuff you say seems right tho.
NV30 Raytracing: I think you're right on that, nVidia is being very vague. Probably only an "advanced" instruction allowing for it to be less of a performance penalty ( 85% instead of 99% maybe, hehe? ) . I guess i should look at the NV30 PS raytracing example in nV SDK one of those days.
Pixel Shader stuff: Really, really sorry - my mistake. I shouldn't have compared it to PS 1.4 but to VS 1.1 . It's basically VS 1.1 for Pixels i think + a very little specific stuff.
Advanced Lighting/Bump-mapping: As i said, i'm a 2D in 3D programmer, so i really didn't think about the bump-mapping part.
But i fail to understand your point with "advanced lighting".
AFAIK, that's only per-pixel lighting. And my very point is that PS will become less usefull ( but still usefull in specific cases such as your excellent bump-mapping example ) - not useless.
Per-Pixel lighting, IMO, has a huge performance cost for a small gain in very high polygon count model compared to vertex lighting. So it should become an option to enable it or not i think.
So, yes, you do have a very interesting points and i'm sorry i did several mistakes.
Now, time for another quote:
Originally posted by Nutty
Erm... what?
The CineFX papers had a mistake claiming a maximum of 1024 static instructions for the NV3x vertex shaders. This has been corrected, and is now claimed to be 256 instructions for the NV3x VS.
Each loop having 65536 instructions? LOL, someone has been hitting the crack pipe pretty hard.
In case you were confused about the pixel shaders as well...
R300 = 255 loops * 255 instructions per loop + 1 last instruction = 65026 instructions total.
NV3x = 256 loops * 256 instructions per loop = 65536 instructions total.
The CineFX claims about the R300's swizzling, registers, flow control, and constants were innaccurate as well.
Methinks this thread title is quite humorous.
One word: AUGH. Typo.
Here's what i said:
"nVidia way is better: maximum 256 loops and each having a maximum of 65536 instructions."
Here's what it should have been:
"nVidia way is better: maximum 256 loops and each having a maximum of 256 instructions."
So yes, i did a HUGE mistake here. Have you never had a typo?
So, you're saying the R300 has 255 maximum loops? That really surprises me. Can you show me a document which proofs what you advance?
Uttar
EDIT: You're saying the NV30 is nothing but vague paper specs.
Well, err, i may sound lame to link this but...
http://www.anandtech.com/video/showdoc.html?i=1711&p=8
There, you can see a NV30 running on a IKOS box at a few Khz. So it's AWFULLY slow, but it works!
Now, let's just hope the real version is faster :)
jbirney
09-24-02, 01:49 PM
Thats not the NV30. Thats an emulation unit made up of FPGAs not the same thing you will have on your video card. Yes the logics the same but no reason to get excited about it until its in a form that we can use...
Originally posted by jbirney
Thats not the NV30. Thats an emulation unit made up of FPGAs not the same thing you will have on your video card. Yes the logics the same but no reason to get excited about it until its in a form that we can use...
Yeah, i know it's nothing more than an emulation. But it proofs it actually runs on emulation, which proofs it isn't only some vague paper specs.
Sure, it's not a NV30, but if that's bug free, it's highly likely the final product will be bug free. Only big question: is it bug free?
Uttar
Bigus Dickus
09-24-02, 03:51 PM
Originally posted by Uttar
Now, time for another quote:
One word: AUGH. Typo.lol, and another one. I'm not nutty. ;)
So, you're saying the R300 has 255 maximum loops? That really surprises me. Can you show me a document which proofs what you advance?My information came from beyond3d's review (http://www.beyond3d.com/reviews/ati/radeon9700pro/index.php?page=page2.inc) and those guys generally know their stuff, so I tend to believe them (much more so than tech-report, anand, tom, et. al).
EDIT: You're saying the NV30 is nothing but vague paper specs. Well, err, i may sound lame to link this but...
http://www.anandtech.com/video/showdoc.html?i=1711&p=8
Yes, and does that IKOS simulator tell us anything at all about the specs of the NV30? That was my point... as of right now, the only specs people have are very vague at best, being mostly rumor, heresay, and wild imagination, combined with some more vagueness from the CineFX papers. I think vague paper specs fits quite well, don't you? I'm not saying the NV30 is only a vague paper spec (though it mostly is), but rather the NV30's specs are vague... a bit of a difference.
There are fundamental reasons why lighting at the pixel level is better than at the vertex level.
Imagine having polygons as small as a pixel. The amount of bandwidth required to transfere all those polygons will be enormous.
Imagine trying to create shadow volume silhouettes from models that have polygons the size of a pixel? Very time consuming, and you'll also get lots of cracks and errors.
You'll also get lots of depth errors and fragment flashing just through normal rendering using such small polygons.
There are many other reaons why using very small polys is not the way to go.
A better solution is something like patches etc.. but it's still nicer to light at the fragment level.
Not saying lighting at the pixel level is not better.
What i'm saying is that IMO, with tommorow 5000+ polys models ( see SWG, AC2 ) , lighting at pixel level doesn't look *much* better and can rarely justify the cost.
But that's just my opinion - maybe some people will think it look a billion times better. That's their choice, and i respect that choice.
Uttar
Originally posted by Bigus Dickus
A picture is worth a thousand words. Show me a pair of pictures where one was rendered at 96 bit precision, and the other at 128 bit precision, and show me the visual difference. Go ahead... I'm waiting. I don't think Pixar even renders movies at 128 bit precision, but 96 and downsamples to 64 bit for frame storage after rendering.
That table from tech-report is from August 9th, and is simply wrong. NVIDIA supplied their "best guess" as to the R300's capabilities, and it turned out that the R300 was much more powerful/flexible than NVIDIA had believed. Why don't you find some source from, say, the last month or so?
the table from tech report is not wrong ,the info posted there
from Radeon9700 and Nv30 comes directly from ATI and Nvdia respectively ,the guy who has done a terrific profesional and unbiased review asked directly to ATI and NVIDIA and the table of specs outlines what Nvdia and ATI has told to him :) ,much better info that was is posted in Beyondfans3d who everyones knows which company most members and reviewers are biased.. ;)
quote from techreport..
As I said before, I've read up on both chips and talked to folks from both NVIDIA and ATI in an attempt to understand the similarities and differences between these chip designs. Both designs look very good, and somewhat to my surprise, I've found very few weaknesses in the ATI design, despite the fact it's hitting the market well before NVIDIA's chip. There are some differences between the chips, however, and they point to different approaches taken by the two companies. Most of my attention here is focused on the pixel pipeline, and the pixel shaders in particular, because that's where the key differences seem to be
A picture is worth a thousand words. Show me a pair of pictures where one was rendered at 96 bit precision, and the other at 128 bit precision, and show me the visual difference.
i agree too, a picture is worth a million words , but you will need
to wait for Nv30 demos and see what kind of quality and precision you can create with its greater pixelshaders technology in CineFx :)
as i said ATi demos were 64bits! very weird right ?
probably because they were short on time when designing the demos
or because radeon9700 only support 64bits not 96bits that they claim
or because there were not enough power in the card to push a demo in realtime in more than that 64bits ,the later seems to be the
what really happened.. ,because there is no sense to advertise a card
with 96bits colors the later show 64bits demos :)
but just for reference lets see
what ati have done in their 64bits demos ..
http://www.tech-report.com/etc/2002q3/nextgen-gpus/index.x?pg=3
and now what Nvidia claim Nv30 can do..
http://www.tech-report.com/etc/2002q3/nextgen-gpus/index.x?pg=6
wow! just say it!!! ....impressive ? hehe :)
i tell you that if Nvidia backup and demostrate in real hardware his claims showing a demo with that kind of quality ,there will be no single human in the planet in the 3d profesional industry
like CAd egineers ,animators or gamedevelopers who will not RUN! a buy an Nv30 . hehe and even gamers will not resist the Nv30
just to have the most powerfull videocard in the planet .
if the Nv30 is Much much faster than the radeon9700pro in direcx9 (which i believe) and have that kind of quality Nvidia is claming
,i have no doubts that it will be possible to see true Cinematic
quality for the first time in the computer industry and in real time! :)
pixar use 64bits in their productions , but movies are very diferent than Games .the fact is that cinematic quality ->in realtime needs more precision more colors and more acurracy than Movies ,
which are not ->real time!! see ? they are prerendered shots ,
thats why john carmack asked for 64bits of colors.. see ?
a still image photo in 32bits can look as good as another one in
128bits ,compare this 32bits prerendered picture in your RADeon2
http://www.insidecg.com/feature.php?id=105&page=4
with the sport car demo made by ati in the radeon9700 in 64bits :)
64bits are more than enough for pixar shots!! prerendered
in movies because those pictures you see in the movie are not real time ..see? the quality in computer graphics movies in holywood are hand tweaked,with paint programs like photoshop or post processing programs like Shake .you will be amazed by the poor quality and graphics errors sometimes the original shots results rendered in computers vs the final shots that you see in the movie.. see?
but for Cinematic quality games or cinematic real times demos
you will need no less than Nvidia Cinefx pixelshaders/vertex shaders precision . hopefully ATi will do something in the future R350?,to match Nvidia Nv30 cinefxquality and its possible higher performance..
i predict that Nv30 will be able to show Final fantasy as close as 9/10 of the real thing in real time which will be impressive ,
not because of image quality because the Nv30 has no loss in image quality but because the huge performance needed to do that..
but surely an in 10/10 in the future with a more powerfull Gpu ,like Nv35!
like the interview with Nvida CEO ..
where the inteviewer asked ...
how much diference we will see between Nv30 cinematic quality
and what we have today..?
->it will be the diference between what we see in movies
and what we see in games :)
vBulletin® v3.7.1, Copyright ©2000-2012, Jelsoft Enterprises Ltd.