PDA

View Full Version : NVIDIA GF100 Previews


Pages : 1 2 3 4 5 6 7 8 9 10 11 [12] 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52

shadow001
03-02-10, 01:45 PM
Tesselation performance on Fermi should be 50% faster from what I heard, so it could be possible imo :) (offcourse only in Games with DX11 tesselation wich aren't many atm :) )


True,it seems to be one of it's strong points,but even in games where tesselation is being used,there's also a lot more work happening as well,such as shading operations,texturing work and operations that require raw fillrate power,as well as memory bandwith constraints,and it's all happening at the same time....Tesselation doesn't live in it's own world basically,and it's how the overall architecture works that matters most.


So the question is,can a game be designed to the point where the tesselation feature in fermi can really strut it's stuff,beyond the capabilities that Cypress can acheive?...Most definately if they want to,as developers are the ones doing the coding afterall.


But will developers do that,considering that it's in their interest to sell as many copies as possible,and make sure it runs well on both ATI and cards regardless?,even if Fermi has more raw tesselation power....Not a chance when money comes into the overall picture,and profits are on the line.

Enrico_be
03-02-10, 01:52 PM
I agree Shadow001 :)

shadow001
03-02-10, 02:01 PM
I agree Shadow001 :)


Besides,i forgot to mention that given that the HD5970 cards have 2 GPU's in them,and each GPU has it's own tesselator,and already taking into consideration that Fermi looks to beat a single HD5870 chip in tesselation power,but can it beat 2 of them working together?


The signs are there that it looks like Nvidia will have to release a dual GPU fermi card to knock the HD5970 off it's spot at the top,in a convincing fashion.

Razor1
03-02-10, 02:28 PM
Besides,i forgot to mention that given that the HD5970 cards have 2 GPU's in them,and each GPU has it's own tesselator,and already taking into consideration that Fermi looks to beat a single HD5870 chip in tesselation power,but can it beat 2 of them working together?


The signs are there that it looks like Nvidia will have to release a dual GPU fermi card to knock the HD5970 off it's spot at the top,in a convincing fashion.

Its not the tesselator that is holding back the hd5870 ;), its the triangle setup, and Fermi can do up to 4 triangles per clock, the HD5980 can do 2. So yeah there is a possibility Fermi might be able to challenge it with tessellation.

Iruwen
03-02-10, 02:31 PM
http://www.semiaccurate.com/2010/03/02/fermi-strips-secret-location/

*sigh* seems like Charlie found a worthy follower.

Iruwen
03-02-10, 02:34 PM
Besides,i forgot to mention that given that the HD5970 cards have 2 GPU's in them,and each GPU has it's own tesselator,and already taking into consideration that Fermi looks to beat a single HD5870 chip in tesselation power,but can it beat 2 of them working together?

The signs are there that it looks like Nvidia will have to release a dual GPU fermi card to knock the HD5970 off it's spot at the top,in a convincing fashion.

You just said that a single GPU card never beat a dual GPU one, so why should it be the case this time. It is not expected to happen, except for some very, very rare occasions with no real life meaning, and Nvidia never said this would happen afair.

shadow001
03-02-10, 02:37 PM
Its not the tesselator that is holding back the hd5870 ;), its the triangle setup, and Fermi can do up to 4 triangles per clock, the HD5980 can do 2. So yeah there is a possibility Fermi might be able to challenge it with tessellation.


So each cypress chip can only do 1 triangle per clock or 2 for each GPU onboard?


If it's 2 triangles per clock for each Cypress GPU,then it would come to the actual clock speeds that Fermi operates at to see how close it gets,at least when talking theoretical numbers.

shadow001
03-02-10, 02:38 PM
You just said that a single GPU card never beat a dual GPU one, so why should it be the case this time. It is not expected to happen, except for some very, very rare occasions with no real life meaning, and Nvidia never said this would happen afair.


Some here on these boards were expecting that...;)

Razor1
03-02-10, 02:41 PM
So each cypress chip can only do 1 triangle per clock or 2 for each GPU onboard?


If it's 2 triangles per clock for each Cypress GPU,then it would come to the actual clock speeds that Fermi operates at to see how close it gets,at least when talking theoretical numbers.


no each cypress gpu can do 1 triangle per clock. So total 2, also Fermi is up to 4 doesn't do 4 all the time. But because of Fermi's out of order instructions that too will give it increased shader performance compared to previous generation cards. There is a lot of factors that are beneficial to Fermi's architecture, that we don't know much about other then they should increase per clock performance, till we see it in the real world.

Enrico_be
03-02-10, 02:45 PM
Some here on these boards were expecting that...;)

Why wouldn't we ... there's really not much accurate info out there :p Onle semi-accurate info .. :D

shadow001
03-02-10, 02:46 PM
no each cypress gpu can do 1 triangle per clock. So total 2, also Fermi is up to 4 doesn't do 4 all the time.


Even so,that still is a big advantage for Fermi if the application uses a lot of geometry just the same.


It's the same issue with ATI and raw shader power abilities....There's a ton of it,but will it ever get used in games?

shadow001
03-02-10, 02:52 PM
Why wouldn't we ... there's really not much accurate info out there :p Onle semi-accurate info .. :D


True.....We'll see in 3 weeks for sure.

Razor1
03-02-10, 02:53 PM
Even so,that still is a big advantage for Fermi if the application uses a lot of geometry just the same.


It's the same issue with ATI and raw shader power abilities....There's a ton of it,but will it ever get used in games?



I'm not sure how much of an advantage Fermi will have, but yes it should do better as geometry levels increase, also the benchmarks that we have seen (uningine) was actually a cut down version of the card, not the gtx 480, at least that is what I have been lead to believe.

ATi's raw shader ability most likely won't be fully utilized, outside of synthetics. Its very difficult to really show what is going on in an over all game aspects because of all the different stress levels due to different shaders, specially with a card that does very well in synthetics over all. Packing of the shaders could be a down fall, when doing some shaders the vec4+1 units might not be fully utilized.

Tessellation has two factors in performance the amount of triangles the card can setup and also the available shader power to do the hull and domain shaders, so its a combination, just saying one is going to be better then the other (when talking about the HD5970), its kinda hard to draw a conclusion without really doing some tests because of what I stated before.

Ninja Prime
03-02-10, 03:03 PM
The problem is, triangle setup limited scenarios in games are almost non-existant.

Razor1
03-02-10, 03:09 PM
The problem is, triangle setup limited scenarios in games are almost non-existant.

with tessellation that will show up quite quickly, and also Crysis 1 did have places where I was pretty sure was when you had over a million tris per screen

shadow001
03-02-10, 03:19 PM
I'm not sure how much of an advantage Fermi will have, but yes it should do better as geometry levels increase, also the benchmarks that we have seen (uningine) was actually a cut down version of the card, not the gtx 480, at least that is what I have been lead to believe.

ATi's raw shader ability most likely won't be fully utilized, outside of synthetics. Its very difficult to really show what is going on in an over all game aspects because of all the different stress levels due to different shaders, specially with a card that does very well in synthetics over all. Packing of the shaders could be a down fall, when doing some shaders the vec4+1 units might not be fully utilized.

Tessellation has two factors in performance the amount of triangles the card can setup and also the available shader power to do the hull and domain shaders, so its a combination, just saying one is going to be better then the other (when talking about the HD5970), its kinda hard to draw a conclusion without really doing some tests because of what I stated before.


I'll have to find the link where i read it,but i seems that in overall terms,the tesselation output limit of Cypress is about 14 million polygons/frame,while still sustaining 60 Fps overall,so in a gaming enviroment,i don't see that as being limited by the hardware that much,as current games are only using a small fraction off that it real terms.....


So it's impressive if Fermi can actually beat that by large margin(50%),even if it's in a theoretical/academic sort of way for a gamer.....Would be killer in professional applications though(movie effects,etc..).


Here's an interesting article that should be read... http://www.tomshardware.com/reviews/future-3d-graphics,2560-8.html


Might be some of the reasons for the design decisions within Fermi,architecture wise.....Including the Avatar reference and why it has so much tesselation power,billions of polygons per frame situations..

Razor1
03-02-10, 04:17 PM
14 million polygons on the screen at 60 fps not sure about that, unless we are talking up unlit polygons, there is a lot more to it then just pure through put.

Just turning on tessellation on Cypress it has like a 40% performance hit, and that is for the Unigine demo.

Polygon set up rates haven't changed since the 9800 and gf fx time ;), they have only gone up do to clock speed increases, until of course Fermi.

shadow001
03-02-10, 04:20 PM
Makes you think doesn't it?...;)


Fermi is designed for all those situations described in that article,while the game playing aspect of it is there obviously,but plays a much smaller role when the overall larger implications are considered,so it really gives the impression that it's primarily a GP-GPU processor that can also play games above all else.

Razor1
03-02-10, 04:28 PM
Makes you think doesn't it?...;)


Fermi is designed for all those situations described in that article,while the game playing aspect of it is there obviously,but plays a much smaller role when the overall larger implications are considered,so it really gives the impression that it's primarily a GP-GPU processor that can also play games above all else.


From a design side, it only becomes viable to create hardware that is more flexible if performance and cost of that flexible hardware is viable. So if it ends up not being fast enough or costs too much to make for the benefits, the chip will be scrapped.

shadow001
03-02-10, 04:28 PM
14 million polygons on the screen at 60 fps not sure about that

Just turning on tessellation on Cypress it has like a 40% performance hit, and that is for the Unigine demo.

It's what i heard,but can't vouch for it's accuracy to be honest....It could be a peak theoretical value when nothing else is going on,which obviously isn't the case when running games at smooth performance levels.


It tricky to characterize hardware these days since it's getting so complex,with hardly nothing in them being fixed function anymore and trying to add all that flexibility into the overall architecture,and also making it possible to use it in other enviroments as well(GP-GPU)....It's definitely not easy and requires a few years of developement with a large team of very qualified people,resources and truckloads of money.

shadow001
03-02-10, 04:33 PM
From a design side, it only becomes viable to create hardware that is more flexible if performance and cost of that flexible hardware is viable. So if it ends up not being fast enough or costs too much to make for the benefits, the chip will be scrapped.


Depends a lot on what you're comparing it with though,and if there's actually competition on that market....Even if the design isn't as perfect as the designers would like,if there's nothing better that it,when compared to the competition(or lack there of),it becomes the best thing out there....Ironic but there it is.

Razor1
03-02-10, 04:37 PM
It's what i heard,but can't vouch for it's accuracy to be honest....It could be a peak theoretical value when nothing else is going on,which obviously isn't the case when running games at smooth performance levels.


Most likely that is unlit triangle through put.


It tricky to characterize hardware these days since it's getting so complex,with hardly nothing in them being fixed function anymore and trying to add all that flexibility into the overall architecture,and also making it possible to use it in other environments as well(GP-GPU)....It's definitely not easy and requires a few years of development with a large team of very qualified people,resources and truckloads of money.

Of course, that's just the nature of the beast, but what makes Fermi a good GP-GPU processor also makes it a good gaming GPU, you really can't have one without the other. Larrabbee would have been good if it wasn't x86 based, and forced to emulate Dx and Ogl, that is why I'm pretty sure it was scrapped, not that it couldn't perform, it was a pretty big design flaw not to support the only 2 graphics API's out there. Kinda reminded me of the NV1 ;)

shadow001
03-02-10, 09:05 PM
Most likely that is unlit triangle through put.



Of course, that's just the nature of the beast, but what makes Fermi a good GP-GPU processor also makes it a good gaming GPU, you really can't have one without the other. Larrabbee would have been good if it wasn't x86 based, and forced to emulate Dx and Ogl, that is why I'm pretty sure it was scrapped, not that it couldn't perform, it was a pretty big design flaw not to support the only 2 graphics API's out there. Kinda reminded me of the NV1 ;)


Well,there are things like double precision floating point,C++ hooks, large unified caches and ECC memory support that for gaming purposes,we can call pretty redundant here,so there obviously are some features in Fermi aimed towards the GP-GPU markets directly .


Those aren't really needed for gaming.....As for larrabee,it seems performance was decent overall,but the chip was a year late to market,and would have competed well with the GT200 and the RV770 in overall performance,but once Cypress was released,and they also had better idea of what Fermi was about,Intel said screw it,there's no point in releasing the current version of it.


An improved version could be released with the 32nm fab process however,and as far as that goes,nobody touches intel,as they're at least 1 year ahead of everybody there....They're actually releasing 32nm products now,while companies like TSMC/UMC and chartered,might have that very late this year or more likely next year,so it's a huge advantage to have.


They spend colossal amounts of money to have that lead though.

lee63
03-02-10, 10:00 PM
Gigabyte boxes.

http://vr-zone.com/forums/570761/gigabyte-gtx-480-and-gtx-470-retail-boxes.html

Razor1
03-02-10, 10:16 PM
Well,there are things like double precision floating point,C++ hooks, large unified caches and ECC memory support that for gaming purposes,we can call pretty redundant here,so there obviously are some features in Fermi aimed towards the GP-GPU markets directly .


Those aren't really needed for gaming.....As for larrabee,it seems performance was decent overall,but the chip was a year late to market,and would have competed well with the GT200 and the RV770 in overall performance,but once Cypress was released,and they also had better idea of what Fermi was about,Intel said screw it,there's no point in releasing the current version of it.


An improved version could be released with the 32nm fab process however,and as far as that goes,nobody touches intel,as they're at least 1 year ahead of everybody there....They're actually releasing 32nm products now,while companies like TSMC/UMC and chartered,might have that very late this year or more likely next year,so it's a huge advantage to have.


They spend colossal amounts of money to have that lead though.


what are c++ hooks? :), I know what it is in software, but never heard of it in hardware terms, outside of ECC memory support there is nothing in Fermi that would not be used for gaming. The same silicon that is used in DP, has two times the single precision, the extra caches actually help performance in many instances, things like adaptive tessellation, where data can be rapidly modified instead of sending it to ram and back.