Go Back   nV News Forums > Graphics Card Forums > NVIDIA GeForce 400/500 Series

Newegg Daily Deals

Reply
 
Thread Tools
Old 01-22-12, 01:56 PM   #109
shadow001
Registered User
 
Join Date: Jul 2003
Posts: 1,526
Default Re: next gen kepler to support dx 11.1, also take a year to rollout all cards

Quote:
Originally Posted by Ninja Prime View Post
To clarify this, they are judging against Tesla supercomputing boards. The first fermi computing board has a DP flops rating of 515 DP flops and a TDP of 238 watts. This amounts to 2.16 DP flops per watt. To hit their claimed 2.5 times this, they would hit 5.42 DP flops per watt. So, 5.42 x 238 watts = 1.28 teraflops DP, and since NVs DP rate is half their SP rate, SP flops would be 2.56 teraflops SP. Doesn't even beat the 6970(2.7teraflops), let alone the 7970(3.79 teraflops), at similar power.

It also depends on what TDP they are targeting for this 2.5x number. As lower end tends to be more efficent, maybe this 2.5x number only applies to a 200 watt product, that still puts them at 1.08 teraflops DP, still in line with the rumored "over 1 teraflop DP" number. I suspect this might be the case, and their high end will be a dual GPU board product.

My main point though is that AMD achieves this with dual precision precision maximum(1 terraflop) with a GPU using "only" 4.3 billion transistors, though it does run at fairly high clocks, and can overclock another 200mhz....I'm focusing on the dual precision figure since it's the one used in mission critical and scientific computing environments where the results have to be as precise as possible, and i do keep in mind these are theoretical maximum figures and not likely to be acheivable in practical terms.


Now on Nvidia's side and with the current fermi, it does a little over half as much in dual floating point and already uses 3+ billion transistors, so for AMD to do be pretty much twice as fast in that dept with "only" an extra billion transistors, and still be 50% faster than their previous high end HD6970 in gaming is quite an achievement to say the least, and both Cayman and Tahiti are only running with a 75Mhz difference in clock speeds(850 vs 925Mhz)....Like it or not. it's one efficient chip and makes every transistor spent on it's design count for something, while keeping the overall die size as small as possible.
shadow001 is offline   Reply With Quote
Old 01-22-12, 06:21 PM   #110
shadow001
Registered User
 
Join Date: Jul 2003
Posts: 1,526
Default Re: next gen kepler to support dx 11.1, also take a year to rollout all cards

Here's another one that's interesting and this time it's regards with the back end side of the GPU wich affects gaming, namely it's fillrate and texturing speed:





What's wrong with this picture when you know that Fermi has 50% more rops than the HD6970 and HD 7970( both have only 32) and also has a 384 bit memory bus(GTX 580 and HD7970), yet in mesured effective fillrate, the HD7970 kills it by 3.5 billion pixels per second, and even more than that relative to the HD6970, wich also has 32 rops and runs 75 Mhz less than the HD7970...


If the Rops were of the same capabilites for both Fermi and Tahiti, the fact that Fermi has 50% more of them(48 Rops) would more than offset the clock speed differences relative to Tahiti, and both cards have a 384 bit memory bus, so that isn't it either and memory speed differences alone between both cards isn't enough either.....It's like Nvidia stuffed the Fermi GPU full of Rops, but they're not very efficient and don't get used much in practical real world terms, so they seriously need to be reworked/enhanced independently of the amount used in Kepler and it isn't just about shading power exclusively.


Now there's texturing, where they use FP16 textures wich aren't widely used in games, but the results are surprising:





Much faster than even a GTX 590 says everything really, and there's 128 texture units in a single tahiti versus 128 texture units between both GPU's on the GTX590...The same goes for tesselation performance where it was a really strong point and the HD7970 is about 30% faster there...


Basically, whatever Kepler ends up being, it has to be improved/completely new in every area for gaming and GP-GPU computing over the Tahiti chip to cover all possible markets, both the gaming and the professional GP-GPU markets, while still comply with the 300 watt PCI-e power limits for a single GPU.....Dual GPU cards from both companies will blow thru that limit like it wasn't even there and that's before they're even overclocked...
shadow001 is offline   Reply With Quote
Old 01-22-12, 08:37 PM   #111
ninelven
Registered User
 
Join Date: Jan 2003
Posts: 132
Default Re: next gen kepler to support dx 11.1, also take a year to rollout all cards

Quote:
Originally Posted by shadow001
What's wrong with this picture when you know that Fermi has 50% more rops than the HD6970 and HD 7970( both have only 32)
Although Fermi has 48 ROPs, it can only put out 2 pixels per SM per clock, which equals 32 (16 SM x 2 = 32PPC). The stock clock of the GTX 580 is 772 vs 925 for the 7970, which makes the 7970 20% faster. The 7970 also has a 37% bandwidth advantage over the GTX 580. Looking at the numbers from your first graph 13.33 / 9.75 = 1.37 or a 37% advantage for the 7970 indicating that bandwidth is the limiting factor and exactly what one would expect.
ninelven is offline   Reply With Quote
Old 01-22-12, 09:40 PM   #112
i SPY
007
 
i SPY's Avatar
 
Join Date: Apr 2007
Location: You were sayin'
Posts: 290
Default Re: next gen kepler to support dx 11.1, also take a year to rollout all cards

^

Are you talking about FP16? Because they doubled that in GF110 compared to GF100.



http://www.anandtech.com/show/4008/n...orce-gtx-580/2


edit: whoops i see you were talking about pixel fillrate..




But still I agree with shadow001, Nvidia needs more pixel fillrate and texture fillrate.
__________________
intel Q9450 @ 3.656Ghz [1.3875v, LLC off]| GA-X48-DS5 [Memory Enhance: Turbo]|MSi N570GTX TwinFrozrIII OC/PowerEdition|Kingston HyperX 4x2GB PC 8500 @ 1097Mhz [5-5-5-18, 2.25v]| Creative X-FI Pro [SB046A]| Tagan PipeRock 600w [48A]

i SPY is offline   Reply With Quote
Old 01-22-12, 09:57 PM   #113
shadow001
Registered User
 
Join Date: Jul 2003
Posts: 1,526
Default Re: next gen kepler to support dx 11.1, also take a year to rollout all cards

Quote:
Originally Posted by ninelven View Post
Although Fermi has 48 ROPs, it can only put out 2 pixels per SM per clock, which equals 32 (16 SM x 2 = 32PPC). The stock clock of the GTX 580 is 772 vs 925 for the 7970, which makes the 7970 20% faster. The 7970 also has a 37% bandwidth advantage over the GTX 580. Looking at the numbers from your first graph 13.33 / 9.75 = 1.37 or a 37% advantage for the 7970 indicating that bandwidth is the limiting factor and exactly what one would expect.

The main point though is that the Rops, and texture units still need to be enhanced not just to match the HD7970, but to beat it in order to have a faster card, so adding more of the same type found in Fermi takes up precious die space where they need a fair chunk of room to also enhance it's GP-GPU ability in single and double floating point math to be better than the HD7970 in that area too.


Memory bandwith wise, given that both companies are limited to GDDR5 and it's close to it's limit in terms of maximum clock speeds, the only way to do that is add a 512 bit memory bus, wich means adding 2 more memory controlers in the GPU die too(8 in total since each is usually 64 bits wide), wich will also takes up die space and need more pins on the GPU packaging and a more complex PCB, and even then it gives it a 25% improvement in memory bandwith compared to the 384 bit bus on the HD7970, with the GDDR memory running at the same clock speeds on both cards...
shadow001 is offline   Reply With Quote
Old 01-22-12, 10:51 PM   #114
ninelven
Registered User
 
Join Date: Jan 2003
Posts: 132
Default Re: next gen kepler to support dx 11.1, also take a year to rollout all cards

No offence, but it is fairly obvious you don't have a very clear technical understanding of the matter. Please answer the following questions:

Exactly, how much die space do the ROPs take on Fermi? What is the percentage of the total area of Fermi? How much die space do the ROPs take on Tahiti, and what percentage of the total area?

Exactly, how should the ROPs and texture units be "enhanced?"

Obviously, more pixel and texture fill would be great, but there is no evidence I can see that what Nvidia has now is inefficient.
ninelven is offline   Reply With Quote
Old 01-23-12, 12:43 AM   #115
shadow001
Registered User
 
Join Date: Jul 2003
Posts: 1,526
Default Re: next gen kepler to support dx 11.1, also take a year to rollout all cards

Quote:
Originally Posted by ninelven View Post
No offence, but it is fairly obvious you don't have a very clear technical understanding of the matter. Please answer the following questions:

Exactly, how much die space do the ROPs take on Fermi? What is the percentage of the total area of Fermi? How much die space do the ROPs take on Tahiti, and what percentage of the total area?

Exactly, how should the ROPs and texture units be "enhanced?"

Obviously, more pixel and texture fill would be great, but there is no evidence I can see that what Nvidia has now is inefficient.

Inefficient is a relative term depending on the what competition has and what it can do given the a certain die size, and keep in mind that tahiti is only a 365mm^ die at 28nm, and it's only using 32 ROPs like the previous generation Cayman used on the HD 6970, and both GPU's aren't that far apart in clock speeds(75 Mhz), yet tahiti beats it by a mile on fillrate and texturing in those charts, so it's obvious that AMD did a lot of improvements on the back end of tahiti and it wasn't just the shaders.


Fermi is a 530mm^ die at 40nm as we all know, and simply shrinking that core down to 28nm still yeilds a core clocking in at 371mm^ without adding anything new to it, making it roughly the same size as tahiti on the HD7970....Would a straight Fermi shrink to 28nm, and clocked at the same speed as the core on tahiti used in tthe HD7970, and using the same type of memory clocked at the same speed, using the same 384 bit memory, be enough to beat it in raw fillrate and texturing speed....Short answer is no simply by looking at the what those charts suggest...Especially the texturing one(ouch).


The single precision math of a tahiti core is 3.7 terraflops and dual precision is just about 1 terraflop even, while Fermi does 1.56 terraflops single precision and 650 gigaflops dual precision, so simply shrinking the core to 28nm and clocking it another 200 Mhz higher isn't enough to match the Tahiti never mind offering even more performance, wich is has too.


Enhancing in this case simply comes down to doing more work for every clock cycle, and it needs it in every major area that affects both gaming performance and professional application performance, so whatever Kepler ends up being, it has to be something new from the ground up basically, not just an enhanced Fermi...
shadow001 is offline   Reply With Quote
Old 01-23-12, 01:04 AM   #116
ninelven
Registered User
 
Join Date: Jan 2003
Posts: 132
Default Re: next gen kepler to support dx 11.1, also take a year to rollout all cards

And yet... you didn't answer my questions. Quite frankly, it is because you can't.

The problem with Fermi relative to GCN is not texturing efficiency or ROPs efficiency, it is compute unit (shader) efficiency, but I digress...

Quote:
single precision math of a tahiti core is 3.7 terraflops and dual precision is just about 1 terraflop even, while Fermi does 1.56 terraflops single precision
So Tahiti has more than 2.3x the Flops and more than 2.3x the texture fill but only performs ~1.4x better than Fermi. That sounds pretty inefficient to me. /sarcasm
ninelven is offline   Reply With Quote

Old 01-23-12, 01:27 AM   #117
shadow001
Registered User
 
Join Date: Jul 2003
Posts: 1,526
Default Re: next gen kepler to support dx 11.1, also take a year to rollout all cards

Quote:
Originally Posted by ninelven View Post
And yet... you didn't answer my questions. Quite frankly, it is because you can't.

The problem with Fermi relative to GCN is not texturing efficiency or ROPs efficiency, it is compute unit (shader) efficiency, but I digress...

So Tahiti has more than 2.3x the Flops and more than 2.3x the texture fill but only performs ~1.4x better than Fermi. That sounds pretty inefficient to me. /sarcasm

We've only seen it in gaming, not computing performance though....And of course i can't answer how much space does each unit takes up die space wise since only the engineers the designed it would know such details and their capabilities.....Here's a picture of Fermi and just one shader block:





Look at the texture units in blue, so as Nvidia adds more shader blocks they also add more texture units since they're built in, same for the tesselation hardware too while i'm at it.




I wouldn't say it's the compute units exclusively as the Rops and texture units are decoupled relative to the shader block wich is a practice AMD has been doing for a while, unlike Nvidia wich for each shader block you automatically add more texturing units...Here's one compute unit on tahiti:





Here's the entire thing and the texture units are seperate from the shader blocks, and so are major components of the graphics portion, such as tesselation

shadow001 is offline   Reply With Quote
Old 01-23-12, 02:38 AM   #118
ninelven
Registered User
 
Join Date: Jan 2003
Posts: 132
Default Re: next gen kepler to support dx 11.1, also take a year to rollout all cards

I'm not the one who needs pictures. Either answer the questions or admit you have no actual evidence for your claims.


Exactly, how much die space do the ROPs take on Fermi? What is the percentage of the total area of Fermi? How much die space do the ROPs take on Tahiti, and what percentage of the total area?

Exactly, how should the ROPs and texture units be "enhanced?"

Quote:
Originally Posted by shadow001
unlike Nvidia wich for each shader block you automatically add more texturing units...
No, just like nvidia. Texture units are tied to SIMD engines in GCN; it is clear you don't have a clue what you are talking about.
ninelven is offline   Reply With Quote
Old 01-23-12, 12:48 PM   #119
Vardant
 
Vardant's Avatar
 
Join Date: Apr 2009
Location: EU
Posts: 1,041
Default Re: next gen kepler to support dx 11.1, also take a year to rollout all cards

Charlie says gk104 price is $299.

http://semiaccurate.com/2012/01/23/e...k104-price-is/
Vardant is offline   Reply With Quote
Old 01-23-12, 01:18 PM   #120
Rollo
 
Join Date: Jul 2003
Posts: 1,719
Default Re: next gen kepler to support dx 11.1, also take a year to rollout all cards

Quote:
Originally Posted by Vardant View Post
I can't remember the last time Charlie was right about something, but if he is about this apparently GK104 is a 560Ti replacement.

If it performs like a GTX580 or slightly better, has 2GB of VRAM, and an MSRP of $299 NVIDIA would sell TONS of them and totally devalue 7970s and 7950s.

Would be a pretty amazing turn of events if true, and great news for everyone except ATi.

But I can't remember the last timne Charlie was right, so I'll take it with a 40# bag of softener salt.
__________________
Rig1:
intel 990X + 2 X EVGA 3GB GTX580 + 3 X Acer GD235Hz
3D Vision Surround

Rig 2:
intel 2500K + NVIDIA GTX590 + Dell 3007 WFPHC

[SIZE="1"]NVIDIA Focus Group Member
[B]NVIDIA Focus Group Members receive free software and/or hardware from NVIDIA from time to time to facilitate the evaluation of NVIDIA products. However, the opinions expressed are solely those of the Members.[/B][/SIZE]
Rollo is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 06:06 PM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.