PDA

View Full Version : NV18 and NV28 names revealed


Pages : 1 [2]

DadGT
08-28-02, 01:30 PM
Originally posted by Bigus Dickus

As to the single-texturing efficiency of the 9700, I have wondered about that as well. You noted that the 9700 is roughly equal to the GF4 per clock in dual texturing. It could be the case that the 9700 is very efficient at multitexturing - relative to its single texturing - due to the (previously discussed) efficiency of the 8 X 1 architecture, where the GF4 isn't as efficient at dual texturing - relative to its single texturing - because its difficult to keep both TMU's working constantly. Perhaps the GF4 suffers from some passes which use only one TMU for some reason. Just a hypothesis.


I don't really understand your point. My point is this, take the clock rate of the R300, 325MHz, multiply it by the dual-texture rate (8 Texels/pass), you get ~2600 MTex/sec, close to published 2527 MTex/sec on Tom's. Take the GF4 clock rate, 300 MHz, multiply it by the dual-texture rate (8) you get 2400 MTex/sec, close to the 2325 MTex/sec published on Tom's. That means both designs are very efficient for this simple dual-texture cases.

Then repeat using the single texture rates. For the GF4, you get calculated 1200 MTex/sec and a measured value of 1072 MTex/sec. The card does 90% of the "theoretical" value. Good efficiency. For the R300, you get a calculated 2600 MTex/sec while the measured value is 1766 MTex/sec. That's only 68% of "theoretical" value. That indicates that the R300 is somehow seriously underperforming in that case. That's the real discrepency I was trying to point out.

Sgt. Slaughter
08-28-02, 02:27 PM
I told my friend about AA yesterday. He has a GF2 Ultra and didn't even know it was there. Being someine who plays mostly flight sims, I smcked him in the head and turned it on.

-=DVS=-
08-28-02, 03:17 PM
Originally posted by Sgt. Slaughter
I told my friend about AA yesterday. He has a GF2 Ultra and didn't even know it was there. Being someine who plays mostly flight sims, I smcked him in the head and turned it on.

ROFL :D does GF2 Ultra runs good with AA on flight sims ?

Nv40
08-28-02, 03:38 PM
Originally posted by DadGT


I don't really understand your point. My point is this, take the clock rate of the R300, 325MHz, multiply it by the dual-texture rate (8 Texels/pass), you get ~2600 MTex/sec, close to published 2527 MTex/sec on Tom's. Take the GF4 clock rate, 300 MHz, multiply it by the dual-texture rate (8) you get 2400 MTex/sec, close to the 2325 MTex/sec published on Tom's. That means both designs are very efficient for this simple dual-texture cases.

Then repeat using the single texture rates. For the GF4, you get calculated 1200 MTex/sec and a measured value of 1072 MTex/sec. The card does 90% of the "theoretical" value. Good efficiency. For the R300, you get a calculated 2600 MTex/sec while the measured value is 1766 MTex/sec. That's only 68% of "theoretical" value. That indicates that the R300 is somehow seriously underperforming in that case. That's the real discrepency I was trying to point out.


AMen
,Thats a nice way to end with Mr. Bigus theories .. :)
3dmark2001 multitexturing benchmarks
clearly show this diference Where Radeon9700
and Geforce4 performance where near the same ,
but with a 110 million transistors card R300 .

this have been discussed many times ,the One Tmu of Radeon9700 is holding back is real performance in most todays games ->in multitexture benchmarks.

5xtimes faster than Geforce4 with high AA+AF/settings enable where huge Bandwidth is needed ,but 5-10% faster :( when multitexturing power matters ;)

Bigus Dickus
08-28-02, 04:26 PM
Originally posted by DadGT


I don't really understand your point. My point is this, take the clock rate of the R300, 325MHz, multiply it by the dual-texture rate (8 Texels/pass), you get ~2600 MTex/sec, close to published 2527 MTex/sec on Tom's. Take the GF4 clock rate, 300 MHz, multiply it by the dual-texture rate (8) you get 2400 MTex/sec, close to the 2325 MTex/sec published on Tom's. That means both designs are very efficient for this simple dual-texture cases.

Then repeat using the single texture rates. For the GF4, you get calculated 1200 MTex/sec and a measured value of 1072 MTex/sec. The card does 90% of the "theoretical" value. Good efficiency. For the R300, you get a calculated 2600 MTex/sec while the measured value is 1766 MTex/sec. That's only 68% of "theoretical" value. That indicates that the R300 is somehow seriously underperforming in that case. That's the real discrepency I was trying to point out.

I hadn't looked closely at the numbers for theoretical vs. measured fillrate performance. I wasn't sure if the 9700 was performing low in single texturing, or if the GF4 wasn't performing well in dual texturing. I think I'd have to agree with you.

I'm not sure if it's driver immaturity, or simply a design decision in hardware that causes this (such as... single texturing appears low, but the performance hit when enabling AA and/or AF is small because the "inefficient" architecure is actually setup for these IQ options... i.e., whatever logical units might be assisting with AF and trilinear filtering performance may also be somewhat bottlenecking the texturing performance when none of those options is used). Dunno... just another hypothesis.

Bigus Dickus
08-28-02, 04:32 PM
Originally posted by Nv40



AMen
,Thats a nice way to end with Mr. Bigus theories .. :):rolleyes:

Ra ra ree... I hate cheerleaders. Of course, he didn't address anything but my "hypothesis" which was more of an uninformed guess as to the cause of the behavior he mentioned. He didn't address any of the other "theories" (I'd call them explanations) of mine on single vs. multiple TMU architectures, presumably because he agrees.

I also noticed that you had nothing to say about my "theory" that the 9700 does in fact have the same number of TMU's as the GF4, again presumably because you agree. And as for my other comments about your "opinion" (I prefer to call it fantasy), I think it's clear why you didn't respond.

3dmark2001 multitexturing benchmarks
clearly show this diference Where Radeon9700
and Geforce4 performance where near the same ,
but with a 110 million transistors card R300 .

this have been discussed many times ,the One Tmu of Radeon9700 is holding back is real performance in most todays games ->in multitexture benchmarks.Holding it back relative to what? Relative to some hypothetical as yet non-existant product? It blows away anything else on the market, what more do you want?
5xtimes faster than Geforce4 with high AA+AF/settings enable where huge Bandwidth is needed ,but 5-10% faster :( when multitexturing power matters ;) What about situations when multitexturing matters and AA + AF are enabled? Yes, it blows away the GF4 there. In fact, there's not a single example that I could find where the 9700 appears to be fillrate limited (as opposed to CPU limited) where the GF4 isn't also seriously (much more so than the 9700) fillrate limited.

DadGT
08-28-02, 05:09 PM
Originally posted by Bigus Dickus


I hadn't looked closely at the numbers for theoretical vs. measured fillrate performance. I wasn't sure if the 9700 was performing low in single texturing, or if the GF4 wasn't performing well in dual texturing. I think I'd have to agree with you.

I'm not sure if it's driver immaturity, or simply a design decision in hardware that causes this (such as... single texturing appears low, but the performance hit when enabling AA and/or AF is small because the "inefficient" architecure is actually setup for these IQ options... i.e., whatever logical units might be assisting with AF and trilinear filtering performance may also be somewhat bottlenecking the texturing performance when none of those options is used). Dunno... just another hypothesis.

That's pretty much exactly what I was pointing at. At any rate, if either single or multi texture has to less efficient, I would lean toward single, as Ati has appeared to do. The fact that the multi-texture is as close to theoretical as is it is a good sign that it's not a key issue in performance.

As to why it is lower, my guess would be less empahsis on optimaztion than for multi texturing. Whether it can be changed in software or if hardware is to blame I have no idea. Again, since games generally use multi-texture, in a limited amount of time, you spend it where it will have the most payback.

StealthHawk
08-28-02, 08:35 PM
Well very true but how many of your of those people whent out and paid more than $250 for the card? Why is that important? Well be cause any one that spends that ammount of money usally knows the hardware and knows about its features. Those that get a lower priced version arent really keen on features as they want price. High end users want IQ/Features. Different market different needs/wants.

actually i know two people who bought the same P4 systems with GF3s last year who were both Counter-Strike nuts. i went over to the more computer nerdy guy's house, he had no FSAA on, no AF, and was actually playing CS with bilinear filtering :eek:

and these are people who had their clan matches and OGL matches and what not.

of course this was when the GF3 was $400. hell, when i got my GF3 i didn't even know what AF was, or that the GF3 even supported it until it was on the front page of 3dgpu. there were 0 options in the control panel for AF at this time, of course.

Nv40
08-28-02, 10:03 PM
Originally posted by Bigus Dickus
:rolleyes:


I also noticed that you had nothing to say about my "theory" that the 9700 does in fact have the same number of TMU's as the GF4, again presumably because you agree. And as for my other comments about your "opinion" (I prefer to call it fantasy), I think it's clear why you didn't respond.





quote...
" The multitexturing test (two textures per object, for example, when using light maps) shows that ATi's "eight pipe pixel engine" also has disadvantages, as each pipe can only process a single texture per clock cycle. The GeForce4's pipelines can process two textures each, and thus the performance of the two cards is almost identical. Unfortunately for ATi, most current games almost exclusively use multitexturing environments. "

http://www17.tomshardware.com/graphic/02q3/020819/radeon9700-15.html


Radeon9700 has only a single texturing unit, Believeme .. ;)
it would be interesting to see how diferent Nvidia NV30
will do it ,which is rumored to have 8 pipes x 2 TMu with over
30Gb+ of memory bandwidth ... ;)

Bigus Dickus
08-28-02, 11:00 PM
Originally posted by Nv40




quote...
" The multitexturing test (two textures per object, for example, when using light maps) shows that ATi's "eight pipe pixel engine" also has disadvantages, as each pipe can only process a single texture per clock cycle. The GeForce4's pipelines can process two textures each, and thus the performance of the two cards is almost identical. Unfortunately for ATi, most current games almost exclusively use multitexturing environments. "That's all well and good, but synthetic benchmarks are sythetic benchmarks. Too bad gaming performance doesn't reflect anything remotely like "performance of the two cards is almost identical." ;)

Radeon9700 has only a single texturing unit, Believeme .. ;) What part of 8 x 1 = 4 x 2 do you not comprehend? Jesus, I said the two cards have the same number of TMU's, and they do. It's a fact, get over it. :rolleyes:

Nv40
08-29-02, 01:25 AM
Originally posted by Bigus Dickus
That's all well and good, but synthetic benchmarks are sythetic benchmarks. Too bad gaming performance doesn't reflect anything remotely like "performance of the two cards is almost identical." ;)
What part of [b]8 x 1 = 4 x 2 do you not comprehend? Jesus, I said the two cards have the same number of TMU's, and they do. It's a fact, get over it. :rolleyes:

THe Radeon9700 has only One texture unit per pipe
8 x 1 = 8 basic math of course ,but the Geforce4 has
Two texture units per pipe which is 4 pipes x 2TMus = 8..
right both have the same power in multitexture games and
should have the same performance in multitextures sceneraios
in FAct thats what why RAdeon9700 and Geforce4 have very close performance in some games ,the diferences 15-30% diference are made
by the higher clocks speeds of Radeon9700 and its 20Gb memory bandwidth..

the point is that Nv30 will have Two textures units per 8 Pipes which in basic math is 8 x 2 = 16 TMus .
versus 8 x 1 = 8 tmus of radeon 9700 .. ;)

Nv30 should be able to score near 2xtimes the performance of Radeon9700 in multitextured games in resolutions below 1600x1200 in theory and even more than 2x times with AA+AF-on If is true than Nv30 have up to 48Gb memory bandwidth ,thats twice of R300 . :)

my other point is that it is posible to see an Nv28 (Geforce4 ultra) to score better than RAdeon9700 with AA+AF off,in quake3 engines games, if it clocked even higher ,with a couple of new memory bandwidth saving techniques ... ;)

it is all basic 3d math theory ,
but i know that we will need to wait for real
games benchmarks to see how close the Nv30 will performs to
its hardware specs ... but like ANAnd's have told in paper
Nv30 will be MORE FAster than R300 ;)

PreservedSwine
08-29-02, 08:33 AM
Nv30 should be able to score near 2xtimes the performance of Radeon9700 in multitextured games in resolutions below 1600x1200 in theory and even more than 2x times with AA+AF-on If is true than Nv30 have up to 48Gb memory bandwidth ,thats twice of R300 .

Yeah, I'm sure the NV20 will have 48Gb of readily available bandwidth-LOL

So, teh NV30 will have 2 tmu's per pipe, just the the Parhenia. Big deal, you can see how that card is a turd compared to the R9700. There is much more to engineering design than simply adding a second tmu and claiming twice the performance. Any more tmu's on the R9700 wouldn't offer any significant performance advantage- Only add to the cost. But we'll see.

If the bandwidth is available to take advantage, it should offer some improvement. I guess we'll see next year:)

Bigus Dickus
08-29-02, 09:01 AM
Originally posted by Nv40
it is all basic 3d math theoryWhy do I get the feeling this isn't all very basic to you? 2x TMU'x != 2x performance. The explanation is already in this thread if you care to ponder it.

Bah, no use arguing against you. You simply don't listen to or comprehend rational arguments. All you care about is "TeH NV30 0wnZ Radon3 Kikks its ASSZ!$!".

Jandar
08-29-02, 09:47 AM
Originally posted by DadGT


That's pretty much exactly what I was pointing at. At any rate, if either single or multi texture has to less efficient, I would lean toward single, as Ati has appeared to do. The fact that the multi-texture is as close to theoretical as is it is a good sign that it's not a key issue in performance.

As to why it is lower, my guess would be less empahsis on optimaztion than for multi texturing. Whether it can be changed in software or if hardware is to blame I have no idea. Again, since games generally use multi-texture, in a limited amount of time, you spend it where it will have the most payback.

early drivers...

here's my 8500, two tests, same system different drivers:

6052 drivers:
http://service.madonion.com/servlet/Index?pageid=/orb/projectdetails&projectType=6&projectId=3165470
753.1 MTexels/s

6071 drivers:
http://service.madonion.com/servlet/Index?pageid=/orb/projectdetails&projectType=6&projectId=3443583
839.3 MTexels/s

for reference, here's tom's test when the 8500 first came out:
http://www17.tomshardware.com/graphic/01q3/010814/radeon8500-12.html#3d_mark_2001
666.6 MTexels/s


Pretty similar machines although his was a P4 1.7 with PC800 versus my 1.67Ghz @ 1.75Ghz.

using your math, the radeon 8550 has a fillrate of 1100 single texture.
makes the 6071 drivers 76% efficient.
makes the 6052 drivers 68% efficient.
and toms review was 60%

whereas they were pretty damn close to max on multitexturing.

Seems like Ati has been shooting for multi texture performance for awhile. my radeon 64DDR was about 75% efficient in single, the Radeon VE was about 55% efficient. Granted, drivers change performance.

I would go with the idea that they streamline more for multi texturing and future drivers should definitely show improvement in that area.

sebazve
08-30-02, 12:00 PM
When games become more pixel/vertex shader dependent the need for the extra TMU is not going to be important.

jbirney
08-30-02, 12:43 PM
Nv40,

The ATI R9000 also only has one TMU per pipe. In some games its a disadvantage. But in other games like SS:SE its with in a few FPS of the 8500 which has two. And as we said before as games sift to using pixel shaders then the need for a second TMU decreases. So yes its a flaw. Is it a big one? Probably not.