PDA

View Full Version : Confused about the performance of 8 and 9 series cards...


CaptNKILL
05-27-08, 01:19 PM
Ok this is driving me nuts. :p

First, here is a list of specifications for different 8 and 9 series cards.

8800 GS = 96 stream processors, 384Mb 192bit memory @ 1400-1600Mhz, 550-680Mhz core clock

9600 GSO = 96 stream processors, 384Mb 192bit memory @ 1900Mhz, 650Mhz core clock

9600 GT = 64 stream processors, 512Mb 256bit memory @ 1800Mhz, 650Mhz core clock

8800 GT = 112 stream processors, 512Mb 256bit memory @ 1800Mhz, 600Mhz core clock


My question is, how the hell can all of these cards perform within 10%-15% of eachother in most benchmarks? On top of that, the 9600GT is a little faster than the GSO and 8800GS in most tests despite having 50% less stream processors. Is the number of stream processors generally meaningless? It clearly is nothing like the number of pixel pipelines that we used to keep track of back in the day. The only thing that seems to matter is the amount of memory since the performance drops off significantly in higher resolutions with the cards with less memory.

People have been talking about the terrible naming scheme for these cards for months, but I can't get over how convoluted the specs and real world performance numbers are. Without looking at benchmark results there is really NO way of knowing how fast a given card is. The model numbers and specs mean absolutely nothing. :confused:

Soetdjuret
05-27-08, 01:49 PM
8800gs and 9600gso = the same

9600gt is better than gso, and 8800gt is the superior one. Simple!

CaptNKILL
05-27-08, 02:00 PM
http://www.nvnews.net/vbulletin/attachment.php?attachmentid=31618&stc=1&d=1211855280

No, its not simple.

96 SP = slow

64 SP = fast

112 SP = faster sometimes

Why?

walterman
05-27-08, 02:03 PM
Everything depends of the game that you want to enjoy.

Generally speaking, a GPU with more SP units should help on games with complex shaders (Crysis), and a GPU with a monster memory bandwidth should help to 'sustain' all that shading power at high resolutions, with high level of AA/AF.

Also, you need to multiply the number of SP units by the shader domain frequency. You need to multiply this number by 2 (the G80/G92 can perform 2 ops per clock cycle: MADD). The result is the number of Flops (the shading power of your GPU).

Medion
05-27-08, 02:07 PM
9600GSO is a rebadged 8800GS.

9600GT (64 stream processors) compares about the same as the Radeon 3870 (320 stream processors) which is why I don't care about that number.

It's best to just look at it like this. 9600GT is current mid range, while 8800GT is last gen high end. So, you can expect somewhat similar performance (like a 7600GT and 6800Ultra).

8800GS is last gen mid range, and 9600GSO is current gen low-mid range, so that makes sense. So don't worry about the specs, so much as the placement. Top to bottom should be:

8800GT > 9600GT > 9600GSO = 8800GS

CaptNKILL
05-27-08, 02:13 PM
I know how they fit performance wise, I'm just trying to make sense of why.

It used to be simple. More pixel pipelines = better performance. Now, we don't have pixel pipelines and they tell us how many shader processors a card has, and that only matters in certain games in certain situations, as walterman said.

Its just getting to be such a mess.

The 8800GS is actually faster than the 9600GT in Crysis because of the 96SPs, but in any other game its slower. Apparently its because of the number of ROPs, TMUs etc... but those kinds of things are rarely advertised. If ever.

Wikipedia is the only place that keeps track of stuff like that.

http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units#GeF orce_8_series

fivefeet8
05-27-08, 02:23 PM
Generally speaking, a GPU with more SP units should help on games with complex shaders (Crysis), and a GPU with a monster memory bandwidth should help to 'sustain' all that shading power at high resolutions, with high level of AA/AF.


What I've found is that my 8800gtx actually plays Crysis smoother than the 9800gtx I had. The reason why is because I used very high texture and object detail settings(1680x1050/16xAF). Both options seem to fill up available memory quite fast to the point that the 9800gtx with lesser memory started to slow down quite a bit compared to the 8800gtx with more memory even though the latter had far more shader processing prowess. Both cards were overclocked. The 8800gtx was at 620/1450/950 and the 9800gtx was at 775/1950/1150.

An alternative example is that the 9800gtx played Assassins Creed PC Dx10 better than the 8800gtx because the games options does not bring a memory limitation into the equation even at their highest and the game seems to be more shader bound. That allowed the 9800gtx to spread it's shader performance prowess over the 8800gtx.

3DBrad
05-27-08, 06:25 PM
9-series = 8-series with new stickers

Medion
05-27-08, 07:04 PM
9-series = 8-series with new stickers

Sort of. The 9600 is an all new card. The "9900s" (GT280/260) are new cores. The rest is just re badged 8 series.

SH64
05-28-08, 12:49 AM
http://www.nvnews.net/vbulletin/attachment.php?attachmentid=31618&stc=1&d=1211855280

No, its not simple.

96 SP = slow

64 SP = fast

112 SP = faster sometimes

Why?
G92/9x has better texteruing performance. (more texture address units)

Bman212121
05-28-08, 01:39 AM
Take the 9600GT out of the equation and it will make a lot more sense. The 9600GT is based on the G94 core, the rest are G92. All of the other cards fall in line like you would expect, but the 9600GT is different because it uses a type of texture compression to speed up the card at higher resolutions.

Thanks to design and process improvements, the stream processors on the GeForce 9600 GT operate at 1,625 MHz, a 20% improvement over the first-generation stream processors in the GeForce 8800 GTX. By using 64 stream processors at a higher clock, the GPU operates more efficiently and requires fewer transistors than a slower, wider design. To balance the improved shader performance, the texture engine is fitted with doubled addressing capabilities, allowing eight bilinear addresses to be calculated and eight texels filtered per cycle. The ROP sub-system has also been improved, with greater compression coverage at high resolutions. The combination of these architectural improvements means that the GeForce 9600 GT offers much better performance than a 64-stream processor version of the GeForce 8800 GTX. In real-world performance, a pair of GeForce 9600 GT cards often outperforms a GeForce 8800 GTX.


http://www.legitreviews.com/article/666/1/

Notice in the article that the stream processors are clocked higher on the 9600GT than they are on the 8800GT. The stream processors on the 8800GS and 9600GSO are both clocked lower than the 8800GT.

Stock:
9600GT = 1625Mhz
8800GT = 1500Mhz
8800GS/9600GSO= 1375Mhz

SH64
05-28-08, 01:47 AM
Hmmm so there is even some improvements in the ROP's! thanks for the info Bman!

Bman212121
05-28-08, 01:53 AM
np ;)

MUYA
05-28-08, 01:58 AM
Hmmm so there is even some improvements in the ROP's! thanks for the info Bman!
Yup, there was a review which highlighted this as there was less penalty hit with AA was turned on on G9x's then previous G8x's..all hinting to improved ROPs

jAkUp
05-28-08, 02:08 AM
Ok this is driving me nuts. :p

First, here is a list of specifications for different 8 and 9 series cards.

8800 GS = 96 stream processors, 384Mb 192bit memory @ 1400-1600Mhz, 550-680Mhz core clock

9600 GSO = 96 stream processors, 384Mb 192bit memory @ 1900Mhz, 650Mhz core clock

9600 GT = 64 stream processors, 512Mb 256bit memory @ 1800Mhz, 650Mhz core clock

8800 GT = 112 stream processors, 512Mb 256bit memory @ 1800Mhz, 600Mhz core clock


My question is, how the hell can all of these cards perform within 10%-15% of eachother in most benchmarks? On top of that, the 9600GT is a little faster than the GSO and 8800GS in most tests despite having 50% less stream processors. Is the number of stream processors generally meaningless? It clearly is nothing like the number of pixel pipelines that we used to keep track of back in the day. The only thing that seems to matter is the amount of memory since the performance drops off significantly in higher resolutions with the cards with less memory.

People have been talking about the terrible naming scheme for these cards for months, but I can't get over how convoluted the specs and real world performance numbers are. Without looking at benchmark results there is really NO way of knowing how fast a given card is. The model numbers and specs mean absolutely nothing. :confused:

You didn't include the shader clock in there, in most cases, the shader clock is right behind the core clock in terms of performance items.

k0py
05-28-08, 02:16 AM
G92/9x has better texteruing performance. (more texture address units)
I don't think thats right:
8800gtx=24ROP's
8800GTS640(g80)=20ROP's
8800GTS(g92)=16ROP's
8800GT=16ROP's
8800GS=12ROP's
9600GT=16ROP's
9800GTX=16ROP's
Arn't the ROP's what matter when it comes to texture performance?

SH64
05-28-08, 02:26 AM
I don't think thats right:
8800gtx=24ROP's
8800GTS640(g80)=20ROP's
8800GTS(g92)=16ROP's
8800GT=16ROP's
8800GS=12ROP's
9600GT=16ROP's
9800GTX=16ROP's
Arn't the ROP's what matter when it comes to texture performance?
Did you read the quote on this post (http://www.nvnews.net/vbulletin/showpost.php?p=1665889&postcount=11) ??
To balance the improved shader performance, the texture engine is fitted with doubled addressing capabilities, allowing eight bilinear addresses to be calculated and eight texels filtered per cycle.

Basically the G9x has has double the TAU of the G8x which translates into better textuering peformance (i.e. when its needed)
http://www.nvnews.net/reviews/evga_geforce_9800_gtx/index.shtml

walterman
05-28-08, 03:36 AM
The ROPs (raster operators) convert the shaded and textured fragments that come out of the SP clusters into pixels.

The TA (texture addressing) units & TF (texture filtering) units, read & filter texels from the textures. Putting more TA/TF units in a chip won't help when you are bandwidth bounded. If you are working with an high level of AF, you need to sample a lot of texels from the mipmaps of your textures (using your TAs), and then filter them (using your TFs), so, you need 'memory bandwidth' to read all these texels. If you turn on the trilinear optimizations (brilinear), you are reducing the area in which the mipmaps are filtered trilinearly (reducing the number of texels that you need to read & filter), so, you are saving memory bandwith. Reducing the AF level also reduces this number of texels to read & filter.

Of course you can use compression for the textures, Z/color buffers (if ROPs support this), ... but the point of all this compression, is to save 'memory bandwidth', cause this is what you need to run games at high resolution with high level of AA/AF. So, do not let the gpu makers fool you.

P.D: Do not use the R600 as example of useless monster memory bandwidth, cause that chip had 16 TA units & the G80 had 32, plus, other big flaws in the architecture that brake the performance of the chip.

CaptNKILL
05-28-08, 09:28 AM
Heh, it definitely isn't simple anymore.

I wish there were some sort of ratings they could put on cards for shader and texture performance based on their specs.

Its just getting harder and harder for people to find out which cards are best without having to take a 9 month college course on graphics cards.

jAkUp
05-28-08, 12:24 PM
Heh, it definitely isn't simple anymore.

I wish there were some sort of ratings they could put on cards for shader and texture performance based on their specs.

Its just getting harder and harder for people to find out which cards are best without having to take a 9 month college course on graphics cards.

The easiest way is just to read reviews. There are plenty out there.

CaptNKILL
05-28-08, 12:38 PM
The easiest way is just to read reviews. There are plenty out there.
I know that, I'm just saying that looking at boxes or specs listed on online stores should net some sort of valuable information when researching a graphics card.

It used to be that if the model numbers made no sense you could just look at the specs listed on the box. Now, we have to know ROPs, TA\TF units, pixel and multitexture fillrate, memory bandwidth, shader processors and shader clock speed, AND whether a given card is really 1Gb or just 2 512Mb cards stuck together. And most of that is rarely shown on any store or retail box.

Its a mess and it'd be nice if the card makers at least attempted to clarify it a bit for the consumers rather than rely exclusively on fan sites to sell their hardware.

walterman
05-28-08, 03:04 PM
Then, read the reviews & buy the card that runs your game faster :)