PDA

View Full Version : How are polygons per second calculated?


intercede007
10-07-02, 05:58 PM
How does someone go about determining how many Polygons per second a GPU can push out? Is it an industry standard thing? Basically what I'm trying to say is, is it a pretty useless measure of performance or is it fairly indicitive of real-world speed?

PenguinJim
10-08-02, 11:01 AM
I take a screenie and count them by hand. It takes some time, though

intercede007
10-08-02, 01:30 PM
Thanks, that was really informative.

Joel776
10-08-02, 01:32 PM
I don't think there is an industry standard for measuring polygon count -- marketing would have a conniption!

But, basically, you can write an app draw a polygon and determine how long it took, and then multiply up to a second. Or, for a more real-world and perhaps more accurate test, you can draw a million polygons and still multiply up to a second. It's pretty much like it sounds.

Uttar
10-08-02, 01:46 PM
The one program which is supposed to determine the number the tris a GPU can push is available at:
http://developer.nvidia.com/view.asp?IO=BenMark5

I wouldn't say it's an industry standard, but no one could say it isn't the most efficient program at doing that. And since ATI never did one themselves, i guess they're using it too, hehe.


Uttar

Chalnoth
10-08-02, 05:50 PM
You can't count the number of polygons per second a GPU can process.

A GPU's T&L speed is measured by the number of vertices per second, which is vastly different.

But, at the same time, it is also very similar.

That is, consider for a moment that pretty much any polygon used in a game will be a triangle. Sometimes quads are used, but those are always easy to break down into triangles, so it usually holds that the polys/sec number is equivalent to tris/sec.

Now, onto triangles per second. It is theoretically possible to build a mesh with about two times as many triangles as vertices, but in practice, an optimized mesh will always be built out of strips and fans. Strips and fans have the nice feature of making it very, very easy for the caches in the GPU to not discard vertex data that will be used soon after.

That is, think about a fan for a moment. A fan is a set of triangles that all share a common vertex, with the subsequent vertices counted out along the edge of the fan. This has the nice feature of each new triangle being specified by only a single new vertex, with the other two vertices of that triangle being shared with the previous triangle.

What all this means is that for optimized meshes, the number of triangles per second is roughly equivalent to vertices per second.

So, in the end, while polys/sec, tris/sec, and verts/sec are all very different measures, and verts/sec is the only valid one for GPU performance (the other two are more valid for measuring how optimal software is...), the three are often used interchangeably, because for most situations, they are going to be the same.

Uttar
10-09-02, 10:54 AM
I've never seen someone who uses polys/sec, and if someone does, he should go to jail ASAP :)

Back to the subject...

Yes, Vert/s & Tris/s is different.

You're talking about fans too. Where the heck did you see you should optimize into fans or strips?

The only optimal solution is optimizing in strips because fans unable any type of good batching, which kills performance ( it's most of the time a lot slower than triangle lists! )

So, you got to use strips to be optimal.

You're also talking about TnL performance. Really, that's an odd way to see it since we got VS and PS now and no more TnL unit.

VS performance should be measured according to the number of vertices/second, agreed. So you got to draw a LOT of small triangles.

PS performance should be measured using HUGE triangles ( preferably textured ) filling the whole screen to make sure the VS isn't the bottleneck )


So BenMark is only usefull to see the VS performance.
I know of no program measuring PS performance :(


Uttar

Chalnoth
10-09-02, 12:03 PM
Originally posted by Uttar
[B]You're talking about fans too. Where the heck did you see you should optimize into fans or strips?

The only optimal solution is optimizing in strips because fans unable any type of good batching, which kills performance ( it's most of the time a lot slower than triangle lists! )

Where do you get this from? How do fans disable "any type of good batching?"

Anyway, I described fans because I figured they were easier to understand, though they are certainly less useful for building models than strips. Having said that, there are certainly some situations where fans would be beneficial (horns or other cone-like objects come to mind...where a fan would be good for at least part of the object).

If you used fans for all geometry, you would certainly have problems as most geometry isn't well-suited for using them. Is this what you're attempting to describe? Just looking at them fundamentally, I see absolutely nothing that should separate fans from strips, other than possible lack of hardware optimization for fans.

Anyway, I thought I remembered seeing a utility that optimized meshes into both trip strips and fans, but I can't seem to find it now (all the ones I found only optimize to tri strips), so maybe nobody bothers with fans.

You're also talking about TnL performance. Really, that's an odd way to see it since we got VS and PS now and no more TnL unit.

VS=TnL. It's not hard to see this. I was attempting to describe basic verts/sec performance as TnL because that assumes no complex vertex programs are being run. VS performance overall is much, much more complex. For example, let's say that a given piece of hardware has 4 parallel vertex shader pipelines (each capable of a 4-component dot product each clock, plus some other math...), and runs at 400MHz. That product should peak at 400Mverts/sec (though in reality lower due to other constraints). Now let's compare it to another product with the exact same specs, but can do some other ops (such as, say, sin/cos) in two clocks instead of the four this first product takes, but the second product takes four clocks to the original's two for e^x calculations.

More simply-stated, the more complex nature of vertex programs makes for many, many more possible combinations of the usage of the processors, meaning judging overall performance of the vertex pipelines is much more challenging.

Uttar
10-09-02, 02:57 PM
http://developer.nvidia.com/docs/IO/1309/ATT/GDC2001_Basic_Mistakes.pdf

There, look for the "Failing to Batch" page.

Anyway, here's the quote:

"If you are considering using fans something has gone wrong"


Agreed TnL is VS, that's pretty much what i supposed, but it's still not the best term to use because there's also per-pixel lighting in DX7.


Uttar

intercede007
10-09-02, 05:35 PM
Hehe..dang guys, I asked for one thing and I got an entire lesson on the subject. Thanks fella's!

Chalnoth
10-09-02, 06:50 PM
Originally posted by Uttar
http://developer.nvidia.com/docs/IO/1309/ATT/GDC2001_Basic_Mistakes.pdf

There, look for the "Failing to Batch" page.

Anyway, here's the quote:

"If you are considering using fans something has gone wrong"

That really doesn't answer anything. For all I know, that's implementation-dependent (i.e. could be different in Direct3D and OpenGL...that paper was specifically written for DX8).

Do you have a more specific reason as to why fans are bad?

Note that I had, at one time, only considered strips as a decent way of optimizing meshes. I had heard that fans were also used, though, so this is why I'm asking this.

MikeC
10-09-02, 09:29 PM
Originally posted by Chalnoth
Having said that, there are certainly some situations where fans would be beneficial (horns or other cone-like objects come to mind...where a fan would be good for at least part of the object).


I just happened to be reading the OpenGL Redbook, which contains a comment in chapter 14 to use triangle fans for best performance when drawing a concave polygon. This is beyond my level of comprehension since I'm still in chapter 2 :)

http://www.ime.usp.br/~massaro/opengl/redbook/redbook-14.pdf

Chalnoth
10-09-02, 11:02 PM
I'm not sure you'd ever use anything quite like that method for any real-time 3D app, where all polygons should already be triangles or quads anyway, though it may be beneficial for 3D modelling applications (i.e. 3DSMax) or simulations (I'm not talking about the simulation genre of games here, btw...).

And, of course, it assums that triangle fans are optimized...which is not the case in Direct3D according to the above pdf file. I suppose my question should be, then, are triangle fans optimized in nVidia hardware in OpenGL? Are they optimized for all hardware in OpenGL?

Uttar
10-10-02, 01:20 AM
Originally posted by Chalnoth
I'm not sure you'd ever use anything quite like that method for any real-time 3D app, where all polygons should already be triangles or quads anyway, though it may be beneficial for 3D modelling applications (i.e. 3DSMax) or simulations (I'm not talking about the simulation genre of games here, btw...).

And, of course, it assums that triangle fans are optimized...which is not the case in Direct3D according to the above pdf file. I suppose my question should be, then, are triangle fans optimized in nVidia hardware in OpenGL? Are they optimized for all hardware in OpenGL?

According to ALL of nVidia papers, fans are just as fast as strips. Only difference is that they are nearly impossible to batch.

There could be VERY specific cases where fans could be batched, but really, making special purposes things like that would make most programmers go mad :) They'd prefer to go strips for everything.

But i'm just saying that - maybe some games use fans in rare cases too. I'm just saying it's unlikely.

Chalnoth
10-10-02, 01:40 AM
Originally posted by Uttar
According to ALL of nVidia papers, fans are just as fast as strips. Only difference is that they are nearly impossible to batch.

There could be VERY specific cases where fans could be batched, but really, making special purposes things like that would make most programmers go mad :) They'd prefer to go strips for everything.

But i'm just saying that - maybe some games use fans in rare cases too. I'm just saying it's unlikely.

And by batching I'm assuming that you mean loading one right after another in rendering? Yes, I would imagine that could be challenging, which means, to me, that it's only useful to have fans in situations where they are especially optimal, and strips are not (Assuming you have to break a batch whenever switching from fans to strips).

For example, in the cone case given before, you would either have to use single triangles, or a single fan. I would tend to think that a fan would be preferable.

As a side note, for, say, a "curved cone" like, if you'll imagine for a moment, a bull's horn, you could use triangle strips that have a beginning or ending point at the tip of the horn, and and ending point at the base, but for high tessellation, this would require far more strips than just using a single fan at the tip and a bunch of cylindrical strips.

At the same time, given the very small amount of geometry that is optimal for fans, it may not be all that great to have code that deals with it. At the same time, I don't see why it wouldn't be bad for auto-optimizing algorithms to put in fans. Provided, of course, that no additional programming needs to go into supporting the fans.

Regardless, I still fail to realize why fans must break batching. There just doesn't seem to be any reason, to me, that fans have to be that much different from strips to cause stalls or be overall less efficient.

Uttar
10-11-02, 04:01 PM
Originally posted by Chalnoth

Regardless, I still fail to realize why fans must break batching. There just doesn't seem to be any reason, to me, that fans have to be that much different from strips to cause stalls or be overall less efficient.

Okay, let me try to explain that again.

If you use the same texture/shader/vertex buffer for a whole model ( which is very likely, in fact. Unless you want to do special effects with shaders, which again, could be done using dynamic branching with NV30 hardware ) , you can potentially do it on ONE DrawIndexPrimitive call with strips. However, that isn't very usefull since as of 500-1000 tris/call, performance doesn't become much better.

So let's consider a goal of 500-1000 tris/DIP call and 3000 tris models.
With strips, that's easy. With fans, err, impossible.
Sure, you can do some things with it. But i couldn't imagine much cases where it would have 500 tris.

Really, using fans is suboptimal because you can't get to 500 tris in most cases.

But then you'll say "Yeah, and what's the difference? 1%?"
Err, actually, it's very signifiant.

If you only draw 100 tris in a DIP call, you only got 57% of the performance you can get with 500 tris. In case you draw 200 tris, you still only got 78% of the perf of 500 tris.


Did that explain my point? :)


Uttar

DaveW
10-11-02, 04:15 PM
How does someone go about determining how many Polygons per second a GPU can push out?

1)Write some random numbers on some scrap pieces of paper.

2)Get some guy from marketing to eat them.

3)Wait for one to come out the "other end".


Seriously, there is some math behind it, but its very selective math. For example, Sony claimed the PS2 could do 60 million polys/sec. But in reality it didn't even have the memory to store that many vertices, never mind any textures.

Chalnoth
10-11-02, 06:44 PM
Originally posted by Uttar
So let's consider a goal of 500-1000 tris/DIP call and 3000 tris models.
With strips, that's easy. With fans, err, impossible.
Sure, you can do some things with it. But i couldn't imagine much cases where it would have 500 tris.

Really, using fans is suboptimal because you can't get to 500 tris in most cases.

Okay, but most models are going far in excess of 3000 tris nowadays.

I suppose what I'm saying is that while most geometry is certainly optimal for strips, some is very optimal for fans. But yes, you would need quite high triangle counts to see any possible need to use fans.

What I'm getting out of this is that there is really no good reason not to use fans, provided that the fans have enough triangles.