PDA

View Full Version : Can anyone with a NV40 help me out with this?


991060
06-06-04, 06:11 AM
Hi guys:
I need to do a few tests concerning NV40's z-fillrate ability. But I don't have one by hand, hope you guys can help me out. I had a program here: ftp://fillrate:fillrate@219.237.118.29/FillrateBenchmark_V092_b.rar
All you need to do is to run the z-fillrate test under these conditions(make sure to restart the program everytime you switch AA mode):
1. disable AA
2. enable 2xAA
3. enable 4xAA

Please post the result in the reply, and thanks a lot.

MUYA
06-06-04, 06:27 AM
Mike and Jaks maybe able to help.

mikechai
06-06-04, 06:31 AM
Sgt_Pitt has one too :)

991060
06-06-04, 06:50 AM
I just got these data from rage3d, it seems R420 can achieve 2 times z-fillrate when AA is enabled.

No AA
Z Fill : 5146.411 M-Pixel/s
Color + Z Fill : 3062.681 M-Pixel/s

2xAA
Z Fill : 4648.127 M-Pixel/s
Color + Z Fill : 2504 M-Pixel/s

4xAA
Z Fill : 2471.284 M-Pixel/s
Color + Z Fill : 2290.09 M-Pixel/s

6xAA
Z Fill : 1275.907 M-Pixel/s
Color + Z Fill : 1336.305 M-Pixel/s

note: you need to multiply Z Fill number with the number of AA samples to get real Z-Fillrate.

Sgt_Pitt
06-06-04, 07:20 AM
Benchmark Date/Time : 6/06/2004 10:19:55 PM

System Information
-----------------------------------------------------------
CPU : Intel(R) Pentium(R) 4 CPU 1.80GHz
GFX : NVIDIA GeForce 6800 Ultra
OS : Microsoft Windows XP
Settings : 1024x768 32 bits D16 No AA

Benchmark Result
-----------------------------------------------------------
FrameBuffer Clear : 10035.2 FPS
Color Fill : 6185.76 M-Pixel/s
Z Fill : 11978.93 M-Pixel/s
Color + Z Fill : 4328.521 M-Pixel/s
Single Texture : 6049.864 M-Pixel/s
Single Texture Alpha Blend : 3092.88 M-Pixel/s
Dual Textures : 3145.728 M-Pixel/s
Triple Textures : 2106.379 M-Pixel/s
Quad Textures : 1587.963 M-Pixel/s
1 Floating Poing Texture : 3140.695 M-Pixel/s
Render to Self : 5139.281 M-Pixel/s
PS 1.1 Simple : 5604.429 M-Pixel/s
PS 1.4 Simple : 5601.913 M-Pixel/s
PS 2.0 Simple : 5609.462 M-Pixel/s
PS 2.0 PP Simple : 5599.396 M-Pixel/s
Customized Pixel Shader : 3160.827 M-Pixel/s
PS 2.0 Complex : (Unsupported)
PS 2.0 PP Complex : (Unsupported)
PS 2.0 Massive Register : (Unsupported)
PS 2.0 PP Massive Register : (Unsupported)
PS 2.0 Sincos Procedure Tex : (Unsupported)
PS 2.0 Per-Pixel Lighting : (Unsupported)
-----------------------------------------------------------
* End of FillrateBenchmark Result

Dont ask me why the PS 2.0 tests werent supported. The program just said this test is not supported i have dx 9c and forceware 61.12 installed

Sgt_Pitt
06-06-04, 07:30 AM
This test is with 4xAA on

FillrateBenchmark(tm) 2004 - "easy benchmark series"

Benchmark Main Program Version: FRB_V092
Benchmark Date/Time : 6/06/2004 10:39:17 PM

System Information
-----------------------------------------------------------
CPU : Intel(R) Pentium(R) 4 CPU 1.80GHz
GFX : NVIDIA GeForce 6800 Ultra
OS : Microsoft Windows XP
Settings : 1024x768 32 bits D16 4x FSAA

Benchmark Result
-----------------------------------------------------------
FrameBuffer Clear : 3507.2 FPS
Color Fill : 3173.41 M-Pixel/s
Z Fill : 3170.894 M-Pixel/s
Color + Z Fill : 2337.905 M-Pixel/s
Single Texture : 3168.377 M-Pixel/s
Single Texture Alpha Blend : 2868.904 M-Pixel/s
Dual Textures : 3163.344 M-Pixel/s
Triple Textures : 2123.996 M-Pixel/s
Quad Textures : 1595.513 M-Pixel/s
1 Floating Poing Texture : 3160.827 M-Pixel/s
Render to Self : 5386.954 M-Pixel/s
PS 1.1 Simple : 3168.377 M-Pixel/s
PS 1.4 Simple : 3173.41 M-Pixel/s
PS 2.0 Simple : 3165.861 M-Pixel/s
PS 2.0 PP Simple : 3173.41 M-Pixel/s
Customized Pixel Shader : 3163.344 M-Pixel/s
PS 2.0 Complex : (Unsupported)
PS 2.0 PP Complex : (Unsupported)
PS 2.0 Massive Register : (Unsupported)
PS 2.0 PP Massive Register : (Unsupported)
PS 2.0 Sincos Procedure Tex : (Unsupported)
PS 2.0 Per-Pixel Lighting : (Unsupported)
-----------------------------------------------------------
* End of FillrateBenchmark Result

991060
06-06-04, 07:34 AM
Sgt_Pitt, which driver were you using?
It seems there's a bug in the program or in the driver.

Sgt_Pitt
06-06-04, 07:36 AM
im using forceware 61.12


2xAA

FillrateBenchmark(tm) 2004 - "easy benchmark series"

Benchmark Main Program Version: FRB_V092
Benchmark Date/Time : 6/06/2004 10:47:03 PM

System Information
-----------------------------------------------------------
CPU : Intel(R) Pentium(R) 4 CPU 1.80GHz
GFX : NVIDIA GeForce 6800 Ultra
OS : Microsoft Windows XP
Settings : 1024x768 32 bits D16 2x FSAA

Benchmark Result
-----------------------------------------------------------
FrameBuffer Clear : 8550.4 FPS
Color Fill : 6107.745 M-Pixel/s
Z Fill : 6107.745 M-Pixel/s
Color + Z Fill : 4426.668 M-Pixel/s
Single Texture : 5935.36 M-Pixel/s
Single Texture Alpha Blend : 3070.23 M-Pixel/s
Dual Textures : 3145.728 M-Pixel/s
Triple Textures : 2108.896 M-Pixel/s
Quad Textures : 1590.48 M-Pixel/s
1 Floating Poing Texture : 3140.695 M-Pixel/s
Render to Self : 5147.25 M-Pixel/s
PS 1.1 Simple : 5325.089 M-Pixel/s
PS 1.4 Simple : 5323.83 M-Pixel/s
PS 2.0 Simple : 5322.572 M-Pixel/s
PS 2.0 PP Simple : 5323.83 M-Pixel/s
Customized Pixel Shader : 3140.695 M-Pixel/s
PS 2.0 Complex : (Unsupported)
PS 2.0 PP Complex : (Unsupported)
PS 2.0 Massive Register : (Unsupported)
PS 2.0 PP Massive Register : (Unsupported)
PS 2.0 Sincos Procedure Tex : (Unsupported)
PS 2.0 Per-Pixel Lighting : (Unsupported)
-----------------------------------------------------------
* End of FillrateBenchmark Result

Its all alien to me, but i'd be happy to help out

MikeC
06-06-04, 08:12 AM
Results at default and overclocked speeds.

CPU : AMD Athlon 64 Processor 3400+
GFX : NVIDIA GeForce 6800 (Ultra)
OS : Microsoft Windows XP
Settings : 1024x768 32 bits D16

Drivers : 60.72


No AA:

400MHz/1.10GHz

Z Fill : 11790.19 M-Pixel/s
Color + Z Fill : 4326.005 M-Pixel/s

442MHz/1.18GHz

Z Fill : 13383.18 M-Pixel/s
Color + Z Fill : 4693.426 M-Pixel/s


2X AA:

400MHz/1.10GHz

Z Fill : 6173.177 M-Pixel/s
Color + Z Fill : 4512.232 M-Pixel/s

442MHz/1.18GHz

Z Fill : 6709.209 M-Pixel/s
Color + Z Fill : 4836.871 M-Pixel/s


4X AA:

400MHz/1.10GHz

Z Fill : 3178.444 M-Pixel/s
Color + Z Fill : 2370.62 M-Pixel/s

442MHz/1.18GHz

Z Fill : 3510.633 M-Pixel/s
Color + Z Fill : 2551.815 M-Pixel/s


Here is another source:
http://www.xbitlabs.com/articles/video/display/nv40_18.html

991060
06-06-04, 08:18 AM
Interesting, looks like NV40's massive z-fillrate isn't cut in half even if AA is enabled.

lowdog
06-06-04, 08:32 AM
Forceware 60.85


FillrateBenchmark(tm) 2004 - "easy benchmark series"

Benchmark Main Program Version: FRB_V092
Benchmark Date/Time : 6/06/2004 11:27:57 PM

System Information
-----------------------------------------------------------
CPU : Intel(R) Pentium(R) 4 CPU 3.40GHz
GFX : NVIDIA GeForce 6800 Ultra
OS : Microsoft Windows XP
Settings : 1024x768 32 bits D16 No AA

Benchmark Result
-----------------------------------------------------------
FrameBuffer Clear : 10035.2 FPS
Color Fill : 6198.343 M-Pixel/s
Z Fill : 11981.45 M-Pixel/s
Color + Z Fill : 4381.371 M-Pixel/s
Single Texture : 6072.514 M-Pixel/s
Single Texture Alpha Blend : 3092.88 M-Pixel/s
Dual Textures : 3148.244 M-Pixel/s
Triple Textures : 2113.929 M-Pixel/s
Quad Textures : 1590.48 M-Pixel/s
1 Floating Poing Texture : 3140.695 M-Pixel/s
Render to Self : 5137.603 M-Pixel/s
PS 1.1 Simple : 5624.562 M-Pixel/s
PS 1.4 Simple : 5614.496 M-Pixel/s
PS 2.0 Simple : 5614.496 M-Pixel/s
PS 2.0 PP Simple : 5614.496 M-Pixel/s
Customized Pixel Shader : 3158.311 M-Pixel/s
PS 2.0 Complex : (Unsupported)
PS 2.0 PP Complex : (Unsupported)
PS 2.0 Massive Register : (Unsupported)
PS 2.0 PP Massive Register : (Unsupported)
PS 2.0 Sincos Procedure Tex : (Unsupported)
PS 2.0 Per-Pixel Lighting : (Unsupported)
-----------------------------------------------------------
* End of FillrateBenchmark Result


FillrateBenchmark(tm) 2004 - "easy benchmark series"

Benchmark Main Program Version: FRB_V092
Benchmark Date/Time : 6/06/2004 11:31:10 PM

System Information
-----------------------------------------------------------
CPU : Intel(R) Pentium(R) 4 CPU 3.40GHz
GFX : NVIDIA GeForce 6800 Ultra
OS : Microsoft Windows XP
Settings : 1024x768 32 bits D16 4x FSAA

Benchmark Result
-----------------------------------------------------------
FrameBuffer Clear : 3507.2 FPS
Color Fill : 3165.861 M-Pixel/s
Z Fill : 3170.894 M-Pixel/s
Color + Z Fill : 2332.872 M-Pixel/s
Single Texture : 3163.344 M-Pixel/s
Single Texture Alpha Blend : 2861.354 M-Pixel/s
Dual Textures : 3163.344 M-Pixel/s
Triple Textures : 2118.962 M-Pixel/s
Quad Textures : 1595.513 M-Pixel/s
1 Floating Poing Texture : 3163.344 M-Pixel/s
Render to Self : 5379.405 M-Pixel/s
PS 1.1 Simple : 3168.377 M-Pixel/s
PS 1.4 Simple : 3165.861 M-Pixel/s
PS 2.0 Simple : 3168.377 M-Pixel/s
PS 2.0 PP Simple : 3173.41 M-Pixel/s
Customized Pixel Shader : 3163.344 M-Pixel/s
PS 2.0 Complex : (Unsupported)
PS 2.0 PP Complex : (Unsupported)
PS 2.0 Massive Register : (Unsupported)
PS 2.0 PP Massive Register : (Unsupported)
PS 2.0 Sincos Procedure Tex : (Unsupported)
PS 2.0 Per-Pixel Lighting : (Unsupported)
-----------------------------------------------------------
* End of FillrateBenchmark Result

mikechai
06-06-04, 08:57 AM
According to Nvidia's latest GPU programming guide (http://developer.nvidia.com/object/gpu_programming_guide.html) page 24, NV3x, NV4x will be able to render at double speed when rendering only depth or stencil values. And to enable that mode, the following rules must be followed:-
- Color writes are disabled
- 2x or 4x AA is not enabled
- Texkill has not been enabled to any fragments
- Depth replace has not been enabled to any fragments
- Alpha test is disabled
- No color key is used in any of the active textures
- No user clip planes are enabled
- No floating point render targets are in use
- Pixel shaders are disabled
- Render to a non-power-of-2 texture

By looking at the results, enabling AA doesn't inactivate NV40 32x0 mode.
Is it due to program bug or what? Any idea 991060?

We might have to wait for Doom3 to clarify things ....

991060
06-06-04, 09:23 AM
Sincerely, I have no idea how this happened, I opened a similiar thread over at rage3d(http://www.rage3d.com/board/showthread.php?s=&threadid=33762783) in which it is confirmed that R420 has similiar behaviour as NV40 in this issue.
I can not gaurantee the program is bug-free since I'm not the writter. And I have emailed nVIDIA about my find, hope they'll response soon.

Bad_Boy
06-06-04, 09:31 AM
no offence or anything but im kinda new to this program. but um...

Sincerely, I have no idea how this happened
what happened?

im guessing
enabling AA doesn't inactivate NV40 32x0 mode.

and that isnt a good thing. can somebody clear this up for me lol.

991060
06-06-04, 10:03 AM
In mikechai's post it is clearly stated that enabling AA will disable the "super z" ability, and this information is directly from nVIDIA.

Sgt_Pitt
06-06-04, 10:17 AM
Interesting, looks like NV40's massive z-fillrate isn't cut in half even if AA is enabled.

I think that was my fault i posted the results thinking AA was enabled but it wasnt, ive corrected the benchmarks

And it looks like zfill is cut in half when AA is on. doesnt it ?

991060
06-06-04, 10:43 AM
No, you need to multiply the fillrate number with the number of samples to get the real fillrate.
I think this is how NV40 works:
without AA, NV40 can do:
16 rendering color ops/clock, or
32 rendering depth ops/clock
with AA, it can do:
32 rendering color ops/clock(since the multisample unit can sample 2 sub-pixels in 1 clock), or
32 rendering depth ops/clock

Since in AA mode we have same speed for rendering color or depth ops, there's no "double speed" anymore. We just misread nVIDIA's document.

SH64
06-06-04, 06:02 PM
I had a program here: ftp://fillrate:fillrate@219.237.118.29/FillrateBenchmark_V092_b.rar
All you need to do is to run the z-fillrate test under these conditions(make sure to restart the program everytime you switch AA mode):
1. disable AA
2. enable 2xAA
3. enable 4xAA

Please post the result in the reply, and thanks a lot.

Thanks for posting that link .. i was looking for a prog to analyze my video cards FillRate & Pixel filling performance ! :)

just a couple of questions :
1)why the framebuffer clear changes with each test run ? & sometimes the numbers has big differences between them ?

2)I noticed my 5950u Single Texture Alpha Blend performance has actually increased when i used 4xFSAA ??! is that possible .. or it might be a glitch ?

ChrisRay
06-06-04, 06:07 PM
no offence or anything but im kinda new to this program. but um...


what happened?

im guessing


and that isnt a good thing. can somebody clear this up for me lol.


Why wouldnt that be a good thing? Z-Fill is colorless. there's no reason Anti Aliasing should hurt this.

Nutty
06-06-04, 06:13 PM
Why wouldnt that be a good thing? Z-Fill is colorless. there's no reason Anti Aliasing should hurt this.
4xMSAA requires a Z buffer of 4 times the size, hence 4x many depth writes.

ChrisRay
06-06-04, 06:30 PM
4xMSAA requires a Z buffer of 4 times the size, hence 4x many depth writes.


Forgive me if I'm wrong here, Wouldnt having twice the z-fill rate be a good thing for Multi Sampling? I dont see how having twice the z-fill could hurt AA.

That was my original conclusion. I was under the assumption he was saying More Z-Fill = Bad for AA.

MikeC
06-06-04, 09:24 PM
According to Nvidia's latest GPU programming guide (http://developer.nvidia.com/object/gpu_programming_guide.html)...

Nice find. I thought Section 4.7 - Identifying GPUs was interesting. Is this information related to the issues with Far Cry on the 6800?

mikechai
06-06-04, 10:13 PM
Actually when rendering depth and stencil values, there is no need to do AA anyway.
AA can be enabled again when rendering color+z. So no issue here.

SH64
06-09-04, 11:43 AM
This test is with 4xAA on

FillrateBenchmark(tm) 2004 - "easy benchmark series"

Benchmark Main Program Version: FRB_V092
Benchmark Date/Time : 6/06/2004 10:39:17 PM

System Information
-----------------------------------------------------------
CPU : Intel(R) Pentium(R) 4 CPU 1.80GHz
GFX : NVIDIA GeForce 6800 Ultra
OS : Microsoft Windows XP
Settings : 1024x768 32 bits D16 4x FSAA

Benchmark Result
-----------------------------------------------------------
FrameBuffer Clear : 3507.2 FPS
Color Fill : 3173.41 M-Pixel/s
Z Fill : 3170.894 M-Pixel/s
Color + Z Fill : 2337.905 M-Pixel/s
Single Texture : 3168.377 M-Pixel/s
Single Texture Alpha Blend : 2868.904 M-Pixel/s
Dual Textures : 3163.344 M-Pixel/s
Triple Textures : 2123.996 M-Pixel/s
Quad Textures : 1595.513 M-Pixel/s
1 Floating Poing Texture : 3160.827 M-Pixel/s
Render to Self : 5386.954 M-Pixel/s
PS 1.1 Simple : 3168.377 M-Pixel/s
PS 1.4 Simple : 3173.41 M-Pixel/s
PS 2.0 Simple : 3165.861 M-Pixel/s
PS 2.0 PP Simple : 3173.41 M-Pixel/s
Customized Pixel Shader : 3163.344 M-Pixel/s


Did you run this benchmark @ stock speeds or OCed ??
if OCed .. what were the frequencies?
thanks .