We've already looked at the performance of Quake 3 Arena, which has served us well for the past two years as a 3D graphics benchmarking measure. While we await the arrival of the next incarnation of Doom from id Software, times are beginning to change and we need a measurement which provides performance data for the new wave of graphics accelerators. More specifically, games that will begin to expose the new capabilities of the GeForce3.
Enter Zetha gameZ. A relatively unknown group of talented developers from Italy have created a sophisticated 3D graphics benchmark based on their upcoming game DroneZ. With the GeForce3 and the nGenius II graphics engine, which was developed by Zetha's 3D Technology Leader Carmine della Sala, the cyberspace action based game boasts the following features:
- High polygon counts to provide extremely smooth characters, objects, and environment details.
- Exploits the four hardware texture units for custom lighting effects, combined with multitexturing and custom glossy bumpmapping with texture shaders and special effects.
- Uses a large number of vertex programs not only to achieve better performance on the usual tasks such as skinning and animation, but also to assist in implementing a custom illumination system.
- Texture shaders are used for high quality per-pixel, per-light, bumpmapping.
On the surface, the DroneZ rolling demo appears to be just another benchmark.
But once you open the hood and uncover the details, you will see otherwise.
The DroneZ rolling demo, which is also available as a movie on Zetha's web site, contains the following graphics and benchmarking features:
- Exploits the strength of the GeForce3 by using hundreds of vertex programs and texture shaders along with register combiners
- Uses a custom illumination model (with per-pixel bumpmapping) through the use of vertex programs.
- Provides an optimized software only pipeline for previous generation graphics processors.
- Allows for comparing vertex program emulation versus optimized software pipeline on the GeForce2.
- Reports both processed polygons and GL polygons for fair comparison between software and hardware.
According to Giovanni Caturano, who is the Development Leader of Zetha gameZ, the GeForce3 nFiniteFX engine allowed them to fully utilize the advanced features of their nGenius II graphics engine:
Our custom illumination model doesn't fit the GeForce 1/2 hardware transform and lighting. This means that these cards are not powerful enough to allow for a custom illumination to be put in hardware. The GeForce3 is the only card that can handle a custom illumination like ours to be put in hardware, through the nFiniteFX engine.
Zetha has developed a custom illumination model that differs from the vertex illumination model normally used in OpenGL and DirectX. This was necessary in order to achieve a very precise and solid-looking feel that blends well with bumpmapping effects in the nGenius II graphics engine.
In cases where graphics processing in DroneZ is not hardware accelerated by the GeForce 1/2, Zetha has fine-tuned the nGenius II graphics engine to perform optimally in software mode (via the central processing unit).
This type of processing is also found in MadOnion's 3DMark2001 benchmark that contains support for DirectX 8 pure hardware transform and lighting whereby vertex and pixel shader operations are handled by the graphics processing unit. On the other hand, when the standard transform and lighting rendering mode is used in 3DMark2001, certain graphics operations are carried out by the central processing unit.
For example, take a look at the following preset benchmark configurations the DroneZ rolling demo uses to implement bumpmapping on the GeForce2 Ultra:
- GeForce2 Bump Mode - uses the optimized software of the nGenius II graphics engine.
- GeForce3 Bump Mode - uses the GeForce3 nFiniteFX graphics engine software emulator.
In these two modes, bumpmapping is processed by the central processing unit. However, the GeForce2 Bump Mode uses the optimized code of the nGenius II graphics engine while the GeForce3 Bump Mode emulates the GeForce3 nFiniteFX graphics engine.
However, the same configurations can also be tested on a GeForce3:
- GeForce2 Bump Mode - does not utilize the GeForce3 nFiniteFX engine. Tests what the GeForce3 brings to older games (like Quake 3).
- GeForce3 Bump Mode - fully utilizes the GeForce3 nFiniteFX engine.
With our testing, we are able to:
- Evaluate optimal code versus the overhead of emulation - running GeForce2 Bump Mode and GeForce3 Bump Mode on the GeForce2.
- Evaluate the advantage of the nFiniteFX engine versus optimal code - running GeForce3 Bump Mode and GeForce2 Bump Mode on the GeForce3.
Performance results from a resolution of 640x480 are included as the graphics fill rate is minimized. In this case, the graphics are geometry bound, which shows the power of the nFiniteFx compared to any software implementation. Even if it's the best possible optimized solution for a 700MHz Pentium 3 processor and GeForce2 Ultra.
Note that sound was muted for these tests.
Optimal Code vs. Emulation
The results using the GeForce2 Ultra illustrate the two different methods of using the DroneZ illumination system. In GeForce2 Bump Mode the optimized code of the nGenius II graphics engine is being utilized as processing is shared between the graphics and central processing units. In the second case, which is a worst case scenario, vertex processing is emulated entirely by the central processing unit.
Optimal Code vs. nFiniteFX Engine
The results of the GeForce3 in the GeForce3 Bump Mode test clearly shows the power of the nFiniteFX engine especially at lower resolutions where fill rate is minimized. This is in contrast to the GeForce2 Bump Mode results, which shows the performance of the GeForce3 is close to the GeForce2 Ultra (first chart) as both modes are utilizing the nGenius II graphics engine.
The DroneZ rolling demo creates a text file of benchmark results which have been furnished here:
The following benchmark report is a listing of the results of the GeForce3 running the GeForce3 Bump High Quality test at a resolution of 1600x1200 in 32-bit color. In this test, vertex shaders are used to provide even better image quality at the expense of a slight loss in performance.
Benchmark resolution: 1600 x 1200 at 32 bit
Current preset: geforce3 bump hq
Z buffer: 16 bit
Texture resolution: 16 bit
Texture size: 512 pixels
Geometry Mode: GL_Draw_Elements
Enabled OpenGL Extensions:
Rendered Frames: 9721
Minimum FPS: 8.98
Maximum FPS: 205.18
Average FPS: 52.6310
Minimum GL K-triangles: 6.72
Maximum GL K-triangles: 1785.55
Average GL K-triangles: 824.2319
Minimum T&L K-triangles: 7.43
Maximum T&L K-triangles: 1785.74
Average T&L K-triangles: 832.0350
Notice that the results contain counts for GL K-triangles and T&L K-triangles. T&L triangles represents the count of all polygons (in thousands) that have been processed by both the graphics processing unit and/or the central processing unit. The GL triangles count contains the number of polygons that are actually passed to the videocard for processing.
When the nGenius II graphics engine defaults to software transform and lighting, polygons are processed on the host processor and the engine decides if they are to be passed to the videocard. However, when the nGenius II graphics engine senses that the GeForce3 nFiniteFX engine is available, all the polygons in the scene are passed to the videocard.
Also note that the polygon counts in a scene are not the total polygons in the game world. Clipping and culling techniques are used to eliminate objects that are hidden from view.
We've finally gotten a sneak peak at the capabilities of the GeForce3 and its nFiniteFX engine under the OpenGL API. Not only does the GeForce3 crunch numbers faster than the 700MHz Pentium 3 processor I was using, but it also allows developers to raise the bar a notch by adding some nifty special effects.
Can you tell which of these two images was generated based on the GeForce3 nFiniteFX engine?
Good job. But hold on to your horses.
Let's see how Qunicunx antialiasing performs under the GeForce3 Bump High Quality test.
Quincunx Antialiasing Performance
Not quite the 60 frames per second we need for high resolution antialiasing, but it's close enough. The entire benchmark results are as follows:
As expected, we like to show off Quincunx antialiasing in action. The images are from the screen capture capability of the DroneZ rolling demo and were saved in their native PNG format. The price you pay for high quality images is increased file size as each images is around 500-600KB in size.
Additional information on DroneZ, including an in-depth interview with Giovanni Caturano, can be read over at the Daily Radar.
Next Page: DirectX 8 Performance