nV News Deals Shop Archive Search Files Forum Feed Articles IRC Chat GeForce.com


Search Site
Ads by Google
Links To NVIDIA
Drivers
Products
Communities
Support
NVIDIA Blog
News Room
About NVIDIA
GeForce Technology
CUDA
DirectX 11
Optimus
PhysX
SLI
3D Vision
3D Vision Surround
Articles
GeForce GTX 580
GeForce GTX 570
GeForce GTX 560 Ti
GeForce GTX 480
GeForce GTX 465
GeForce GTX 460
GeForce GTS 450
GeForce GTX 295
GeForce GTX 280
GeForce GTX 260
GeForce GT 240
GeForce 9800 GTX
GeForce 9800 GX2
GeForce 9600 GT
GeForce 8800 Ultra
GeForce 8800 GTX
GeForce 8800 GTS
GeForce 8800 GT
GeForce 8600 GTS
GeForce 8500 GT
GeForce 7950 GX2
GeForce 7950 GT
GeForce 7900 GTX
GeForce 7900 GS
GeForce 7800 GTX
Watercooling Project
My Book 500GB
Raptor Hard Drive
Guide To Doom 3
EVGA Stuff
EVGA E-LEET
EVGA Precision
GPU Voltage Tuner
OC Scanner
SLI Enhancement
EVGA Bot
EVGA Gear
Reviews and Awards
Associates
Benchmark Reviews
Fraps
GeForce Italia
GPU Review
Hardware Pacers
LaptopVideo2Go
MVKTECH
News3D (NVITALIA)
OutoftheBoxMods
OSNN.net
Overclocker Cafe
PC Extreme
PC Gaming Standards
PhysX Links & Info
TestSeek
3DChip (German)
8Dimensional
NVIDIA GeForce 6800 Ultra Preview - Page 3 of 9

CINEFX 3.0

CineFX is the engine that powers the GeForce series of GPU's and consists of the graphics hardware and ForceWare drivers, both of which are closely tied to a specific version of the DirectX and OpenGL Applications Programming Interfaces (API's). The third-generation of CineFX debuts on the GeForce 6 Series and embraces the technology behind DirectX 9.0 and Shader Model 3.0.

Nalu Technology Demo

Under OpenGL, graphics acceleration in hardware is achieved through vendor specific extensions or extensions that have been adopted by the OpenGL Architectural Review Board (ARB). For example, the ARB_vertex_shader extension introduced programmable vertex shaders to OpenGL in version 1.4.

Direct3D vs. OpenGL Shader Compiler
Direct3D vs. OpenGL Shader Compiler

Programmability is an exciting and powerful feature of modern GPU's. While shader programs are natively written in assembly language, Microsoft's High Level Shader Language (HLSL) and OpenGL's Shader Language (GLSL) continue to gain acceptance in the developer community and are supported by CineFX 3.0. The shader compiler plays a key role as it translates high level shader instructions to assembler while hardware specific optimizations can be achieved by providing compiler hints.

FX Composer
NVIDIA's FX Composer Shader Development Environment

NVIDIA continues to support developers by releasing tools such as NVShaderPerf, which reports on DirectX and OpenGL shader performance for GeForce FX GPU's. Shader development tools like FX Composer are geared towards simplifying shader development by incorporating real-time preview options and optimization features. NVIDIA's Software Development Kit (SDK) is a valuable resource that contains sample code, demos, tools, technical papers, and tips for DirectX and OpenGL.

VERTEX SHADER 3.0

The following table summarizes the key differences between vertex shader 2.0 and vertex shader 3.0 features. A vertex shader performs tasks that include transformation, texture coordinate generation, lighting, and vertex level texture access.

Feature Vertex Shader 2.0 Vertex Shader 3.0
Instruction Slots 256 ≥ 512
Max Instructions 65535 ≥ 65535
Dynamic Branching No Required
Texture Lookup No Up to 4
Stream Divider No Yes

The GeForce 6800 Ultra contains six vertex processing units that are managed by an instruction scheduler. The vertex processing units are based on a Multiple Instruction, Multiple Data (MIMD) parallel architecture, which is characterized by each processor having its own copy of program instructions while performing different operations on different data streams.

GeForce 6 Series Vertex Processing Unit

Within each vertex processor is a scalar processing unit and a vector processing unit, which operate in parallel. A new feature of the GeForce 6 Series vertex shader is the texture fetch unit, which is capable of retrieving texture data from memory. This technology serves as the foundation for a hardware-assisted displacement mapping technique that can be accomplished using vertex shaders.

Infinite Length Vertex Programs

With previous versions of CineFX, NVIDIA based vertex shader limitations on the limitations Microsoft established in the DirectX API. Under DirectX 9.0, vertex shader versions 2.0 and 2.a allowed a maximum of 256 instruction slots and 65,535 instructions per program, and were incorporated in the CineFX FX 2.0 architecture of the GeForce FX. Although 256 slots are reserved for instructions, the number of instructions that can be executed is higher due to looping.

EverQuest 2

Vertex shader 3.0 has a minimum constraint of 512 instruction slots while the maximum number of instructions is capped by the MaxVShaderInstructionsExecuted variable in D3DCAPS9. Although vertex shader 3.0 documentation suggests that the maximum number of instructions be at least 2^16, CineFX 3.0 and GeForce 6 allow vertex shaders to execute an unlimited number of instructions.

Dynamic Flow Control

Dynamic flow controls are available to vertex shaders, which will provide greater control over program logic. New flow controls consist of new instructions (ifc/breakc, if/break/callnz), an eight-deep stack for return addresses and address registers (branch, call, push, pop), and condition code selection.

Dynamic branching increases the flexibility of vertex operations as conditions can be developed that determine how a specific vertex should be processed. This flexibility can also result in improved performance as unnecessary shader operations can be avoided. The flexibility that dynamic branching provides is a welcome feature, but care should be taken in order to ensure that processing remains efficient.

Texture Lookup

Displacement mapping is a graphics technique used to increase the visual detail of surfaces by incorporating effects like bumps, cracks, and dents. Traditional displacement mapping algorithms typically subdivide the structure of the underlying geometry to achieve a desired geometric level of detail, which can be computationally intensive. For example, the images below originated from a head model. The original mesh is comprised of 1,358 triangles while the displaced mesh is comprised of 48,434 triangles.

Geometry Based Displacement Mapping

With CineFX 3.0, NVIDIA designed texture fetching from memory capabilities into the vertex shader unit. This feature allows textures to be mapped onto vertices, which can be used to create an effect similar to geometry-based displacement mapping. Up to four textures can be retrieved and mip-maps are supported although no texture filtering occurs.

Vertex Frequency Stream Divider

A programmable vertex shader consists of instructions that manipulate vertex element data such as color, position, and texture coordinates. During the execution of a vertex shader, data is sent to an arithmetic logic unit (ALU) to perform the requested arithmetic and Boolean operations. The rasterization process operates on vertex component streams, which are comprised of vertex elements defined in the vertex shader.

Prior to vertex shader 3.0, a vertex shader was called once per vertex. Every time a vertex shader was called, its input registers, which are bound to vertex element data, were initialized with the vertex elements from the vertex streams.

Lord Of The Rings, The Battle for Middle-Earth

Vertex shader 3.0 allows an application to assign a rate at which vertex shader input registers are initialized. The rate determines the number of vertices that are processed before obtaining data from the vertex stream and loading it to the input registers.

The Vertex Frequency Stream Divider will benefit games that frequently make use of objects that are replicas of one another. In many cases, these objects are designed to perform similar actions and therefore can be efficiently controlled by the rate at which the vertex stream data is updated. The developer can also issue a "batch" update, which will affect all of the objects in a scene or limit the effect to a specific group of objects thereby providing them with unique characteristics.

Note that information on pixel shader 3.0 and other new features about the GeForce 6 will be forthcoming. For more information on Shader Model 3.0, please visit Microsoft's WinHec 2004 web site and read the article Shader Model 3.0 - No Limits, which was written by D. Sim Dietrich Jr. of NVIDIA.

Next Page: Antialiasing Image Quality

Last Updated on May 8, 2004


Table of Contents

Advertisement

nV News - Copyright © 1998-2014.
Search Products
Search
for


Ads by Casale
Tweaks
Metro: Last Light
PlanetSide 2
Miscellaneous Links
AutoDesk 123 Design
Build Your Gaming PC
FPS vs. Frame Time
Free Games And MMOs
GeForce SLI Technology
HPC For Dummies
PC Game Release Dates
Play Classic PC Games
Steam Hardware Survey
Video Game Designers
TechTerms Dictionary
GPU Applications
AMD GPU Clock Tool
AMD System Monitor
ATITool
aTuner
EVGA E-LEET
EVGA OC Scanner
EVGA Precision
EVGA Voltage Tuner
Gainward ExperTool
GPU-Shark
GPU Voltage Tuner
Fraps
FurMark
GLview
GPU Caps Viewer
GPU PerfStudio
GPU Shark
GPU-Z
MSI Afterburner
nHancer
NiBiTor
NVClock (Linux)
NVFlash
NVIDIA Inspector
NvTempLogger
NVTray
PowerStrip
RivaTuner
SLI Profile Tool
The Compressonator
3DCenter Filter Test
3DMark 11
3DMark Vantage
PhysX Applications
Cell Factor Revolution
Cryostatis Tech Demo
Cube Wall Demo
PhysX FluidMark
Fluid Physics
NV PhysX Tweaker
NVIDIA OPTIX 2
PhysX Downloads
PhysX at YouTube
Add-In Partners
AFOX
ASUS
AXLE
BFG Technologies
BIOSTAR
Chaintech
Colorful
ELSA
emTek
EVGA
GAINWARD
GALAXY
GIGABYTE
FORSA
FOXCONN
Inno3D
Jaton
Leadtek
Manli
MSI
Palit
PNY
Point of View
Prolink
SPARKLE
XFX
ZOGIS
ZOTAC