PDA

View Full Version : Performance difference between 24-bit and 32-bit percision


NickSpolec
09-15-03, 10:48 AM
I was wondering, is there a large performance difference between FP 24 and 32?

There isn't any hardware that supports both (Radeon's only 24-bit, Geforces 16 and 32-bit), so there is no way in testing both percision methods on the same piece of hardware, but theoretically is there? What's the estimated margin?

The reason I ask is to try and possibly figure out Nvidia's motives for lowering percision in DX9 games through their drivers (Dets 50), other then the fact that the Nv3x hardware is utter pants at every and all things DX9.

Here is my running theory:

All DX9 games that make calls for FP 24 on Nvidia's DX9 hardware will automatically be upped to FP 32. This is assuming the game has not been altered by the developers to already replace all calls for FP 24 with a lower percision for Nvidia hardware.

I think a main reason Nvidia's is lowering the percision through drivers (other then Nv3x sucks at DX9... really) comes down to the original decision not to include FP 24 as a feature of the Nv3x hardware. Perhaps Nvidia feels it is getting an unfair performance hit (compared to ATI hardware) in certain games/apps that make general calls for FP 24. This is, of course, because when general calls for FP 24 on done on Nvidia hardware, FP 32 is automatically used, where on ATI Hardware *only* FP 24 has to be used. Maybe Nvidia sees it as an unfair comparision (theoretically, Nvidia hardware is doing more work in percision then ATI hardware).


Now of course, this all goes back to my original question (is there a large performance difference between FP 24 and 32?) and assuming that IF FP 24 just so happen to be an option on Nvidia hardware that Nvidia would *actually* let it be used on their hardware, unaltered (i.e., Nvidia would not lower percision for games using FP 24).

That, and this all assume I am *not* completely wrong about some points.

Bopple
09-15-03, 11:00 AM
It has nothing to do with the performance when done correctly.
It's not anything like FP24 speed at 4/3 of FP32 speed.
It's to do with the transistor counts in the chip.

GlowStick
09-15-03, 11:02 AM
Yes there is a large preformance diffrence, espically between fp16 and fp32. (though this is case specific, someone could make a natively fp32 card that only dose fp32 and dose it fast)

But i think you also are looking at quality, eg how much better dose fp32 look compared to fp24, and thats hard to tell

mathmaticly theres a big diffrence(its all exponental), but it all gets translated to our monitors, and how many colors can the human eye see, and how many colors can your monitor display so on.

jimbob0i0
09-15-03, 11:05 AM
Well seeing as you have rightly said there is no hardware to offer a choice of 24bit or 32bit floating point precision the answer is, I'm sorry for this, there is no way to make a valid comparison of speeds.

As for the DX calls - no calls are made to '32bit', '24bit' or '16bit'... What actually happens is a call is made to use either full or partial precision. In NV's case full would be 32bit and partial 16bit. In ATI's case either would be 24bit. DX9 was created with 24bit in mind and that is why if a DX shader is written and compiled without and __pphint lines to force partial precision it will be full - and this is the way that most DX9 code apps will likely be written - esp since for certain things 16bit is not enough and full precision (24 or 32 bit) is necissary (eg. HDR effects).

But the main point is this was a hardware choice when NV3X was in it's design stages by NV so there's zero they can do for it now until NV40 (as long as that is a completely new design) and for now can only rely on _pphint statements to help them (and that probably won't help very much and I doubt many small devs will spend the tim doing them TBH) and also they can rewrite shaders in there driver to go as low as INT12 for precision (which breaks DX9 spec).

NickSpolec
09-15-03, 11:09 AM
Many thanks for explaining.

Hank Lloyd
09-15-03, 11:09 AM
All DX9 games that make calls for FP 24 on Nvidia's DX9 hardware will automatically be upped to FP 32.

I think the prevailing theory early on is that this is exactly what would happen. The way things are now, I have zero % faith that nVidia would promote any data type to a greater precision. The performance is bad enough at a lower precision. I think the exact opposite is what you're likely to see from here on out, unless Microsoft has WHQL restrictions on this type of driver behavior.

The one thing that has been stated as fallout from all these driver games is that Microsoft has added more tests in order to ensure compliance, which is a good thing.

The Baron
09-15-03, 11:11 AM
Anyone know how a driver becomes WHQL-certified?

For example, does MS get the source, or do they just run the driver through a battery of tests?

Sazar
09-15-03, 11:14 AM
Originally posted by The Baron
Anyone know how a driver becomes WHQL-certified?

For example, does MS get the source, or do they just run the driver through a battery of tests?

afaik there are others who run the tests... I believe m$ has a suite of tests but they themselves do not conduct them for compliance...

I may be wrong though... this is just hearsay...

Bopple
09-15-03, 11:14 AM
Clarifying...
I took your question as:
"If R300 had the exactly same architecture as the current form but replacing FP24 with FP32, would it be slower than the current form?"
My answer is "No, but you need much more transistors integrated in the chip."

jimbob0i0
09-15-03, 11:14 AM
Originally posted by The Baron
Anyone know how a driver becomes WHQL-certified?

For example, does MS get the source, or do they just run the driver through a battery of tests?

Baron AFAIK from NV sources (can't name anything sorry.. PM me if you want to know more) it is a series of tests they go through. Can take up to a week. If all tests are okay then they are certified... if not then a motice goes back to the vendor (NV in my/our case of course) with where it failed... they then alter the driver as appropriate and send it through their QA lab and then back to M$ again for WHQL tests.

Dazz
09-15-03, 11:20 AM
Well if it was like anything back in the old days with 16, 24, 32bit colour cards there was a big performance hit from 24 to 32bit. 24bit was only slightly slower then 16bit but looked as good as 32bit.

vandersl
09-15-03, 11:49 AM
As Bopple said - there is no inherent performance hit from processing 24bits vs 32bits, if the hardware is designed for it.

32bit internal precision won't even have an impact on memory bandwidth requirements (ATI anyways), since all external data formats are 32 bit.

That said, if an architecture can use the same resources to process either 16bit FP at 2X speed, or 32bit FP at 1X speed, then of course there would be a performance difference. But there is no inherent rule that '32bit FP takes longer'. It is, as Bopple said, due to a hardware resource limitation.

ATI is in a fairly good position right now, IMHO, since they can update to 32bit FP with no performance hit by increasing the number of bits in their FPU and register set. I imagine the change to 0.13u will allow this, though I am not sure they would make the change for the next round.