View Full Version : NV40 Speculation... will it really fix the DX9 issue?
Evildeus
09-20-03, 12:33 PM
Originally posted by Hellbinder
With LoKi you are looking somewhere beyond 2x the Raw Shader performance the R350 has. Well, isn't 2 times the raw performance of a R350 the minimum target of NV40? That's where they are weak, that's where they need to put transistors.
Well, at least that's how i see it, otherwise, Nv will loose once more.
ok hang on, (someone correct me if I'm wrong) as I understand it NV35 has 4 "pipelines" that can each do either 2 FP32 ops or 1 FP32+2 FP16 ops. NV40 has 8 "pipelines", so right off the bat it should have twice the shader power of NV35+a higher clock speed. hopefully the FP ALUs are designed better too. I realize this is oversimplified, so flame away.
Razor04
09-20-03, 09:18 PM
Ok...I just can't resist...but if the NV40 had twice the pixel shader power of the NV35 wouldn't that put it roughly on par with a R3XX? Hopefully it won't but we will see...personally I think they could use another failure as it would teach them some valuable lessons like following standards for one (they still don't seem to get this so maybe another failure is needed).
Originally posted by Razor04
Ok...I just can't resist...but if the NV40 had twice the pixel shader power of the NV35 wouldn't that put it roughly on par with a R3XX?
if NV40 handles PS the same way as NV35, then yes. hopefully things have been improved *knock on wood*
this is how I understand it: R400 was pushed way back and is now R500. R420 is basically a 12x1 R350 with PS3.0. this gives NV an opportunity to catch up with ATI by introducing a new architechture while ATI is still using its "old" (albeit superior) technology. if R420 is "only" a 12x1 R350 it will still be a DX9 monster, so if NV40 is the same disasterpiece NV35 was it will still be left in the dust. basically this stuff is anyones guess, there's not enough information to come to any conclusions (although I admit to being skeptical of Nvidia, as I'm sure we all are)
Originally posted by nobie
if NV40 handles PS the same way as NV35, then yes. hopefully things have been improved *knock on wood*
this is how I understand it: R400 was pushed way back and is now R500. R420 is basically a 12x1 R350 with PS3.0. this gives NV an opportunity to catch up with ATI by introducing a new architechture while ATI is still using its "old" (albeit superior) technology. if R420 is "only" a 12x1 R350 it will still be a DX9 monster, so if NV40 is the same disasterpiece NV35 was it will still be left in the dust. basically this stuff is anyones guess, there's not enough information to come to any conclusions (although I admit to being skeptical of Nvidia, as I'm sure we all are)
With the relatively low transistor count of the nv40, I would expect nvidia to have made major changes to the design over the entire nv3x series. At 8x2 with the traditional nv3x architecture I believe you would have 50% more transistors to deal with the doubled pipes, but its more like 1/3 more. This means, the large sections of silicon that didnt work in the nv3x are gone and redesigned. Perhaps now, and most importantly, added to the PS units is an instruction scheduler in hardware. With the addition of branching to PS 3, nvidia probably now is running ps and vs code through the same shader pipes since the hardware should be roughly similar. I believe that the move to fx16 is mostly just to run legacy pixel shaders at optimal speed as the latency of integer operations is less than the equivalent fp instructions.
In real world ps tests, I would imagine that the ps power of the nv4x will be somewhere between 2 to 4 times higher than the equivalent nv3x part. In contrast, the nv4x only should have twice the ps power of the nv3x due to doubled pipelines/fp units. On the otherhand, if the vertex and pixel shaders are sharing the same hardware, I am not sure how well the nv4x architecture will fare with scenes which utilize heavy use of pixel and vertex shaders.
I understand there is a huge difference in transistor count between fp32 and fp24. If NV40 is fp32 and R420 is fp24 you may see more parallelism in the R420 because they have transistors to spare. I would imagine both companies have the same design/cost constraints. I expect we will get an early look at loki when the RV360 comes out and we see what they have done to it.
hkultala
09-26-03, 08:24 AM
Originally posted by Hellbinder
The Radeon 8500 is a good hardware design. Its problem was it was released 6 months to late. Introduce that card around the same time as the GF3.. Whole different ball game. The one weak area with it is its memory controler was not nearly as efficient as Nvidias. Still it would totally have outperformed the Origional GF3 had it been introduced in a timely manner.
It's lack of MSAA was quite bad shortcoming.
If it had it come at same time as NV20, with good drivers, it would propably have been faster without AA,
but with AA, NV20 would have been faster.
It however still had better AA quality than NV20.
vBulletin® v3.7.1, Copyright ©2000-2012, Jelsoft Enterprises Ltd.