PDA

View Full Version : NV30 Memory Optimizations - What the heck could they be?


Uttar
10-24-02, 03:44 PM
Hello everyone,

As i've seen a lot of posts about NV30 having some type of revolutionary way to increase effective memory bandwidth but nearly none trying to explain that, i've decided to write this thread.

First, why would anyone need a lot of memory bandwidth beside in FSAA situations & 64/128 BPP? The response, quite frankly, is no one.

So since nVidia goal is to have "better" pixels, they want to make those solutions performance acceptable.

One of the rumor actually suggested a color optimization. That actually makes sense when you think about it. If you can do nearly 50% lossless Z Buffer compression, couldn't you just apply the same technology to the whole screen?
My response would be that there'd be no problem doing so.
The only acceptable explanation of Z Buffer compression seems to come from ATI: http://www.graphicshardware.org/previous/www_2000/presentations/ATIHot3D.pdf

Basically, you use 8x8 pixel blocks for the compression and play on the fact it's likely to have very similar Z. Which is still the case with color - After all, if you got a LOT of grass, there's very little red. And that means color compression could help a lot.
But one of the most amazing example would be the sky. It's frequent to have skies in games, and there's a LOT of the same color. Optimizing that would be wonderful!

And with high color depth, the effects would be even more major.
My conclusion? It's likely this rumor isn't a joke. A lossless Color Compression technology will be in the NV30. And the effect will become even more major as BBP increases.
This could just as well be a complete failure, as in real-life applications, those optimizations could potentially reduce performance. However, a recent rumor suggesting "Some of the systems work even better than hoped" hint in the favor this works.


But then, what's another thing that takes a LOT of memory bandwidth?
Overdraw. ATI solution, of course, is Early Z. This is a good solution, really, and i fail to see how nVidia could invent a better one.
My understanding from that technology is that Early Z is an added function which happens between the triangle setup and the Pixel Shader. Hard to make it more efficient, really.

So, what's left? FSAA first. Well, really, i fail to see what they could do here. Multisampling performance is mostly impacted by the Z Buffer fetching performance, so the only option would be to use a more efficient Z compression method. They could have found one, but really, i fail to see what.

One more thing: Texture size, format & compression. What could the fix be, here? Well, there's that ridiculous 64BPP super-mega-highly-efficient compressing rumor. I doubt there's any trust in that. But only time will tell i guess...


Uttar

Mono
10-24-02, 05:41 PM
I guess we'll find out soon enough. I think it's gana take some balls in the hardware to really push features ahead though... everything has gotten pretty efficent as is.

Juntari
10-24-02, 06:21 PM
I think the features should be included that address affecting current crop of games, not features that are merely not mature enough to see any use. But I think color optimization can be useful. I hope NVidia reveals the feature set in the upcoming Comdex fully.

gemini1313
10-24-02, 11:47 PM
one thing for fsaa.

z3 and i dont mean the bmw.

Bigus Dickus
10-25-02, 01:32 AM
I seriously doubt the NV30 will be using Z3 FSAA. But I woudn't be surprised at all if it used color compression... that seems like a natural progression.

Does the R300 use color compression in any of it's functions? MSAA?

Uttar
10-25-02, 02:02 AM
I heard somewhere that the R300 does color optimization for FSAA. But i have yet to find any serious site saying that... So maybe it does, but then ATI doesn't seem to care much about it.

Or maybe it'll be ATI's surprise - they disabled it with their drivers and once the NV30 is out, they got color compression too.


Uttar

StealthHawk
10-25-02, 04:27 AM
Originally posted by Uttar
I heard somewhere that the R300 does color optimization for FSAA. But i have yet to find any serious site saying that... So maybe it does, but then ATI doesn't seem to care much about it.

Or maybe it'll be ATI's surprise - they disabled it with their drivers and once the NV30 is out, they got color compression too.


Uttar

you're not getting that mixed up with gamma correct FSAA are you?

Mono
10-25-02, 05:07 AM
someone wana give me the lowdown on this color optimization technique? Curious how it would save 50% performance in aa.. someone fill me in.

SurfMonkey
10-25-02, 05:48 AM
I go for Z3 FSAA and a hybrid DFR\IMR tech like the new Wildcat. Virtually bandwidth free FSAA and extremely low hit Aniso. There are many more operations that could be done post PS to optimise shader usage as well.

Lezmaka
10-25-02, 06:35 AM
While people seem to be asking for explanations, how about one for this z3 fsaa?

SurfMonkey
10-25-02, 10:39 AM
Originally posted by Lezmaka
While people seem to be asking for explanations, how about one for this z3 fsaa?

It's basically a very economical way of doing antialising and transparency. The paper can be found here (http://research.compaq.com/wrl/people/jouppi/Z3/Z3paper.pdf).

Uttar
10-25-02, 02:00 PM
No, i'm not being confused by Gamma Correct FSAA.

From my understanding, some rumors suggested a 1.4 Effective bandwidth calculation for the R300 with FSAA. I'm really not sure what this is all about.
My best guess is that it's Z Compression. But i could be 100% wrong.

As for Color Compression, the effect wouldn't be limited to FSAA. However, since with FSAA you get less 100% different pixels near of each other, the effect could be a little better than without FSAA.

But really, from my understanding, the system would mostly be efficient at higher resolutions.


BTW, i've been thinking about that David Kirk quote saying "We want better pixels, not more pixels" ( not exact quote, but that's the idea )

When i first read that, i thinked that meant NV30 power wouldn't be in vertex processing but in pixel processing. Then when i rethink about it, i came to a completely different conclusion.


For that you see, you also could increase pixel quality with Vertex Shaders since all pixels are within polys.

But then... What could that quote mean? My guess is as good as yours. Personally, i'd say that they've found some type of revolutionary way of doing AA or AF. Note that i didn't say FSAA. An adaptive algorithm for AA might be very interesting.

But then again, it's possible this could all simply be about this Color Compression which might enable higher BPP at a much lower cost.


Uttar

gemini1313
10-25-02, 07:00 PM
uttar is on to something and i know what it is. its gonna be impressive when u read about it, as soon as comdex comes around.

for all the people who dont know what it is yet. just wait another 3 weeks.

its about processing speed and capabilities which compared to the r300 are much more than just impressive. its flat out tech the r300 can't do.

dont dis cg for it is only start for what is actually gonna come.
just know this.

nv30 may have had 1 or 2 difficulties getting from here to there in the past couple of months, but this kinda hardware usually does.

also .13u was a difficult migration, but by Q2 2k3 there will be an abundant display of cards for the different budget sectors including a mobile chip on the horizon and the nv35 in the works.

btw the r350 wont be much, for any company can keep speed bumping the speeds and refining the cores.

ben6
10-25-02, 07:52 PM
Simplistically, think about what multi sample antialiasing is. Now think about what ATI's HyperZ does . So by using lossless Z compression , ATI effectively gets 2x AA for free (not quite but close enough to it) and 4x for only minimal performance lost , because the maximum compression of the Z-Buffer is 4:1 on HyperZ III but 2:1 in most cases.

Mono
10-25-02, 08:03 PM
Originally posted by gemini1313
btw the r350 wont be much, for any company can keep speed bumping the speeds and refining the cores.

k.. the same can be said about nearly every nvidia card since the original geforce up tell now. Bleh, quit stating things like you know everything as fact. FACT IS: You don't know what the nv30 will do. FACT IS: You don't know what the r350 will do. k, just had to get that outa my system, quit acting like it's fact until they at least do a paper launch :p

StealthHawk
10-25-02, 10:34 PM
Originally posted by Mono
k.. the same can be said about nearly every nvidia card since the original geforce up tell now. Bleh, quit stating things like you know everything as fact. FACT IS: You don't know what the nv30 will do. FACT IS: You don't know what the r350 will do. k, just had to get that outa my system, quit acting like it's fact until they at least do a paper launch :p

actually, he said he DOES know. uttar is on to something and i know what it is. its gonna be impressive when u read about it, as soon as comdex comes around.

now whether to believe him or not...

gemini1313
10-26-02, 01:40 AM
thx hawk, im confident in my resources and my nvidia products.

nv30 till 40... :)

NVDA all the way!!

Uttar
10-26-02, 10:07 AM
Sounds like there's nothing to be learnt from nVidia patents ( which can be found on the web legally at http://www.uspto.gov/patft/index.html )

Only thing i learnt was that they patented a very strange DMA system ( http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=/netahtml/PTO/search-bool.html&r=1&f=G&l=50&co1=AND&d=PG01&s1=nVidia.AS.&OS=AN/nVidia&RS=AN/nVidia )

It's a pending patent they filed June 5, 2001 - after the tape-out of the NV20 architecture, and they wouldn't have had the time to implement it in the NV25.

Or maybe it's just something they'll never use like their tile-based rendering patent.
Who knows.


Uttar

Uttar
10-28-02, 11:22 AM
Read all of the R300 vs NV30 article at Beyond3D.

Seems objective to me. Only thing i fail to understand is where the heck they got their bandwidth optimization info for the R300.

"Color Compression ( 12:1 )" , "Fast Color-Clear" , "Z-Compression ( 24:1 )"

Err, WTF?

I'm sorry but i refuse to trust ATI got a 24:1 Z Compression technology. The guy who put that in his head is *insane*

AFAIK, Z Compression didn't change much since HyperZ 1. All that happened is that Z Complexity in real applications increased. IIRC, Z Compression with HyperZ 1 is between 4:1 and 8:1


As for Color Compression, i wouldn't be surprised if the R300 did in fact have that. But 12:1? That's insane...

Either the author is talking about non-lossless texture compression, or he read a paper which is either unexistant or very hard to find.

Uttar
10-28-02, 11:51 AM
Sorry for posting again so little time after my last post in this thread, but oh well...

I think the following is very interesting.
http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=/netahtml/PTO/search-bool.html&r=1&f=G&l=50&co1=AND&d=PG01&s1=nVidia.AS.&OS=AN/nVidia&RS=AN/nVidia

It's a patent filed by nVidia on the June 5, 2001 - too late for the NV20, and it's unlikely they'd have the time to implement such a thing in a few months for the NV25.

It's about some type of Direct Memory Access circuit system.
Since i barely understand anything, i won't try to speculate on it.

However, it's all about accelerating transfering data to an input/output device. Seems familiar? :)

There is, however, one BIG question here. What does the term "input/output device" really means? Sure, a computer is an input/output device. But i doubt this is a way to accelerate keyboard typing speed...

The GPU is also an input/output device. So this could be a way to optimize AGP. However... isn't a pipeline ( such as the Vertex Processor ) an input/output device?

If the answer is "yes", a lot has just been answered. If the answer is "no", what the heck is this patent all about?


BTW, this patent still hasn't been granted to nVidia. This is nothing strange, since it sometimes took them years for some of their patents to be approved.


Uttar

Joe DeFuria
10-28-02, 12:32 PM
AFAIK, Z Compression didn't change much since HyperZ 1. All that happened is that Z Complexity in real applications increased. IIRC, Z Compression with HyperZ 1 is between 4:1 and 8:1

Actually,

One big change with the R-300 and HypoerZ 3 is that Z-Compression is fully utilized with AA. This was not the case with ealier versions.

So the "up to 24:1 compression" (I believe) comes from 8:1 compression with non AA scenes --> up to 24:1 with AA applied (multiple samples per final pixel).

So while 24:1 is indeed a marketing number only reached in the most unrealistic and ideal of cases, HyperZ 3 is significantly improved when using AA. And a 3X increase in efficiency relative to Hyper Z I/II is pretty accurate, IMO. (Again, when AA is turned on.)

Uttar
10-28-02, 01:24 PM
Hmm, interesting. But then again, that would be between 12:1 and 24:1, not 24:1
Talk about marketing numbers...

Anyone got an idea about that color compression number?


Uttar