PDA

View Full Version : Epic Unreal warfare new engine


Pages : 1 [2] 3

SavagePaladin
11-23-02, 12:24 AM
(semi off topic) please God no, Unrel 2s been delayed enough already :p

Bigus Dickus
11-23-02, 12:49 AM
Originally posted by Nv40

5) 1.6 millions polygons was the Main character!!!
6) 10 millions polys for the full scene! Wow!
7)the GeforceFx should be able to move 20-30 characters with
that quality at the same time... pretty cool ..

:confused:

And how do you figure it will render 20-30 characters of that complexity at the same time?

The quoted figure for the NV30 vertex performance is 375 million tris/second. 20 characters at 1.6 million polygons is 32 million polygons, which gives roughly 12 frames per second. Hidden surface removal doesn't help because the geometry still has to be transformed. That is assuming maximum theoretical throughput, and no scene geometry. Adding in your estimated 8.4 million polygons for the rest of the scene, that gives just over 9 frames per second.

So yeah, I suppose the GF FX could move 20 characters of that quality at the same time, but it would be a slide show. Are those polygon counts accurate? No graphics card in the next year will be able to handle more than one, possibly two characters plus scene at acceptable framerates.

alex.burm
11-23-02, 01:00 AM
NO PLAYER MODEL SHOULD EVER USE 1MILLION POLYS!!!!
its just too much!!! 50k MAX. u cant even see that sort of detail anyway.
rather than concentrate on making higher and higher poly characters, why not focus on having faster framerates instead, why run a game with loads of detail at 20-30fps, when u can run it at 100fps with less detail, thas barely noticable.

i say framerates are more important that increasing detail, which lowers fps anyway.:)

thcdru2k
11-23-02, 02:00 AM
who knows how far this engine is away from being used in games. perhaps its not intended for the geforce fx, but tech after that.

Nv40
11-23-02, 02:12 AM
Originally posted by Bigus Dickus
:confused:

And how do you figure it will render 20-30 characters of that complexity at the same time?

The quoted figure for the NV30 vertex performance is 375 million tris/second. 20 characters at 1.6 million polygons is 32 million polygons, which gives roughly 12 frames per second. Hidden surface removal doesn't help because the geometry still has to be transformed. That is assuming maximum theoretical throughput, and no scene geometry. Adding in your estimated 8.4 million polygons for the rest of the scene, that gives just over 9 frames per second.

So yeah, I suppose the GF FX could move 20 characters of that quality at the same time, but it would be a slide show. Are those polygon counts accurate? No graphics card in the next year will be able to handle more than one, possibly two characters plus scene at acceptable framerates.

by the time those games ships ,we will have an Nv40 or R400 ,
with atlon hammers and P5 4ghz ....

i think Epic should be using the same techniques of Idsoftware.
they should use invicible very high polygons characters from 500k to 1 million polys to project Bumpmaps in a low version of the real characters . but look at the video and see it by yourself , EPiC prgrammer give all the numbers not me. the character looks like 2x-3x times more detailed than the doom3 ones .
also everquest2 has very high quality characters too .with long hair and shiny armor with human like animations .

so the GeforceFX must be rendering not the High version of the models but a"low" version with the bumpmaps of the
ultra high polys characters . it is a Rendering technique to save performance . and looks very cool . realtime softshadows too .
what i see its more agressive developers , finally pushing
the hardware to its limits .. and the return to 640x480 playing resolutions .. :)

StealthHawk
11-23-02, 02:52 AM
Originally posted by Bigus Dickus
:confused:

And how do you figure it will render 20-30 characters of that complexity at the same time?

Sweeny said that the new Unreal engine was capable of it. and since this was the launch of the gfFX the implication is that it will be playable on that card. he might have mentioned the card verbally, i don't remember.

do i believe it will be possible with the characters that detailed? no. you'll probably have to throw down a lot of details as well as the resolution too, but it's nice to see so much progress being made.

Bigus Dickus
11-23-02, 03:20 AM
I'm not disputing that there may be an Epic engine that can handle character models of 1+ million polygons. And I'm not disputing that the NV30 can probably handle some very detailed character models that truly look fantastic at playable framerates.

But NV40 said the GFFX could render 20 to 30 1.6 million polygon characters at the same time, and that just ain't gonna happen. Even R400 and NV40 will be hard pressed to render process 1.2 billion tris/sec, which is what 20 characters plus scene at that complexity would require at just 30 fps. And lowered resolution won't help... we're talking vertices here, not pixels.

StealthHawk
11-23-02, 07:52 AM
using EQ2 as an example, and please note that i'm not sure of the polygon count here, we were informed that gfFX would render many many characters, 20-30+ at full speed on screen at once. i use this example since i vaguely remember it being said, and other people apparently heard it as well.

Sweeny *may* have said the same thing about the next Unreal. then again, maybe not. personally i remember something like the one character shown to have 1 million polys, and i don't believe for a second that gfFX could render 20-30 on screen at the same time with that model complexity.

Mod
11-23-02, 09:30 AM
Originally posted by Bigus Dickus
:confused: The quoted figure for the NV30 vertex performance is 375 million tris/second. 20 characters at 1.6 million polygons is 32 million polygons, which gives roughly 12 frames per second. Hidden surface removal doesn't help because the geometry still has to be transformed. That is assuming maximum theoretical throughput, and no scene geometry. Adding in your estimated 8.4 million polygons for the rest of the scene, that gives just over 9 frames per second.

Isn't 375 million the number for unfiled polys, that is no light no texture ? :confused:

Wiith all custom details, shouldn't it would go down to 1 or 2 fps ? :confused

Mod
11-23-02, 09:34 AM
Originally posted by Bigus Dickus
Even R400 and NV40 will be hard pressed to render process 1.2 billion tris/sec, which is what 20 characters plus scene at that complexity would require at just 30 fps.

Have you come back from the future, or are you using moore law ?

Uttar
11-23-02, 10:10 AM
Originally posted by Mod
Isn't 375 million the number for unfiled polys, that is no light no texture ? :confused:

Wiith all custom details, shouldn't it would go down to 1 or 2 fps ? :confused

That 375 million/s number is for when you ONLY do the basic transformations of x,y,z and output those values and an input color. Nothing more.

You've got to differentiate Pixel Shaders & Vertex Shaders.
Texturing can only be done in Pixel Shaders.

Lighting, on the other hand, can be done in both Vertex Shaders & Pixel Shaders. The quality when it's done in the Pixel Shaders is higher, but it becomes a bottleneck very rapidly on current GPUs.

Now, there also is skinning for models. That obviously takes Vertex Shading power, and since those posts were talking about models, 375M/s is impossble.

And doing that on the CPU would simply kill it. No more physics, IO, ...
In other words, the 375M/s figure is NOT affected by texturing or filling the polygons, but since those are models, 375M/s simply won't happen. If it's highly optimized, doesn't have too many matrices, and uses branching, 200M/s maybe... But more is utopic IMO.


Uttar

Bigus Dickus
11-23-02, 04:35 PM
Originally posted by Mod
Isn't 375 million the number for unfiled polys, that is no light no texture ? :confused:

Wiith all custom details, shouldn't it would go down to 1 or 2 fps ? :confused

Yeah, I was just taking the simplistic approach to make the point, but I guess the situation is much more hopeless than I made it sound. :)

Bigus Dickus
11-23-02, 04:37 PM
Originally posted by Mod
Have you come back from the future, or are you using moore law ?

Just a speculation, based on the 2x to 4x increase in vertex processing power per generation. Of course, they could be using a new primitive processor in combination with an uber-powerful VS that blows the doors off of my speculation, but I was just trying to be realistic.

We probably won't see cards capable of handling that many polygons on screen at playable framerates for a couple of years still.

alex.burm
11-23-02, 09:40 PM
if the engine will have an option to scale down the models to a normal 50k trianlges, then whats the problem? id rather play at 100fps with lower detail models than at 10fps with INSANELY detailed ones.

all new engines should be scalable imho. framerate IS the most important factor in games, not detail. fps first, detail second.

borntosoul
11-23-02, 11:15 PM
when cards move to .09 micron then we will have the big jump everyone is waiting for ,close to christmass next year or a bit after,if the radeon is up to 3 times faster than the gf4 on .15 micron emagin how much faster those cards will be to the radeon ,maybe even 5 times !

StealthHawk
11-23-02, 11:26 PM
Originally posted by borntosoul
when cards move to .09 micron then we will have the big jump everyone is waiting for ,close to christmass next year or a bit after,if the radeon is up to 3 times faster than the gf4 on .15 micron emagin how much faster those cards will be to the radeon ,maybe even 5 times !

it has nothing to do with the process, it has everything to do with the bandwidth and use of bandwidth. if you notice, the r9700 is only 2.5-3x higher than gf4 performance in two situations, FSAA, and FSAA+AF.

of course the gf4 has terrible AF performance in D3D, and acceptable performance in OGL. but the r9700 really doesn't have less of a performance hit with AF than the r8500 does. so i don't see that changing between ATI cards.

then, let's look at FSAA. in this case, the more bandwidth or effective bandwidth you have, the better. but we are already at 4x FSAA with a minimal performance hit with the r9700. eventually you will get to a point where either the sampling pattern wil be improved, or performance will be free because of the large amounts of bandwidth available on the card.

in other words, i doubt we'll see such great jumps again, except maybe in very high resolutions like 1600X1200. again, the great 3x jumps are usually already only seen in resolutions like 1600. it will be very hard to get that same magnitude of increase over the r9700, unless the video card is already being stressed. and that will depend on the adoption of things like DX9.

borntosoul
11-23-02, 11:36 PM
i didnt say bandwindth has nothing to do with :) but just save your post and look at it 1 year from now :) the things they will be able to do then will be so far removed from what we got now it wont be funny ,whats makes you think that we wont get that sort of increase ?

Chalnoth
11-24-02, 01:32 AM
Originally posted by borntosoul
i didnt say bandwindth has nothing to do with :) but just save your post and look at it 1 year from now :) the things they will be able to do then will be so far removed from what we got now it wont be funny ,whats makes you think that we wont get that sort of increase ?

Because there would need to be a corresponding increase in memory bandwidth savings technologies, which is unlikely. Even if raw fillrates do increase by that much, I doubt we'll have enough effective memory bandwidth to generate a similar performance improvement.

Additionally, don't be too expectant that .09 micron will be that great. The smaller and smaller die processes go, the less of a benefit there will be (Perhaps asynchronous processing will alleviate this...could be interesting).

DIMA
11-24-02, 10:54 AM
And who said the character is actually 1.6 million polygons?

As far as I remember, Sweeney said that the original art is about 1.6 million polygons, while the ingame model (like in DOOM III) is much lower in polys.

No big suprise there, 20-30 characters on screen at once should be a no brainer.

alex.burm
11-24-02, 12:43 PM
hawk,
nothing to do with feature size? of course it has "everything" to do with it, u gimp. :rolleyes: that is what digital logic devices are built from...

smaller transistors are faster than bigger ones for a start ;)
and smaller means u can pack more of them in the 0.5mmx0.5mm space thats reserved for chip cores.:)

so what youre saying is that we may as well play in 10x7 resolution, cos below that, its almost the same performance. well i vote for that!
of course mem bandwidth increases exponentially (ok linearly, but exp sounds better!) with increasing res, the vertices stay the same, so maybe they will improve that part of the design.

"Additionally, don't be too expectant that .09 micron will be that great. The smaller and smaller die processes go, the less of a benefit there will be (Perhaps asynchronous processing will alleviate this...could be interesting)."
what ARE you talking about? ive explained above...:rolleyes:

async processing? as in mem and core clocked asynchronously?

well the *biggest* benefit, if u want to talk about every posibility, would be to use SRAM and not DRAM for the memory. sram is static ram (dram dynamic ram). static ram is MUCH MUCH faster than dram, so using that will help, except 128mb or sram, will cost an enourmous amount. for those that might not know, sram is the ram on-chip memories are made from...very fast.

dima,
(u russian btw?) :D

Chalnoth
11-24-02, 01:41 PM
Originally posted by alex.burm
async processing? as in mem and core clocked asynchronously?

No, nothing like that. I mean no clock speed. I saw a paper on this over at Beyond3D a short while ago.

Basically, the clock speed of a chip is set by its slowest part.

If you just let the transistors switch at their own pace, and include logic to synchronize different paths properly, your chip can run close to the average speed of the transistors, not the slowest speed.

The paper showed a bunch of gaussian distributions (bell-shaped things) of various die processes. Basically, the idea is that the smaller you go, the more uncertain your process is. That is, the smaller you go, the more difference there is between the different transistors. More switch more slowly, and more switch more quickly than the mean.

So, if you get rid of the overriding clock (and include the synchornization logic for output...), and let each transistor switch at its own pace, there's no problem with some transistors being slower than the others. The fast ones will cancel out the slow ones.

Anyway, from what I was reading, I don't really know how hard it would be to modify current fabrication to support asynch designs, though logic would dictate that it wouldn't be a challenge at all. The challenge would be in converting to the new mindset in design.

Bigus Dickus
11-24-02, 06:24 PM
You're talking about distributed asynchronous clocking? There's been a lot of research done along those lines by IBM and Motorola I believe.

It will be interesting to see what comes of it. It is possible however that an asynchronous chip with a higher "average" clockspeed is quite a bit less efficient than one with a lower clockspeed that is running in sync though. One logic unit might be running 10% faster, but if it "misses" the timing input for some data, it may have to wait another cycle to catch it. That's the biggest hurdle, and I think that's what all the research is concentrating on. If they can manage to make a way for clock timings to be driven on input signals (when input is needed) and on maximum logic unit speed when no input is needed, then they might make a breakthrough.

Chalnoth
11-24-02, 07:56 PM
Well, the main goal should be to make sure the process is, more or less, uncertain in a uniform fashion. That way, as long as your paths are long enough, if they're all the same length, they'll all finish very close to one another.

But yes, there is most definitely overhead to asynchronous processing. I was thinking more along the lines of extra transistors, though. It would definitely take some logic to get all of the final data synchronized.

Anyway, I don't really see what you're describing as much of a problem. It is an issue, yes, but it's never going to be as slow as clock-driven processing (Assuming that the overall clock doesn't make the transistors actually switch somewhat faster...).

borntosoul
11-24-02, 08:33 PM
i think there will be a 3 x performance improvement around that time cause the cards by then will be more optimized to run the higher colour depths ,so not only in 1600 x1200 with aa and af! and who knows we might have some new technologies that will be revolutionary ,we got to think outside the norm

StealthHawk
11-24-02, 09:31 PM
Originally posted by alex.burm
hawk,
nothing to do with feature size? of course it has "everything" to do with it, u gimp. :rolleyes: that is what digital logic devices are built from...
yeah, you read what i said. and judging from comments on the internet, i'm completely correct :rolleyes:

everyone seems to think that nvidia's 8x and 6x modes will look inferior to ATI's current RGMS 6x. maybe they are wrong, maybe they are right. we'll have to wait and see.

but wait....gfFX has to be better than r9700 in every way, because it's built on a smaller process. obviously that logic doesn't pan out, because gfFX is not superior in every way, based on simple documentation.