Doom3

Firestorm · September 25, 2000, 10:30am

Ever since i listened to carmacks’ speech at quakecon i was wondering about something…

At the speech he said doom3 would use .map files instead of .bsp files, and maybe a secondary binary file for edge coherency & t-junction data…
Later on he wrote somewhere (in his plan i think) that currently ‘doom3’ is using .map files only and no binary file would be used…

This basically means that all vsd preprocessing etc. would be done at level loading…
And altough it’s possible to draw an entire q3 level without vsd at 90fps on geforce hardware (i’ve tried it ;o)
I doubt carmack is planning to have no vsd

Especially if you consider that he’s going to implement realtime lighting using stencil buffers, which basically means virtually everything is redrawn at least 1x per light…
He was talking about 8-passes per surface!

And considering that you need at least some vsd in order to correctly draw multiple alpha-blended and other transparent surfaces…

Add to this that he was mysteriously talking about doom3 being more dynamic than anything seen to date, and about brush connectivity etc. etc.
Almost sounds like every brush would be dynamic and movable… altough i doubt that

It does sound like doom3 is going to be a cool game / engine…

But it does leave me wondering what the hell he’s doing hehe

So, my question is, do you guys have any idea what kind of vsd carmack is doing?

Considering that preprocessing will have to be lightning quick since they’ll be done at level loading time…
It sure as hell won’t be BSP… (at least, i don’t think so)
Probably some rough portal rendering…
Just turn on r_portalonly (or something) in q3a… it’s just as fast as using it with the bsp tree (at least on my geforce1)
and considering that geforce1/2/nv20 will be the target system for doom3…

oh well.

Firestorm · September 25, 2000, 10:32am

heh, yea i know, a bit off topic ;o)

pavlos · September 25, 2000, 11:51am

There’s a very interesting interview at voodooextreme. There, Carmack make (almost) clear that it’s going to use a portal engine for doom. I can’t understand if it’s a usual “clip the view frustum to portal” engine ,but I’m sure that there will be not vsd preprocessing.
Also,there is no reason to store the geometry on a BSP , but that’s also the case with Quake 3 ,so here Carmack is unpredictable.

Firestorm · September 25, 2000, 12:08pm

well portal rendering itself needs to be preprocessed…
you don’t magically get all of the portals and cells afterall, they need to be calculated somehow/somewhere…

as for “storing the geometry on a bsp”…
I’m talking about the bsp tree itself basically… not about the geometry…
The geometry is quick & easy to calculate…
It’s how to calculate the data you need to perform (rough) vsd quickly that’s the mystery…

but thanks for the info about the article on voodoo extreme, i’ll be sure to check it out…

Gorg · September 25, 2000, 1:42pm

You can create somekind of “dynamic portal”.

Brushes are used to define a region that is only visible from a certain spot. Then, if something moves in that brush you had it to an “in brush” object list. So if you don’t have that brush in view, you just don’t render anything.

With that, you can even make terrain rendering faster if the map makers of clever enough to place the brushes at the right spot.

With today’s more powerfull machine and with the size of the data, it’s pretty “unclever” to use a perfect vsd algorithm. What you need, is just very gross culling and scalable geometry.

[This message has been edited by Gorg (edited 09-25-2000).]

skw_d · September 25, 2000, 8:30pm

I am doing some research into a portal engine myself and I think I know what he is talking about.

With portal rendering, there is no need for preprocessing of map data because you can easily create the cells on the fly. The trick is to come up with an algorithm to minimize the number of cells used to define a room, this allows for fewer splits of the view frustum. Because you only render what is visible, you have extra processing time for other things like dynamic lighting.

And since the loading of the world data into cells is realtime, you can easily change the existing cells dynamicly.

Though I think Carmack has something in mind to go with the already known portal rendering method. DooM3 is supposed to support more open areas which have been shown to be slow with portal rendering. He might have a hybrid rendering scheme that mixes portal rendering with a form of terrain rendering. It is quite easy that once you leave a cell through a portal, that the rendering system jumps to an entirely different scheme to render what is beyond the portal.

Another benefit is that you can design levels using the engine in realtime. This idea is similiar to the DukeNukem3D map editor.

Everything that Carmack has stated about his engine has not surprised me, it makes sense and I have already seen engines that show off some of the technology that he speaks of. I hope to be able to come up with my own solutions as I develop my own engine.

/skw|d

Firestorm · September 26, 2000, 12:12pm

well building portal cells is fast, but i doubt it’s fast enough to do realtime…
at least not when you’re doing rendering, AI, sound, physics and what not at the same time too…

as for zero overdraw, that’s overrated…
it’s probably fast to have a lot of overdraw instead of splitting a lot of polygons to get zero overdraw (generating more vertices to send over to the 3d card and having the cpu overhead of the splitting itself)
Fillrate is not the problem with rendering, bandwidth is.

These days it’s usually more effective to just find a very rough estimation of what is visible and just send it over to the card then to actually try to find out exactly what is visible and draw it…
especially if you have to use a less optimal format when sending the data to the card…
Better try to keep everything as static as possible and use matrices to move/rotate stuff…

I mean, i think carmack said something about having roughly 30x overdraw per pixel??
and i remember him saying something about having about 8 passes per surface on average…

i mean wow… that’s a lot more overdraw than i tought in the first place…

dorbie · September 26, 2000, 10:03pm

This suggests to me that there could be lot’s of dynamic data in the scene and highly curved surfaces etc. Maybe it won’t all be indoors.

Remember that the BSP structure was only a means of visibility processing which was very well matched to the database being used. That database consisted of a series of convex hulls with preprocessed intervisibility information. The BSP traversal determined which leaf the eye was in to access a mask for which other leaf geometry sets were also potentially visible.

As the nature of the database, hardware and surface descriptions change alternative means of visibility processing make sense.

I don’t think it’s a given that there will be no preprocessing of the scene but it could well be the case, CPU’s are much faster and the kinds of processing required during the load may be offset by the time to load things like shaders and geometry. It may also be a mistake to assume that the rendering will be as efficient as previous engines. A protal engine seems extremely limited for anything but running around a cluttered series of rooms with narrow connections (and I mean more cluttered than previous shooters). It’s way to fussy for anything similar to what’s gone before.

The edge connection list & t junctions are required for dynamically remeshing curved surfaces without cracked seams. If there is no curvature this information and geometry would be fixed and in the core description, tristriping on the fly isn’t enough motivation for this.There may be much more payback from intense work on view dependent tesselation algorithms for this type of database than from sophisticated model space visibility culling, in addition as the geometry gets arbitrarily complex and unpredictable sorted eye space occlusion culling with emphasis of level of detail tesselation might make more sense for many types of scene, especially if you expect hardware will have some sort of coarse zbuffer testing…and everyone will. Heck with coarse zbuffer testing you’ll be focusing on geometry again rather than overdraw and fill, particularly with many textures in a single pass. You waste less bandwidth sending tris and in the distance there are fewer.

Carmack’s art has always been matching the engine to the technology. If you’re trying to figure what he’s going to architect you need to look at future card features (he runs on bargain basement cards but he targets the latest & greatest) and the type of database being drawn (look at the direction Quake3 moved the database in).

Coarse zbuffer means you need to sort on cards anyway. It also means geometry not fill could become the issue.

Many textures & register combiners etc. means single pass shaders again improving fill while reducing the number of times you send geometry.

Database mandates much complex curved even animated surfaces which in turn makes schemes like facet aligned BSP trees impossible.

A BSP tree is also faster to generate for sorting of larger groups of objects with bounds of information instead of leaf level data. If you want a reasonable BSP (or other structure) for quickly sorting of big chunks of geometry rather than getting down to a facet level leaf to resolve visibility it may make more sense to compute it when loading.

Firestorm · September 27, 2000, 12:48am

Well if you tought i tought that doom3 would have no vsd, than you misunderstood me…
It HAS to have some vsd, and therefore it requires at least some preprocessing…
Only it’s done during load time…
and carmack mentioned something about preprocessing not working the same as in previous id games, in the way that doom3 requires smarter designing from level designers because the preprocessing won’t be fullproof (or something, sorry bad explanation)
it comes down to that level designers will need to use more hint/clip etc. type brushes to get good vsd compared to older id games…
He said something that technology was moving away from letting one tool do all the vsd preprocessing for you to more feedback from the level designers…

So a lot of vsd preprocessing information can already be in the .map file…

He IS calculating the t-junction data etc. while loading the level because i doubt that he’d store raw polygon & edge data in the doom3 .map files…
Most likely he’ll use some updated version of the old .map file format…

Some form of vsd is always necesarry, even with zbuffers…
at least when you’re planning on displaying transparent surfaces…

as for portal rendering, you don’t have to use convex cell portal rendering, you can use portal rendering in the form of areas connected to areas…
Just like id did in q2 & q3…

Actually, without some sort of portal rendering combining any form of vsd with any other form of vsd (terrain rendering) will be very hard to do…

Altough i doubt id will have terrain in doom3…
maybe i’m wrong, but it doesn’t seem to be carmacks thang…

skw_d · September 27, 2000, 3:01am

When I said on-the-fly, I meant on map loading, I don’t see how I implied that you process the portals/cells every frame. You can update them or insert/remove them dynamicly however.

And overdraw is very important, the limitation of hardware is the ammount of polys you process. If you render a poly with a texture, dynamic lighting, shader effects… and it’s not even visible… that is a waste of a lot of processing time. A BSP can suffer from 200% overdraw, so 2x the number of polys are sent to the hardware and get processed. If you construct the PVS from the camera real-time, then you only send the visible polys to the hardware because there is no overdraw.

as for portal rendering, you don’t have to use convex cell portal rendering, you can use portal rendering in the form of areas
connected to areas… Just like id did in q2 & q3…

That is not true, portal rendering requires convex hulls, you adjoin two convex hulls to form complex shapes. This gives you a lot of portals, and each one will clip the view frustum, and a lot of splits are evil.

Quake2 only used BSP trees for the calculation of the PVS, Quake3 used a portal hack to work with a BSP tree for special effects like the transport destination view and rotating mirrors. Doom3 will be a portal-based engine, no more BSP. The main reason is to be able to increase the complexity of the levels and to remove the preprocessing of map requirement.

/skw|d

LordKronos · September 27, 2000, 5:57am

Not positive, but when firestorm said you dont have to use convex hulls for portal, I think he MIGHT have meant you dont need convex hulls that match up with room geometry. You can use the convex hulls as a sort of roughly bounding box, to create general areas of the map.

Say you are trying to model a 5 story building, each story connected by a staircase which stick out of the side of the building (kinda like a fire escape). You can put a convex hull around each of the 5 stories, and a 6th convex hull around the stair case. Then a portal can be used to link each floor to the staircase. In this case, each floor of building can be concave (hallways and rooms), but its bounding hull is convex.

Firestorm · September 27, 2000, 9:10am

Originally posted by skw|d:
That is not true, portal rendering requires convex hulls, you adjoin two convex hulls to form complex shapes. This gives you a lot of portals, and each one will clip the view frustum, and a lot of splits are evil.

a lot of splits are always evil (well, in principle at least, maybe not always)

portal rendering doesn’t have to be fully convex, you can have non convex cells too…
You can think of the non-convex parts as ‘details’ (like detail brushes) or anti-portals or what not…

I’ve seen it work, it is possible.

pavlos · September 27, 2000, 11:31am

The key is how Carmack will define a cell (convex or not).
Here is a part from the interview at VE:
“In any case, the gross culling in the new engine is completely different from previous engines. It does require the designers to manually placed portal brushes with some degree of intelligence, so it isn’t completely automated, but I expect that for commercial grade levels, there will be less portal brushes than there currently are hint brushes. It doesn’t have any significant pre-processing time, and it is an exact point-to-area, instead of cluster-to-cluster. There will probably also be an entity-state based pruning facility like areaportals, but I haven’t coded it yet.”
So ,I think it’s clear that he ‘s talking about relatively large (non-convex)areas connected by portals placed by the designer. That way he can get an estimate of what is visible and replace the precomputed PVS. But he must use a hierarchical space partitioning algorithm for each cell to eliminate overdraw. I haven’t read any carmack’s comment about that but I think he will not use a BSP tree for that (That’s what I was trying to say on my first post) .The shortcomings of a BSP is well known and you can observe them if you run a Quake 3 map with r_speeds enabled.

I think he may continue to have each object in an AABB and put these bounding boxes inside an octree(or something like that) with relatively big (to avoid splits) octree-cells. This way can have a rough estimation of what’s visible and also the octree can render outdoors efficiently (but I don’t think DOOM 3 will have outdoors).

Also, he have stated at Firingsquad (almost a year back, after the quake 3 release) that the new engine will have the ability to cut the geometry everywhere, something you can do relatively easily with an octree and an area connectivity graph (to add new portals when necessary) .

And finally note that in a Slashdot comment he have cleared that the .map file will be loaded (and required) for use only with the built-in editor. The rendering engine will use pre-processed “compiled” geometry placed on a text file. The built-in editor is like the old editor and it ‘s integrated to share the common code with the rendering engine.

[This message has been edited by pavlos (edited 09-27-2000).]

skw_d · September 27, 2000, 4:09pm

LordKronos: You are describing a culling technique that does work, but it cannot be use to get pixel accuracy needed to render the 3d data. If you did, you will get tears in the walls and such.

Firestorm: You know something I don’t, please explain to me how you can clip a view frustum with an arbitrary volume, because I would like to know.

LordKronos · September 27, 2000, 5:35pm

Originally posted by skw|d:
LordKronos: You are describing a culling technique that does work, but it cannot be use to get pixel accuracy needed to render the 3d data. If you did, you will get tears in the walls and such.

not sure what you mean. it works fine for me

pavlos · September 27, 2000, 5:55pm

Firestorm is right. You can use non-convex cells.
Using convex cells produce zero overdraw but the overhead from the frustum clipping is enormous and does not make sense when using hardware acceleration.
There ‘s a portal column at Flipcode that describe a portal engine. Here is a snippet:
“Another good way to get a portal engine up to speed is using concave sectors: Instead of using small sectors (or larger sectors with very little detail) we could also use larger sectors, if we would somehow find a way to handle the problems that this introduces…”
For the rest go at http://www.flipcode.com/portal/

As for the clipping question, the answer is you can’t clip against an arbitrary volume . So, in my engine I clip against the 2d-bounding box of the portal, getting always a 4-planes frustum .As you now, testing an AABB against a 4-planes frustum is only 4 dot products.
I know it’s not perfect, but keep in mind that all the hardware accelerators perform guard band clipping and so, you only need I rough estimation of what’s visible.

[This message has been edited by pavlos (edited 09-27-2000).]

skw_d · September 28, 2000, 7:45am

An Axis-Aligned Bounding Box is a convex volume, so you are still clipping against a convex cell. It is true, that regardless of what GEOMETRY is contained inside the cell, as long as the cell is convex it can be clipped and the side-effect will be overdraw.

But going back to the topic of this thread: With all the polygon processing that will be done to get dynamic shadows and special effects, overdraw is not an option. By removing unneeded polys from the pipeline, you save time for more complex geometry and dynamic shadows.

For my research I am taking a set of data and processing it into cells and portals. To minimize the splits, the algorithm adjusts the current cell with it’s neighbors to try and find a volume that is within a certain tolerance of it’s neighbor. My goal is to be able to load unprocessed data right into the engine for use, and still have all the benefits of zero overdraw.

/skw|d

Firestorm · September 28, 2000, 10:46am

Originally posted by skw|d:
An Axis-Aligned Bounding Box is a convex volume, so you are still clipping against a convex cell.

Hahaha, let’s not argue about symantics, that will get us nowhere ;o)
Yes, you need a convex cell for clipping.
But who ever said that you need to clip?

But going back to the topic of this thread: With all the polygon processing that will be done to get dynamic shadows and special effects, overdraw is not an option. By removing unneeded polys from the pipeline, you save time for more complex geometry and dynamic shadows.

True, but fillrate isn’t the problem. (altough yes, geometry is)
(and by the way, that is one of the reasons why i think doom3 HAS to have some at least fairly good vsd, which kinda is the opposite of quick 'pre’processing during loading imho)
But calculating zero overdraw takes way too much processing time, relatively speaking.
Carmack said 30x overdraw, with 8 passes per surface… that is something like 4 surfaces drawn on top of eachother if you count the 8 passes as 8x overdraw…
You know how to create shadows with the stencil buffer, right? It’s pretty cool stuff…
The thing which costs the most calculations is calculation the shadow sillouhette, but once you have the sillouhette, you don’t have to care how complex the geometry is you’re casting your shadows on…
(and sillouhete calculations can be partly precalculated)

For my research I am taking a set of data and processing it into cells and portals. To minimize the splits, the algorithm adjusts the current cell with it’s neighbors to try and find a volume that is within a certain tolerance of it’s neighbor. My goal is to be able to load unprocessed data right into the engine for use, and still have all the benefits of zero overdraw.

Sounds pretty cool…
Some time ago i was, like you, looking for the perfect algo’s wich could produce zero overdraw as quickly as possible…
But some time back i realized that the problem in the current level of technology isn’t overdraw (fillrate) but troughput/bandwidth…
It’s going to give you a bigger performance gain by using display lists, use vertex buffers, decrease the number of vertices (which means less splits, splits create more vertices) and sort your textures…

Trust me, i’ve timed it all…
Fillrate, on geforce level hardware and up at least (and within a year that’s going to be low level), is basically infinite as far as you care… (at least on q3a level engines, probably not at doom3 level engines)
It’s the bandwidth which is going to eat your framerates away…
So ‘zero-overdraw’ won’t help you because it will make it hard for you to send your data to the card in a compact form (vertex buffers/display lists etc.), it’ll create more vertices by all the splitting which is necesarry to get zero overdraw, and you have all the additional cpu cycles it takes…

Don’t believe me?
Create a program which loads a q3a map, creates all the polygons from the brushes, create a display list and draw everything
(sort those textures!)

then, split everything in such a way that you only have the outer shell of the level…
optimize it any way you want.
use a display list again…
And then just watch the enormous performance drop.

and with zero overdraw you won’t even be able to use a display list…
vertex buffers (which are slower) might work with it, but i’m not entirely sure…

whoa, long post

[This message has been edited by Firestorm (edited 09-29-2000).]

skw_d · September 28, 2000, 8:01pm

You forgot to think about some things.

Zero overdraw means less polygons are being sent through the pipeline, you hurt more there than just the fillrate.

Think about this, you are sending polys to the hardware card asking it to do all the complex processing you talked about. All the polys must be textured, lit by dynamic lighting, dynamic shadow volumes must split some and shade them, and any other processing must be done to them… but they are never seen!

The whole point of vsd is to render what the viewer needs to see. The reason is not just fillrate, but to avoid all the other processing that needs to be done to the polys.

The time it takes to render a scene of polys is not linear, adding 10 times the number of polys does not make it 10 times slower, it is far worse. So the more effort you put into removing polys that will never need to be processed will go a long way, and IMHO is where one should focus their efforts when designing an engine.

I hope you can see what I am getting at.
/skw|d

Firestorm · September 28, 2000, 9:52pm

Originally posted by skw|d:
Think about this, you are sending polys to the hardware card asking it to do all the complex processing you talked about. All the polys must be textured, lit by dynamic lighting, dynamic shadow volumes must split some and shade them, and any other processing must be done to them… but they are never seen!

well “textured, lit by dynamic lighting” is fillrate and “dynamic shadow volumes must split some”??
as far as i know hardware doesn’t have shadow volumes!?
The technique i was referring to when talking about with stencil buffers uses the zbuffer and does absolutely no clipping…
It’s a pure fillrate thang…

And if you’re referring to texture matrix calculations on the card etc.
Yes, those have impact on performance, but geometry has more effect on it than anything else…

The whole point of vsd is to render what the viewer needs to see. The reason is not just fillrate, but to avoid all the other processing that needs to be done to the polys.

And you’re absolutely right!
But the point is that balance is more important than zero-overdraw…
There are more factors than just ‘drawing the polygons you see’
Sure, the less you draw, the faster everything is going to be, that’s obvious.
But it’s faster to draw less polygons and less complex polygons (with less vertices) even if you have more overdraw…

The time it takes to render a scene of polys is not linear, adding 10 times the number of polys does not make it 10 times slower, it is far worse. So the more effort you put into removing polys that will never need to be processed will go a long way, and IMHO is where one should focus their efforts when designing an engine.

It’s all about balance, yes you need to remove as many polygons you don’t see as you reasonably can (without doing so many calculations you actually start slowing down everything)
But clipping polygons is a bad thing most of the time…
But the most important factor to take into consideration is that things like vertex buffers usually have a much bigger impact on performance (this case in a positive way) than removing several polygons…

Look…
I’ve been at siggraph2000 this year and in one of the courses they came with a very good advise.
Theory is good, but it’s worth crap if you don’t actually verify that what you think is going on, IS actually going on.
Check everything, try everything, never assume.
Verify verify verify…

I was working with no-overdraw algo’s before i stated testing everything… and because of those tests i changed my mind…

Please take my advice and do some tests, You’ll discover (like i did) that splitting is very bad for performance, much more so than overdraw.

Ofcourse if you split a polygon and you end up with exactly the same ammount of vertices, sure it’ll be just as fast…
(but ofcourse you’ll still have the extra cpu overhead)

It’s probably more effective to split a level up into some sort of cells (not too small, not too big, non convex)
put them in display lists, and determine which portals are visible and which display lists should be called…
the added advantage of display lists is that the geometric data should already be on the graphics card…
And when you do things like multipass rendering, you just call the list for every pass and you don’t need to send it to the card again, because it’s already resident.

But again, i’m talking about geforce level hardware…

I hope you realize how much faster vertex array ranges (nvidia specific) are faster than display lists, how much faster display lists are than vertex buffers, and how much faster vertex buffers are than sending triangles etc…
And the more static the data is, the faster it can basically be send to the card…
Ofcourse when i say static, i mean from a geometric point of view… you can still rotate/translate the data with the help of matrices (and some extensions), you can even do the same for texture coordinates using texture matrices…

[This message has been edited by Firestorm (edited 09-29-2000).]