Removing unneeded polys

I understand backface culling just fine. My question is depth testing. Does opengl internaly not draw things that are behind others? Is that the purpose of depth testing or not? I am looking for an easy way to tell opengl not to draw that polys that are behind other polys, and thus wont be seen anyways. I kind of did things backwards. I have already designed a way to cull entire objects from my frustum with some plane tests, and I have designed a way to set a LOD from those tests. My problem is with my Map, I want to render my Map Fast and efficiently, so I first though about using one interleaved array for the entire map, but if opengl dosnt remove the faces behind one another, isnt this a HUGE performace hit? I have read about BSP trees and such, but those are a little out of my scope, and I have always though that the simple answer tends to be the best, and BSPs seem far to difficult, must be an easier way.?. If opengl dosnt remove the hidden faces, I have come up with another possiblity, please let me know if you think this idea will work. (Im not sure if this is the theory behind a BSP or not) My idea is this, create a Map for starters. Then create an index of each Plane normal within the map. Then create an index of each poly that lies on that particulare plane. Then after each time the camera is updated check to see which planes lie within my viewing frustum, then with those planes, check to see which of the polys per plain lye within my frustum, then simply only draw those planes? Only problem with this idea is that is seems quite intensive, I mean, image a map of an apartment building, there could be 1000 planes, with an average of 50 polys per plane. Now say we check 1000 planes to see if there inside the frustum, then say ther is 30 planes inside the frustum, we then check 1500 polys. Or maybe I should simply cut out the poly check, and just render the all the polys in those planes, and save some CPU cycles, and let the video card handle a “little” extra data. Anyone have any ideas?? Or getting back to my original question, will opengl simply not draw things that are behind others?? please help… Thanks.

That is what the Z buffer is for. It is a non-trivial problem to clip background polygons against foreground polygons in order not to render them at all. There are quick visibility tests you can perform on groups of polygons (but not in a `polygon soup’ application). The Z buffer will sort out all depth problems for you.

So if I stay with my frustum culling for lets say objects, and characters, and let OpenGL’s depth test handle my map Data, things will run smoothly? Figuring on a quake 3 level of detail map? Not a horendous amount of polys, but a fair amount. The reason I would like to stick with the frustum culling for the others, is that i think there is no sense having opengl test all those polys just to see if there visable, if thats how it works. So thank you very much. If I have missunderstood this at all, please let me know. Thanks.

Sorry, I got lost in your original post, but I think I get the general gist of it. (Paragraphs might have made it a bit more readable.)

The way depth testing works is basically like so…

When you draw a polygon, it test against the depth buffer, and depending on what your depth function is, it will decided to draw or not to draw.

So say you have drawn polygon 1, and your depth function is GL_LESS (the defalt). Assuming nothing else has been draw, polygon 1 is drawn, and it’s depth is written into the depth buffer.

Now you try to draw polygon 2, which is completely behind polygon 1. Since polygon 2 is behind polygon 1, the depth will be greater than what is currently in the depth buffer, and the depth test will fail, so it won’t be drawn.

Now if you were to draw polygon 2 first, it would get drawn, and then when you draw polygon 1, it would simply draw over polygon 2 because the depth test would succeed in both cases.

Well, i guess that would be faster than attempting to write my own algorithm to do that. Only way I could think of to make this method faster would be to first find the closest say “20” planes, and render those first. Then attempt the rest, which most, would fail, unless the 20 planes we started with were all close to the ground, leavin far open spaces to be seen. (Explained badly, but i hope you get the gist) But in doing so, that would require me to break the map up into alot of chuncks, like a couple polys per plane. So think it would be much faster just to leave the Map in an interleaved array, and let the depth buffer do its thing. Also might help with multitexturing (if i use CVAs) that way opengl already knows what is where, so it dosnt have to do the calculations twice.

Originally posted by dabeav:
So if I stay with my frustum culling for lets say objects, and characters, and let OpenGL’s depth test handle my map Data, things will run smoothly?

Umm, no

Depth testing has nothing to do with speed (well almost, it saves some memory bandwidth), just with image correctness. You still send geometry to the card, this geometry is still lit, rasterized, textured etc, and then some pixels fail the depth test and don’t get written to the frame buffer.

Speaking of Q3 maps, a GF4 easily handles ALL the leaf faces of q3dm11.bsp when put in one vertex array and drawn with a single glDrawElements (GL_POLYGON), 70+ fps. And that’s without CVA or VAR. On the other hand, with BSP and glBegin/glEnd of the same data the frame rate drops to 20 . And on the third hand, BSP + PVS culling + state sorting + vertex arrays easily makes 200+ fps.

All this is to say the culling performed by the CPU is usually beneficial.

Regards,
-velco

Originally posted by velco:
[b] Umm, no

Depth testing has nothing to do with speed (well almost, it saves some memory bandwidth), just with image correctness. You still send geometry to the card, this geometry is still lit, rasterized, textured etc, and[/b]

I don’t think so:

First. If you cull all the polygons outside the frustrum, you save all the depth-testing that you would have to do otherwise with these “extra” polygons.

Second. Depth test goes before texturing and ligtning. There is no need to waste time doing calculations of fragments that never will be seen.

Third. By using frustum culling, you haven’t got to send occluded polygons to the card, so you save bandwith, too. Is not the same to render 3,000 polys than 30,000. Although a GF4 can render a Quake3 level at 70fps, it is a waste of power. In fact, Quake 3 doesn’t need a GF4 becasue it selects the polys that it has to render before sending them to the graphics pipeline. That was the reason that allowed me to play Quake 3 with a Banshee .

[This message has been edited by Azdo (edited 06-20-2002).]

As an added point, if you are truly concerned about sending data down and not using it. The new NVidia extension allow, as I understand it, a test if any Z-writes were made during a section (so you can find out if the entire object was occluded).

Occlusion testing helps too, but the Z-buffering definately helps better than coding it yourself. If you do your own occlusion test, make sure it is on higher order objects so that the time you spend in the test, is made up for in tris not sent. However, we still send it all down now, and get good frame rates.

Originally posted by Azdo:
[b] I don’t think so:

First. If you cull all the polygons outside the frustrum, you save all the depth-testing that you would have to do otherwise with these “extra” polygons.
[/b]

Yes. And you save much more that the depth testing - transformations, lighting calculations, rasterization, texturing.


Second. Depth test goes before texturing and ligtning. There is no need to waste time doing calculations of fragments that never will be seen.

This is not true.

Please, read the OpenGL 1.3 spec, page 10, section “2.4 Basic GL operation”.


Third. By using frustum culling, you haven’t got to send occluded polygons to the card, so you save bandwith, too. Is not the same to render 3,000 polys than 30,000. Although a GF4 can render a Quake3 level at 70fps, it is a waste of power. In fact, Quake 3 doesn’t need a GF4 becasue it selects the polys that it has to render before sending them to the graphics pipeline. That was the reason that allowed me to play Quake 3 with a Banshee .

My point exactly. Even on a relatively high end card (4400), some operations performed by the CPU can improve the frame rate - it is not “the GPU can do anything better/faster”.

Regards,
-velco

Just to clarify: Depth Sorting is a per pixel calculation, not a per polygon calculation. It checks the depth values of separate pixels, and depending on the z value, draws one pixel instead of another.
BSP trees as far as i know allow you to disable depth sorting because the BSP trees allow you to straight away draw the polygons in the right order, as in the furthest polygons first, and the closest last. This offers a speed increase because depth sorting OpenGL style is pretty slow…

There should be a way to tell opengl to just NOT render things that are behind others. I have come up with a system of Zones. I have broken my map up into pieces by room. Then i surround each room in a bounding sphere, test if the sphere is in the frustum, and just dont render that CHUNK if it is. Each room is a sepperate interleaved array. So this makes calls very fast. Only other way i can think of to do it, would be after i find the visable zones, Test each plane of that zone to see if its visable, but that sounds like alot of unneeded calculations per frame. There REALY should be like a NON visable face cull, similare to backface culling.!!

Ahh, you make it sound so easy.

The problem is that you need to transform the primitives to even ask the question. Once projected coarse z can reject large ammounts of hidden fragments. In addition the occlusion extensions let you test the bounds of an object to see if it is visible before you draw it, but remember that this MIGHT be extra overhead with no gain for some applications.

OpenGL lacks the higher level concepts available to you in your application. From OpenGL’s perspective all you do is draw a bunch of primitives in some random order.

Look at the HP and NV occlusion extensions, it’s probably as much as OpenGL can support without some kind of higher level scene description and may be what you want.
http://www.nvidia.com/dev_content/gdc2002/GDC2002_occlusion_files/frame.htm

The WHOLE reason I use OpenGL is to stay platform independant, so it would be kind of silly to use Card Specific code to cut proccessing time. Someone (not me, im not an API programmer) should develope a OpenGGL, or Open Game Graphics Language, as an add on to opengl. Much like GLUT did to windowing, and input. I figure there should be 2 major developments in this area. First it would be nice to instate a method of only drawing whats visably in the frustum (meaning drop that data before it even hits th GPU for testing) second it should include a way to cull NOT visable objects with the frustum much faster than currently done with depth testing.

Thank you all for one thing (well not just one),but I finaly now know what a BSP tree actualy does, it simply draws things far to near, to elimate the need for the depth test thus speeding things up alot.

I find this kind of a pain though, its hard to use CVA, or even normal interleaved arrays with this method.

I really think that OpenGL should rely a little more on the CPU, use some of its power to remove unneeded steps from the GPU. I understand that OpenGL is LowLevel, designed just to put the polys on the screen, but things would be so much better if it did this with a more efficient means.

The clue is : Zbuffer works at a pixel basis. So :
1/ It’s slower than discarding unseen polygons
2/ It’s more precise than a painter algorithm.

Discarding unseen geometry can be done by Zbuffer, but you can’t just rely on this if you have a lot of unseen geometry. ZBuffer alone strategy is OK for, lets say, a single complicated object. For a huge scene, if you look for performance, you MUST reduce openGl works to what is susceptible to be seen.
Frustrum culling is the first step, and can be enougth in some situation. For a map like quake maps, when in front of you (in the frustrum) there are lots of rooms and big walls, that is, when there is in the frustrum a huge amount of geometry that will be hidden, you must look for an algorithm that can tell wether or not some geometry will be hidden by some others. BSP Trees do just that, for static geometry. You can separate static / dynamic geometry, or you can try another algorithm, depending on your Map/objects/dynamics of objects/dynamics of camera/etc…

if (have_CVA)
{
lock_it();
draw_it();
unlock_it ();
}
else
{
/* ****, no CVA … oh, WTH, just go ahead. */
draw_it();
}


Someone (not me, im not an API programmer) should develope a OpenGGL, or Open Game Graphics Language, as an add on to opengl. Much like GLUT did to windowing, and input. I figure there should be 2 major developments in this area. First it would be nice to instate a method of only drawing whats visably in the frustum (meaning drop that data before it even hits th GPU for testing) second it should include a way to cull NOT visable objects with the frustum much faster than currently done with depth testing.

There is LOTS of graphics engines out there and ALL of them do frustum culling (the ones that doesn’t don’t deserve the name graphics engine", not that I know of any).


Thank you all for one thing (well not just one),but I finaly now know what a BSP tree actualy does, it simply draws things far to near, to elimate the need for the depth test thus speeding things up alot.

You can use BSP for frustum culling too, if the view frustum is entirely on one side of the separating plane you can skip one of the subtrees.


I find this kind of a pain though, its hard to use CVA, or even normal interleaved arrays with this method.

It is not that hard.

  1. Traverse the tree and collect the geometry that’s going to be drawn in a some kind of list/array/whatever.

  2. Sort (rather group) the list according to OPenGL state.

  3. Traverse the list. If the state doesn’t change, copy the vertices to an array. If the state changes, flush (i.e. send to the card) the vertices with CVA or VA.


I really think that OpenGL should rely a little more on the CPU, use some of its power to remove unneeded steps from the GPU. I understand that OpenGL is LowLevel, designed just to put the polys on the screen, but things would be so much better if it did this with a more efficient means.

This goes into application’s area of expertise. What is good for one app is not necessarily good for another. One man’s opinion of what is good is not necessarily shared by others (which you can see soon, when others comment on my three points above . It is good that OpenGL is low level, because that does not enforce suboptimal decisions on applications.

All this IMHO,
-velco

edit: Hehe, I didn’t knowe the forums are censored. Prolly I shouldav written “sh*t”

[This message has been edited by velco (edited 06-21-2002).]

Originally posted by velco:
[b] My point exactly. Even on a relatively high end card (4400), some operations performed by the CPU can improve the frame rate - it is not “the GPU can do anything better/faster”.

Regards,
-velco

[/b]

Sorry, I didn’t understand you We were saying the same thing…