Occlusion culling with ARB_occlusion_query

tdname · October 29, 2012, 10:01am

I’ve tried many ways but I can’t execute a good occlusion culling using ARB_occlusion_query extension.

glClear(…)
BEGIN WHILE for each object…
— frustum culling check
— if not in frustum, continue to next object
— I turn off color mask
— glDisable(GL_CULL_FACE);
— I render bounding box
— glEnable(GL_CULL_FACE);
— check culling result with glGetQueryObjectuiv(…, &occlusion_result)
— I turn on color mask
— I render only visible object (visible = “occlusion_result” variable greater than 0)
END WHILE

Using this way I have boundig_boxes drawn on screen!
How I can “render” without print to screen?
I’ve already tried to “render” to FBO but it seems not to work using occlusion_query…so I preferred to stay in the default “DRAW_FRAMEBUFFER” which give me some fragment_count greater than zero.

Screenshot (UP: without occlusion_query):

danbartlett · October 29, 2012, 11:07am

Did you disable writing to the depth buffer too? Otherwise you will get objects that are drawn after the bounding box that fail the depth test when they aren’t meant to, including the object that is meant to be contained inside the bounding box.

tdname · October 29, 2012, 11:56am

Nope, depth_test is always enabled…infact the “grey bounding box” (which is not shaded due to colorMask to FALSE) visible in my screenshot is placed well in its right place and moving camera it moves according to the movement.

My problem seems to be just the “bounding box draw” during normal rendering.
All online examples doesn’t uses two passes…but in my engine a single pass produces that effect, and it’s wrong.

Maybe I could solve the problem with a second pass in this way:
void Render(){

1st glClear(color | depth)
BEGIN WHILE for each object
execute occlusion query
save occlusion query result in an object’s internal variable
END WHILE
2nd glClear(color | depth)
BEGIN WHILE for each object
render only object with previous internal variable to TRUE
END WHILE
}

But I want to know why I have always problems and I can’t use any SIMPLE source code I found online…I’m a bit angry about that hahahaha
I always have to interpret it or completly rewrite from scratch with a lot of changes.

danbartlett · October 29, 2012, 12:41pm

You have to disable depth writes for your bounding boxes, since you don’t want their depth value to be stored in the depth buffer - you only want the box to be tested to see if any values would be visible. Disable depth writing for your bounding box in a similar way to color writing:

glDepthMask(GL_FALSE);

AFAICT, it isn’t the bounding boxes that are being drawn to the color buffer, but the terrain that isn’t being drawn in that position due to your bounding box having already written a nearer depth value to the depth buffer, so the sky is visible through it.

tdname · October 29, 2012, 1:18pm

Yes, I don’t understand why but now it works, but I’m sure I’ve tested that way already before.
However thx!

Question: during async query waits (to get results), what “actions” are allowed meanwhile? gl* functions (glMapBuffer, glReadPixel, and so on) are allowed? or only CPU-side (frustum culling, front_to_back objects sorting, etc…)?

thokra · October 29, 2012, 3:30pm

You usually use the stall to perform CPU tasks. You might want to check out CHC++ and follow up papers.

Dark_Photon · October 29, 2012, 4:18pm

Also look at conditional rendering.

tdname · October 30, 2012, 1:52am

CHC seems great but at the moment I have many other things to do, expecially because I don’t use C++/VC++ but Delphi, so to implement CHC I would to convert all its source.

However I’ve got other questions:

I think OpenGL/Shader needs to know which object is behind other to return Zero as occlusion_query result…but if I disable DepthMask, how can it knows all that?
Sort objects from front to back: due to DepthMask disabled, each rendered “layer”/object would covers previous one, so FRONT objects (which are drawn for firsts) will be covered from BACK ones (subsequently drawn). In this way back objects will be rendered in “top-layers” leaving FRONT objects covered by them and stay in background. I don’t understand this step: disable DepthMask and sort objects…

Now I’ve got “perfect” render (not like my first screenshot) but all occlusion_query results are greater than Zero even if I have very big cube in front of the camera covering all the scene.
I’ve tried to temporaly enable ColorMask and/or DepthMask to really render boundingBoxes which are involved during occlusion_queries, but all boxes seems fine…but this big cube in front of the camera seems to not covers nothing while I get queries results.

thokra · October 30, 2012, 2:53am

[…]so to implement CHC I would to convert all its source.

No you thieve. Just take the paper and implement it yourself - you’ll learn more and can tailor your code to your needs while writing and not while porting.

I think OpenGL/Shader needs to know which object is behind other to return Zero as occlusion_query result…but if I disable DepthMask, how can it knows all that?

No. The way occlusion queries work is that the GPU keeps track of how many fragments were actually rasterized and writes the number(or a kind of a boolean with GL_ANY_SAMPLES_PASSED*) into the query object. You can then retrieve the number of samples from the query object with glGetQueryObject(). Disabling the depth mask doesn’t mean a depth value used for depth testing isn’t generated. It means the depth buffer will not be updated with the depth values. This is especially useful if you render a depth pass first and then do the occlusion culling with depth masking.

Sort objects from front to back: due to DepthMask disabled, each rendered “layer”/object would covers previous one, so FRONT objects (which are drawn for firsts) will be covered from BACK ones (subsequently drawn). In this way back objects will be rendered in “top-layers” leaving FRONT objects covered by them and stay in background. I don’t understand this step: disable DepthMask and sort objects…

Here’s one way with a depth-first pass (or depth-only pass, or depth pre-pass, or z-pre-pass or …):

Frustum cull your scene
Sort visible objects coarsely front to back
Render a z-pre-pass (no fragment shading, color masking turned on), maybe just large occluders here, like terrain, large buildings and stuff
Enable depth masking (you’ll use the depth buffer from the previous step)
Render the scene again, this time only the bounding volumes
Determine fragments passed and decide whether to render or to cull (partially) occluded objects

This is not the only way. There are multiple philosophies as to how to occlusion cull a scene in the best possible way. Something much simpler but still maybe worth it is a method by Croteam’s Dean Sekulic in GPU Gems.

DICE took a completely different route in their Frostbyte engine. They implement a software rasterizer which renders a coarse depth map of the scene using simplified proxy meshes for considerable occluders and do the culling on the CPU. The same goes for the guys at Guerilla.

tdname · October 30, 2012, 3:23am

Of course. But without your explanation (“disabling the depth mask doesn’t mean a depth value used for depth testing isn’t generated”) queries behaviour was a mystery
Thanks for clarification.

So your recomend is to do a z-pre-pass (1), occlusion_query pass (2) to get results and finally the rendering one (3)? How many passes…
I know there are many ways and implementation of occlusion culling, but I think the minimum required passes can’t be less…so I must surrender to do all those passes?
I’m angry about technology limits…

tdname · October 30, 2012, 6:59am

To temporarily avoid sorting algorithm, I’ve loaded only a big cube and a small sphere behind it. Even exchanging loading/rendering sort (from first to last, or from last to first) both objects are always visible and pass occlusion_query with fragment_count > 0 (370.000 for cube and 42 for sphere).

If the sphere goes outside camera (but still INSIDE for frustum check, due to sphere-radius bigger than real objects), or the object becames too small to be rendered in 1 fragment, the query_result goes to Zero.
So the occlusion_query seems to work…a bit

Update 1: I’ve linked a different, and very simple, Vertex+Fragment Shader to just execute (in Vertex) “gl_Position = PVM * vertex” and (in Fragment) “out fragColor= vec4(1.0, 1.0, 1.0, 1.0)”. No changes.

Update 2: I’ve tested a full-object-rendering during occlusion_query (not just boudingBoxes) without significant changes.