GL_OCCLUSION_TEST_HP = frustum culling ?

It seems to me they do the same thing right ? Which of the two is supposed to be faster ?

Frustum culling will remove everything thats outside of the view frustum from being drawn.

Occlusion Query lets you know if there were any fragments of an object actually rendered to the frame buffer (and depending on the exact extension, the number of fragments drawn is returned).

“Occlusion Query lets you know if there were any fragments of an object actually rendered to the frame buffer (and depending on the exact extension, the number of fragments drawn is returned).”

Ok, and then i render what’s visible and discard what is not. So it seems to me the same thing of the frustum culling method.

Fragments that fail the Z test won’t get drawn. Make sense yet?

if you insist, yes they both have the same result with occlusion query doing a lot more and having a much higher price. high enough to be often useless even for occlusion culling. if youre thinking about using that to get around frustum culling, dont complain about horrible performance. its like making photos of millions of pebbles just to figure out which of them have a certain color, when you could just as well look at them and see for yourself.

In my case, i have a forest of several trees (about 10000). I already applied a DLOD to them, so the closest have about 10000 triangles, and the most distant just 100. Now i’d like to exclude from the rendering the ones that are outside the field of view. I currently implemented occlusion query, testing if trees are visible againt a cube box. It works but it doesn’t give the performances i would like. So that was my question: would frustum culling instead give much better performances ?

Frustum culling will be faster if thats all you need. The advantage of occlusion culling is that it will also allow you to not draw objects that are hidden by other objects.

I wouldn’t use HP_OCCLUSION_TEST, that has been superceded by ARB_OCCLUSION_QUERY http://oss.sgi.com/projects/ogl-sample/registry/ARB/occlusion_query.txt
The advantages over the hp version are in the spec.

[This message has been edited by Adrian (edited 02-22-2004).]

Originally posted by penetrator:
Ok, and then i render what’s visible and discard what is not. So it seems to me the same thing of the frustum culling method.

Occlusion queries arent just for culling. Whilst they can be used to cull hidden objects, it can also be used for other effects such as sizing a flare dependant on how much of a light object is visible.

Originally posted by Jared:
if you insist, yes they both have the same result with occlusion query doing a lot more and having a much higher price. high enough to be often useless even for occlusion culling.

I disagree. It is true, however, that you need to be pretty clever about how you use occlusion queries. You have to keep in mind that

(a) You have to transform and rasterize a bounding volume for the query, so the object you’re culling had better be significantly more expensive to draw than the bounding volume itself;
(b) The queries can have rather high latency, so you should structure your code so as to avoid having to go idle while waiting for the results to come back.

If used correctly, occlusion queries work really, really well in my experience. That said, they could and should still be combined with some sort of hierarchical CPU-based culling method to achieve optimal results.

Penetrator, could you provide some more details about how exactly you implemented your occlusion culling?

– Tom

Originally posted by DopeFish:
Occlusion queries arent just for culling. Whilst they can be used to cull hidden objects, it can also be used for other effects such as sizing a flare dependant on how much of a light object is visible.

It’s worth pointing out that the HP occlusion test doesn’t provide that functionality, only the NV and ARB versions do.

Originally posted by Tom Nuydens:
b The queries can have rather high latency, so you should structure your code so as to avoid having to go idle while waiting for the results to come back.[/b]

The HP version only provides a ‘Stop and Wait’ model. To take full advantage of cpu/gpu parallelism he should use the ARB_OCCLUSION_QUERY.

The ARB occlusion query exists in the current WHQL NVidia drivers so I don’t know why it hasn’t been added to the list of extensions here: http://developer.nvidia.com/object/nvidia_opengl_specs.html

I’ve read a lot of posts about occlusion queries causing ‘bubbles’ in the pipeline. I’m not sure how this effect manifests itself and how big an impact it has. As far as I can tell occlusion queries, if used optimally, have a fill rate impact and little else.

Originally posted by Adrian:
The HP version only provides a ‘Stop and Wait’ model. To take full advantage of cpu/gpu parallelism he should use the ARB_OCCLUSION_QUERY.

Absolutely, I should have made that clear.

Originally posted by Adrian:
I’ve read a lot of posts about occlusion queries causing ‘bubbles’ in the pipeline. I’m not sure how this effect manifests itself and how big an impact it has. As far as I can tell occlusion queries, if used optimally, have a fill rate impact and little else.

AFAIK this simply refers to the effect of finishing an occlusion query before the result is available. Doing so causes your pipeline to go idle until all pending commands have been executed, i.e. it’s equivalent to calling glFinish(). As you point out, proper use of occlusion queries will minimize this effect and the overhead beyond the inevitable fill rate cost should be negligible.

– Tom

AGP can’t handle simultaneous upstream and downstream data AFAIK, so transfering the query result from the video card to sys mem could very well cause a pipeline bubble as the card cannot hoover data from memeory during that time.

This will of course be pretty isignificant compared to forcing GPU-CPU synchronization by ending a query prematurely.

Originally posted by harsman:
AGP can’t handle simultaneous upstream and downstream data AFAIK, so transfering the query result from the video card to sys mem could very well cause a pipeline bubble as the card cannot hoover data from memeory during that time.

Presumably this bottleneck will disappear with PCI Express?

thats why i said often. if you can tell a few good occluders and apply pretty much all tips they give in the specs it might not be too bad. i tried a lot of things with them, but in the end the gain was minimal and sometimes even slower (obviously, in some scenes doing occlusion culling is just wasted effort, no matter the method).
in the end testrendering the bounding boxes took about as much time as just really rendering them. even tried just drawing two lines connecting the corners. so it would either have meant to write everything around that extension or just ignoring it.

but i guess the real problem i have with it is that often there might be a lot easier ways you dont think about. so i consider it more of a “last chance” for situations where other methods wont work.

Originally posted by Adrian:
Presumably this bottleneck will disappear with PCI Express?

Probably. PCI express is supposed to have a separate upstream link. But since moving data from GPU to CPU hardly is the common case, I suspect the drivers won’t do a stellar job performance wise anyway.

Penetrator, could you provide some more details about how exactly you implemented your occlusion culling?
– Tom[/b]

This is the routine that render the trees:

glDepthMask(GL_FALSE);
glColorMask(GL_FALSE, GL_FALSE, GL_FALSE, GL_FALSE);
glutSolidCube(size*0.4f);
glDisable(GL_OCCLUSION_TEST_HP);
glDepthMask(GL_TRUE);
glColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE);
glGetBooleanv(GL_OCCLUSION_TEST_RESULT_HP, &isVisible);
if (isVisible==TRUE)
{
glCallList(pine_tree);

}

If you are storing all of your stuff on the card, then PCI Express or high AGP speeds won’t help but I think a lot of people will.
Main RAM will be like having extended memory.
I don’t remember reading about PCI Express memory but I’m sure it will have something like AGP memory.

As for glReadPixels, probably it will suck just as much as it does today if it already does.

Originally posted by V-man:
As for glReadPixels, probably it will suck just as much as it does today if it already does.

“ATI thinks PCI Express will allow multiple high performance graphics adaptors, allow new applications using the backchannel bandwidth, and increase the bar on graphics performance” http://www.theinquirer.net/?article=8000

By backchannel are they refering to readback? Also the term ‘bidirectional bandwidth’ is used frequently in articles about pci express. This must mean something or is it just marketing fluff.

I hoped NVidia/ATI would be clearer about what (if anything) PCI express means to developers particularly regarding readback speed. I’ve read a number of the technical documents and I still don’t really know what its going to mean in real terms.

Originally posted by Jared:
so it would either have meant to write everything around that extension or just ignoring it.

Well yes, you’d have to design your engine around the extension to get good results out of it, but in many cases it’s worth the effort IMHO.

Penetrator, unfortunately the code you posted is pretty close to the worst case scenario

For starters, as has been mentioned more than once before, don’t use the HP extension – use the NV or ARB one. One way to avoid the latency of the queries is to retrieve and use the results in the next frame, not the current one. Because your bounding boxes somewhat overestimate the size of your trees anyway, popping will hopefully be insignificant.

Next, I wonder in what kind of order you’re drawing your trees? The best way to do it is front to back. Back to front is the worst case scenario, presumably you’re somewhere in between (i.e. random order)?

Furthermore, you mention having 100-poly LODs for far away trees. You’re likely to be fillrate-limited for those, so doing an occlusion query (and rasterizing a bounding box) might cost you more than just rendering the tree in those cases. For these objects, having good CPU-based hierarchical frustum culling could help.

– Tom

Thank you Tom, i’m going to try both frustum culling and the Arb occlusion extension. I will post some results later …