GL_OCCLUSION_TEST_HP usage

Is this the correct way to use the new occlusion test extension on nvidia cards?

GLubyte result;
glDepthMask(false);
glColorMask(false, false, false, false);
glEnable(GL_OCCLUSION_TEST_HP);
// DRAW BOUNDING BOX HERE
glDisable(GL_OCCLUSION_TEST_HP);
glDepthMask(true);
glColorMask(true, true, true, true);
glGetBooleanv(GL_OCCLUSION_TEST_RESULT_HP, &result);

if ( result )
{
// Draw Object Here
}

This is exactly what I’m doing but it isn’t working right (more like randomly).

If the usage is correct, my bug must be elsewhere :stuck_out_tongue:

Thanks,
– Zeno

Looks correct to me.

  • Matt

you have to draw in the correct order… maybe this is the bug?

No, it can’t be that. I’m not saying it doesn’t work in the performance sense (which would be affected by the order you sort drawables) but it doesn’t seem to work for me in the functional sense.

Sometimes things that should be culled appear and sometimes things that shouldn’t are invisible. I’ll check it out more tomorrow.

– Zeno

If anyone cares, the problem is that I was trying to render the boxes in immediate mode while having the VERTEX_ARRAY client state enabled :stuck_out_tongue: Oops.

Unfortunately, the state changes (and pipeline stop) required for me to render the bounding box actually cause the program to run slower. It’s faster for me to render 40k tris per frame without the occlusion test than it is to do 5-10k with it (approximate results).

Perhaps if the geometry were more dense or I put the bounding box vertices in VAR memory, it would be worthwhile.

– Zeno

does the occlusion test work on geforce2 based cards? because i tried with the 27.10 and 27.20 drivers and it still doesn’t work?

HP_occlusion_test is a suboptimal extension for performance. This is entirely a result of the interface it provides, which makes it impossible to get any parallelism.

  • Matt

Also, you can use the extension on older hardware by using the emulation registry option. The performance, however, will be pathetically bad – you’ll be running in software. This may be useful for testing purposes, but since the point of the extension is to improve performance, it’s not incredibly useful.

  • Matt

Originally posted by mcraighead:
[b]HP_occlusion_test is a suboptimal extension for performance. This is entirely a result of the interface it provides, which makes it impossible to get any parallelism.

  • Matt[/b]

Matt -

Are you implying that a functionally equivalent extension could be made with a different interface that would be faster?

– Zeno

Yes, I think I’m implying that.

  • Matt

Is that “equivalent extension” GL_NV_occlusion_query ?

Y.

The so mysterious GL_NV_occlusion_query… Cass said new OGL specs will be online soon on NVIDIA website.

“Soooooon, answers and questions will meet.” Zarglor Nitizer Snoopz

Originally posted by Zeno:
Are you implying that a functionally equivalent extension could be made with a different interface that would be faster?

The problem with this extension is that the
glGetBooleanv(GL_OCCLUSION_TEST_RESULT_HP, &result) call would cause an implicit glFinish(). There are many ways to around that problem, not sure how nVidia will do it, but I suppose they will add some sort of query whether the result variable is valid yet or something similar.

Maybe i’m saying something stupid, but isn’t it possible to have that kind of extension run 100% in parallel without any feedback ?

I was thinking of something like a command to tell OGL what the current bounding box is, and you make sure that all subsequent drawing calls will lie inside that bounding box. Like:

glBegin(GL_BOUNDING_BOX);
… many glVertex calls to specify bbox
glEnd();

glEnable(GL_OCCLUSION);
glBegin(GL_TRIANGLES);
… some glVertex inside the bbox
glEnd();
glDisable(GL_OCCLUSION);

What do you think ?

Y.

Another stall in the HP version is when you enable the occlusion test itself - logically, you’d have to flush all preceeding commands before processing the occlusion geometry. Then specify occlusion geometry, then query results (forcing another stall).

It might be easy to change this to a single stall, but either way you empty the processing pipeline… which is one of the top ‘no-nos’ in performance graphics.

The pay-off comes when you have a massively complex object which you can represent with a simple bounding region. Imagine a 250,000 polygon object enclosed by a bounding box. There’s pay-off in that (no, “250,000” is not the magic break-even number - I just pulled it out of the air).

A better extension, in my opinion, would be one which ties two geometric representations together, rendering the second only if the first ‘would probably have been’ rendered. This could be done with some vertex-array equivalent calls such as:

glDrawOcclusionElements( GLenum mode,
GLsizei count1,
GLsizei count2,
GLenum type,
const GLvoid *indices1,
const GLvoid *indices2 )

Mode and type are the same for both the occlusion representation (count1, indices1) and the complex representation (count2, indices2).

Of course, if nVidia has come up with a nice parallellism-friendly extension which they hope to have adopted by multiple vendors, then I’d be tickled to use it.

Later,
– Jeff