VBO + Occlusion Query = Slow

Hi.

I have a really large static VBO and I have been looking into ways of removing parts of it to speed up my program.

I am going to be breaking it up and implementing octrees, but before that, I want to have a go at the ARB occlusion query.

I have tried to implement it, but it seems to really slow down my program.

The VBO I have is 419430 vertices large (it is a terrain) and the occlusion query tells me how many bits it has to render, which seems fine to me but I could be wrong and the actual drawing of it does not seem to remove anything, and slows down my program considerably.

The code that I use is this:

 
	GLuint sampleCount;

	glGenQueriesARB(1, &query);

	glBeginQueryARB(GL_SAMPLES_PASSED_ARB, query);

	glRenderTerrain(tnormalt,tdetailnormal);

	glEndQueryARB(GL_SAMPLES_PASSED_ARB);

        glColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE);
        glDepthMask(GL_TRUE);

	glGetQueryObjectuivARB(query, GL_QUERY_RESULT_ARB, &sampleCount);

	glUseProgramObjectARB(pTerrain);

	//glPolygonMode(GL_FRONT_AND_BACK, GL_LINE);

	if (sampleCount > 0) {
	glRenderTerrain(tnormalt,tdetailnormal);
	}

	glutSwapBuffers();
 

The line with glPolygonMode is just to show me what I can see. And it is showing everything that is supposed to be occluded.

Am I implementing this correctly? If I am, why does it slow down my program, instead of speed it up?

This is not good OQ usage.
OQ returns how many samples “survive” depth (alpha & stencil) test. In your case it will be better to split terraing into octrees. Depending on observer position, do frustum culling agains leafs and split leafs in 3 groups… near, medium and far.

Render near group… then render only bboxes of medium group and do OQ for each leaf in medium group. If some leaf pass OQ then this leaf is visible. Now… render leafs that “survive” OQ testing. Do again all this tests on far leafs just like as medium leafs.

OQ save fillrate and pixel processing time. With proper usage it can save vertex procesing time by sending lowpoly models or bboxes.

Keep in mind that waiting for OQ result might force CPU to wait on GPU for result. Better approach is to send OQ batches then do something else and get back and read OQ test results.

  Better approach is to send OQ batches then do something else and get back and read OQ test results. 

or… test the last frame query!

Originally posted by Golgoth:
[b]

  Better approach is to send OQ batches then do something else and get back and read OQ test results. 

or… test the last frame query! [/b]
Yes why not but it depends on several factors like the change in the view and the granularity of the OQ samples.