GL_EXT_compiled_vertex_array problem.

Hi, I have changed my code to use the CVA extension, but I have only received a marginal speed increase.

The mesh is a quite large, bu I cannot see it being any bigger than Q3A meshes.

I have changed the vertex arrays to use an extra pad float (W coord) as was stated in a John Carmack paper - posted on this site somewhere - and this speeded things up a bit more.

I have no colour arrays specified and the textures aren’t working quite correctly yet - although I get something displayed - but still it really crawls.

Should I be setting up the arrays (and calling the glDrawElements) on each frame?

I really need some help.

Thanks,
Luke A. Guest.

Just a simple question : are you using floats for your vertex ?

If you are using doubles, try to go to floats… When I did that, I had an incredible boost in performance !

Something else : are you using display lists that embed your calls ? If so try to use glNewList(x,GL_COMPILE) and then glCallList(x) instead of glNewList(x,GL_COMPILE_AND_EXECUTE).

And if none of these apply to you, sorry, I am clueless !

Best regards.

Eric

Nope, I’m using floats (win32)…and I;m not using display lists at all - yet.

Luke.

Hi, here’s the code that get called every frame to render a mesh.

The vertex buffer contains the vertices/texture coords/weights/colours etc. and the mesh segment contains the indices for the part of the mesh that is currently being rendered.

I had originally had two viewports being drawn - the main one which is the scene and then another one which contains another view into the current scene. This small one was in the top corner. I just removed the small viewport and gained only a minimal speedup.

I think it must be the way I’m rendering the meshes.

Thanks,
Luke A. Guest.

— Start —

void CDriver::Render(const V3DVertexBuffer &c_vbRender, const V3DIndexBuffer &c_ibIndices, const C3DMeshSegment &c_meshsegPortion)
{
Msg(“OpenGL”, “[CDriver::Render 2]”);

const C3DIndexBuffer  &c_ibRealIndices = reinterpret_cast<const C3DIndexBuffer &>(c_ibIndices);
const C3DVertexBuffer &c_vbRealRender  = reinterpret_cast<const C3DVertexBuffer &>(c_vbRender);

glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_NORMAL_ARRAY);

if (c_vbRealRender.GetVertexFormat().GetTotalTextureCoords() > 0)
	{
	glEnableClientState(GL_TEXTURE_COORD_ARRAY);

	glTexCoordPointer(2, GL_FLOAT, 0, c_vbRealRender.GetTextures(c_meshsegPortion.m_dwVertexStart));
	}

if (c_vbRealRender.GetVertexFormat().IsHomogeneous() == true)
	{
	glVertexPointer(4, GL_FLOAT, 0, c_vbRealRender.GetVertices(c_meshsegPortion.m_dwVertexStart));
	}
else
	{

// glVertexPointer(3, GL_FLOAT, 16, c_vbRealRender.GetVertices(c_meshsegPortion.m_dwVertexStart));
glVertexPointer(3, GL_FLOAT, 0, c_vbRealRender.GetVertices(c_meshsegPortion.m_dwVertexStart));
}

glNormalPointer(GL_FLOAT, 0, c_vbRealRender.GetNormals(c_meshsegPortion.m_dwVertexStart));

if (m_bHasCompiledVertexArrayExension == true)
	{
	glLockArraysEXT(c_meshsegPortion.m_dwVertexStart, c_meshsegPortion.m_dwVertexRange);
	}

glDrawElements(m_ePrimitives[c_meshsegPortion.m_primType], c_meshsegPortion.m_dwIndexRange, GL_UNSIGNED_SHORT, c_ibRealIndices.GetData() + c_meshsegPortion.m_dwIndexStart);

if (m_bHasCompiledVertexArrayExension == true)
	{
	glUnlockArraysEXT();
	}

if (c_vbRealRender.GetVertexFormat().GetTotalTextureCoords() > 0)
	{
	glDisableClientState(GL_TEXTURE_COORD_ARRAY);
	}

glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_NORMAL_ARRAY);
}

— End —

Are you using this code to do any multipass rendering though? If not, the CVA extension will only provide a minimal speed boost, if any at all. On the otherhand, if you are using it for multipass rendering, you are losing the benefit of the CVA extension by prematurely unlocking the array.

[This message has been edited by DFrey (edited 10-12-2000).]

Hi,

We are not currently doing multipass rendering - but we will be…I’m currently looking at the GL_ARB_multitexture extension. We also have the code to set up each texture (stage), but we are not currently rendering multipass.

Also, when is it best to lock/unlock the arrays? I’m assuming that when you call glLockArraysEXT(), the arrays get compiled (shifted onto the card) and when glUnlockArraysEXT() is called, it is safe to edit the contents of the vertex buffer?

I essentially have to emulate D3D by wedging in the OGL code to fit the existing engine.

Thanks,
Luke A. Guest.

Also, when is it best to lock/unlock the arrays? I’m assuming that when you call glLockArraysEXT(), the arrays get compiled (shifted onto the card) and when glUnlockArraysEXT() is called, it is safe to edit the contents of the vertex buffer?

Essentially correct. So your current code is fine for single pass rendering. That makes me wonder how much overhead your classes are introducing.
For multipass rendering the general approach in pseudo-code would be:
[i]
set vertex pointer
lock arrays

set texture and color pointers
set first rasterizing state
draw elements

set texture and color pointers
set second rasterizing state
draw elements

unlock arrays
[/i]

[This message has been edited by DFrey (edited 10-12-2000).]

I was just wondering…our mesh segments may be quite small.

Does the CVA only really produce better results when using large meshes?

Thanks,
Luke A. Guest.

The CVA extension is not really sensitive to the mesh size, but instead to vertex reuse. The more often a vertex is reused, either by (1) appearing multiple times in the vertex index array passed to glDrawElements, or (2) by multiple passes of the array (by multiple calls to glDrawElements while the vertex array remains locked for the entire sequence of calls), the more benefit obtained from using the CVA extension. And actually, some OpenGL drivers are apparently accelerating case (1) even while the array is not locked as the array doesn’t really need to be locked to accelerate case (1), however, I would not assume any given OpenGL driver would accelerate case (1) without being locked.

[This message has been edited by DFrey (edited 10-13-2000).]

> I have no colour arrays specified and the textures aren’t working quite correctly yet - although I get something displayed - but still it really crawls.

Many drivers have optimized their rendering path for the Q3A-way-to-do. It means that if you do not specify texture coordinates or a color array, you may hit a non-optimized path.

Y.