VBO performance issue for dynamic geometry

I transfer my drawing codes from immediate mode to VBO now. When the 3D scene is static, the performace is well. But when I animate the scene, the performance reduce drastically.
I use glMapBufferARB(…, GL_WRITE_ONLY_ARB) and glBufferDataARB(…, GL_DYNAMIC_DRAW_ARB) both, but the performance is same bad.
My code pieces is listed as follows:

float* MapVerticiEx()
{
#ifdef USE_MAPBUFFERS
glBindBufferARB(GL_ARRAY_BUFFER_ARB, VBOVertices);
return (float*)glMapBufferARB(GL_ARRAY_BUFFER_ARB, GL_WRITE_ONLY_ARB);
#else
return pfVertexM;
#endif
}

void UnMapVerticiEx(float pDst)
{
#ifdef USE_MAPBUFFERS
memcpy(pDst, pfVertexM, 3
sizeof(float)NV);
glUnmapBufferARB(GL_ARRAY_BUFFER_ARB);
#else
glBindBufferARB(GL_ARRAY_BUFFER_ARB, VBOVertices);
glBufferDataARB(GL_ARRAY_BUFFER_ARB, NV
3*sizeof(float), pDst, GL_DYNAMIC_DRAW_ARB);
#endif
}

Please give me some advice, Thanks!

There are a large number of issues that can negatively affect your drawing speed. With the little information provided it can be just about any of them.

One thing that looks strange is that on UnMap* you for the non-mapped case use pDst as source, but for the mapped case it seems you’re using it as destination for data already present in pfVertexM. As neither of them is set up in the presented code, I don’t know what to make out of it.

IME the most common error is to submit many small batches of vertices, followed by drawing call(s) referencing just those few vertices, instead of submitting a larger bunch of vertices just once and then reference them with (if needed, multiple) drawing calls - with properly biased indices for potential new vertex offsets.

Another problem could be to use just a single VBO, which effectively on unmap/map can force flushes and therefore stalls. Try using two VBO’s and toggle between them when submitting data. That way the server (gfx card) can process a number of operations using VBO 1 while you’re filling VBO 2 with new data.

Last, but not least, keep track of states. Forgetting to turn off a performace-sucking state can have very noticable performance impact.

Thank you for input. And sorry for little infomation.


One thing that looks strange is that on UnMap* …

In my implementation, I use MapVerticiEx() and UnMapVerticiEx() as follows:

float *pVM = MapVerticiEx();
UnMapVerticiEx(pVM);

So when I open the MACRO USE_MAPBUFFERS, The codes are equal to:

glBindBufferARB(GL_ARRAY_BUFFER_ARB, VBOVertices);
float pVM = (float)glMapBufferARB(GL_ARRAY_BUFFER_ARB, GL_WRITE_ONLY_ARB);
memcpy(pVM, pfVertexM, 3sizeof(float)*NV); // pfVertexM is the upadted vertex array
glUnmapBufferARB(GL_ARRAY_BUFFER_ARB);

If I close the MACRO USE_MAPBUFFERS, The codes are equal to:

glBindBufferARB(GL_ARRAY_BUFFER_ARB, VBOVertices);
glBufferDataARB(GL_ARRAY_BUFFER_ARB, NV3sizeof(float), pfVertexM, GL_DYNAMIC_DRAW_ARB);

I profile my code, If I comments Map* and UnMap*, the performance is boosted most. So I think Map* and UnMap*
is the bottleneck. The reason I think is that Map* and UnMap* cause memory copy from system and video memory.


IME the most common error is to submit many small batches of vertices …

The size of vertex array is more than 65000, so I think it’s not the problem.


Another problem could be to use just a single VBO …

This reason maybe the issue. I animate all the vertex in the vertex array,
so the entire VBO should be update all frames.

Are you overwriting the entire contents of the buffer object when you map it? If so:

switch to the STREAM_DRAW usage hint, and insert a BufferData(GL_ARRAY_BUFFER, size, NULL, GL_STREAM_DRAW) before your map. By passing the ‘NULL’ pointer, you’re telling the driver you don’t need the previous contents of the buffer.