What are “best practices” for using vertex buffer objects with dynamically updated data?
As I read the extension spec, glMapBuffer will always stall until all drawing commands that use that buffer have completed. If the driver is brain-dead it might even stall until ALL drawing is completed. Anyone know if it’s that bad?
glBufferSubData probably stalls until all drawing with the buffer completes. If I used only glDrawRangeElements, a really smart driver might be able to figure out that the drawing and the update don’t overlap and avoid the stall. However, unless someone can definitively confirm it, it doesn’t seem wise to count on this. Anyone know? And even if it can avoid the stall, it has the downside of copying data rather than letting me write it directly into a buffer.
From this, I’m concluding that unlike DirectX with its D3DLOCK_NOOVERWRITE flag, OpenGL offers no performance-friendly multi-vendor way to update only part of a vertex buffer object if any part of it has been used for drawing. Thus, each block of dynamically-updated data should be in its own buffer that’s either replaced completely or not modified at all. Correct?
There are several strategies I can see here:
- Always allocate a new vertex buffer object of the correct size. Delete it when it’s no longer used. Update it as needed with glBufferData( …NULL… ) + glMapBuffer, which will cause a new allocation within the driver.
2a. Have a pool of unused vertex buffers sitting around. When I need new buffer space, grab the best-fit from that pool (or allocate if necessary) and discard/replace its contents. When I’m done with a buffer, it goes into a queue. Queued buffers get returned to the free pool after buffer swaps (or fences if available) complete. Replacing the contents of an existing buffer is done with glBufferData on that buffer.
2b. Same as (2a), but replacing the contents of an existing buffer is done by tossing the current buffer into the unused queue and grabbing a new one from the free pool.
My guess is that for dynamic data (used multiple times for each time its updated), approach 1 would be the best. For streamed data (written once, rendered once, and then discarded), 2a and 2b are equivalent and would be the best solution.
But I’m just speculating here. Does anybody know for sure?