Can you give some details of the stream-draw method?
I’ve tried glBufferDataARB() with GL_STREAM_DRAW_ARB and GL_STATIC_DRAW_ARB, but there is no difference in frame-rate using these different approaches.
Is there any reason I should try vertex arrays without VBOs?
VBO should be faster than immediate mode even if GL_STREAM_DRAW mode is used.
You can copy new dataset with glBufferDataARB() every frame. Or mapping technique; glMapBufferARB()/glUnmapBufferARB().
Here is a snippet code using map/unmap;
float *ptr = (float*)glMapBufferARB(GL_ARRAY_BUFFER_ARB, GL_WRITE_ONLY_ARB);
if(ptr)
{
// update dataset with given pointer to vertex buffer
updateMyVBO(ptr);
glUnmapBufferARB(GL_ARRAY_BUFFER_ARB); // release VBO after use
}
You can copy new dataset with glBufferDataARB() every frame.
No. glBufferData allocates a new buffer, so it is slow. Better use glBufferSubData to override the existing buffer.
Originally posted by Overmind: [quote]You can copy new dataset with glBufferDataARB() every frame.
No. glBufferData allocates a new buffer, so it is slow.[/QUOTE]It allocates new memory only if it has to because the buffer size or storage type changed or because the buffer is used by GPU. Otherwise it will likely reuse existing memory.
The GL_DOUBLE type is not natively supported by most GPUs so the driver needs to do conversion on the CPU during the rendering command which is almost certainly the cause of the slowdown you see.
Originally posted by Overmind: I wouldn’t be so sure the driver optimizes away the allocation. It doesn’t ever optimize away redundant state change…
What the driver optimizes depends on cost of the change when compared with cost of the check (complexity of the check, frequency of the calls, probability that the check will avoid additional work).
At least the Nvidia optimizes this based on this paper .
The allocation can still happen if the GPU is using the buffer or the driver would have to wait. In the case of replacing content of entire buffer that is currently in use by GPU, the glBufferSubData has opposite problem. Unless the driver detects that entire buffer content is replaced and optimizes that by allocating new memory, it needs to wait for the GPU. In both cases there would be allocation or wait.
I would at least try using glBufferSubData, to see if it makes a difference.
That is good idea.
Additional thing to try would be to manually double buffer the VBOs in case the GPU using the buffer would force the driver to allocate memory or wait.