In VBO experiments I’ve found that if static VBOs are very efficient, I haven’t been able to bring streaming or dynamic VBOs to the level of performance of immediate mode or good old (even uncompiled) vertex arrays.
Further, it appeared that using glBufferSubData to update the whole dataset was faster than using glBufferData, by a factor 3, which is contrary to what the spec or nVidia VBO performance paper suggest (tested on GF3 and FX5900), manually managing several VBO sets and filling them alternatively from frame-to-frame with BufferSubData was even faster (if still not as fast as plain vertex arrays). Isn’t glBufferData supposed to take care of CPU/GPU synchronization issues?
I also tried doing a single glVertexPointer as hinted in the nVidia paper, but the framerate didn’t change at all.
An also odd result was the very poor performance of ATI drivers when using vectex arrays (it’s 20-30% slower to use glVertexPointer/glDraw(Range)Elements than to just loop through the array yourself and specify data with glBegin/glVertex), not sure what’s going on, but if nVidia drivers showed a 1:3 ratio between normal vertex arrays and static VBOs on a 5900, ATI drivers exhibited ratios of 1:10 and beyond on 9700/9800 models, truning normal vertex arrays into major bottlenecks.
I wasn’t able to find any demos of VBO in a streaming situation (where all vertices get updated each frame), all the demos I found where using static VBOs… is it because that’s the only situation in which VBOs currently work?
Any link/url to a streaming VBO demo would be most welcome, I want leave VAR behind
(see http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=011853 for the methodology/code that’s used)