I’ve encountered what looks to me like a performance oddity: with vertex lighting ON, immediate mode rendering is slightly faster than VBO or BuildList, with lighting OFF, the situation reverses.
By “immediate” I mean specifying triangles with glBegin/glNormal/glVertex/glEnd, while “BuildList” use a build list made from the “immediate”, and “VBO” uses straight indexed vertex arrays and VBO buffers.
Here are the figures I get (GF3, Det 45.23, AXP 1800+) for 28000 triangles in a tristrip (all visible, none gets culled), only one omni light is in the scene:
Lighting ON:
immediate mode : 220 FPS
buildlist/VBO : 200 FPS
Lighting OFF: (aka glDisable(GL_LIGHTING))
immediate mode : 340 FPS
buildlist : 520 FPS
VBO : 510 FPS
If the triangle rate with lighting OFF looks not too bad, with lighting ON, the VBO/BuildList performance is somewhat depressing… Any idea why VBO would perform slower than immediate mode calls when the only difference is lighting being ON?
glBegin(GL_TRIANGLE_STRIP);
for i:=0 to indices.Count-1 do begin
k:=indices[i];
glNormal3fv(@normals[k]);
glVertex3fv(@vertices[k]);
end;
glEnd;
Note: “classic” vertex arrays and VBO have the same exact performance as soon as there is a light ON, i.e. slower than immediate (despite the fact that “immediate” makes thousandths of calls).
Is that Visual Basic?
If it is, then maybe there’s something going wrong because of the way the pointer parameter is abused in VBO…? Just a thought…don’t know jack about visual basic, but I gather it doesn’t use pointers, so maybe it’s interface with a c dll gets messed up in this extension.
That is Delphi code, interfaces OpenGL in exactly the same fashion as your C code, with pointers etc. - though from experience I guess my ‘for’ loop is compiled more efficiently than your C equivalent
Btw, on the “classic” vertex array performance, I’ve an addendum: as long as no VBO call of any kind has been made, performance is similar to “immediate” (and even slightly faster). Once VBOs have been used, the performance of “classic” vertex arrays matches that of the VBOs when lighting is ON (ie. slower, even though the VBOs have been disposed of).
With lighting OFF, classic vertex arrays are faster than immediate, but not as fast as VBO (as can be expected).
Could it be your lighting?
Nvidia wants you to NOT use 2 sided lighting, otherwise you hit a software path.
I think that`s it and the rest should be entirely hw accelerated.
The next suspect is the driver. I think the newer ones are better tuned for the FX cards at the cost of older ones.
TwoSidedLighting isn’t used, and faceculling on/off has no impact of performance (as expected, all triangles being visible).
Shorts gave me no performance delta, same exact performance figures
(the actual meshes can end up with more than 64k vertices in a chunk, so short indices wouldn’t have been a convenient solution anyway)
I’ve made another test (not willingly at first, but results were interesting): I fired a proggy that ate 100% of CPU time (a math calculations thing), then started the bench. Immediate mode performance dropped to about 170 FPS each time, while VBOs went down to 50 FPS… meaning that VBOs are tranformed/lit on the CPU side???
Well, gotta wait for next driver release and hope for an improvement…