Thaellin
01-30-2001, 08:37 AM
Hi,
I'm having trouble getting my rendering loop up to snuff. I'm currently getting only 20,000 triangles per second on my system at home. A GLperf test case with similar GL states indicates my system should be capable of over 100,000 triangles per second.
I profiled my code, and have run through numerous optimization cycles already. I seem to be at a plateau right now, though. My code currently processes a list of the minimal object data which must bre rendered into the scene. It does a small amount of display cache checking, then renders the OpenGL primitives which compose the object.
Timing indicates that over 90% of my function time is spent in the glDrawArrays call. I replaced it with DrawElements with no impact. On a V3 system, replacing the call by manually stepping through the data arrays (with begin/vertex3fv/end) sets significantly improved performance (not enough), but had no effect on nVidia-based test systems. Interleaved arrays do not significantly alter performance (minor performance /drop/).
I'm currently using the vertex, normal, and texture coordinate arrays. All data is tightly packed in seperate single-precision floating point arrays.
I simply have no clue how to get the improvement that GLperf shows me is possible.
I can implement more optimizations based on rejecting data, but they will not boost the triangle output of my function.
Does anyone know what could account for the performance difference? My head aches.
Thanks for any insight you can offer.
-- Jeff
[This message has been edited by Thaellin (edited 01-30-2001).]
I'm having trouble getting my rendering loop up to snuff. I'm currently getting only 20,000 triangles per second on my system at home. A GLperf test case with similar GL states indicates my system should be capable of over 100,000 triangles per second.
I profiled my code, and have run through numerous optimization cycles already. I seem to be at a plateau right now, though. My code currently processes a list of the minimal object data which must bre rendered into the scene. It does a small amount of display cache checking, then renders the OpenGL primitives which compose the object.
Timing indicates that over 90% of my function time is spent in the glDrawArrays call. I replaced it with DrawElements with no impact. On a V3 system, replacing the call by manually stepping through the data arrays (with begin/vertex3fv/end) sets significantly improved performance (not enough), but had no effect on nVidia-based test systems. Interleaved arrays do not significantly alter performance (minor performance /drop/).
I'm currently using the vertex, normal, and texture coordinate arrays. All data is tightly packed in seperate single-precision floating point arrays.
I simply have no clue how to get the improvement that GLperf shows me is possible.
I can implement more optimizations based on rejecting data, but they will not boost the triangle output of my function.
Does anyone know what could account for the performance difference? My head aches.
Thanks for any insight you can offer.
-- Jeff
[This message has been edited by Thaellin (edited 01-30-2001).]