I feel a bit foolish asking this (and this might not be the correct forum) but here it goes:
In my spare time I am setting up an engine but I am confused to it's performance (good/bad).
A bit of needed info: I have a GF4 ti4200 (x4 AGP 64Meg) and a 1.6 Gig P4.
I checked the Nvidia site and looked at the peak performances. I think I read somewhere that the nvidia performance statistics are extremely optimized and that I should not expect results close to it in my application. What I wondered is if I am "on the mark" or is there something I am doing wrong (since my results are much lower).
I set up my test app to generate "lots" of tris in order to check how well it is performing. It has a skydome with approx 21.6K tris. A patch of ground with 8K tris, and a "pillar" in the center with 8K tris as well. At this primative stage I just sent everything to the card.
3 Textures totalling 38016 tris. Each texture is done in order (ie: Only 3 texture changes) using vertex lighting.
Static VBO's for everything.
Indexes are GL_UNSIGNED_SHORT and I use glDrawRangeElements.
I am sending GL_TRIANGLES.
I only clear the Z buffer each frame.
My "hud" displays position, rotation, and FPS. This is the only part that is still done with VAs.
Currently the VBO indexes are sent row by row. I read an article from nvidia about sending everything in one tri strip and using degenerates. Does this help much or is it not worth the bother?
I set up some simple profiling code and included the percentages (in case it helps).
Performance (24 bit color, 8 alpha) fullscreen:
395 fps @ 800x600 Processing Time: glDrawRangeElements 38% SwapBuffers 24%
300 fps @ 1024x768 Processing Time: glDrawRangeElements 44% SwapBuffers 47%
At first glance I thought it was good but when doing the maths it does not seem that great. Opinions? Suggestions? Any help in this matter would be greatly appreciated as I am hesitant to continue until I feel this is resolved.



