Lets say I have 20.000 triangles that gets updated 50 times per second. What is the fastest way to render these triangles 5 times? I want to render them using 5 different materials etc…
Would it be faster to generate a display list the first time and then update them using display lists 4 times or should I use vertex arrays 5 times ?
Why is it no no ? If I shall render the same contents 5 times using different material etc. , the driver could use the on GFX memory to store the triangle data, so I would logically benefit from using display list. I would not have to transefer the same contents 5 times from CPU to GFX mem
Display list compilation can be exceedingly slow. It’s definitely not designed to be used per frame, even though you are reusing it 5 times per frame. Display lists are designed to produce the fastest possible rendering performance, while not caring much about compilation performance.
As a point of reference, I have an app that generates display lists for models with on the order of 300,000 triangles. The display list compilation time for this is around .75 seconds. Adjusted down to 20,000 polygons, that would yield a max of about 20fps. Of course, my hardware is likely different from your own, or your target hardware, but you get the idea.
If you don’t want to use VAR, use LockArraysEXT(). That’s what it’s for – submitting the same geometry more than once.
However, some implementations of LockArraysEXT() cut off any optimization at a fairly low number of verts (1000? 4000?) so you might want to slice it in a bunch of little arrays, iterating each material for each slice. Benchmark to be sure.
Will i benefit from dividing the trangle soup into smaller chunks of lets say 1000 vertices and the do lockArrays and then draw each subchunk several times
enstead of drawing the complete soup multiple times. The data will not change between each draw.
so instead of drawing 5x10000 i will draw 5x10x1000
>>Will i benefit from dividing the trangle soup into smaller chunks of lets say 1000 vertices and the do lockArrays and then draw each subchunk several times<<
yes,
display lists and vertex arrays need to be keep under a certain limit
-eg from memory, my testing on my tnt2 DL’s of 65kb will run about half speed of one 63kb.
-also check the drawrangeelements exten (or something) u can do a glGet…() to get the maximium recommended number of vertices/indices that u can pass in vertexarrays call. going above this limit will results in speed loss (sometimes quite a bit)
IIRC the tnt2’s limit was 4096 for both
drawarrays! so theres no shared vertices.
try drawelements
also try the quake3 format IIRC 4fV 2fT 2fT 4ubC even though nvidia said other formats will work (in the start this was the only one that did) from my testing a year or so ago i still found the quake3 format gives the best results.
check also oldish performance pdf’s at the nvidia website something along the lines.
LockArrays will help on non-T&L cards, and maybe help a little bit on some T&L implementations (such as 3dlabs?) but it doesn’t help you on GeForce hardware, AFAIK.
Also, are you sure that you’re vertex throughput bound? If you are, then perhaps making the vertices smaller (shorts instead of floats) will make it go faster. If you’re not vertex throughput bound, then the size of your window will have a great impact on speed.
Anyway, for the best vertex throughput, you have to use GL_NV_vertex_array_range on nVIDIA hardware. Them’s just the ropes.
LockArrays is still better than not using it. (It lets you get the same benefits as DRE.) The only reason to use LockArrays over DRE is if you plan to render multiple passes with exactly the same arrays, i.e., multiple DE’s within a single lock.
As you can see in my explanation , I am using it for multiple passes within the same lock. However why don’t I get any speedup. Is it because I use vertexWeights ?
If i send the same array data multiple times, you must be able to use AGP mem or simillar to accellerate the same data ???
I beleeive it is vital in cases when you are unable to use indexed geometry but you want to use multpass or send same data multiple times to accellerate drawArrays !!
The what can I do if I wan to resend the same buffers multiple times but using different model matrixes ,to get higher speed ?? I am not able to use idexed geometry…