Originally posted by maximian:
[b]Can you explain what you mean by due to the cache.
How can I manipulate my data so that it fits in the cache. Unfortunantly, this data is derived from surface scan, and there is little if any repetion.
In reference to drawing the triangle strip w/ too many vertices, I am not sure what you mean. In immediate mode, using triangle strip speeds up drawing anywhere from 60%-100%.
In VBO mode it does not improve performance!
[/b]
The vertex cache remembers the post-transformed results of the last N (N=16, 24, 48, etc…) vertices, saving memory fetch, transform and lighting time if one of those remembered vertices is repeated.
Optimizing means trying to sort the triangles such that you have the fewest transitions in and out of that cache. There’s little hope of fitting everything in such a small cache, but the sort can help a lot with typical meshes that have most vertices shared by 3 to 6 triangles. There’s a free mesh optimizer from Nvidia that does this work for you, btw, even on pre-existing meshes.
I don’t know whether the vertex cache does anything for non-indexed data these days, but it’s possible. Either way, the triangle strip is a primitive that’s designed to implicitly reuse vertices so caching wouldn’t add much unless your strips share other vertices too (as in the case of a mesh).
Anyway, if your data is in system memory or is sent in immediate mode without any nice AGP-mem buffering by the driver, then strips would be much faster–fewer glVertex calls and less data to transmit. But the difference between optimized indexed triangle and strips might be small once the data transfer or API calls are no longer the bottlenecks. Then it might come down to caching behavior or time to fetch the indices.
What I meant by “drawing too many vertices” is a common problem with triangle strips. The ‘count’ parameter is the number of vertices, which starts with 3 for the first triangle and adds 1 for each additional triangle (basically, v = numTri + 2). Some people try to use v = numTri *3 or *2 or somesuch, meaning they’re rendering extra verts that don’t always show up as garbage on screen but do take time to transform.
Avi