What is the optimal way to store/submit your vertex data?

Clearly immediate mode is not the right way to submit vertices, so it doesn’t really need mentioning.

A lot of DirectX 8 documentation (it’s running on the same graphcis cards as GL, so it can still be applicable) says that in the optimal case a program would use two Vertex Buffers for storing all its data, one for static data and one for dynamic data. Is this accurate for OpenGL (vertex arrays) as well? Also, I wonder if they mean that this situation is optimal in the same way that a scene will optimally use only a single texture (even if no real scene would ever meet that limitation).

It seems to me that it would be pretty easy to deal with the “dynamic” vertex buffer, since it gets filled from the start every frame. Basically instead of using immediate mode calls you put the equivalent data into the vertex array memory and then make a draw call when you are done. If, however, you need to use vertices that have a different format, won’t you have to make another call to glVertexArray/ glInterleavedArrays/ whatever to handle this change? Is there any benefit to storing vertices with different formats in the same memory buffer if you are going to have to respecify the arrays when switching formats? Maybe its better to have one dynamic array for each vertex format you use, and to minimize switches between them.

As for static geometry there are more issues. It still seems silly to store vertices with different formats in the same buffer when you are going to have to effectively “switch buffers” when switching formats. If you have static geometry such that you also store an index array, then this seems to form another roadblock in the “put everything in one buffer” plan. If a model consists of some vertex data and an index array, and we put that vertex data somewhere in a large vertex array then we have two choices of how to handle drawing. We can leave the current vertex array pointing at the start of the large buffer and add a constant to each index from our index array data to deal with the offset, or we can “switch” the vertex array pointer when we need to draw the object for which we have index array data. Once you are in that situation it seems like you might as well have each “model” use a different vertex array. The only benefit I could see in this situation to storing all static vertex data in one buffer is that the program could allocate a large pool and use its own memory manager to decide what will fit in the static data pool. Beyond this, even, is the issue of using display lists. If you are already drawing using vertex arrays, is it worth it to create display lists for frequently drawn objects? Or will display lists just waste memory since the vertex data might already be on the card in optimized form?

<I know this is rambling on, but I’ve nearly finished all my questions>
Finally, are there standard (not nVidia proprietary, for instance) ways to allocate AGP and video memory, to use “vertex array range”-type functionality or possibly even to get the synchornization model of NV_FENCE? I’d like to know the most optimal path for getting vertices to the screen that will work on the widest range of hardware and won’t be heavily dependent on features of one board or another.

Thank you to anybody from ATI and nVidia who takes the time to read this and reply.

D3D has lots of silly restrictions that MS forces on people. There is no real reason for many of these restrictions. Don’t pay attention to what D3D says.

It is best to allocate one chunk of AGP memory with VAR and manage it in your application. You can allocate blocks of memory manually for static data, and you can use fence and a circular buffer for dynamic data.

There is absolutely no reason to restrict yourself arbitrarily to just one vertex format. That is a stupid D3D restriction, from what I understand.

Yes, every time you render, you will probably want to change your pointers. This is not a problem, and I don’t see why you would think that it would be.

VAR/fence enables performance significantly better than what you can get with DX7, and still moderately better than DX8.

  • Matt

Thanks for that help. I was worried that changing vertex array pointers would be a “heavy” state change… but since its a client state there’s no reason it would be.

But VAR and fence will only work on nVidia cards… Is there anything I can do on systems that don’t have these extensions?

Originally posted by mcraighead:
VAR/fence enables performance significantly better than what you can get with DX7, and still moderately better than DX8.

Why then do I get 17 Million Tris/sec in the 3DMark 2001 T&L test and about 12 M Tris/sec in the VAR demo? Both have 1 light.

Too many factors to say. Obviously there’s some difference in the apps that I don’t know about.

I know of a number of problems with the 3DMark 2000 “high polygon” tests that make them extremely unreliable measures of polygon rates. I don’t know if those problems carry over to 2001, but I would argue that the 3DMark 2000 benchmarks were mostly flawed.

  • Matt