How to get rid of client arrays

CatDog · April 23, 2007, 2:22pm

Huh,

there’s this very interesting thread about the new object model , where Michael Gold pondered about dropping client side array support in Longs Peak. As I understood it, at least mixing client arrays with buffered arrays will not work anymore.

Now, well, that is exactly what I did. I’ve got vertices, normals, texture coords in VBOs, but color arrays are on the client side.

I’m doing this, because I need to change the vertex colors dynamically and in realtime. A per-vertex-parameter (eg. a force or some other value that can be scaled to a color table) is visualized on the surface of the meshes.

These meshes are huge. Several million vertices. So I decided to leave the color arrays on the client side.

What would be the solution for this problem without client side color arrays?

Thanks a lot!

CatDog

Lord_crc · April 23, 2007, 4:24pm

I’m no expert, but couldn’t you use a VBO and use glMapBuffer to modify it directly?

Humus · April 23, 2007, 6:12pm

Yes. That’ll likely be faster.

CatDog · April 24, 2007, 5:57am

Originally posted by Lord crc:
I’m no expert, but couldn’t you use a VBO and use glMapBuffer to modify it directly?
This would be a tremendously difficult task. The colors are updated in random order over all meshes. It’s an asynchronous operation. When the next frame is rendered, I’m not waiting for the color calculation to finish. In fact, since the color update is not always a realtime operation, the arrays get updated from another thread that does all calculations and writes the colors to the arrays.

At the moment, this is not problem: randomly changing any color at any time is no problem.

With MapBuffer, I had to map and unmap the arrays thousands times per frame, since mapping does only work on the currently bound array (or am I wrong here?).

I still think, without client arrays, I would be forced to maintain my color arrays on the client side just as before. But on each frame, I needed to copy (BufferSubData) the whole stuff to the VBOs.

To me, this look like a deterioration, but I might be wrong. If for example the driver does exactly the same thing (copy all client arrays to a temporary VBO before rendering it) - then there isn’t any drawback. But I don’t know what’s happening behind the scenes - which is the reason I’m asking here. Maybe someone can clarify this to me?

CatDog

Cyranose · April 24, 2007, 7:45am

I still think, without client arrays, I would be forced to maintain my color arrays on the client side just as before. But on each frame, I needed to copy (BufferSubData) the whole stuff to the VBOs.
Yes, that’s the simplest way to “get rid of” client arrays with minimal changes to your app.

This may not be immediately faster than old client arrays, but it shouldn’t be any worse. AFAIK, client arrays are indeed copied or streamed to videomem on most cards (not sure about cards that map main memory as pseudo-videomem).

This change lets LP simplify the driver model which, from what I understand, may help everything speed up for a win. But at least it will help make better drivers.

One immediate potential upside for you is that if your colors don’t change every frame, using a STATIC type VBO could actually make it faster.

And in your case, your non-color portions should probably be STATIC anyway, i.e., don’t interleave colors with the other vertex data, though they could (for most HW) all be in one giant VBO.

If only some broad sub-range of your colors changes, then you can potentially save some bus traffic too. You may want to keep a min/max changed color and only load the changed region(s).

And you might want to do the BufferSubData a few ms before the draw call that uses the data, or, for example, during the initial screen clear or even right before the SwapBuffers (e.g., for next-frame’s color data). There may be some benefit to be gained from that sort of parallelism, though it’s not clear.

Michael_Gold · April 24, 2007, 9:40am

BufferSubData is a good way to go here. To illustrate the problem: GPUs typically cannot DMA from client arrays, and when you supply a mix of client arrays and VBOs the driver has to either copy the client array to a temporary internal array, or fall back to a slower path that doesn’t use DMA. Worse, since the GL has no way of knowing if you modified your client array since the last call, there is no way to cache any of it - the entire thing must be copied on every draw call.

Alternatively, if you use BufferSubData, especially if you can track the dirty regions and efficiently update only those which you have touched, the GL is more likely to use the fast path, and buffer updates are under your control. Even if you sometimes dirty the entire buffer, less data is transfered from client memory to the GL on average per draw call. And if you always dirty the entire buffer, and copy the entire array each time, you still have the win of staying on the GPU fast path.

One of the problems with the current API is cases like this: what seems simpler often hides the complexity which exists under the covers. One of the goals of Longs Peak is to eliminate these “false promises”.

CatDog · April 24, 2007, 10:23am

One of the goals of Longs Peak is to eliminate these “false promises”.
Keep at it, you’re definitely on the right track!

Thanks to all, you helped me a lot!

CatDog