which API calls are asynchronous? (maximum throughput)

when using immediate mode calls (i.e. glNormal* glTexCoord* glVertex*) i am working under the assumption that the call is non-blocking unless the GPU has not finished with the previous call. is this a correct assumption for modern drivers?

i know that the various buffered calls (VBO, VAR, display lists, etc) are effectively asynchronous until certain other gl calls are made, but i don’t know how this applies to immediate mode calls.

for example, in a fill-bound application (e.g. volume rendering) where the geometry count is low, is it better to spend a little CPU time computing expensive vertices (transforming, interpolating, clipping, etc) and drawing them immediately, knowing that you can compute the next vertex while a triangle is rasterizing, or is it better to compute the few 100-1000 verts at once, dispatch them all at once with VBO, VAR, etc and while all that drawing occurs in the background, compute the next set of verts?

it is ultimately a question of maximizing CPU/GPU utilization, so they are both balanced in their load. i can’t tell though whether the CPU is fast enough to finish the next polygon and is blocking waiting for the last one to draw, or whether the GPU finishes drawing and is waiting for the CPU to issue the next polygon.

Typically on PCs calls will block when a second swapbuffers is issued while another is waiting for completion. I usually block before then because I’d rather reduce the latency, but each to their own. Some implementations may block on the first graphics call issued after the second swap, which can be better behaved w.r.t. CPU useage in a lot of situations.

Other things may cause graphics to block, readbacks certainly but maybe stranger stuff, it’s not really set in stone, there are many implementations out there.

If you want to do some CPU processing and you’d like to ensure something gets issued to keep graphics busy then a glFlush may be a good idea (not to be confused with a glFinish).

Originally posted by codemonkey76:
for example, in a fill-bound application (e.g. volume rendering) where the geometry count is low, is it better to spend a little CPU time computing expensive vertices (transforming, interpolating, clipping, etc) and drawing them immediately, knowing that you can compute the next vertex while a triangle is rasterizing, or is it better to compute the few 100-1000 verts at once, dispatch them all at once with VBO, VAR, etc and while all that drawing occurs in the background, compute the next set of verts?

The first. If you know that you’re completely fillrate bound and low geometry, start the rasterizer as early as possible.
As dorbie said, you can go even that far to send an early glFlush after some polygons to keep implementations from buffering these commands. (Don’t overdo, though.)