Hello everyone,
I have got an interesting question for the OpenGL gurus out there.
When I make a call to glDispatchCompute, and then I execute other commands, the OpenGL pipeline queues those commands sequentially. Does that mean that on the GPU the command after glDispatchCompute waits for glDispatchCompute to finish before executing?
I’ll clarify my question: When the CPU pushes commands to the GL pipeline, those commands are not necessarily executed immediately as the GPU might be stuck a little bit behind. This is called the CPU-GPU async model.
At the same time, when I queue 2 rendering commands, I know that the gpu executes them one after the other.
Now let’s make an example.
I rendered all my geometry and I want to do a bunch of post process passes.
Let’s assume that I want to do a horizontal blur pass and then a vertical blur pass.
Pass 1: horizontal blur: pixel shader reads from scene, outputs blurred to texture object 1
Pass 2: vertical blur: pixel shader reads from texture object 1 and outputs to final scene or other texture object.
since the 2 passes are 2 different draw calls, I don’t have to put a wait between pass 1 and pass 2, because I know for a fact that the GPU won’t start any operation of pass 2 before pass 1 is done and the texture object 1 is ok.
Now, can I assume the same with Compute Shaders and the rest of the GL pipeline?
In the same example as above, let’s assume that Pass 1 is a compute shader.
Pass 1: horizontal blur: compute shader reads from scene, outputs blurred to texture object 1
Pass 2: vertical blur: pixel shader reads from texture object 1 and outputs to final scene or other texture object.
do I have to put any sync/barrier or double buffering between the compute shader and the pixel shader? That is, will the GPU wait for the compute shader to finish before executing the Pass 2 draw call and read from the texture object 1 which is the compute shader output?
The reason why I ask, is because in the new OpenGL superbible, compute shader chapter, they use double buffers for the flocking example (using compute shader to do flocking on a uniform shader buffer that is then bound when drawing the geometry)
This is from the book:
“The flock positions and velocities need to be double buffered because we don’t want to to partially update the position or velocity buffer while at the same time using them as a source for drawing commands”
but what you can do is execute the compute shader with the output buffer being the input buffer (copy all data to shared memory, do calculation, write out), and then bind that one.
Am I missing anything else?
Thanks for the reply in advance,
Paolo