Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Results 1 to 2 of 2

Thread: glMultiDispatchComputeIndirect

Hybrid View

  1. #1
    Junior Member Newbie
    Join Date
    Jul 2008
    Posts
    18

    glMultiDispatchComputeIndirect

    I want to see glMultiDispatchComputeIndirect. This would be important to me because my compute shader needs to be globally synced often. There is no way to do this currently, except by calling glComputeIndirect multiple times in a loop. The new procedure would be changed to allow for multiple calls

    Code cpp:
    void glMultiDispatchComputeIndirect(GLintptr indirect, GLsizei computeCount) {
      for(GLsizei GLComputeInvocation = 0; GLComputeInvocation < computeCount; GLComputeInvocation++) {
        glUniform1i(locationof_GLComputeInvocation, GLComputeInvocation);
        glDispatchComputeIndirect(indirect);
      }
    }

    The same thing could be done for the non-indirect version. Also, some people might like a 3 dimensional compute count.

    Code cpp:
    void glMultiDispatchComputeIndirect(GLintptr indirect, GLsizei computeCountX, GLsizei computeCountY, GLsizei computeCountZ) {
      for(GLsizei GLComputeInvocationX = 0; GLComputeInvocationX < computeCount; GLComputeInvocationX++) {
      for(GLsizei GLComputeInvocationY = 0; GLComputeInvocationY < computeCount; GLComputeInvocationY++) {
      for(GLsizei GLComputeInvocationZ = 0; GLComputeInvocationZ < computeCount; GLComputeInvocationZ++) {
        glUniform3i(locationof_GLComputeInvocation, GLComputeInvocationX, GLComputeInvocationY, GLComputeInvocationZ);
        glDispatchComputeIndirect(indirect);
      }
      }
      }
    }
    The order of calls in it may matter to some people, but they could also go in any undefined order by giving a counter of invocations, or people could use atomic counters in their shaders.
    This procedure should allow some optimization in the driver/hardware because they know the same thing will be called multiple times.
    Or, maybe add in a Global Sync in the compute shader, but I know that would be very difficult because it would require all workgroups to stay alive and communicate with each other.
    Last edited by Dark Photon; 06-20-2013 at 07:58 PM.

  2. #2
    Junior Member Newbie
    Join Date
    Jul 2008
    Posts
    18
    I think even just having the loop be in the driver instead of the user code should save some kernel-user round trips.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •