reinterpreting/binding GL_DRAW_INDIRECT_BUFFER as GL_ATOMIC_COUNTER_BUFFER?

Hello, I wanted to reinterpret & bind a single field of an indirect draw call command (the instance count field) as an atomic integer uniform in a compute shader as opposed to having a separate atomic integer buffer and copying the value into the draw call buffer (using glCopyBufferSubData).

When I tried this my machine froze up and the graphics driver crashed, is this an OpenGL limitation or is this a driver issue? The other alternative is to create an atomic integer buffer the size of a a single indirect draw call command but this seems less than idea considering OpenGL only requires a vendor to a support a minimum of a single atomic integer for fragment/compute shaders and the other fields of the indirect draw call command will only ever get initialized once and never modified again.

On a slightly unrelated note, really what’s so special about GL_ATOMIC_COUNTER_BUFFER hard-wise compared to using atomic operations on image/buffer type uniforms? are they really more efficient than the latter or is this just speculation on potential future hardware/compiler optimizations? is this just more to do with compiler code-gen optimizations than hardware circuitry? atomic integers are still being accessed from global memory?

What you want to do should be doable. What graphics card do you use? Can you copy-paste some setup code? Did you properly use glMemoryBarrier to sync the atomic counter writes?

Regarding what’s special about atomic counters, well some GPUs have special HW for atomic counters which make them way more efficient than performing an equivalent atomic operation performed on an image or storage buffer.

This definitely works, from an ordinary GLSL shader anyway. With the inconvenience that the shader doing instance count writing needs to know the offset in the buffer to write the atomic to (undesirable coupling). ARB_query_buffer_object looks interesting for avoiding all this monkey business while still avoiding a full CPU readback, but I haven’t actually tried it yet.

Hi sorry for taking long to reply, I appreciate the feedback! :slight_smile:

I have an ATI 6780 using the latest beta drives, Catalyst 13.10 beta. The latest stable version doesn’t support OpenGL 4.3.

I’m writing this from work so I don’t have access to my code at the moment, I’ll post some code later if it’s necessary.

I have barriers on GL_SHADER_STORAGE_BARRIER_BIT and GL_ATOMIC_COUNTER_BARRIER_BIT.

This weekend I discovered that glMemoryBarrier is buggy on ATI drives it seems(?) as I was timing code I expected timing code around the compute shader dispatch and barrier should be enough to give me rough timings (so long as I wait on glFinish prior to this). On my setup this seems to only be timing the compute dispatch speed. In the end I used GPU PerfStudio 2 to give proper timings on my compute shader.

[QUOTE=aqnuep;1254974]
Regarding what’s special about atomic counters, well some GPUs have special HW for atomic counters which make them way more efficient than performing an equivalent atomic operation performed on an image or storage buffer.[/QUOTE]

Do you have more details? like do you know of some example GPUs which have such special circuity? I mean that’s the only information I get is that there is/might be “special hardware” it just seems like black-box voodoo magic it would be nice to know more than this :stuck_out_tongue: and know of how prevalent this is currently.