MemoryBarrier doesn't work on AMD hardware
I have a fragment shader that does some heavy work using image load/store operations. memoryBarrier() is called inside the shader (and inbetween shader executions) to make sure the synchronisation is performed. We checked the code of this shader multiple times - it looks good. Our GL 4.2 program runs fine on both nVidia and Intel hardware.
These pages seem to mention AMD's glMemoryBarrier implementation is incorrect (August 2013):
In this thread, Graham Sellers (part of the OpenGL driver team) was saying it brought the problem to the attention of AMD's compiler team:
I also stumbled upon this research paper:
"On tested AMD hardware there was an issue with the multi-pass compute shader, where a call to “glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BI T)” does not seem to work correctly, requiring a call to “glFinish()” instead, which invalidates correctness of performance numbers, so these were omitted."
I tried to add glFlush/glFinish to all glMemoryBarrier calls - doesn't make my program work. It's probably because the GLSL memoryBarrier() function doesn't work as expected. This function is critical in my scenario.
I have several questions:
1) anybody here confirming the client side glMemoryMarrier bug?
2) what can I do to work around the bug in the GLSL implementation? I'm stuck, am I not?
3) anybody from AMD here? Is there any plan to fix this bug? This seems pretty serious, and has been reported about a year ago already!