What exactly does the memoryBarrier do?

I’m confused by people’s explanation about the memoryBarrier() function in glsl.
Some people said memoryBarrier can synchronize all invocations to a specific execution point.
But it can’t explain the result of the following compute shader code.


#version 430 core
layout (local_size_x = 256) in;
layout(binding = 0, rgba32f) uniform coherent image2D test_img;
shared int arr[256];

void main()
{
    ivec2 p = ivec2(gl_GlobalInvocationID.xy);

    arr[p.x] = 0;
    arr[p.x] = 1;

    //memoryBarrierShared(); only a memoryBarrierShared() call can't produce desired output
    barrier();
    if(arr[p.x+1] != 1)
        imageStore(test_img, p, vec4(vec3(0.0f), 1.0f));
    else
        imageStore(test_img, p, vec4(1.0f));
}

I run this compute shader and then output the content of test_img to the screen in another glsl program.
(There is a glMemoryBarrier(GL_ALL_BARRIER_BITS) between them)

With only barrier(), the whole output image is white(except the right edge),
which means that barrier() synchronized all the invocations of compute shader correctly.
But with only memoryBarrierShared called, I couldn’t get desired output.

I draw a conclusion myself:
The memoryBarrier can’t synchronize invocations by synchronizing memory accesses. More specifically, what exactly memoryBarrier do is just waiting for completion of all memory accesses which have already happened in the invocations. It will not wait for the memory accessing code to finish which have not executed even though it’s prior to the memoryBarrier in the source code. The Opengl programming guide said “When memoryBarrier() is called, it ensures that any writes to memory that have been performed by the shader invocation have been committed to memory rather than lingering in caches or being scheduled after the call to memoryBarrier()”. That’s means, for example, assuming there are three invocations, if both invocation A and B have run the imageStore() for a coherent image variable, then a following memoryBarrier of A or B will guarantee this imageStore() has changed the data in main memory, not just the cache. But if invocation C has not runned imageStore() when A or B call memoryBarrier, then this memoryBarrier call will not wait for C to run its imageStore().

But I’m not sure about this because I can’t find any similar explanatioin on the internet.
Is that correct?