hi everyone,
lets assume we invoke a compute shader with 1 (x 1 x 1) workgroups, and the compute shader’s task is to compute the sum of all numbers from 1 to n = 999. theres a formula how to calculate the result:
result = n * (n + 1) / 2 = 499.500
using uint variables:
#version 450 core
layout (local_size_x = 1000, local_size_y = 1, local_size_z = 1) in;
layout (binding = 1, std430) buffer OutputBlock { uint Result; };
shared uint Total;
void main()
{
atomicAdd(Total, gl_LocalInvocationID.x);
barrier();
if (gl_LocalInvocationID.x == 50)
Result = Total;
}
here the actual math (reading variable, processing the temporary result, writing to variable) is done in the “atomicAdd(Total, gl_LocalInvocationID.x)” instruction. we then only need to wait for all other invocations to reach the “barrier()” point in the code, finally 1 certain (here: the 51st) invocation is allowed to write the sum into the shader storage buffer.
that works.
question: how can we do that with float variables ??
i currently have this:
#version 450 core
layout (local_size_x = 1000, local_size_y = 1, local_size_z = 1) in;
layout (binding = 1, std430) buffer OutputBlock { float Result; };
shared float Total;
void main()
{
memoryBarrierShared();
Total += float(gl_LocalInvocationID.x);
memoryBarrierShared();
barrier();
if (gl_LocalInvocationID.x == 50)
Result = Total;
}
but it doesnt deliver the correct result. this is how i query the result:
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 1, buffer);
glBufferData(GL_SHADER_STORAGE_BUFFER, 4, NULL, GL_STATIC_READ);
/* execute shader */
glUseProgram(program);
glDispatchCompute(1, 1, 1);
glUseProgram(0);
glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT);
float result = 0.0f;
glGetBufferSubData(GL_SHADER_STORAGE_BUFFER, 0, 4, &result);
cout << "result = " << result << endl;