john_connor

04-28-2017, 10:10 AM

hi everyone,

lets assume we invoke a compute shader with 1 (x 1 x 1) workgroups, and the compute shader's task is to compute the sum of all numbers from 1 to n = 999. theres a formula how to calculate the result:

result = n * (n + 1) / 2 = 499.500

https://en.wikipedia.org/wiki/1_%2B_2_%2B_3_%2B_4_%2B_%E2%8B%AF

using uint variables:

#version 450 core

layout (local_size_x = 1000, local_size_y = 1, local_size_z = 1) in;

layout (binding = 1, std430) buffer OutputBlock { uint Result; };

shared uint Total;

void main()

{

atomicAdd(Total, gl_LocalInvocationID.x);

barrier();

if (gl_LocalInvocationID.x == 50)

Result = Total;

}

here the actual math (reading variable, processing the temporary result, writing to variable) is done in the "atomicAdd(Total, gl_LocalInvocationID.x)" instruction. we then only need to wait for all other invocations to reach the "barrier()" point in the code, finally 1 certain (here: the 51st) invocation is allowed to write the sum into the shader storage buffer.

that works.

question: how can we do that with float variables ??

i currently have this:

#version 450 core

layout (local_size_x = 1000, local_size_y = 1, local_size_z = 1) in;

layout (binding = 1, std430) buffer OutputBlock { float Result; };

shared float Total;

void main()

{

memoryBarrierShared();

Total += float(gl_LocalInvocationID.x);

memoryBarrierShared();

barrier();

if (gl_LocalInvocationID.x == 50)

Result = Total;

}

but it doesnt deliver the correct result. this is how i query the result:

glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 1, buffer);

glBufferData(GL_SHADER_STORAGE_BUFFER, 4, NULL, GL_STATIC_READ);

/* execute shader */

glUseProgram(program);

glDispatchCompute(1, 1, 1);

glUseProgram(0);

glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT);

float result = 0.0f;

glGetBufferSubData(GL_SHADER_STORAGE_BUFFER, 0, 4, &result);

cout << "result = " << result << endl;

lets assume we invoke a compute shader with 1 (x 1 x 1) workgroups, and the compute shader's task is to compute the sum of all numbers from 1 to n = 999. theres a formula how to calculate the result:

result = n * (n + 1) / 2 = 499.500

https://en.wikipedia.org/wiki/1_%2B_2_%2B_3_%2B_4_%2B_%E2%8B%AF

using uint variables:

#version 450 core

layout (local_size_x = 1000, local_size_y = 1, local_size_z = 1) in;

layout (binding = 1, std430) buffer OutputBlock { uint Result; };

shared uint Total;

void main()

{

atomicAdd(Total, gl_LocalInvocationID.x);

barrier();

if (gl_LocalInvocationID.x == 50)

Result = Total;

}

here the actual math (reading variable, processing the temporary result, writing to variable) is done in the "atomicAdd(Total, gl_LocalInvocationID.x)" instruction. we then only need to wait for all other invocations to reach the "barrier()" point in the code, finally 1 certain (here: the 51st) invocation is allowed to write the sum into the shader storage buffer.

that works.

question: how can we do that with float variables ??

i currently have this:

#version 450 core

layout (local_size_x = 1000, local_size_y = 1, local_size_z = 1) in;

layout (binding = 1, std430) buffer OutputBlock { float Result; };

shared float Total;

void main()

{

memoryBarrierShared();

Total += float(gl_LocalInvocationID.x);

memoryBarrierShared();

barrier();

if (gl_LocalInvocationID.x == 50)

Result = Total;

}

but it doesnt deliver the correct result. this is how i query the result:

glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 1, buffer);

glBufferData(GL_SHADER_STORAGE_BUFFER, 4, NULL, GL_STATIC_READ);

/* execute shader */

glUseProgram(program);

glDispatchCompute(1, 1, 1);

glUseProgram(0);

glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT);

float result = 0.0f;

glGetBufferSubData(GL_SHADER_STORAGE_BUFFER, 0, 4, &result);

cout << "result = " << result << endl;