Hey everyone.
While reasoning about the problem in this thread I started fiddling around with image load/store. Now, as a first exercise I went for simple compositing of two images inside a shader storing the result in a third image. Aside form not having to use a third image at all using image load/store, I came across the following behavior where the first image is the correct result after the first frame and the second is what I get when rendering the second, third, to the n-th frame:
[ATTACH=CONFIG]245[/ATTACH] <-> [ATTACH=CONFIG]246[/ATTACH]
It’s standard stuff - rendering a fullscreen quad with a pass-through vertex shader (plus tex coord) and the following fragment shader:
#version 420
layout(binding = 0, rgba8) uniform image2D ImageA;
layout(binding = 1, rgba8) uniform image2D ImageB;
layout(binding = 2, rgba8) uniform image2D ImageResult;
layout(binding = 2) uniform sampler2D ImageResultTex;
layout(location = 0) out vec4 FragColor;
in vec2 InterpTexCoord;
const int Size = 512;
const float Alpha = 0.5;
void main()
{
ivec2 ImageTexCoord = ivec2(InterpTexCoord * Size);
// Image A and B are declared read-only
vec4 ColorA = imageLoad(ImageA, ImageTexCoord);
vec4 ColorB = imageLoad(ImageB, ImageTexCoord);
vec4 ColorResult = ColorA * (1.0 - Alpha) + ColorB * Alpha;
// ImageResult is declared read-write
imageStore(ImageResult, ImageTexCoord, ColorResult);
// false results after the first frame
FragColor = imageLoad(ImageResult, ImageTexCoord);
// regular texture lookup is always correct
//FragColor = texture(ImageResultTex, InterpTexCoord);
}
The image unit setup is done as follows:
glBindImageTexture(0, tbo_a_, 0, GL_FALSE, 0, GL_READ_ONLY, GL_RGBA8);
glBindImageTexture(1, tbo_b_, 0, GL_FALSE, 0, GL_READ_ONLY, GL_RGBA8);
glBindImageTexture(2, tbo_result_, 0, GL_FALSE, 0, GL_READ_WRITE, GL_RGBA8);
The texture objects are setup correctly as well.
As can be seen in the above code, only image loads return the wrong value after the first frame. Doing a normal lookup with the corresponding sampler always succeeds.
I assumed the above should succeed because as I read the specs (GL and GLSL), memory transactions in a single invocation of the fragment shader are well-defined and need not be synchronized using a coherent qualifier or a memoryBarrier(). Correct?
I get no errors at any time. The GPU is Radeon HD 6350 with Catalyst 12.5 reporting 8 image units.
Does anyone spot what’s going wrong?