Trouble passing Uniform Buffer Objects to shader

Hello,

I’m trying to use Uniform Buffer Objects to pass arbitrary data as a float array to a shader, and I’m having some trouble with that. I can get it to run, but I don’t get the results I expect.

Like I said, I’m only trying to pass an array of floats which has a hard-coded size of 50, for now. Here is my opengl code (in Java with jogl):

(Assume myShader is a valid pointer to a working shader.)


            int blockIndex = gl.glGetUniformBlockIndex(myShader, "MyBlock");
            FloatBuffer data = FloatBuffer.allocate(50);

            // Fill array with floats ranging from 1.0 to 0.0
            for (int i = 50; i > 0; i--) {
                data.put(((float)i) / 50f);
            }
            data.rewind();


            IntBuffer buf = Buffers.newDirectIntBuffer(1);
            gl.glGenBuffers(1, buf);
            int bo = buf.get();
            gl.glBindBuffer( GL2.GL_UNIFORM_BUFFER, bo );

            gl.glBufferData(GL2.GL_UNIFORM_BUFFER, 50 * Float.SIZE, data, GL.GL_DYNAMIC_DRAW);
            gl.glBindBufferBase( GL2.GL_UNIFORM_BUFFER, blockIndex, bo );
            gl.glBindBuffer(GL.GL_ARRAY_BUFFER, 0);


and here is my fragment shader code:


layout(std140) uniform MyBlock
{
  float myDataArray[50];
};

void main (void)
{
        int pos = int (gl_FragCoord.x) % 50;
        gl_FragColor = vec4(myDataArray[pos], myDataArray[pos], myDataArray[pos], 1);
}

this is the output:

[ATTACH=CONFIG]877[/ATTACH]

I would expect a smooth white-to-black gradient over 50 pixels and then repeat, but instead, the gradient goes over 13 pixels, and the remaining 37 pixels are all black. It’s as if my 50-sized array gets crushed into the fisrt 13 elements of the output array.

This is what I would expect (done in photoshop):

[ATTACH=CONFIG]878[/ATTACH]

Any ideas?

I don’t know what the FloatBuffer class is doing. But your array elements are not supposed to be tightly packed in memory.
The OpenGL Programming Guide says about array elements in an UBO with std140 layout:

The size of each element in the array will be the size of the element type, rounded up to a multiple of the size of a vec4. This is also the array’s alignment. The array’s size will be this rounded-up element’s size times the number of elements in the array.

A vec4 has 16 bytes. That means you need to align every float in the array at 16 byte boundaries. First element at offset 0, second at 16, third at 32 and so on.

Oh and you mixed up some Parameters.
Should be something like this:


int bindingpoint = 0;

int blockIndex = GL.GetUniformBlockIndex(myShader, "MyBlock");
GL.UniformBlockBinding(myShader, blockIndex, bindingpoint);

int bo = GL.GenBuffer();
GL.BindBufferBase(BufferRangeTarget.UniformBuffer, bindingpoint, bo);

That means you need to align every float in the array at 16 byte boundaries. First element at offset 0, second at 16, third at 32 and so on.

I think you misread something. The floats ought to be tightly packed in the array. For arrays of composed types (e.g. structs) the rule you cited seems right, but not for simple types.

No, float array elements are indeed aligned to 16 byte boundaries for std140 layout. They are tightly packed with layout std430, but that layout is for shader storage blocks only as far as i know.

Look at this little example:

layout(std140, binding = 0) uniform Block
{
	float array[3];
} block;

When i extract the array stride, it gives me a value of 16 bytes:
[ATTACH=CONFIG]892[/ATTACH]

(Uoloaded a wrong file first, can i delete the first one?)

I guess you’re right. I’ve to say I’m quite surprised about this. Why would floats in a struct be tightly packed, but not in an array?

Because in a struct, you access each member individually by a compile-time-defined name. Therefore, the compiler knows at compile-time exactly what memory address “var.member” refers to. Internally, it can convert “var.member” directly into a memory address. So if you have a struct of 4 floats, “var.secondMember” can be converted internally into “vec4.y”.

In an array, you can access any element by a runtime value. If your hardware is based on vec4’s, how do you access the ith element of that, when ‘i’ is runtime-defined? You would need a way to access a vec4’s element by a runtime-defined value.

Now granted, GLSL requires that you can access a vector by a runtime-defined index, so…

f your hardware is based on vec4’s, how do you access the ith element of that, when ‘i’ is runtime-defined?

My idea would have been a bit-shift. If a statement like if(i==1) return v.y is possible - so is accessing a component.
How could this be even in std140? No - please don’t tell me. I believe it.

If a statement like if(i==1) return v.y is possible - so is accessing a component.

Or you could just do this:


uniform Buffer
{
  vec4 floatArray[XX];
};

...

float AccessFloatArray(int index)
{
  return floatArray[index >> 2][index & 0x3];
}

That is required to work, even in GLSL 1.50. Why the compiler isn’t required to do that for you, I can’t say.

My guess would be the argument was that this requires two instructions whereas everybody knows that an array-access is one instruction.

Some vec4 based hardware can index into an array of vec4s, but dynamic vector component selection requires a series of conditional moves. That kind of overhead was apparently considered undesirable.