Problem with uniform buffer array and older AMD hardware

I am using a uniform buffer to pass a larger amount of light sources to a shader.

I define the buffer as follows:


	layout(std140) uniform LightBuffer
	{
		vec4 lights[NUM_LIGHTS];
	};

NUM_LIGHTS is set, based on the reported max. uniform buffer size reported by the system.
The buffer itself it at least 16 MB large and I set the required part with


int LightBuffer::BindBufferRange(unsigned int index)
{
	unsigned int offset = (index / mBlockAlign) * mBlockAlign;
	glBindBufferRange(GL_UNIFORM_BUFFER, LIGHTBUFFER_BINDINGPOINT, mBufferId, offset*16, NUM_LIGHTS*16);
	return (index - offset);
}

NUM_LIGHTS is the same value in both, in the glBindBufferRange it is multiplied with 16 to go from a vec4 index to a byte index in the buffer. The value returned by this function is then passed to the shader as an index uniform.

This works fine on all NVidia and Intel hardware I tested it on, and also on modern AMD cards, but I have a problem with older AMD hardware that acts if the buffer’s contents were all 0.

So, is there any known problem with uniform buffers on AMD that one needs to be aware of? Unfortunately I do not have a card available that shows this problem so it’s quite hard to track down.
So far it only got reported by a handful of users for AMD drivers that do not implement GL 4.0 at all.