ATI problems with variable length for loops?

I’m currently using GLSL 1.5 and have noticed on two separate occasions that ATI cannot seem to handle for loops where the count is a uniform variable instead of a pre-defined constant.

For example, a blur where the blur radius is passed in as a uniform rather than hard coded to some fixed radius.

The effect of this appears to be that no iterations of the for loop get executed at all.

However, NVidia seems to handle this fine.

Has anyone else experienced this issue?

Maybe I’m wrong, but “uniform flow control” is a part of GLSL 4.xx.x specification. ATI is pretty consistent in spec implementation, so if you have declared GLSL 1.5, variable loop length might be not allowed. Uniform flow control also requires new hardware. Did you try it on the same class hardware (D3D11/GL4.0)?

If that’s the case, then my bad as I must have missed that. However, I have certainly found ATI to be consistent with their implementation of the spec.

I have not been able to test this on ATI DX11/GL4 class hardware. The NVidia hardware I’ve been testing on is DX11 though, so that may explain why it’s working there.

Just so I’m clear though – GL3.2 does not allow for variable loop length? i.e. it’s a 4.0+ only feature?

Sounds fishy. If there’s no loop control of all types, then it means there’s no support for dynamic branching at all.
My bet is on a driver bug.

Btw, it’s the third similar report here for the past week or so.

Only sampler array that takes a variable index is a 4.0+ feature. It needs the hardware support. The driver could handle all flow control cases except that.

If you still have problems, please paste on the shader.

I’m sorry. It’s my mistake. Loops controlled by uniforms are allowed in SM4 hardware (I have tried it on GF8600). I thought that it is still needed for loops to be unrolled by the compiler. In that case the number of iterations must be known in the compile-time. It is still the limitation of mobile devices and OpenGL ES, but obviously not for desktop GPUs.

And for the end one historical fact: ATI Radeon 9500 (R300) did not support loops in fragment shader at all. :slight_smile: I bet that your card is not so old, and if it supports SM4, there is probably a bug, just like Ilian said.

Thanks for the feedback guys. And yes, it still does not work.

The shader itself is very simple. It’s literally a for loop fetching some texels and computing an average. The count is defined by a uniform variable rather than a constant. I don’t have the code in front of me at the moment, but can post it tonight when I get home.

It’s just the case I mentioned - Only sampler array that takes a variable index is a 4.0+ feature. You need to run it on HD5000+ series. Otherwise the driver will report something like “indirect index to sampler array is not supported on the asic.”.

Just so we’re clear on terminology – it’s a for loop fetching texels from a texture (single sampler), not a sampler array.

Here’s the code:

void main()
{
    vec4 C = texture( TEXTURE_0, LerpUV );

    vec2 V = DF_ComputePixelVelocity( TEXTURE_1, TEXTURE_2, LerpUV ) * MotionBlurInfo.y;
    
    vec2 BlurUV = LerpUV + V;
    
    const int NUM_SAMPLES = 8; // If this comes from a uniform, it doesn't work
    
    for( int i = 1; i < NUM_SAMPLES; i++, BlurUV += V )
        C.rgb += texture( TEXTURE_0, BlurUV ).rgb;
    
    C.rgb /= float( NUM_SAMPLES );

    OutC = vec4( C.rgb, 1 );
}

The shader looks good to me. I tried to compile the shader by adding the comment on the DF_ComputePixelVelocity, it works. Which hardware and driver do you use?

Yes, that shader as it’s posted there, will indeed work. If you look at the comment on the line where I assign NUM_SAMPLES, you’ll see that it breaks (i.e. does not work) if the NUM_SAMPLES value comes from a uniform variable instead of an inline constant. Note however, that even when NUM_SAMLPES is assigned a value from a uniform, the ATI shader log indicates that everything compiled and linked successfully.

I’m running either the latest or one version older than the latest ATI drivers.

The bug could be reproduced now. It’s related to uniform block and is fixed recently. You have to wait for about three months to try the new driver.
Two workaround ways could be taken to avoid the failure:

  1. Do not use the variable length as you said.
  2. Use general uniform instead of uniform block.

Sorry for the inconvenience.

I’m not sure it is related to uniform blocks.

I’m also trying to use a for loop with a uniform condition (specified in the general block) in a fragment shader (with a sampler2DArray), with #version 400 defined, on a HD5770 and it doesn’t work. It compiles fine, but just doesn’t iterate the loop at all. If I use a constant expression, it loops ok.



in vec2 TexCoords;

uniform sampler2D DefaultDetailTexture;

const int MaxDetailTextures = 12;

uniform int NumDetailTextures;
uniform sampler2DArray AlphaTextureArray;
uniform sampler2DArray DetailTextureArray;

void main(void) {
	float alpha_accum = 0.0;
	vec3 detail_color = vec3(0.0);
	for (int i = 0; i < MaxDetailTextures; ++i) {
		//if (i >= NumDetailTextures || alpha_accum >= 1.0) {
		//	break;
		//}
		
		float alpha = texture(AlphaTextureArray, vec3(TexCoords, i)).r;
		
		detail_color += min(alpha, 1.0 - alpha_accum) * texture(DetailTextureArray, vec3(TexCoords, i)).rgb;
		alpha_accum = min(1.0, alpha_accum + alpha);
	}
	detail_color += (1.0 - alpha_accum) * texture(DefaultDetailTexture, TexCoords).rgb;
	
	gl_FragColor = vec4(detail_color, 1.0);
}


If I try NumDetailTextures instead of MaxDetail textures, or uncomment the condition w/ the break in, it doesn’t work.

It’s definitely not related to uniform block.

Are the results the same (both wrong) when you use NumDetailTextures or uncomment the if condition?

Yep. In both cases the shader compiles and runs, but doesn’t seem to execute any code in the loop.

So what’s the status on this Frank? Have you guys been able to pinpoint exactly what the problem is? I think it’s clear at this point that there is definitely a problem with the compiler and at the very least, with for loops that loop based off uniforms rather than constants.

I just get a chance to take a look at it. It can’t be reproduced. Below are some codes:

glGenTextures(2, tex);

glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, tex[0]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 4, 4, 0, GL_RGBA, GL_UNSIGNED_BYTE, _texture1);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_REPLACE);

glActiveTexture(GL_TEXTURE1);
glBindTexture(GL_TEXTURE_2D_ARRAY, tex[1]);
glTexImage3D(GL_TEXTURE_2D_ARRAY, 0, GL_RGBA, 4, 4, 2, 0, GL_RGBA, GL_UNSIGNED_BYTE, _texture2);
glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D_ARRAY, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_REPLACE);
GLuint loc;
loc = glGetUniformLocationARB(p, "DefaultDetailTexture");
glUniform1iARB(loc, 0);
loc = glGetUniformLocationARB(p, "AlphaTextureArray");
glUniform1iARB(loc, 1);
loc = glGetUniformLocationARB(p, "DetailTextureArray");
glUniform1iARB(loc, 1);
loc = glGetUniformLocationARB(p, "NumDetailTextures");
glUniform1iARB(loc, 8);

No matter how I change the shader, the results shown on amd and nvidia are the same. My environment is HD5770 + Vista + Cat10.6.

If you still have problems, please send your program to me by frank.li@amd.com

Well, I tried making a minimal example, and found that I couldn’t reproduce the bug either. Then I went back to my original code and put the uniform loop conditional back in, and it worked fine.

Not a clue what I was doing wrong. Sorry for the waste of time.

It turns out the problem is still here, but it’s happening intermittantly.

When I turn my computer on, the uniform conditional in my main program doesn’t work. I then run my attempt at a minimal reconstruction of the problem and that does work. Then I go back to the main program and run it, and it now works fine!

Not really got a clue how to track this down further.

It has not been intermittent for me – it has been 100%. I sent Frank a copy of the shader program source that was doing it for me. Not sure if he was able to test with that or not.

For now I’m getting by with hard-coding constants instead of using uniforms. Unfortunately, that is not (and cannot be) a permanent solution.