GLSL dynamic looping?

Hi all!

I m having rough time dealing with glsl dynamic looping on gforce 7800 and nvidia 84.21 drivers!

Once per frame I m sending how many lights the shader should go through… in this exemple I send 4 as Count …

uniform int Count;

void main()
{
for (int i = 0; i < Count; ++i)
{
...
}
}

Anyways I turn this around, and I did try a lot of different combinations, it is simply not working!

Interesting point here:

uniform int Count;

void main()
{
if (Count == 4)
{
… <---- it goes here and we have a winner!
}
}
}

I m clue less, some are convinced it can be done with SM 3.0 and HLSL… I m using GLSL and it does not work for me…

Any hint would be really appreciated!

Thx

Are you by any chance trying to index into uniforms with your loop counter? That’s not supported by today’s hardware, so I could imagine that might cause problems, but it should of course result in software rendering instead.

Thx Humus!

Are you by any chance trying to index into uniforms with your loop counter?

well, I m using glUniform1i to send the data from FFP to the shader… I ve tried with constants as well… pretty much any possible combinations… someone in another thread suggest to pre-compile many versions of the shader… tbh, I just cant visualize this, and how would I do that efficiently!

Is it possible to copy the value of the uniform into another var and then to use that as loop-variable?

Is it possible to copy the value of the uniform into another var and then to use that as loop-variable?
I ve tried a Zillion times … negative!

@Humus: I saw a demo, that stored a bone-count at each vertex and used that as loop-count. My “old” SM 2.0 hw couldn’t do that. I assume, that at least this kind of looping is possible on SM 3.0 hardware?

But what about all the demos of NV and ATI, that “showed off” a single shader calculating multiple lights? Was all that stuff only about instruction counts? I always thought the amount of lights to use was passed by a uniform…

Jan.

Originally posted by Golgoth:
well, I m using glUniform1i to send the data from FFP to the shader…
What I meant was, is your code doing something like for instance this:

uniform vec4 array[64];

...

vec4 sum = vec4(0.0);
for (int i = 0; i < Count; i++){
    sum += array[i];
}

The problem here is that we access “array” with a non-constant index. This is not supported in hardware in the fragment shader on today’s GPUs. If Count was a compile time constant, then this would work fine since the loop would be unrolled. If not, you’ll have to pass the array as a texture and sample in the loop to get the value.

Originally posted by Jan:
@Humus: I saw a demo, that stored a bone-count at each vertex and used that as loop-count. My “old” SM 2.0 hw couldn’t do that. I assume, that at least this kind of looping is possible on SM 3.0 hardware?
SM3.0 hardware can do that in the vertex shader, but not in the fragment shader. It’s an unfortunate weakness of SM3.0, which thankfully will be gone in SM4.0 that will have identical instruction set (to the extent it makes sense) in both the vertex and fragment shaders.

which thankfully will be gone in SM4.0

is that a fact?

Will SM 4.0 works on gforce 7800 serie or will I have to wait another generation of video card?

SM4.0 is DX10, so you’ll no current card supports that.

If Count was a compile time constant, then this would work fine since the loop would be unrolled. If not, you’ll have to pass the array as a texture and sample in the loop to get the value.

humm… I m not familiar with this technic… do you refer to something like:

uniform sampler2D Count;

void main()
{
for (int i = 0; i < Count; i++)
{
}
}

and sample in the loop to get the value… what do you mean by that? “Count” as to be part of the final result?

uniform sampler1D Array;
uniform float invWidth;

uniform int Count;

void main(){
	for (int i = 0; i < Count; i++){
		vec4 arrayElement = texture1D(Array, (float(i) + 0.5) * invWidth);
		...
	}
}

thx, I m reading over and over the last post and I cant figure what you are suggesting:

void main()
{
	for (int i = 0; i < Count; ++i)
	{
		// Calculate attenuation.
		l_distance = distance(v_Vertex, gl_LightSource[i].position.xyz);
		l_lightAtt = 1.0 /
		(
			gl_LightSource[i].constantAttenuation +
			gl_LightSource[i].linearAttenuation    * l_distance +
			gl_LightSource[i].quadraticAttenuation * l_distance * l_distance
		);
	}
...
	);
}

I cant wait to have this working, anyone can help clear this up?

Well, that confirms that the problem is indeed what I suggested. You’re using your loop variable to index into gl_LightSource[]. Current hardware doesn’t support that. Easiest would probably be to remove that “uniform int Count” and compile 8 different versions of the shader with “#define Count 1”, “#define Count 2” etc. added to the beginning of the shader.

alright, thx again humus, im up up for any voodoo stuff… I ll probably have to do the same thing for textures… sigh…

compile 8 different versions of the shader with “#define Count 1”, “#define Count 2” etc. added to the beginning of the shader.

Interesting, ok so I ve added:

#define Count 1;
#define Count 2;
#define Count 3;
#define Count 4;
#define Count 5;
#define Count 6;
#define Count 7;
#define Count 8;

at the beggining of the shader and removed:
uniform int Count;

and? that is the way to compile many times the same shader? want else do I have to do?

where can I find any info on this?

You misunderstood and that doesn’t translate.
First you can remove the semicolon at the end of the #define.
You need only one line with
#define Count x
and then replace the character at “x”'s position in the source string to 1 for 1 light, translate, then 2 for 2 lights, translate another, and so on until you have your eight shader programs.
Generating shaders this way can easily be done programmatically with string manipulations.

hummm… do you suggest that i need 8 different shaders writen on the hard drive? if yes, why do we need a #define for?

if no, I m totaly lost… can you make an example please?

That’d work but you can do it with one shader on the harddrive.
Let’s say you don’t add the define in the source, but just print it into it:

Warning pseudo code, doesn’t compile. There are more elegant ways with the glShaderSource interface.

// Load the shader however you do it.
// let char *strMyShaderSource point to the NULL terminated C string with the source.
// Allocate another string which is bigger and can hold the additional line as well.

GLuint programObjs[8];
for (i = 0; i < 8 ; i++)
{
  // Put "#define Count i" at beginning of shader.
  sprintf(strFinalSource, "#define Count %d
%s", i + 1, strMyShaderSource); 
  length = strlen(strFinalSource);
  glShaderSource(shaderObj, 1, &strFinalSource, &length);
  glCompileShader(shaderObj);
  programObjs[i] = glCreateProgram();
  glAttachShader(programObjs[i], shaderObj);
  glLinkProgram(programObjs[i]);
}

There are cases were something else needs to be at the top of the source. You can also load a version with the define inside and search for a token you define like “@” and replace that with the right number.

and when setting up lighting have something like:

switch (lightCount)
{
case 1: glUseProgram(programObjs[0]);
case 2: glUseProgram(programObjs[1]);

}

that could work indeed… imagine I want to add multi-textures as well in the same shader… that’s would be ridiculously insane… I m so surprised no one ever consider doing this… Forced to go back to the drawing board and revise my rendering loop… I really don’t like it… the only other descent way I can figure is to get rid of the gl_[i] variables and send every light data through uniforms and use a light stuct in the shader (till I figure how to send it via texture sampler)… but that is costly and not effective at all… so much fuzz for basic stuff… I m having a nightmare, someone wake me up!

Well, you could do glUseProgram(programObjs[lightCount-1]) instead. Or glUseProgram(programObjs[lightCount]) if you want to cover the case of zero lights too.

As for multi-texture, it depends on what you mean with that. You can’t access sampler arrays with a dynamic index either on current hardware. If that would be dependent on another variable, then you’d need more shaders, but if you access it through “i” as well then you don’t need any more shaders.