PDA

View Full Version : GLSL dynamic looping?



Golgoth
05-15-2006, 10:07 PM
Hi all!

I m having rough time dealing with glsl dynamic looping on gforce 7800 and nvidia 84.21 drivers!

Once per frame I m sending how many lights the shader should go through… in this exemple I send 4 as Count …


uniform int Count;

void main()
{
for (int i = 0; i < Count; ++i)
{
...
}
}Anyways I turn this around, and I did try a lot of different combinations, it is simply not working!

Interesting point here:


uniform int Count;

void main()
{
if (Count == 4)
{
… <---- it goes here and we have a winner!
}
}
}I m clue less, some are convinced it can be done with SM 3.0 and HLSL… I m using GLSL and it does not work for me…

Any hint would be really appreciated!

Thx

Humus
05-15-2006, 11:12 PM
Are you by any chance trying to index into uniforms with your loop counter? That's not supported by today's hardware, so I could imagine that might cause problems, but it should of course result in software rendering instead.

Golgoth
05-15-2006, 11:45 PM
Thx Humus!


Are you by any chance trying to index into uniforms with your loop counter?well, I m using glUniform1i to send the data from FFP to the shader... I ve tried with constants as well... pretty much any possible combinations... someone in another thread suggest to pre-compile many versions of the shader... tbh, I just cant visualize this, and how would I do that efficiently!

Jan
05-16-2006, 01:22 AM
Is it possible to copy the value of the uniform into another var and then to use that as loop-variable?

Golgoth
05-16-2006, 09:57 AM
Is it possible to copy the value of the uniform into another var and then to use that as loop-variable?I ve tried a Zillion times ... negative!

Jan
05-16-2006, 03:16 PM
@Humus: I saw a demo, that stored a bone-count at each vertex and used that as loop-count. My "old" SM 2.0 hw couldn't do that. I assume, that at least this kind of looping is possible on SM 3.0 hardware?

But what about all the demos of NV and ATI, that "showed off" a single shader calculating multiple lights? Was all that stuff only about instruction counts? I always thought the amount of lights to use was passed by a uniform...

Jan.

Humus
05-16-2006, 03:27 PM
Originally posted by Golgoth:
well, I m using glUniform1i to send the data from FFP to the shader...What I meant was, is your code doing something like for instance this:


uniform vec4 array[64];

...

vec4 sum = vec4(0.0);
for (int i = 0; i < Count; i++){
sum += array[i];
}The problem here is that we access "array" with a non-constant index. This is not supported in hardware in the fragment shader on today's GPUs. If Count was a compile time constant, then this would work fine since the loop would be unrolled. If not, you'll have to pass the array as a texture and sample in the loop to get the value.

Humus
05-16-2006, 03:30 PM
Originally posted by Jan:
@Humus: I saw a demo, that stored a bone-count at each vertex and used that as loop-count. My "old" SM 2.0 hw couldn't do that. I assume, that at least this kind of looping is possible on SM 3.0 hardware?SM3.0 hardware can do that in the vertex shader, but not in the fragment shader. It's an unfortunate weakness of SM3.0, which thankfully will be gone in SM4.0 that will have identical instruction set (to the extent it makes sense) in both the vertex and fragment shaders.

Golgoth
05-16-2006, 03:34 PM
which thankfully will be gone in SM4.0
is that a fact?

Will SM 4.0 works on gforce 7800 serie or will I have to wait another generation of video card?

Humus
05-17-2006, 12:35 AM
SM4.0 is DX10, so you'll no current card supports that.

Golgoth
05-17-2006, 12:30 PM
If Count was a compile time constant, then this would work fine since the loop would be unrolled. If not, you'll have to pass the array as a texture and sample in the loop to get the value.
humm... I m not familiar with this technic... do you refer to something like:

uniform sampler2D Count;

void main()
{
for (int i = 0; i < Count; i++)
{
}
}

and sample in the loop to get the value... what do you mean by that? "Count" as to be part of the final result?

Humus
05-17-2006, 04:40 PM
uniform sampler1D Array;
uniform float invWidth;

uniform int Count;

void main(){
for (int i = 0; i < Count; i++){
vec4 arrayElement = texture1D(Array, (float(i) + 0.5) * invWidth);
...
}
}

Golgoth
05-17-2006, 06:22 PM
thx, I m reading over and over the last post and I cant figure what you are suggesting:


void main()
{
for (int i = 0; i < Count; ++i)
{
// Calculate attenuation.
l_distance = distance(v_Vertex, gl_LightSource[i].position.xyz);
l_lightAtt = 1.0 /
(
gl_LightSource[i].constantAttenuation +
gl_LightSource[i].linearAttenuation * l_distance +
gl_LightSource[i].quadraticAttenuation * l_distance * l_distance
);
}
...
);
}I cant wait to have this working, anyone can help clear this up?

Humus
05-18-2006, 01:08 AM
Well, that confirms that the problem is indeed what I suggested. You're using your loop variable to index into gl_LightSource[]. Current hardware doesn't support that. Easiest would probably be to remove that "uniform int Count" and compile 8 different versions of the shader with "#define Count 1", "#define Count 2" etc. added to the beginning of the shader.

Golgoth
05-18-2006, 10:10 AM
alright, thx again humus, im up up for any voodoo stuff... I ll probably have to do the same thing for textures... sigh...


compile 8 different versions of the shader with "#define Count 1", "#define Count 2" etc. added to the beginning of the shader.
Interesting, ok so I ve added:

#define Count 1;
#define Count 2;
#define Count 3;
#define Count 4;
#define Count 5;
#define Count 6;
#define Count 7;
#define Count 8;

at the beggining of the shader and removed:
uniform int Count;

and? that is the way to compile many times the same shader? want else do I have to do?

where can I find any info on this?

Relic
05-18-2006, 10:31 AM
You misunderstood and that doesn't translate.
First you can remove the semicolon at the end of the #define.
You need only one line with
#define Count x
and then replace the character at "x"'s position in the source string to 1 for 1 light, translate, then 2 for 2 lights, translate another, and so on until you have your eight shader programs.
Generating shaders this way can easily be done programmatically with string manipulations.

Golgoth
05-18-2006, 10:48 AM
hummm... do you suggest that i need 8 different shaders writen on the hard drive? if yes, why do we need a #define for?

if no, I m totaly lost... can you make an example please?

Relic
05-18-2006, 11:36 AM
That'd work but you can do it with one shader on the harddrive.
Let's say you don't add the define in the source, but just print it into it:

Warning pseudo code, doesn't compile. There are more elegant ways with the glShaderSource interface.


// Load the shader however you do it.
// let char *strMyShaderSource point to the NULL terminated C string with the source.
// Allocate another string which is bigger and can hold the additional line as well.

GLuint programObjs[8];
for (i = 0; i < 8 ; i++)
{
// Put "#define Count i" at beginning of shader.
sprintf(strFinalSource, "#define Count %d\n%s", i + 1, strMyShaderSource);
length = strlen(strFinalSource);
glShaderSource(shaderObj, 1, &amp;strFinalSource, &amp;length);
glCompileShader(shaderObj);
programObjs[i] = glCreateProgram();
glAttachShader(programObjs[i], shaderObj);
glLinkProgram(programObjs[i]);
}There are cases were something else needs to be at the top of the source. You can also load a version with the define inside and search for a token you define like "@" and replace that with the right number.

Golgoth
05-18-2006, 12:30 PM
and when setting up lighting have something like:

switch (lightCount)
{
case 1: glUseProgram(programObjs[0]);
case 2: glUseProgram(programObjs[1]);
...
}

that could work indeed... imagine I want to add multi-textures as well in the same shader... that’s would be ridiculously insane... I m so surprised no one ever consider doing this... Forced to go back to the drawing board and revise my rendering loop... I really don’t like it... the only other descent way I can figure is to get rid of the gl_[i] variables and send every light data through uniforms and use a light stuct in the shader (till I figure how to send it via texture sampler)... but that is costly and not effective at all... so much fuzz for basic stuff... I m having a nightmare, someone wake me up!

Humus
05-18-2006, 10:25 PM
Well, you could do glUseProgram(programObjs[lightCount-1]) instead. Or glUseProgram(programObjs[lightCount]) if you want to cover the case of zero lights too.

As for multi-texture, it depends on what you mean with that. You can't access sampler arrays with a dynamic index either on current hardware. If that would be dependent on another variable, then you'd need more shaders, but if you access it through "i" as well then you don't need any more shaders.

Relic
05-19-2006, 01:00 AM
Originally posted by Golgoth:
so much fuzz for basic stuff... I m having a nightmare, someone wake me up! This stuff is basic in the vertex shader where those lighting state comes from and none of the above workarounds are needed.
It's a little more advanced inside the fragment shaders because they don't support the indirect uniform adressing.

Golgoth
05-19-2006, 09:49 AM
Well, you could do glUseProgram(programObjs[lightCount-1]) instead.Of course!

This stuff is basic in the vertex shader where those lighting state comes from and none of the above workarounds are needed.
I ve tried transferring everything in the vertex shader already and kept only this in the Fragment shader:


It's a little more advanced inside the fragment shaders because they don't support the indirect uniform adressing. Advanced??!!! I had another word in mind! :)


As for multi-texture, it depends on what you mean with that.
Well, let me try to give you the big picture. Here s a pseudo FFP Render sequence!

Render()
{
Enable Lighting
Set Global States
Set Render States (Including Muti-Texture State)
Draw primitives with Lighting State
Disable Lighting
Set Render States
Draw primitives without Lighting State
}

Anyway you move this around is not really important here… we just want to play with shaders like blocks of code and move them around dont we?

First, a question that comes to my mind is: can way expect to do:

If (Lighting)
SetLightingShader()
If (MultyTexture)
SetMultyTextureShader()
If (Fog)
SetFogShader()
If (PolygonOfsset)
SetPolygonOfssetShader()

DrawPrimitive()

Communicate between shaders and draw a single pass?

or one shader = one pass?

In my FFP, as it is for now, each primitive have an array of render states, a shader is consider as a render state, and it is always insert as the first Render State in the array. Since it overrides lots if not all the Render States… I m facing choices here…

1 - Replace all RSs with one big shader with all RS and lots of if (condition) in it!
2 - Replace all RSs with an Array of shaders: One shader per RS!
3 - Insert one or more shader(s) among the FFP RSs!

Since we cant process x number of lights and/or textures inside a shader effeciently… can we really get rid of the FFP at the moment?

What would be consider as good design when introducing shaders in a FFP?

Humus
05-19-2006, 03:24 PM
You can only have one shader per pass. But what you might want to do is to make a so called über-shader that does all passes at once, if that's possible in your case. So you basically just pass a boolean uniform for those conditions, like "Lighting" and "MultiTexture" etc. and use that in your shader.

Speaking on boolean uniforms, another option for the loop is to do this:


uniform bool count[8];
...

for (int i = 0; i < 8; i++){
if (count[i]){
...
}
}If your count is for instance 5, you pass an array of booleans where the first 5 elements are true and the rest are false. In this case, since you have a compile time constant as your loop count the compiler will be able to unroll the loop, and it should also be able to handle the static branching so that you don't do unneccesary work. Depending on instruction count, this might work on PS2.0 hardware too.

Golgoth
05-19-2006, 04:49 PM
You can only have one shader per pass. Slap... Ouch!


But what you might want to do is to make a so called über-shader that does all passes at onceSo basicly take all FFP RenderStates and Stack it into a vertex and an fragment shader... and Slap... Ouch!


If your count is for instance 5, you pass an array of booleans where the first 5 elements are true and the rest are false. In this case, since you have a compile time constant as your loop count the compiler will be able to unroll the loop, and it should also be able to handle the static branching so that you don't do unneccesary work.Already tried this and for some ocus pocus reason the result is this:

It all link and compile successfully, the Boolean condition are doing their job fine but, it takes as much process going through the entire loop with 1 light enabled then having 8 lights enabled! try it and tell me I m taking crazy pills!

zed
05-19-2006, 10:49 PM
So basicly take all FFP RenderStates and Stack it into a vertex and an fragment shader... and Slap... Ouch!if you were doing it on the cpu (ideal world) then this is what yould do, with new ati cards apparently this sort of choices are quite fast + with newer cards from now on this is only gonna get faster (theyre becoming more like cpus)

Jan
05-20-2006, 02:44 AM
zed, could you explain to me, why this is going to be this way? Are modern GPUs that fast at branching and that alergic to shader-switches or what is the reason?