Skeletal animation shader: need suggestions

I’ve created my first GLSL shader related to skeletal animation.
http://www.igrodel.ru/files/skeletal12.zip (exe+source in Turbo Delphi)
Some files required to this sample compile in Turbo Delphi: http://www.igrodel.ru/files/igrodel10.zip

vertex:

uniform mat4 boneMat[32];
varying float texNum;
void main(void)
{
float boneIndex[3];
float boneWeight[3];

texNum = gl_Vertex[3];

vec4 fixedTexCoord = gl_MultiTexCoord0;
vec4 fixedColor = gl_Color;
vec4 fixedVertex = gl_Vertex;

vec4 finalVertex = vec4(0,0,0,1);

boneIndex[0] = floor(fixedTexCoord[2]*255.0+0.001);
boneWeight[0] = fixedTexCoord[3];
boneIndex[1] = floor(fixedColor[0]*255.0+0.001);
boneWeight[1] = fixedColor[1];
boneIndex[2] = floor(fixedColor[2]*255.0+0.001);
boneWeight[2] = fixedColor[3];

fixedTexCoord[2] = 0.0;
fixedTexCoord[3] = 1.0;
fixedColor[0] = 1.0;
fixedColor[1] = 1.0;
fixedColor[2] = 1.0;
fixedColor[3] = 1.0;
fixedVertex[3] = 1.0;

mat4 finalMatrix = mat4(0);

for (int i = 0; i < 3; i++)
 finalMatrix += boneWeight[i]*boneMat[int(boneIndex[i])];

finalVertex = finalMatrix*fixedVertex;

finalVertex[3] = 1.0;

gl_Position = gl_ModelViewProjectionMatrix * finalVertex;
gl_FrontColor = fixedColor;
gl_TexCoord[0] = fixedTexCoord;
} 

fragment:

uniform sampler2D myTexture0;
uniform sampler2D myTexture1;
uniform sampler2D myTexture2;
uniform sampler2D myTexture3;
uniform sampler2D myTexture4;
uniform sampler2D myTexture5;
uniform sampler2D myTexture6;
uniform sampler2D myTexture7;

varying float texNum;

void main(void)
{
 float texNum2 = floor(texNum*255.0-1.0+0.001); 

 if (texNum2==0.0)
  gl_FragColor = texture2D( myTexture0, gl_TexCoord[0].st ); 
 else if (texNum2==1.0)
  gl_FragColor = texture2D( myTexture1, gl_TexCoord[0].st ); 
 else if (texNum2==2.0)
  gl_FragColor = texture2D( myTexture2, gl_TexCoord[0].st ); 
 else if (texNum2==3.0)
  gl_FragColor = texture2D( myTexture3, gl_TexCoord[0].st ); 
 else if (texNum2==4.0)
  gl_FragColor = texture2D( myTexture4, gl_TexCoord[0].st ); 
 else if (texNum2==5.0)
  gl_FragColor = texture2D( myTexture5, gl_TexCoord[0].st ); 
 else if (texNum2==6.0)
  gl_FragColor = texture2D( myTexture6, gl_TexCoord[0].st ); 
 else if (texNum2==7.0)
  gl_FragColor = texture2D( myTexture7, gl_TexCoord[0].st ); 
}

I need suggestions about how to make it work with ATI cards. Some peoples give me feedback that it does not work.

Also I am interested in what are you thinking about performance of my shader relative to what you’ve seen before.

And the last MYSTERIOUS THING:
when i using “…/255…” in my program and do the “…*255.0+0.001…” in shader perfomance is higher than without this bone/texture index “normalisation”. Why?

Works fine on my X1800XL.

Ha! It works! Very good!!

Maybe some ideas about FPS drop?
When I pass uniforms in range 0,0…1,0 then I get maximum speed. When I use the same float uniforms in range 0,0…255,0 rendering of huge amount of models becomes slower (30fps less). I got this on my NVIDIA 7300GS.

So before passing bone index i do “normaliation” to 0…1 range (boneIndex/255.0). And multiply this uniform in shader by 255 (*255.0). It looks like range 0…1 is little faster than >1.0.

Isn’t this wrong?
texNum = gl_Vertex[3];

Originally posted by V-man:
Isn’t this wrong?
texNum = gl_Vertex[3];

The gl_Vertex[3] is equivalent to gl_Vertex.w

That’s confusing because with draw buffers, we can write to gl_FragData[X] and this works like an array instead of component selection.

Yeah, because that is an array. :wink:
You should be able to do gl_FragData[X][3] to access its w components.

This addressing of components would only be useful if used in loops, but current HW can not handle that with dynamic loop variables, only with expressions which are compile time constants because there is no address register or “swizzle register” for that.
Use suffixes like .w to make code clearer.

Ha! It works! Very good!!
Yes, but on X1800, which is shader model 3.0 hardware. It’s very likely taht you run into problems and most certainly performance drop on shader model 2.0 hardware such as Radeon 9K, Radeon X and GeForce FX - you use a lot of branching in you fragment shader.
And by the way - I don’t know if you used 8 2D textures for a reason (mipmapping?), but you could use a 3D texture and say goodbye to performance problems on shader model 2.0 hardware.

I used 8 2d textures to do rendering in one call. In this demo we can render models textured with 1 to 8 textures. I do not know what is faster:

  • sorting poligons by texture index and then do N draw call (model textured with N textures)
  • do 1 draw call and choose texture in fragment shader

Can I do an array of sampler2D uniforms?

uniform sampler2D myTexture[8];

instead of

uniform sampler2D myTexture0;
uniform sampler2D myTexture1;
uniform sampler2D myTexture2;
uniform sampler2D myTexture3;
uniform sampler2D myTexture4;
uniform sampler2D myTexture5;
uniform sampler2D myTexture6;
uniform sampler2D myTexture7;

I tried to do array of sampler2D uniforms, but it does not compiles.

Array helps alot: we can move if then else if then else … to vertex shader. In this case “IF” block becomes per-vertex.

At the moment “IF” block runs per-pixel. Yes, it provides some overheading. I am just a newbie in GLSL. Is it possible to do an array of sampler2D uniforms?

  • sorting poligons by texture index and then do N draw call (model textured with N textures)
    This is usually done this way. It works much faster on old hardware when object is near camera (takes lots of space on screen).
  • do 1 draw call and choose texture in fragment shader
    Dynamic branching is only supported on shader model 3.0 hardware (GeForce 6, Radeon X1k). It will work fast enough, but on shader model 2.0 hardware it will be much slower because it will always sample all 8 textures every pixel. It will work fine if object is far away.

So, if you want good performance on all GPU’s then sort your polygons by texture.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.