I need suggestions about how to make it work with ATI cards. Some peoples give me feedback that it does not work.
Also I am interested in what are you thinking about performance of my shader relative to what you’ve seen before.
And the last MYSTERIOUS THING:
when i using “…/255…” in my program and do the “…*255.0+0.001…” in shader perfomance is higher than without this bone/texture index “normalisation”. Why?
Maybe some ideas about FPS drop?
When I pass uniforms in range 0,0…1,0 then I get maximum speed. When I use the same float uniforms in range 0,0…255,0 rendering of huge amount of models becomes slower (30fps less). I got this on my NVIDIA 7300GS.
So before passing bone index i do “normaliation” to 0…1 range (boneIndex/255.0). And multiply this uniform in shader by 255 (*255.0). It looks like range 0…1 is little faster than >1.0.
Yeah, because that is an array.
You should be able to do gl_FragData[X][3] to access its w components.
This addressing of components would only be useful if used in loops, but current HW can not handle that with dynamic loop variables, only with expressions which are compile time constants because there is no address register or “swizzle register” for that.
Use suffixes like .w to make code clearer.
Ha! It works! Very good!!
Yes, but on X1800, which is shader model 3.0 hardware. It’s very likely taht you run into problems and most certainly performance drop on shader model 2.0 hardware such as Radeon 9K, Radeon X and GeForce FX - you use a lot of branching in you fragment shader.
And by the way - I don’t know if you used 8 2D textures for a reason (mipmapping?), but you could use a 3D texture and say goodbye to performance problems on shader model 2.0 hardware.
I tried to do array of sampler2D uniforms, but it does not compiles.
Array helps alot: we can move if then else if then else … to vertex shader. In this case “IF” block becomes per-vertex.
At the moment “IF” block runs per-pixel. Yes, it provides some overheading. I am just a newbie in GLSL. Is it possible to do an array of sampler2D uniforms?
sorting poligons by texture index and then do N draw call (model textured with N textures)
This is usually done this way. It works much faster on old hardware when object is near camera (takes lots of space on screen).
do 1 draw call and choose texture in fragment shader
Dynamic branching is only supported on shader model 3.0 hardware (GeForce 6, Radeon X1k). It will work fast enough, but on shader model 2.0 hardware it will be much slower because it will always sample all 8 textures every pixel. It will work fine if object is far away.
So, if you want good performance on all GPU’s then sort your polygons by texture.