View Full Version : Shader performance

02-28-2005, 12:55 AM
I'm getting a large performance drop when using a very basic fragment shader. My frame rate is halfing when I do this.

vec4 value,value2;

value=texture2D(myTexture, vec2(gl_TexCoord[0]));



yet if I replace the above line with


performance is fine again.

Replacing mix with min or some other really basic function still halves the framerate.

If the shader code was a bottleneck I would expect that if I reduced my window size it would have less fragments to process and performance would go up again but reducing the window size makes no difference to performance.

Does anyone know the reason for the performance drop? Thanks.

I'm using GF5900Ultra with FW 75.9

My only guess is that the driver can see that I don't need to use the shader if I just set gl_FragColor=value; and so it goes back to fixed function. This would imply there is a large fixed overhead with using a fragment shader. Could this be the reason?

02-28-2005, 08:03 AM
glsl uses 32 bit natively, gffx aint to good on that she likes 16bit better, unlike the gf6xxx + radeons which dont seem to suffer such a drop, try using half (not strictly part of glsl yet, the keyword is reserved though)

02-28-2005, 10:47 AM
Thanks, unfortunately half doesnt make any difference. I guess my card's just slow with glsl.

02-28-2005, 10:46 PM
sorry i misread your original question, i missed the part about mix(..)
obviously the mix one is gonna be slower as it uses more resources/instructions check the fasm_01.txt etc files in the projects directory
youll see something like this as well as the instructions used itll give u an idea of whats happening
# 5 instructions, 0 R-regs
# 4 instructions, 2 R-regs, 0 H-regs

FWIW with the new driver 75.xx i cant seem to enable this asm output

also youre right about half not helping in that test i just tried, though if u try the following

vec4 tex = texture2D( tex0, gl_TexCoord[0].xy );
vec4 value2 = vec4(.96,.96,1,1);
gl_FragColor = vec4( tex * value2 );

first with vec4 and then with half4 youll notice a big speedup,

try what u want with register combiners, certainly u should be able to do the shader u posted im sure, + im guessing its not gonna be much (if at all) faster than the glsl one