PDA

View Full Version : revised shader performance question



kewuwsi
10-30-2007, 06:45 AM
looks something is wrong when I post!
Hi, all
I had a fragment shader like this:
.....
void main(void)
{
vec4 color(0.0);
if(A)
{
// convolution to texture_0
......
color = ..
}
else
{
// convolution to texture_1 & texture_0
........
color =...
}
gl_FragColor = color;
}

It works OK.
the interesting thing is: the performance is decreased a lot( at least half) with above if /else branch. following are my testing results (for else part):
1) comment out the code in if branch( just have color = vec4(1.0, 0.0, 0.0, 1.0)
the performance is improved
2) separate the if /else into two functions, then call these two function in main() like "
void main(void)
{
vec4 color = vec4(0.0);
if(A)
color = caltexture_0(texture_0);
else
color = caltexture_1(texture_1);
gl_FragColor = color;
}
"
the performance is improved too.

3) separate the if/else into two shaders. the performance is improved even better.

this only happens for complicated convolution, for simple convolution, there are no noticeable difference between above testing.

------------------thought it is related to the compiler ( I am working on linux, NV Quadro graphic card)-------------- anybody has a proven explanation for it?

many many thanks

Zengar
10-30-2007, 08:49 AM
Many cards don't support real dynamic branches in fragment shaders. They usualy execute the whole code (so the 'then' and the 'else' part) and choose the result with a conditional write.

Still, I am puzzled that your 2) option provides better performance. Are you sure that functions perform the same calculations as the above if() .. else code?

kewuwsi
10-30-2007, 10:56 AM
thanks.
how they choose the result with a conditional write?
works like"
void main(void)
{
vec4 color0 = vec4(0.0);
vec4 color1 = vec4(0.0);
color0 = //convolution..calculation
color1 = //convolution..calculation
if(A)
gl_FragColor = color0;
else
gl_FragColor = color1:
}
(how to check my card works in that way?)

for option 2, I am pretty sure it performs the same calculations.