Is the if-statement particularly slow?

Hi !

I tried different things on Humus’ Portal demo to see what i could do with GLSL.

i replaced the fragment shader code by this one :

void main(){
vec4 base = texture2D(Base, texCoord);

float distSqr = dot(lightVec, lightVec);
float atten = clamp(1.0 - invRadius * sqrt(distSqr), 0.0, 1.0);

float diffuse = 0.0;
float specular = 0.0;

if(atten != 0.0){

    vec3 bump = texture2D(Bump, texCoord).xyz * 2.0 - 1.0;
    bump = normalize(bump);
	
    vec3 lVec = lightVec * inversesqrt(distSqr);


    diffuse = clamp(dot(lVec, bump), 0.0, 1.0);
    specular = pow(clamp(dot(reflect(normalize(-viewVec), bump), lVec), 0.0, 1.0), 16.0);
	
}

gl_FragColor = ambient * base + (diffuse * base + 0.6 * specular) * atten;	

}

but the frame rate dropped by 10. I though light radius may be large enough to get every fragment lit, thus i added the original shader code the if-statement workload.

So i divided the light radius by 2 (multiplied the invRadius variable by 2) and then it was clear fragment should not be taken care of, but the frame rate stayed at around 90, while it was 105 initialy

thx in advance
wizzo

If you don’t use a graphic card which supports Shader Model 3.0 (GeForce 6800) this if won’t give you any performance boost. This is because both pathes of an if-statement (the if-path and the else-path) have to be computed. After that the GPU decides which values should be taken, the ones from the if-path or the ones from the else-path. Pre-SM3.0 hardware isn’t able to decide which code to execute.
So back to your example: Unless you use a GeForce 6800 the code within the if-path will be exectued. It doesn’t matter if atten == 0 or atten != 0. So you won’t get any extra performance.

alright ! this does explain why
too bad i just bought a Radeon 9800 =)

thx for your answer
wizzo

Wizzo, did you tried Humus implementation of
“if” statement for SM2 ?

http://www.beyond3d.com/forum/viewtopic.php?t=13716&sid=7c7fd4b25cf6b6e623094b4e04a55392

well, no i didn’t, but now that i sneaked a peek, i’m definetly gonna try this.

but i can’t right now, because MY engine doesn’t support any shading language yet, i was just playing around with one of his’ demo =)

thanks for the tip anyway jpeter,
wizzo

About branching… It seems that NV40 have only one instruction pointer (Im not sure for this) for all 16 pipes. Can someone with NV40 test this? For exmple, make texture:

oxxx... (repeat this block)
xxxx...
xxxx...
xxxx...
.......

and another texture
oooo...
oooo...
oooo...
oooo...
.......

then draw a fullscreen quad (map texel to pixel)
with some fragment shader like:

...
vec4 col = texture2D(mask, uv);
if (col.r == 'o') gl_FragColor = DoVeryExpensiveCalculation();
else discard;
...

and check speed for both textures.

Current drivers doesn’t support SM3.0 in GLSL, so for this test should use ARBFP.

yooyo

AFAIK does the NV branching work on a block of pixels. Take a look at the posting of tb at
http://www.forum-3dcenter.org/vbulletin/showthread.php?s=&postid=1954339#post1954339
Warning, this is a german forum site.

Originally posted by jpeter:
[b]Wizzo, did you tried Humus implementation of
“if” statement for SM2 ?

http://www.beyond3d.com/forum/viewtopic.php?t=13716&sid=7c7fd4b25cf6b6e623094b4e04a553 92 [/b]
While the technique is interesting, that thread you linked to went straight to hell. There are some interesting technical discussion though squeezed in between the flaming …

Humus, your work is amazing. Don’t let anyone flaming you (not here but elsewhere) put you off, keep up the good work.

BTW the dynamic branching doesn’t seem to work with my NV5900 (forceware 62.71).

Is there something I should do to help it run properly?

The GeForce FX series isn’t able to do dynamic branching in pixel shaders. You won’t get it work on your card. But you can either buy a new graphic card or use a software rasterizer (NVemmulate).

The GeForce FX series isn’t able to do dynamic branching in pixel shaders.
For the sake of clarification, the GeForce FX 6800 is still an FX, and it can do dynamic branching in the fragment program. What he really means is that 5xxx cards can’t do fragment program branching.

Hi again,

Now that i had some time to look at it, i’d like to get some precision. I’m quiet a beginner so dont hesitate to explain to me like you would to a kid =)

So here are the steps :

1 - We render our first pass with depthAmbiant shader ON, rendering for shadowed aeras

2 - XXXX

3 - Finally we do our second pass with lighting shader ON only where stencil value says, so that lit fragments are actually lit.

That’s the second step i’m missing :

the if shader return as a color param the distance from the light to the fragment.
And i know thats when the stencil buffer decide whether a fragment is lit or not, but i dont get how.

thx to anyone
wizzo

Korval: It’s the GeForce 6800; no “FX”. FX <=> 5xxx. </nitpick>

Korval: It’s the GeForce 6800; no “FX”. FX <=> 5xxx. </nitpick>
Hmmm. I could have sworn I checked that before posting it. Oh well.

Originally posted by dingo_aus:
[b]BTW the dynamic branching doesn’t seem to work with my NV5900 (forceware 62.71).

Is there something I should do to help it run properly?[/b]
It seems this is actually the 5x00 hardware not being capable of doing early stencil rejection.

Originally posted by Corrail:
The GeForce FX series isn’t able to do dynamic branching in pixel shaders. You won’t get it work on your card. But you can either buy a new graphic card or use a software rasterizer (NVemmulate).
Well, the point of that demo is to show that you can implement dynamic branching for some common situations without special hardware support, and still get as good or actually better performance than real dynamic branching.

Originally posted by wizzo:
[b]Hi again,

Now that i had some time to look at it, i’d like to get some precision. I’m quiet a beginner so dont hesitate to explain to me like you would to a kid =)

So here are the steps :

1 - We render our first pass with depthAmbiant shader ON, rendering for shadowed aeras

2 - XXXX

3 - Finally we do our second pass with lighting shader ON only where stencil value says, so that lit fragments are actually lit.

That’s the second step i’m missing :

the if shader return as a color param the distance from the light to the fragment.
And i know thats when the stencil buffer decide whether a fragment is lit or not, but i dont get how.

thx to anyone
wizzo[/b]
The result from the if-shader is in the alpha. By using the alpha test you kill fragments that are unlit. So the stencil is only updated for lit fragments.

It seems this is actually the 5x00 hardware not being capable of doing early stencil rejection.
Does these mean the ATI cards can do early stencil rejection?

Yes, and the 6800 too (though it seems a bit more sensitive).

Ah good to know - pity a 6800 at the moment would cost more than the rest of my PC combined :slight_smile:

(and depreciate a lot faster too)