but the frame rate dropped by 10. I though light radius may be large enough to get every fragment lit, thus i added the original shader code the if-statement workload.
So i divided the light radius by 2 (multiplied the invRadius variable by 2) and then it was clear fragment should not be taken care of, but the frame rate stayed at around 90, while it was 105 initialy
If you don’t use a graphic card which supports Shader Model 3.0 (GeForce 6800) this if won’t give you any performance boost. This is because both pathes of an if-statement (the if-path and the else-path) have to be computed. After that the GPU decides which values should be taken, the ones from the if-path or the ones from the else-path. Pre-SM3.0 hardware isn’t able to decide which code to execute.
So back to your example: Unless you use a GeForce 6800 the code within the if-path will be exectued. It doesn’t matter if atten == 0 or atten != 0. So you won’t get any extra performance.
About branching… It seems that NV40 have only one instruction pointer (Im not sure for this) for all 16 pipes. Can someone with NV40 test this? For exmple, make texture:
oxxx... (repeat this block)
xxxx...
xxxx...
xxxx...
.......
and another texture
oooo...
oooo...
oooo...
oooo...
.......
then draw a fullscreen quad (map texel to pixel)
with some fragment shader like:
...
vec4 col = texture2D(mask, uv);
if (col.r == 'o') gl_FragColor = DoVeryExpensiveCalculation();
else discard;
...
and check speed for both textures.
Current drivers doesn’t support SM3.0 in GLSL, so for this test should use ARBFP.
The GeForce FX series isn’t able to do dynamic branching in pixel shaders. You won’t get it work on your card. But you can either buy a new graphic card or use a software rasterizer (NVemmulate).
The GeForce FX series isn’t able to do dynamic branching in pixel shaders.
For the sake of clarification, the GeForce FX 6800 is still an FX, and it can do dynamic branching in the fragment program. What he really means is that 5xxx cards can’t do fragment program branching.
Now that i had some time to look at it, i’d like to get some precision. I’m quiet a beginner so dont hesitate to explain to me like you would to a kid =)
So here are the steps :
1 - We render our first pass with depthAmbiant shader ON, rendering for shadowed aeras
2 - XXXX
3 - Finally we do our second pass with lighting shader ON only where stencil value says, so that lit fragments are actually lit.
That’s the second step i’m missing :
the if shader return as a color param the distance from the light to the fragment.
And i know thats when the stencil buffer decide whether a fragment is lit or not, but i dont get how.
Originally posted by dingo_aus:
[b]BTW the dynamic branching doesn’t seem to work with my NV5900 (forceware 62.71).
Is there something I should do to help it run properly?[/b]
It seems this is actually the 5x00 hardware not being capable of doing early stencil rejection.
Originally posted by Corrail: The GeForce FX series isn’t able to do dynamic branching in pixel shaders. You won’t get it work on your card. But you can either buy a new graphic card or use a software rasterizer (NVemmulate).
Well, the point of that demo is to show that you can implement dynamic branching for some common situations without special hardware support, and still get as good or actually better performance than real dynamic branching.
Now that i had some time to look at it, i’d like to get some precision. I’m quiet a beginner so dont hesitate to explain to me like you would to a kid =)
So here are the steps :
1 - We render our first pass with depthAmbiant shader ON, rendering for shadowed aeras
2 - XXXX
3 - Finally we do our second pass with lighting shader ON only where stencil value says, so that lit fragments are actually lit.
That’s the second step i’m missing :
the if shader return as a color param the distance from the light to the fragment.
And i know thats when the stencil buffer decide whether a fragment is lit or not, but i dont get how.
thx to anyone
wizzo[/b]
The result from the if-shader is in the alpha. By using the alpha test you kill fragments that are unlit. So the stencil is only updated for lit fragments.
It seems this is actually the 5x00 hardware not being capable of doing early stencil rejection.
Does these mean the ATI cards can do early stencil rejection?