Compiler options and extensions in GLSL on NVidia hardware

Hi,

one of the nice things about Cg is that it lets you add compiler options to your code. For example it is possible to specify if loops should be unrolled or use a data-dependent branch. Also features can be enabled or disable by targeting different profile.

GLSL on the other ahdn appears to be more criptic.
Now I have a problem with one GLSL shader doing a loop which appear to be compiled by unrolling and not brnaching.
The loop looks like
for(i = 0; i < max; i ++) { … }
and only works when max <= 8.

I would like to ensure the driver is using fragment_program_2 as well as loop branching. How can I do so in GLSL?

Thanks!

-x

Hi Xelatihy,

I’m not aware of any way to specify compile options to the GLSL, but you may have better luck asking this question on the GLSL forum.

Thanks -
Cass

I would like to ensure the driver is using fragment_program_2 as well as loop branching. How can I do so in GLSL?
Why would you care? You should trust the compiler to do what is best for the hardware you’re compiling to. And if nVidia’s compiler decides to unroll the loop, it probably is for performance reasons. Which means it will make your program go faster than it would have if you forced looping.

Plus, you probably would want this program to work on ATi GPUs too (that don’t have branching), so reliance on such a thing is kinda silly. Let the compiler do its job.

Thanks, I also posted on the shading forum.

BTW, I am trying to figure out why my shader won’t work and it looks that the problem is loop unrolling not done correctly.

Also the compiler knows less than me on certain things. For example I know that certain loops should use conditionals since there is a high probability of it exiting early, regardless of the length.
The compiler uses only statis analysis to decide as far as I know, while I know things that you can only discover using dynamic recompilation.

Thanks anyway and maybe I should just move back to Cg!

-x

if you want to see shader ouput add following reg key:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\NVIDIA Corporation\Global\OpenGL]

[HKEY_LOCAL_MACHINE\SOFTWARE\NVIDIA Corporation\Global\OpenGL\Debug]
"ShaderObjects"=dword:00000001
"WriteProgramObjectAssembly"=dword:00000001

This will force driver to write shader asm code in app folder. If you are expirienced in arb vp/fp, you can analize code.

yooyo

BTW, I am trying to figure out why my shader won’t work and it looks that the problem is loop unrolling not done correctly.
Well, that’s a different story. Then it is a compiler bug, and you should report it to nVidia as such.

The compiler uses only statis analysis to decide as far as I know, while I know things that you can only discover using dynamic recompilation.
Yes, but the compiler knows its hardware better than you. For example, if you’re looping 4 times, and you have a conditional exit that sometimes breaks out the 1st time and sometimes later, it is not unreasonable for it to be more efficient (on average) in the hardware to simply never exit and mask out the results later on. Conditional branches can be very inefficient in fragment shaders, and you the user have no way of knowing when branching is more efficient than not branching.

Plus, don’t forget that your code needs to work on ATi hardware too. Which throws a wrench into your attempt to optimize. You don’t know how different the two architectures are, so you have no way of knowing whether branching is more efficient on one vs the other, or what.

Thanks anyway and maybe I should just move back to Cg!
Oh, feel free… if you only want your shaders to work on half the hardware out there…