opengl redundant calls

Hi. Does OpenGL ignore redundant state changes? For example, does using glUseProgram(1); glUseProgram(1); hurt the application same as glUseProgram(1); glUseProgram(2); ?

Yes OpenGL implementations should change states if it not necessary. But since all drivers are not perfect i would not rely on this and you do not prevent the cost of a call to the driver.

I’ve been measuring cpu-cycles on nVidia drivers in 2.1, 3.0 and 3.1 contexts, redundant calls for binding VBOs, specifying vtx-attrib offsets/sizes/etc, changing textures, changing blending/raster-modes - always take the same time to execute (and it’s always 1500+ cycles). Thus, you must remove redundant calls yourself (5-15 cycles).

That’s very interesting info. Thanks.

Yes I think it certainly depends of the cost/gain in cpu cycle of this kind of optimization. Conditional jumps may very expensive and are not cache friendly…

Conditional jumps are still over a hundred times faster than doing the redundant call.
Just pack the state-cache nicely in an array or a struct of arrays, otherwise the C++ compiler is free to scatter those vars.

Yes maybe but I need evidences :stuck_out_tongue: This statement does make sense to me since it hardly depends on the platform you are working with, but anyway we are a bit off topic now.

I’ve been writing in x86/ARM asm for years (serious full-projects and optimizations), so I’d like some trust on that statement :wink: .

FWIW I employ my own state tables on both x86 and particularly the iPhone.

I trust you. :wink:

Haha! Ok Ilian I must believe you now. :slight_smile: I must admit that I do not have such an experience in asm programming.

btw gDebugger shows what and how many are redundant. wish GLIntercept had same feature.

seems logic that an
if(cur_prog != 1) glUseProgram(1) // cur_prog = 1
should be faster then
glUseProgram(1);

  1. just a compare and jump.
  2. call includes save the stack frame, push params on stack, jump to glUseProgram address, clean stack, jump back (aka return)
    not mentioning the overhead the driver does on that state change!

Seconded.

seems logic that an
if(cur_prog != 1) glUseProgram(1) // cur_prog = 1
should be faster then
glUseProgram(1);

  1. just a compare and jump.
  2. call includes save the stack frame, push params on stack, jump to glUseProgram address, clean stack, jump back (aka return) not mentioning the overhead the driver does on that state change!

Yeah, what muddies the water a little is that one is conditional and subject to branch prediction misses and the wasted work/latency that entails, while the other is not.

Does seem intuitive though, what with all the validation work (i.e. conditional branching) and pointer chasing the driver prob does under-the-hood anyway, in addition to the function call overhead.

Would be interesting to reanalyze with bindless graphics and with all GL validation ripped out (if that were possible).

I have been Beta testing Gremedy’s gDEBugger for quite some time at various release versions… The thing I most use in that personally is the Redundant State change feature on any new code. It’s amazingly helpful.

Particularly when I jumped from GL to GLES and had got all sloppy with the fixed function pipeline stuff!

The conditional jump is usually predicted quite early-on, as cur_prog won’t have been modified so recently.

Inside the glUseProgram, there ARE a bunch of other conditional jumps anyway.

Anyway, it’s best to use glDebugger or a custom implementation to see where redundant calls are done in your app, and cache only them.

Ok, thanks.