opengl redundant calls

Jacek_Nowak · July 27, 2009, 8:51am

Hi. Does OpenGL ignore redundant state changes? For example, does using glUseProgram(1); glUseProgram(1); hurt the application same as glUseProgram(1); glUseProgram(2); ?

dletozeun · July 27, 2009, 9:01am

Yes OpenGL implementations should change states if it not necessary. But since all drivers are not perfect i would not rely on this and you do not prevent the cost of a call to the driver.

Ilian_Dinev · July 27, 2009, 10:54am

I’ve been measuring cpu-cycles on nVidia drivers in 2.1, 3.0 and 3.1 contexts, redundant calls for binding VBOs, specifying vtx-attrib offsets/sizes/etc, changing textures, changing blending/raster-modes - always take the same time to execute (and it’s always 1500+ cycles). Thus, you must remove redundant calls yourself (5-15 cycles).

scratt · July 27, 2009, 11:04am

That’s very interesting info. Thanks.

dletozeun · July 27, 2009, 11:12am

Yes I think it certainly depends of the cost/gain in cpu cycle of this kind of optimization. Conditional jumps may very expensive and are not cache friendly…

Ilian_Dinev · July 27, 2009, 11:28am

Conditional jumps are still over a hundred times faster than doing the redundant call.
Just pack the state-cache nicely in an array or a struct of arrays, otherwise the C++ compiler is free to scatter those vars.

dletozeun · July 27, 2009, 3:07pm

Yes maybe but I need evidences This statement does make sense to me since it hardly depends on the platform you are working with, but anyway we are a bit off topic now.

Ilian_Dinev · July 27, 2009, 7:37pm

I’ve been writing in x86/ARM asm for years (serious full-projects and optimizations), so I’d like some trust on that statement .

scratt · July 27, 2009, 7:48pm

FWIW I employ my own state tables on both x86 and particularly the iPhone.

I trust you.

dletozeun · July 28, 2009, 1:14am

Haha! Ok Ilian I must believe you now. I must admit that I do not have such an experience in asm programming.

_NK47 · July 28, 2009, 1:57am

btw gDebugger shows what and how many are redundant. wish GLIntercept had same feature.

seems logic that an
if(cur_prog != 1) glUseProgram(1) // cur_prog = 1
should be faster then
glUseProgram(1);

just a compare and jump.
call includes save the stack frame, push params on stack, jump to glUseProgram address, clean stack, jump back (aka return)
not mentioning the overhead the driver does on that state change!

Dark_Photon · July 28, 2009, 5:24am

Seconded.

seems logic that an
if(cur_prog != 1) glUseProgram(1) // cur_prog = 1
should be faster then
glUseProgram(1);

just a compare and jump.

call includes save the stack frame, push params on stack, jump to glUseProgram address, clean stack, jump back (aka return) not mentioning the overhead the driver does on that state change!

Yeah, what muddies the water a little is that one is conditional and subject to branch prediction misses and the wasted work/latency that entails, while the other is not.

Does seem intuitive though, what with all the validation work (i.e. conditional branching) and pointer chasing the driver prob does under-the-hood anyway, in addition to the function call overhead.

Would be interesting to reanalyze with bindless graphics and with all GL validation ripped out (if that were possible).

scratt · July 28, 2009, 5:55am

I have been Beta testing Gremedy’s gDEBugger for quite some time at various release versions… The thing I most use in that personally is the Redundant State change feature on any new code. It’s amazingly helpful.

Particularly when I jumped from GL to GLES and had got all sloppy with the fixed function pipeline stuff!

Ilian_Dinev · July 28, 2009, 6:42am

The conditional jump is usually predicted quite early-on, as cur_prog won’t have been modified so recently.

Inside the glUseProgram, there ARE a bunch of other conditional jumps anyway.

Anyway, it’s best to use glDebugger or a custom implementation to see where redundant calls are done in your app, and cache only them.

Dark_Photon · July 28, 2009, 9:17am

Ok, thanks.