PDA

View Full Version : opengl redundant calls



Jacek Nowak
07-27-2009, 08:51 AM
Hi. Does OpenGL ignore redundant state changes? For example, does using glUseProgram(1); glUseProgram(1); hurt the application same as glUseProgram(1); glUseProgram(2); ?

dletozeun
07-27-2009, 09:01 AM
Yes OpenGL implementations should change states if it not necessary. But since all drivers are not perfect i would not rely on this and you do not prevent the cost of a call to the driver.

Ilian Dinev
07-27-2009, 10:54 AM
I've been measuring cpu-cycles on nVidia drivers in 2.1, 3.0 and 3.1 contexts, redundant calls for binding VBOs, specifying vtx-attrib offsets/sizes/etc, changing textures, changing blending/raster-modes - always take the same time to execute (and it's always 1500+ cycles). Thus, you must remove redundant calls yourself (5-15 cycles).

scratt
07-27-2009, 11:04 AM
That's very interesting info. Thanks.

dletozeun
07-27-2009, 11:12 AM
I've been measuring cpu-cycles on nVidia drivers in 2.1, 3.0 and 3.1 contexts, redundant calls for binding VBOs, specifying vtx-attrib offsets/sizes/etc, changing textures, changing blending/raster-modes - always take the same time to execute (and it's always 1500+ cycles). Thus, you must remove redundant calls yourself (5-15 cycles).

Yes I think it certainly depends of the cost/gain in cpu cycle of this kind of optimization. Conditional jumps may very expensive and are not cache friendly...

Ilian Dinev
07-27-2009, 11:28 AM
Conditional jumps are still over a hundred times faster than doing the redundant call.
Just pack the state-cache nicely in an array or a struct of arrays, otherwise the C++ compiler is free to scatter those vars.

dletozeun
07-27-2009, 03:07 PM
Conditional jumps are still over a hundred times faster than doing the redundant call.

Yes maybe but I need evidences :P This statement does make sense to me since it hardly depends on the platform you are working with, but anyway we are a bit off topic now.

Ilian Dinev
07-27-2009, 07:37 PM
I've been writing in x86/ARM asm for years (serious full-projects and optimizations), so I'd like some trust on that statement ;) .

scratt
07-27-2009, 07:48 PM
FWIW I employ my own state tables on both x86 and particularly the iPhone.

I trust you. ;)

dletozeun
07-28-2009, 01:14 AM
I've been writing in x86/ARM asm for years (serious full-projects and optimizations), so I'd like some trust on that statement ;) .

Haha! Ok Ilian I must believe you now. :) I must admit that I do not have such an experience in asm programming.

_NK47
07-28-2009, 01:57 AM
btw gDebugger shows what and how many are redundant. wish GLIntercept had same feature.

seems logic that an
if(cur_prog != 1) glUseProgram(1) // cur_prog = 1
should be faster then
glUseProgram(1);
1. just a compare and jump.
2. call includes save the stack frame, push params on stack, jump to glUseProgram address, clean stack, jump back (aka return)
not mentioning the overhead the driver does on that state change!

Dark Photon
07-28-2009, 05:24 AM
btw gDebugger shows what and how many are redundant. wish GLIntercept had same feature.
Seconded.


seems logic that an
if(cur_prog != 1) glUseProgram(1) // cur_prog = 1
should be faster then
glUseProgram(1);
1. just a compare and jump.
2. call includes save the stack frame, push params on stack, jump to glUseProgram address, clean stack, jump back (aka return) not mentioning the overhead the driver does on that state change!
Yeah, what muddies the water a little is that one is conditional and subject to branch prediction misses and the wasted work/latency that entails, while the other is not.

Does seem intuitive though, what with all the validation work (i.e. conditional branching) and pointer chasing the driver prob does under-the-hood anyway, in addition to the function call overhead.

Would be interesting to reanalyze with bindless graphics and with all GL validation ripped out (if that were possible).

scratt
07-28-2009, 05:55 AM
I have been Beta testing Gremedy's gDEBugger for quite some time at various release versions.. The thing I most use in that personally is the Redundant State change feature on any new code. It's amazingly helpful.

Particularly when I jumped from GL to GLES and had got all sloppy with the fixed function pipeline stuff!

Ilian Dinev
07-28-2009, 06:42 AM
The conditional jump is usually predicted quite early-on, as cur_prog won't have been modified so recently.

Inside the glUseProgram, there ARE a bunch of other conditional jumps anyway.

Anyway, it's best to use glDebugger or a custom implementation to see where redundant calls are done in your app, and cache only them.

Dark Photon
07-28-2009, 09:17 AM
The conditional jump is usually predicted quite early-on, as cur_prog won't have been modified so recently.
Ok, thanks.