Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Results 1 to 8 of 8

Thread: GLSL; return statement ...

  1. #1
    Junior Member Regular Contributor
    Join Date
    Apr 2014
    Posts
    109

    GLSL; return statement ...

    Hello,

    Simple question about GLSL and fragment shaders. May I return the "final color" early if I don't need to do any further processing in the void main function?

    Consider the following example:

    Code :
    #version 330 core
     
    out vec4 finalColor;
     
    uniform float drawWireframe;
     
    uniform vec4 materialColor;
     
    vec4 fancyLightingFunction(vec4 colorToProcess) {
     
    // Fancy lighting function here...
     
    }
     
    void main() {
     
    if (drawWireframe == 1.0) finalColor = materialColor;
     
    // May I call "return" here?
     
    finalColor = fancyLightingFunction(materialColor);
     
    }

    Thank you for your time.

  2. #2
    Member Regular Contributor Agent D's Avatar
    Join Date
    Sep 2011
    Location
    Innsbruck, Austria
    Posts
    281
    You have to understand how the processor architecture that this is run on works. Branches in general are very bad for performance.

    Shader executions are basically divided into groups, where each group has only one single instruction decoder that controlls a bunch of
    parallel ALUs with register files and local memory attached to them. It's like an extremely wide SIMD architecture. There is only one instruction
    fetch and execution unit, so diverging controll flow within a group is usually implemented by executing both branches and flagging the individual
    "cores" on whether the results should be used or not.

    Having a return statement in your shader code that is only taken by a few shader executions within a group will at best not influence performance
    at all. The code is still executed, but the results are ignored for those that hit the return statement, unless all shaders in the the entire group hit the return
    statement, so the entire group could finish early.

    It is better if you simply use different shader programs for your wireframe rendering, rather than cluttering it with branches.
    Last edited by Agent D; 02-20-2015 at 12:06 PM.

  3. #3
    Junior Member Regular Contributor
    Join Date
    Apr 2014
    Posts
    109
    I see; that makes sense.

    I was trying to avoid shader switching but if that hurts performance then I'll go ahead with using different shaders.

    Quote Originally Posted by Agent D View Post
    You have to understand how the processor architecture that this is run on works. Branches in general are very bad for performance.

    Shader executions are basically divided into groups, where each group has only one single instruction decoder that controlls a bunch of
    parallel ALUs with register files and local memory attached to them. It's like an extremely wide SIMD architecture. There is only one instruction
    fetch and execution unit, so diverging controll flow within a group is usually implemented by executing both branches and flagging the individual
    "cores" on whether the results should be used or not.

    Having a return statement in your shader code that is only taken by a few shader executions within a group will at best not influence performance
    at all. The code is still executed, but the results are ignored for those that hit the return statement, unless all shaders in the the entire group hit the return
    statement, so the entire group could finish early.

    It is better if you simply use different shader programs for your wireframe rendering, rather than cluttering it with branches.

  4. #4
    Senior Member OpenGL Guru
    Join Date
    Jun 2013
    Posts
    2,404
    Quote Originally Posted by Agent D View Post
    Having a return statement in your shader code that is only taken by a few shader executions within a group will at best not influence performance at all. The code is still executed, but the results are ignored for those that hit the return statement, unless all shaders in the the entire group hit the return statement, so the entire group could finish early.
    If you look at the sample code, the condition expression is uniform (i.e. involves only uniforms and constants), so the value will be the same for all fragments within a group. Modern hardware will actually branch here. Older hardware lacks the ability to branch in the conventional manner, but an implementations may compile distinct variants of the shader for each case, and select the appropriate one prior to execution.

    OTOH, creating distinct variants of the shader yourself ensures that this will happen. You can use the preprocessor to simplify the process, replacing the condition with e.g. "#ifdef WIREFRAME ... #endif" then switching between "#define WIREFRAME\n" and an empty string. glShaderSource() takes an array of strings rather than a single string, which makes it easy to dynamically insert, remove or replace arbitrary chunks of source code.

  5. #5
    Senior Member OpenGL Lord
    Join Date
    May 2009
    Posts
    5,907
    If you look at the sample code, the condition expression is uniform (i.e. involves only uniforms and constants), so the value will be the same for all fragments within a group. Modern hardware will actually branch here. Older hardware lacks the ability to branch in the conventional manner, but an implementations may compile distinct variants of the shader for each case, and select the appropriate one prior to execution.
    I believe the older hardware you refer to would be NVIDIA's pre-8xxx line. Since they're quite literally 10 years old this June, I'd say it's probably a moot point to consider them.

    Also, changing programs is a fairly heavyweight option. Odds are good that any internal change will be faster than your external one. As long as it doesn't provoke a recompile.

  6. #6
    Member Regular Contributor
    Join Date
    Dec 2009
    Posts
    251
    By the way, you seem to use a float uniform where a bool uniform would be more appropriate. If the driver knows that there are only 2 possible input values it can compile two shader variants to avoid dynamic branching (if this is a useful optimization on the given hardware). For a float there are millions of possible values, so it's more difficult for the driver to optimize this.

  7. #7
    Junior Member Regular Contributor
    Join Date
    Apr 2014
    Posts
    109
    Quote Originally Posted by mbentrup View Post
    By the way, you seem to use a float uniform where a bool uniform would be more appropriate. If the driver knows that there are only 2 possible input values it can compile two shader variants to avoid dynamic branching (if this is a useful optimization on the given hardware). For a float there are millions of possible values, so it's more difficult for the driver to optimize this.
    Thank you (and everyone else) for all of the feedback.

    Since I am a beginner maybe my thinking isn't correct on this but I will share it in hopes of hearing what experts have to say on the matter:

    I wanted to use one (1) shader for drawing such that I don't have to constantly switch shaders CPU-side. The shader would use float-based uniforms to chose whether to draw:
    • Wireframe
    • Skybox (using a texture2D sampler)
    • Standard shading (with different texture channels for ambient, diffuse, specular, emissive, etc.)
      • Within my standard shading model I plan to have the capability to calculate multiple lights and be able to position them, etc.
      • I would also have the capability to turn lighting "on" or "off" such that I can just show the ambient/diffuse textures without lighting calculations.


    The method I was going about doing this was as shown in my OP as an example; if a "drawWireframe" uniform was equal to 1.0, use the wireframe section of code in my shader but if that uniform was 0.0 then do something else.

    I my mind this was simpler than using multiple shaders as I am just altering the uniforms in memory but not changing shaders which from what I read you should avoid changing shaders if you can.

    Let me know if this makes sense and if my thinking is correct here.

    Thank you.
    Last edited by tmason; 02-26-2015 at 08:43 AM. Reason: Formatting

  8. #8
    Senior Member OpenGL Lord
    Join Date
    May 2009
    Posts
    5,907
    I wanted to use one (1) shader for drawing such that I don't have to constantly switch shaders CPU-side.
    You have 3 shader scenarios. That's not "constantly" switching.

    Many actual applications switch between dozens, if not hundreds, of different shaders every frame. That you've managed to boil it down to 3 is really good enough.

    It's generally not a good idea to avoid a shader switch if the different scenarios have wildly different resource needs. For example, if one shader form needs 4 textures and another only uses 2, those should probably be two separate shaders, not governed by an internal switch.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •