Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 1 of 2 12 LastLast
Results 1 to 10 of 15

Thread: Multiple objects: single or multiple shaders?

  1. #1
    Intern Contributor
    Join Date
    Sep 2012
    Posts
    76

    Multiple objects: single or multiple shaders?

    This topic is actually a follow-on of an earlier posting of mine "Generating pipelines..." where I learned from thokra's responses how to generate pipelines in 4.3. The reason to post a new thread is that other novices like me may be interested in this particular question.

    So, here's the scenario: Two objects (could be many more) need to be transformed/colored differently. Here are two ways I know to do this:

    1. Single vertex and fragment shader:
    Define a uniform, call it currentObj, in the VS and FS. Both VS and FS have code of the form
    if (currentObj = obj1) do obj1 stuff;
    if (currentObj = obj2) do obj2 stuff;

    The draw routine in the main program looks like
    glUniform(make currentObj = obj1);
    draw obj1;
    glUniform(make currentObj = obj2);
    draw obj2;

    2. Separate vertex and fragment shaders:
    Make VS1 and FS1 for obj1 and attach them to pipeline1. Make VS2 and FS2 for obj2 and attach them to pipeline2.

    The draw routine in the main program looks like
    glBindProgramPipeline(pipeline1);
    draw obj1;
    glBindProgramPipeline(pipeline2);
    draw obj2;

    Btw, I believe the above methods would apply to other shader stages as well if present.

    So, here are my questions:
    1. Are the above basically the two ways to handle multiple objects or are there others?
    2. What's the best?

    Thanks in advance,
    Sam

  2. #2
    Senior Member OpenGL Pro
    Join Date
    Apr 2010
    Location
    Germany
    Posts
    1,129
    It depends on what you mean by "transformed differently". It could mean that you have the same program logic for every object, e.g. a chaining of multiple matrices where only the matrices themselves change. Or you could have entirely different transformation logic which in addition might be data dependent. The same goes for other properties like colors etc.

    For the first case, where the logic is identical, you don't need branching in any way in the shader code because, well, the logic is identical. Just change the uniform values or better yet, stick them in a uniform buffer object and just have the GL point to a different data store location with glBindBufferRange(). This also works well with instancing, only in that case, you'd draw a number of instances with the same set of uniform before changing the bound buffer range. This is also applicable to other areas such as material definitions or light source properties and so on. I like uniform buffers a lot.

    If the logic is supposed to be different, there are actually three ways to change the code path:


    • dynamic branching, what you suggested first but less desirable IMO
    • different shader programs, work well but need to take care on the application level to not have to many program switches (i.e. you need to batch hard)
    • use shader subroutines, effectively switching portions of functionality at run-time depending on what logic is needed for the current object


    I can't speak to what works best in your particular case. For a small number of objects it's irrelevant anyway. If you go to hundreds or thousands of objects, it's a different story.

    Technically, dynamic branching is the least favorable approach because if stuff goes wrong you loose performance due to branch prediction or additional execution of additional branchen you're not even trying to reach. At least in the fragment shader. I'm not certain how much impact dynamic branching has on current HD7000 or GTX600 series GPUs in worst case scenarios.

    With different programs (or separable programs for that matter) you get what you want but have to be careful to not introduce too many shader program switches, i.e. repeated calls to glUseProgram() or glUseProgramStage() because it alters program state and has to make the executables current for their respective stages, which takes some time, and introduces overhead in the application. If you can aggregate multiple objects into a group which uses the same program or pipeline, you can reduce this overhead. Batching is a good idea in general to reduce API overhead. Do it, and do it hard if possible.

    To me, switching functionality at dynamically runtime immediately let's me think of virtual functions or functions pointers in C/C++. Shader subroutines are technically similar to function pointers. You define multiple subroutines which offer different logic and choose the appropriate one at runtime. Subroutines are basically nothing but functions to which a pointer with a specified name points at a given point in time. In the shader, this "pointer" is declared as a uniform which can be queried and altered by the application. The drawbacks are, at least principle, that you have an additional indirection and of course a function call - which probably cannot be inlined by the compiler. Depending on the number of shader invocations I assume it can have a noticable impact. However, I'm not aware of any actual performance comparisons with real-world code that prove or disprove the value of subroutines in a high-performance context. I'm pretty sure, however, there aren't many real applications out there which actually use subroutines anyway - or most of the GL4 stuff.

    In the end, the only thing that's gonna give you certainty, at least for the platform you're developing on, is implementing multiple methods and simply profiling the result. Still, as I said, what works well on your platform is not guaranteed to run as fast on other platforms - but the opposite is true as well. That's the price we pay for doing cross-platform graphics programming.

    HTH.

  3. #3
    Intern Contributor
    Join Date
    Sep 2012
    Posts
    76
    As always, thokra, thank you for a clear and comprehensive reply. Shader subroutines are clearly what was missing from my list.

  4. #4
    Senior Member OpenGL Pro
    Join Date
    Jan 2012
    Location
    Australia
    Posts
    1,117
    I'm not aware of any actual performance comparisons with real-world code that prove or disprove the value of subroutines
    My tests to date imply a slight performance hit verses a condtional branching based on a uniform when I have a small number of subroutine variations.
    The plus side is the shader code is much more readable.

  5. #5
    Senior Member OpenGL Pro
    Join Date
    Apr 2010
    Location
    Germany
    Posts
    1,129
    tonyo: If you increase the number of variations, how does it play out then? Did you test? What GPU?

  6. #6
    Senior Member OpenGL Pro
    Join Date
    Jan 2012
    Location
    Australia
    Posts
    1,117
    My test only had 4 subroutine variations and I was using a nVidia GTX 570. I have changed my code to use a uniform to switch for the 3 most common variants and then a subroutine for the wilder variations that I don't use often. This keeps the code reasonable neat. I haven't been back to test it more seriously because it is not my major bottleneck at the moment.

    There are 2 problems I see with dynamic subrountines.
    One is having to pass parameters even when the variation is not going to use any of them.
    The other is that you cannot just change 1 subroutine pointer; you have to rebuild all of them. Both of these things must effect performance.

    The logic I am using most in the fragment shader looks like this
    Code :
    fetch basic colour - single texture/vertex colour/uniform colour using uniform, all other variations like double sided textures, height based colours and specialised results lookups by dynamic subroutine
    lighting - by dynamic subroutine
    modify colour - wire frame, contouring and some other variations by dynamic subroutine
    store - by dynamic subroutine - depends on  g-buffer data I want to collect

    This means I can build the dynamic subroutine list once for a large number of render objects but can easily flick texturing on/off for a particular object.

  7. #7
    Intern Contributor
    Join Date
    Sep 2012
    Posts
    76
    Thokra,
    I have a question for you. You say
    "For the first case, where the logic is identical, you don't need branching in any way in the shader code because, well, the logic is identical. Just change the uniform values..." I want to clarify this in a particular case I have.

    Suppose I am drawing two different objects. Specifically, in the app I have:
    ...
    bind buffer
    glVertexAttribPointer(0, point to buffer location for coord values of obj1);
    glVertexAttribPointer(1, point to buffer location for color values of obj1);
    ...
    bind another buffer
    glVertexAttribPointer(2, point to buffer location for coord values of obj2);
    glVertexAttribPointer(3, point to buffer location for color values of obj2);

    Correspondingly, in the VS I have
    layout(location=0) in vec4 obj1Coords;
    layout(location=1) in vec4 obj1Colors;
    layout(location=2) in vec4 obj2Coords;
    layout(location=3) in vec4 obj2Colors;

    Now, in this case don't I need a conditional in the VS (or subroutines) which does something like the following?

    if (current obj == obj1)
    {gl_Position = projectionMatrix * modelViewMatrix * obj1Coords;
    ...}
    if (current obj == obj2)
    {gl_Position = projectionMatrix * modelViewMatrix * obj2Coords;
    ...}

    As the attributes change I can't just get the job done with changing a uniform in the app and do need some form of branching in the shader, right?
    Thanks again.

  8. #8
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948
    Why would you ever do that? Why would you not simply put the vertex data in the same buffer object and render them all with the same draw call (even if it's a multi-draw call)?

  9. #9
    Intern Contributor
    Join Date
    Sep 2012
    Posts
    76
    Thanks for the response, Alfonse.
    I see your point but how about if obj1 and obj2 are, additionally, transformed differently, i.e., the modelview matrix is MV1 for obj1 and MV2 for obj2? In this case, I need to reset the modelview matrix uniform between drawing the two. Is it possible to do this within a multidraw call?

  10. #10
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948
    I see your point but how about if obj1 and obj2 are, additionally, transformed differently, i.e., the modelview matrix is MV1 for obj1 and MV2 for obj2? In this case, I need to reset the modelview matrix uniform between drawing the two. Is it possible to do this within a multidraw call?
    And what if the number of draw calls isn't your performance bottleneck? Unless and until you have hard profiling data that says otherwise, just render. Render in the most obvious manor possible. Because until you know where you're slow, you're not going to know how to make it fast. And until you know that you're slow, you're never going to tell if any of the things you do are an actual improvement in real-world scenarios.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •