Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Results 1 to 7 of 7

Thread: Nvidia 319/320 drivers and shared layout UBOs

  1. #1
    Member Regular Contributor malexander's Avatar
    Join Date
    Aug 2009
    Location
    Ontario
    Posts
    328

    Nvidia 319/320 drivers and shared layout UBOs

    I've run across a change in the Nvidia 319/320 series drivers with regards to std140/packed uniform buffer objects that is causing havoc in our application.

    Code :
    #version 150
     
    layout(shared) uniform scene
    {
         float A;
         float B;
         float C;
    };
     
    out float result;
     
    void main()
    {
         result = A + B;
    }

    In the above shader, when the number of uniforms is queried via glGetActiveUniformBlockiv(... GL_UNIFORM_BLOCK_ACTIVE_UNIFORMS) it is returning "2". This is a change from previous Nvidia drivers where the same code returned "3". While this seems to be in line with the whole notion of active uniforms, it isn't terribly convenient when applied to shared uniform blocks, especially when different shaders may use subsets of the uniforms within the uniform block.

    The reason this is causing problems for our application is that it queries all the offsets, names, and sizes of the shared uniform block from the shader and caches them in a C++ object representing the uniform block (backed by a buffer). While I can make modifications to allow piecemeal caching of the various active uniforms' data used by shaders as the object is reused, I'd rather not unless this is truly the intent of the GL spec. The GLSL and GL specs talk mostly about memory layout when discussing shared and std140 uniform blocks, while the OpenGL wiki seems to indicate that all uniforms should be considered active and not optimized out for packed/std140 (http://www.opengl.org/wiki/Interface...#Memory_layout).

    So, should I be submitting this as a driver bug, or is this proper behaviour?

  2. #2
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948
    it isn't terribly convenient when applied to shared uniform blocks
    This statement is true if you replace the words everything after "it" with "is in violation of the specification."

    This is a driver bug.

    while the OpenGL wiki seems to indicate that all uniforms should be considered active and not optimized out for packed/std140
    No it doesn't. It specifically says that `packed` is allowed to optimize uniforms out; it's shared and std140 that don't. I don't even know why you're mentioning packed and std140 when you not using either.

  3. #3
    Member Regular Contributor malexander's Avatar
    Join Date
    Aug 2009
    Location
    Ontario
    Posts
    328
    No it doesn't. It specifically says that `packed` is allowed to optimize uniforms out; it's shared and std140 that don't. I don't even know why you're mentioning packed and std140 when you not using either.
    Type-o -- I mean to say shared/std140.

    The uniform block size is constant for shared regardless of which uniforms were used in the shader and the compiler isn't changing the offsets or sizes for the uniforms it is reporting. It isn't "optimizing them out" in terms of shuffling the uniform offsets around; it just isn't reporting the uniforms that were not referenced in the shader, and that is the source of my confusion.

  4. #4
    Senior Member OpenGL Pro
    Join Date
    Jan 2012
    Location
    Australia
    Posts
    1,117
    Why do you need to query locations in a std140 buffer. They are at known offsets from the start of the buffer or am I misunderstanding something.

  5. #5
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948
    It isn't "optimizing them out" in terms of shuffling the uniform offsets around; it just isn't reporting the uniforms that were not referenced in the shader, and that is the source of my confusion.
    I've looked at the spec, and it seems clear that the shared notion is not defined in terms of active uniforms. So the uniforms in a shared or std140 buffer are not considered active, and therefore are not necessarily queriable. But the offsets will always be done as though they were.

    That being said as you yourself pointed out this is counterproductive to the whole point of using `shared` to begin with. You need to be able to query all of the uniform offsets from any program. As such, I've filed a bug on that (for all the good it will do).

  6. #6
    Member Regular Contributor malexander's Avatar
    Join Date
    Aug 2009
    Location
    Ontario
    Posts
    328
    Quote Originally Posted by tonyo_au View Post
    Why do you need to query locations in a std140 buffer. They are at known offsets from the start of the buffer or am I misunderstanding something.
    My main concern was with shared uniform blocks; std140 just happened to demonstrate this behaviour as well. Since more program information is being defined by shaders these days (like locations), I'd prefer not to have to define the structure twice and keep them synchronized. But I would be fine with std140 not reporting all uniforms, as long as I have an alternative in 'shared' to report everything.

    Quote Originally Posted by Alfonse Reinheart View Post
    I've looked at the spec, and it seems clear that the shared notion is not defined in terms of active uniforms. So the uniforms in a shared or std140 buffer are not considered active, and therefore are not necessarily queriable. But the offsets will always be done as though they were.

    That being said as you yourself pointed out this is counterproductive to the whole point of using `shared` to begin with. You need to be able to query all of the uniform offsets from any program. As such, I've filed a bug on that (for all the good it will do).
    Thanks. I've also sent an email to our Nvidia contact, and adjusted our code to be more defensive for this case.

  7. #7
    Senior Member OpenGL Pro
    Join Date
    Jan 2012
    Location
    Australia
    Posts
    1,117
    an alternative in 'shared' to report everything.

    I assume you mean includes and I agree. I have setup a template that generates both structures and I have a pseudo include in the shader via a pragma.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •