Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Results 1 to 10 of 10

Thread: Uniform Buffer Objects (slow) - help needed

Hybrid View

  1. #1
    Intern Newbie
    Join Date
    Feb 2010
    Posts
    39

    Uniform Buffer Objects (slow) - help needed

    I've already seen the "VBOs strangely slow?" thread. I've read through it twice, it all makes sense. However, my problem is a bit different in that it's dealing with UBOs and not VBOs. Although, behind the scenes are they entirely the same?

    glMapBufferRange makes good sense to me, as it seems to mimic (for the most part) what DirectX has always had in terms of buffer object locking/unlocking.

    The simple case that I currently have working is a shader program with a constant block that's updated via a UBO. The constant block is structured as follows:

    Code :
    uniform DF_GLOBALS
    {
        mat4    WorldView;
        mat4    WorldViewProj;
     
        vec4    BackBufferInfo;
        vec3    CameraInfo;
        vec2    ViewportInfo;
    };

    As you can see, there's a few hundred bytes of data there.

    However, the problem is largely with glMapBufferRange and to a smaller degree, glUnmapBuffer.

    I'm currently mapping the entire buffer and using the following flags to request that the driver discard the possibly in use buffer memory, and hand me back a pointer to new memory, if necessary:

    Code :
    GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_BUFFER_BIT

    That call to glMapBufferRange alone is taking just under 1 millisecond. Maybe (hopefully) I'm doing something wrong?

    Using either GL_DYNAMIC_DRAW or GL_STREAM_DRAW at buffer creation time makes no difference.

    The old school method of glBufferData( NULL ) to discard in conjunction with glMapBuffer does help quite a bit (relatively speaking). But even then, it's still around the 0.1 to 0.2 ms range, per update.

    This becomes unbearably slow very quickly if I try to draw many objects which their WorldView and/or WorldViewProj matrices updated. In that case I'm making many calls to glMapBufferRange per frame. Is there a better way I should be doing that?

    Hardware is ATI HD4850 with latest drivers.

    Any ideas? Thanks.

  2. #2
    Junior Member Regular Contributor
    Join Date
    Mar 2009
    Posts
    153

    Re: Uniform Buffer Objects (slow) - help needed

    Some quick ideas:

    - try different ubo storage (std140, packed)
    - try glMapBuffer(GL_WRITE_ONLY) and use GL_DYNAMIC_DRAW and GL_UNIFORM_BUFFER target when creating buffer

  3. #3
    Intern Newbie
    Join Date
    Feb 2010
    Posts
    39

    Re: Uniform Buffer Objects (slow) - help needed

    Quote Originally Posted by randall
    - try different ubo storage (std140, packed)
    I'm already using std140. I don't see how that could/should affect mapping performance though.


    Quote Originally Posted by randall
    - try glMapBuffer(GL_WRITE_ONLY) and use GL_DYNAMIC_DRAW and GL_UNIFORM_BUFFER target when creating buffer
    I'm already using GL_UNIFORM_BUFFER as target at creation time. What else can you use there without OpenGL complaining? I've tried both GL_DYNAMIC_DRAW and GL_STREAM_DRAW, neither seems to make a difference.


    Just so I'm clear, what's the intended use of UBOs? Is it so you can quickly swap out different values for the same set of constants (i.e. init a couple of buffers once and then don't touch them again - just swap amongst them)? Don't see how that's useful though.

    Or are UBOs intended to used like most other buffer backed operations which benefit from async updates? I assume this is the intended usage pattern, so you can quickly and repeatedly update multiple times per-frame certain (any/all) constants via OpenGL's buffer object mechanism.

  4. #4
    Junior Member Regular Contributor
    Join Date
    Mar 2009
    Posts
    153

    Re: Uniform Buffer Objects (slow) - help needed

    The default ubo storage is shared. So the code you posted uses shared. It should not affect mapping performance but there might be a driver bug so you should check all storages.

    GL buffers are only raw container of bytes. You can use any target for any purpose but there may be performance penalty.

    I recommend this reading http://www.opengl.org/registry/specs/ARB/uniform_buffer_object.txt.

    Storing uniform blocks in buffer objects enables several key use
    cases:

    - sharing of uniform data storage between program objects and
    between program stages

    - rapid swapping of sets of previously defined uniforms by storing
    sets of uniform data on the GL server

    - rapid updates of uniform data from both the client and the server

  5. #5
    Intern Newbie
    Join Date
    Feb 2010
    Posts
    39

    Re: Uniform Buffer Objects (slow) - help needed

    Quote Originally Posted by randall
    The default ubo storage is shared. So the code you posted uses shared.
    Yes, I know. The code I posted was only a snippet of a larger piece, where I specify:
    Code :
    layout(std140) uniform;
    I'm certain that all of my constants are using std140. Functionally, everything is working. Objects are rendering where they should be with proper orientations etc.

    Quote Originally Posted by randall
    - rapid updates of uniform data from both the client and the server
    Not so much. It's the performance that's just not there.

    I guess I could try using shared, but I'll have to query for and store offests, which the code isn't currently setup to do.

  6. #6
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948

    Re: Uniform Buffer Objects (slow) - help needed

    Just so I'm clear, what's the intended use of UBOs?
    There are many intended uses of them. Frequency of updates, however, is not one of them. If you're trying to update a UBO more than once per frame, you're using it wrong.

    You can use UBOs to share particular data among several programs. For example, if you have a camera matrix and a projection matrix, these are constant for all entities in the scene. As long as all of these shaders use the same UBO, you can update them just by changing one UBO.

    You can also use UBOs to store per-instance data. But this doesn't mean you change the buffer's data every time you draw another instance.

    For example, if you have the model-to-world (MTW) matrix as per-instance data, you can allocate a "large" buffer object. From that buffer, you can allocate slices that are the size of the MTW matrix, one allocation for each instance. When you are preparing to draw your instances, you map the entire buffer and update everyone's per-instance data. Note that this happens only once per frame.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •