Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: Custom Vertex Attributes

  1. #1
    Junior Member Newbie
    Join Date
    May 2016
    Posts
    21

    Custom Vertex Attributes

    I am trying to implement batch rendering, where I pass a large number of triangle vertexes and corresponding object data (like texture control flags, extra variables, etc.). Right now, each sprite has just over 164 bytes of data. I know that OpenGL has at least 16 vertex attributes, with each element having the size of 4 floats. A lot of batch rendering tutorials use them, but they never try to pass that much data for each object.

    I was thinking about separating all 164 bytes of data into chunks of 12 bytes, and loading them all into 16 vertex attributes. However, I've read about the attributes reserved by OpenGL for various purposes, which means I can't use them.

    Technically, of the 164 bytes, 20 bytes are used for vertex/uv coordinates, while the rest are per object (so they remain the same for the same object). I want to do batch rendering for 10 000 objects, and that requires 1.44 megabytes of non-vertex/uv coordinate data.

    I would like to know what other options I have, that allow passing that much data for batch rendering (with decent performance that is better than enormous uniform arrays).

  2. #2
    Senior Member OpenGL Pro
    Join Date
    Jan 2007
    Posts
    1,714
    Quote Originally Posted by CaptainSnugglebottom View Post
    I've read about the attributes reserved by OpenGL for various purposes, which means I can't use them.
    This is a common misunderstanding; most of the time the restriction does not apply and you are perfectly free to use them.

    Where the restriction does apply is if you are attempting to use a generic attribute and it's aliased fixed attribute in the same shader. So you cannot, for example, use glClientActiveTexture(GL_TEXTURE0)/glTexCoordPointer(...) and glVertexAttribPointer(8, ...) together.

    However, if you only use generic attributes you are completely free to use all of them and there is no problem. Note that this will always be the case if you're using a core profile.

    Likewise, if you never use glClientActiveTexture(GL_TEXTURE0)/glTexCoordPointer(...) you're completely free to use glVertexAttribPointer(8, ...) without any issues.
    Last edited by mhagain; 11-17-2017 at 07:35 AM.

  3. #3
    Senior Member OpenGL Guru
    Join Date
    Jun 2013
    Posts
    2,468
    Quote Originally Posted by CaptainSnugglebottom View Post
    Technically, of the 164 bytes, 20 bytes are used for vertex/uv coordinates, while the rest are per object (so they remain the same for the same object).
    In which case, you might be better off adding a single integer attribute for "object ID" and using that to index into uniform arrays (or textures if you would exceed the limits on uniforms). Or you might not (dependent fetches have a cost); you'd need to benchmark it to be sure.

  4. #4
    Junior Member Newbie
    Join Date
    May 2016
    Posts
    21
    I was thinking about using Uniform Buffers, since they can support up to 64Kb of data each, and I can have a few dozen of those.

    Would Uniform Buffers be significantly slower than vertex attributes, especially when it comes to that much data?

  5. #5
    Senior Member OpenGL Lord
    Join Date
    May 2009
    Posts
    5,924
    Technically, of the 164 bytes, 20 bytes are used for vertex/uv coordinates, while the rest are per object (so they remain the same for the same object).
    Then what you really want is to have some form of index for each vertex, which you use to fetch the per-object data from a buffer. Each vertex in the same object will have to have a copy of that index, but two-bytes-per-vertex is better than 20 bytes.

    I was thinking about using Uniform Buffers, since they can support up to 64Kb of data each
    No, they cannot. Even in OpenGL 4.6, the minimum value for GL_MAX_UNIFORM_BLOCK_SIZE is 16KB. Granted, most implementations of 4.0+ offer at least 64KB. But 16KB is the only value you can be certain of.

    What you want is best done with an SSBO, where the maximum size is measured in megabytes (the minimum required is 16MB, and most implementations offer limits that are "most of available GPU memory"). That way, you won't have to worry about whether adding more sprites will blow past some limit.

  6. #6
    Senior Member OpenGL Guru
    Join Date
    Jun 2013
    Posts
    2,468
    Quote Originally Posted by CaptainSnugglebottom View Post
    I was thinking about using Uniform Buffers, since they can support up to 64Kb of data each, and I can have a few dozen of those.
    As Alfonse says, you aren't guaranteed more than 16K per UBO. Buffer textures can have at least 64K texels (so up to 1MB for e.g. GL_RGBA32UI), but are constrained to the available texture formats. SSBOs can be much larger but require OpenGL 4.3.

    Quote Originally Posted by CaptainSnugglebottom View Post
    Would Uniform Buffers be significantly slower than vertex attributes, especially when it comes to that much data?
    You'll have to measure it. Array accesses and texture lookups have a cost, but so does memory consumption. Which of the two wins out depends upon the specifics of the program, the hardware, screen resolution, etc.

  7. #7
    Junior Member Newbie
    Join Date
    May 2016
    Posts
    21
    64k is supported by HD 4000 with latest drivers, so that's why I picked that. I don't really consider anything less than HD 4000.

    What you want is best done with an SSBO, where the maximum size is measured in megabytes (the minimum required is 16MB, and most implementations offer limits that are "most of available GPU memory"). That way, you won't have to worry about whether adding more sprites will blow past some limit.
    How much slower (or faster) is SSBO compared to Uniform Buffers when it comes to uploading data. Since I have to load like ~150 bytes for each of the 10k+ objects, I can either do it for a single SSBO, or multiple 64k long Uniform Buffers (one for each per object variable).

    Since single SSBO can fit everything I need, I'm guessing it will be quicker to load since I don't have to bind/unbind when I load different variables (as opposed to binding/unbinding multiple Uniform Buffers when I switch between data I'm trying to load).
    However, on OpenGL's side, is there any difference between SSBO and Uniform Buffer when it comes to performance, lets say, when using an SSBO and a Uniform Buffer of the same size?

  8. #8
    Senior Member OpenGL Lord
    Join Date
    May 2009
    Posts
    5,924
    How much slower (or faster) is SSBO compared to Uniform Buffers when it comes to uploading data.
    Buffer objects are not typed. I swear, someday I'm going to put that in my forum signature.

    The performance of transferring data to a buffer object has nothing to do with how it is used.

    I can either do it for a single SSBO, or multiple 64k long Uniform Buffers (one for each per object variable).
    I don't see why using a buffer for uniform data would in any way necessitate multiple distinct objects. You can bind a range of a buffer for use as uniform data. You would then change which range of the buffer to use based on which portion of the sprite batch you're rendering.

  9. #9
    Senior Member OpenGL Guru
    Join Date
    Jun 2013
    Posts
    2,468
    Quote Originally Posted by CaptainSnugglebottom View Post
    However, on OpenGL's side, is there any difference between SSBO and Uniform Buffer when it comes to performance, lets say, when using an SSBO and a Uniform Buffer of the same size?
    Maybe. Uniforms are guaranteed to be read-only, SSBOs are read-write. The implementation can determine that a given shader never writes to a SSBO, but I wouldn't assume that it will always take full advantage of that.

    If you want to know for sure, measure it.

  10. #10
    Junior Member Newbie
    Join Date
    May 2016
    Posts
    21
    Buffer objects are not typed. I swear, someday I'm going to put that in my forum signature.

    The performance of transferring data to a buffer object has nothing to do with how it is used.
    Okay thanks, I will keep that in mind.


    Kind of a silly question, but can I do something like:

    Code :
    glBindVertexArray(VAO);
    glBindBuffer(GL_ARRAY_BUFFER,VBO);
    glBindBuffer(GL_SHADER_STORAGE_BUFFER,SSBO);
     
    ...
     
    LOOP FOR EACH OBJECT:
    glBufferSubData(GL_ARRAY_BUFFER​, sizeof(arrayBufferStuff)*objectIndex​, sizeof(arrayBufferStuff)​, &arrayBufferStuff);
    glBufferSubData(GL_SHADER_STORAGE_BUFFER, sizeof(ssBufferStuff)*objectIndex​, sizeof(ssBufferStuff)*objectIndex​, &ssBufferStuff);

    ... or should I buffer the data as C++ arrays first, before writing sending them to GPU in one go? ...

    Code :
    glBindVertexArray(VAO);
    glBindBuffer(GL_ARRAY_BUFFER,VBO);
    glBufferData(GL_ARRAY_BUFFER, sizeof(arrayBufferStuff)*graphics2DMaximumObjects, &arrayBuffer, GL_STATIC_DRAW);
     
    glBindBuffer(GL_SHADER_STORAGE_BUFFER,SSBO);
    glBufferData(GL_SHADER_STORAGE_BUFFER, sizeof(ssBufferStuff)*graphics2DMaximumObjects, &ssBuffer, GL_STATIC_DRAW);

    I am especially sketched out by ...

    Code :
    glBindBuffer(GL_ARRAY_BUFFER,VBO);
    glBindBuffer(GL_SHADER_STORAGE_BUFFER,SSBO);

    ... since I don't know whether calling glBindBuffer twice in a row will overwrite binding points of two different types.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •