Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: glBufferData just one time at start

  1. #1
    Junior Member Newbie
    Join Date
    Jul 2017
    Posts
    5

    glBufferData just one time at start

    Hi!

    I am AI Programmer trying to improve my knowledge about graphics programming in general.
    I am writing my own game engine, I have lot of things already but I am still trying to figure out what is the best way of drawing lot of sprites.

    There has been an idea that it is being in my head for quite a long time, which is not update the vertex buffer before drawing the object, but preallocate the position and texture coordinates as soon as you create the texture.
    My engine is very data-driven, so I can serialize lot of classes and data.
    I was thinking to create the texture and the vertex buffer and index buffer at start and then call directly the glBufferData with the proper position and textCoords, then when an object needs to be render I will just pass the transormation matrix to the shader by using an uniform.

    Does it makes sense that approach?

    The thing is that so far by calling glBufferSubData for each renderable I am having a not very good performance (around 60 fps drawing 3000 sprites, which seems very very slot to me)

    Thanks in advance!

  2. #2
    Senior Member OpenGL Pro
    Join Date
    Jan 2007
    Posts
    1,715
    The way you seem to be doing this is you have a single buffer, sized for one sprite, and you make 3000 glBufferSubData calls per frame. Alternatively each sprite may have it's own buffer so you now make 3000 glBindBuffer and 3000 glBufferSubData calls per frame.

    Both of these are going to be slow.

    The alternative you mention is also going to be slow, because now it's 3000 matrix uploads to the GPU per frame.

    The right way is to have one buffer, sized for 3000 sprites. Make one glBufferSubData call per frame. Make one glDrawArrays/glDrawElements call per frame.

    If you can't make one glDraw* call per frame, perhaps because your sprites have different textures, so be it, use the parameters to your glDraw* call to select subranges of the buffer to draw, but at least get those buffer updates under control.

    Don't unbind or disable state after each draw call because it will mess with any state change filtering or batching that your driver might otherwise be able to do.

    Even this may not increase performance - with 3000 sprites, depending how large they are, fillrate can be your primary bottleneck. If so, that's OK - at least you'll know you're bottlenecked for the right reason.

  3. #3
    Junior Member Newbie
    Join Date
    Jul 2017
    Posts
    5
    Hi mhagain!

    Thanks for your reply, I was thinking about that possibility, but one more question arises to me.
    Right now my shader is expecting to have a modelviewMatrix per sprite, (also a ProjectionMatrix, but that is global for all of the sprites) so it is configured as uniform. So, do you recommend to include the modelviewMatrix as part of the vertex attributes?
    Because I see only 2 options here:
    1.- Passing the matrix per vertex (so the matrix will be duplicated by 4 which seems a bit too much of waste)
    2.- Perform the matrix operation on the CPU on the vertex before sending the vertices to the GPU.

    What would it be the right approach?

    Thanks! I am enjoying a lot with graphics programming!

  4. #4
    Senior Member OpenGL Pro
    Join Date
    Jan 2007
    Posts
    1,715
    Approach 3 would be to use instancing; this would allow you to have one matrix per sprite but have it available as vertex attribs. Be aware though that in terms of performance there's not a huge gain from the memory saving; GPU programming is often counter-intuitive like this and sometimes "wasting memory" can actually lead to higher performance (which is one reason why I hate the word "waste" in this context; if it gives you something in return for it, surely it's "use", not "waste"?)

  5. #5
    Senior Member OpenGL Guru
    Join Date
    Jun 2013
    Posts
    2,474
    Quote Originally Posted by urosidoki View Post
    Right now my shader is expecting to have a modelviewMatrix per sprite, (also a ProjectionMatrix, but that is global for all of the sprites) so it is configured as uniform. So, do you recommend to include the modelviewMatrix as part of the vertex attributes?
    Because I see only 2 options here:
    1.- Passing the matrix per vertex (so the matrix will be duplicated by 4 which seems a bit too much of waste)
    2.- Perform the matrix operation on the CPU on the vertex before sending the vertices to the GPU.
    mhagain gave you a third option (instancing). A fourth option is to send each sprite as a point and use a geometry shader to convert each point to a pair of triangles.

    Option 1 is feasible, but you might want to condense the matrix. For sprites, you often only need translation, rotation and uniform scale. In which case, you only need 4 values, as the matrix will always have the form
    Code :
    u -v x
    v  u y
    0  0 1

    This may well be more efficient than either instancing or a geometry shader.

    If you're particularly concerned about memory, you can store an array of matrices (one per sprite) as a uniform variable or (if limits on uniform storage are an issue) a texture, then index the array using either gl_VertexID or an integer vertex attribute. But the array/texture lookup has a cost relative to receiving the data directly as a vertex attribute.

    Also, if you're splitting draw calls because of different textures, consider using either an array texture or an array of samplers (with the texture ID passed similarly to the transformation) so that you can coalesce the draw calls. This would also avoid needing to group sprites by texture. In turn, that would allow you to sort by depth, so you can render either back to front (eliminating the need for a depth buffer) or front to back (maximising the effect of early depth tests).

  6. #6
    Junior Member Newbie
    Join Date
    Jul 2017
    Posts
    5
    Hi guys I got news,

    Since my level was made of tiles I just construct a class called Mesh where I can just collect all the vertices + indices for the tiles that share the same texture.
    So now I am using just 1 draw call, the performance has increased a lot.

    I will try your suggestion about the matrix.
    I still have lot of questions but I will continue investigating a bit more.

    So thanks for your advises, it was very helpful!

  7. #7
    Senior Member OpenGL Guru
    Join Date
    Jun 2013
    Posts
    2,474
    Quote Originally Posted by urosidoki View Post
    Since my level was made of tiles
    FWIW, you can draw a tile map as a single large quad (pair of triangles), with the fragment shader dividing the pixel coordinates by the tile size to obtain the tile indices (quotient) and the offset within the tile (remainder).

    Even if you don't go that far, a grid of tiles warrants a different approach to sprites.

  8. #8
    Junior Member Newbie
    Join Date
    Jul 2017
    Posts
    5
    Thanks GClements, I might be able to try that just for curiosity and to improve my skills with shaders!

    Oh btw, regarding to the model matrix I realized that I need to rotate sprite vertices on the CPU because I need to calculate the AABB anyway. so I will probably not even try to pass the matrix as a vertex attribute because of that

  9. #9
    Senior Member OpenGL Guru
    Join Date
    Jun 2013
    Posts
    2,474
    Quote Originally Posted by urosidoki View Post
    Oh btw, regarding to the model matrix I realized that I need to rotate sprite vertices on the CPU because I need to calculate the AABB anyway.
    Another option is to use transform feedback mode to capture the transformed vertices. But the likely pipeline stall from copying the data back to CPU memory may well make this approach slower overall than transforming the vertices on the CPU.

  10. #10
    Junior Member Newbie
    Join Date
    Jul 2017
    Posts
    5
    Actually I have been thinking in a more elegant solution.

    I might generate the AABB of the vertices without any rotation. Then rotate the AABB (so it will not be an AABB anymore, but an Oriented Box), and then calculate the AABB over the rotated Box.

    I think that will work and I can still try to experiment with sending the matrix in the vertex array.

    Cheers!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •