Lets Build a Batcher!

Well hello there, after a lot of blood sweat and failed experiments i’ve decided to take this to the forums.

My goal is to build a draw call batcher that
-is simple to use and setup
-can handle completely generic vertex data and shaders
-can handle transparency
-and is of course as performant as possible

I understand the design principles behind batching fine, what i’m struggling with finding is some good practical examples that can be integrated into a game engine.

For my initial engine design, each renderable object has a ‘draw’ function that takes a batch class by reference, this function would then add vertex data to the batcher along with adding a draw command to the command que.

The batcher itself has an ‘addDraw’ function. This takes the type of primitives to draw (tris, lines etc), a pointer to an array of vertex data (must be passed as an array of bytes so that any generic data can be used), a pointer to an array of GLushorts for the element data, a depth value to be used for sorting purposes only (may not match depth values of all verticies!), and a stategroup that represents the uniforms and opengl state that must be set for this draw command.

The problem is that for transparency objects must be ordered back to front and for efficient opqaue rendering they must be grouped by shader, texture etc before any draw calls are made.
What this means is that any changes to the opengl state between adding draw data will not necessarily be applied in the order the data is added. In fact draw commands essentially need to be attached to ‘descriptions’ of the opengl state for this re-ordered rendering to work properly. In my own attempts i created the ‘stateGroup’ class, that would set uniforms and transformation matrixes in between draw calls.

Another problem is that for vertex data to be sorted before being uploaded it must be stored in some sort of buffer first before being uploaded to vertex buffer object. I myself used a vector of chars (representative of raw bytes, used so generic vertex data can be stored) and stored the offset and size of each draw command’s vertex data inside the draw command class. Then after the commands are sorted the data is retrieved from the vector of chars and uploaded to a mapped vertex buffer in order. The problem with this is that a copy of the vertex and element data is stored locally before being uploaded, which eats up a lot of extra time.

Also as mentioned above transparent and non-transparent objects must be sorted seperately the batcher has transparent and non-transparent addDraw methods, meaning the renderable has to ‘know’ whether it is transparent and call the correct method.

Now, how workable is this method? Is it really worth sorting all opaque objects at once, or should they be batched as they come and draw calls dispatched when a different state is requested? Cause atm the bottleneck seems to be the cost of storing local copies of the data. Which seems necessary for the strict ordering of transparent objects, but not worth it for the performance sorting of opaques.

Oh and it is assumed for this batcher that all data is generated on the fly each frame and is dynamic.

also is this more appropriate for the advanced forum? becuase i can move it if it is