Sprite Batcher Strategies

So, I want to make an efficient Sprite Batcher because I am nearly fill rate capped with a few Sprites.
Yesterday I implemented a SpriteSheet to eliminate all glBindTexture Calls during the scene. Works fine :).
My data is formatted this way: XYZ RGBA ST

Currently I send the position as uniform to the shader. I guess I need to change that and tranform the vertices in the cpu and just send the final vertices.

When the position of a Sprite changes, I only need to update this specific sprite. But if you delete a sprite (what can be very often in my type of game), what do you guys do then? Just remake the array and send everything again?

How do you keep track which memory part in the vbo is “this one specific sprite”?
Or do you just send everything again?

Then for sorting:
I need to sort by z-order (Yes, I use depth test) for possible blending and the used shader because I need to bind it before I call glDrawArrays.

Any good article and any tipp how to organize all this is apreciated. When you search for “OpenGL Sprite Batcher” you get a lot of forum discussions but nothing really helpful because those are all individual smaller problems, no comprehensive articles.
Thanks a bunch.

fill rate capped with a few Sprites -> how much is ‘few’ ?

You problem sound similar to this thread, be sure to read it :
http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=290988

How big are your arrays ? To ‘delete’ a sprite you can send it offscreen if you don’t want to rebuild the buffer every frame.
A few thousand vertices rebuild each frame should not be a problem.

So, I want to make an efficient Sprite Batcher because I am nearly fill rate capped with a few Sprites.

I don’t see how batching will help if you’re bottlenecked on fillrate.

Maybe I have chosen the wrong word, but this is a performance graph:
http://img11.imageshack.us/i/glperf.jpg/

Thats ogl/calls per frame while rendering about 20 sprites. Ands thats simply too much.
But are there no articles about the topic sprite batching?
I know how to draw more than 1 Sprite with glDrawArrays, I rather need some advice on how to proper implement/organize the C++ side of it.

but this is a performance graph:
http://img11.imageshack.us/i/glperf.jpg/

That doesn’t help, since most of the graph’s labels are cut off.

But are there no articles about the topic sprite batching?

Probably because they don’t call it that. If you want to know how to structure the code, search for particle systems.

Basically you need to look to using glDrawElements and draw multiple sprites in a single call. If you’re using strips you should be aware that they’re completely inappropriate for this use case; use GL_TRIANGLES and an index buffer instead. Stop sending position as a uniform and switch to doing transforms on the CPU. Your biggest bottleneck (aside from fillrate which is always going to be a bottleneck with sprites/particles) is the number of separate draw calls you’re doing; one for each sprite by the sound of things. This is not efficient use of the GPU.

So keep a static index buffer as it’ll never change (0,1,2,0,3,4,etc) and a dynamic vertex buffer for vertex data. Append each sprite to the dyanmic buffer, draw them when the texture changes, and orphan the buffer when the current sprite would cause it to overflow.

That sounds complex so do what was recommended and search for info on particle systems (or billboarding). There will be lots of sample code available. Sorting is easy - just store the distance to the camera in each sprite struct and qsort them (tip: you don’t need the sqrt in the distance calc). I wouldn’t bother though unless you’ve definitively identified visual glitches from not doing so. Try switching off depth writing first and see if the result is acceptable.

Haha, if I read correctly the truncated label on the left, you have 60 frames per second. As this is exactly the typical framerate of a display, it means you have vertical sync on. Nice for end result, but totally useless for benchmarking. Ie, you can try to render 10x the number sprites, it may not even affect the framerate…

At least read this if you are serious about measuring performance : http://www.opengl.org/wiki/Performance

Nah, I am not bitching about the FPS. :smiley: I know about VSYNC. I am a noob, but not such a big noob :D.
I am moaning about the 333 OpenGL Calls per Frame :slight_smile: for roughly 20-30 Sprites.
Sorry for the bad Screenshot.
And well, I am more concerned about how to organize it in C++.
Those lists and maps of vertices… I don’t like it. But I got a idea last night, I will see what I can do with it.
Thanks so far for all links and suggestions.

You know about vsync, yet you did forget to disable it…
Otherwise, swapbuffers will block to provide 60fps.