I have a scene with few low poly models (foliage) instanced many times across the scene. Currently I do a single glDrawElementsInstanced without culling or LOD. I want to improve performance in the most typical way possible with culling and LOD.
I read and understood Instance culling using geometry shaders – RasterGrid
which I’ve seen referenced here many times. The solution above sounds great for me. I get to keep my precomputed instance positions and perform culling to prevent most of them from actually getting drawn.
This article seems to imply it is a more advanced way to do this task. I am more of a beginner, so rather than skipping right to this step I would want to consider the precursor ways to do the same task, perhaps on the CPU.
My questions:
-
am I correct in thinking this article presents a GPU workflow that is roughly equivalent to some CPU solution?
-
If I cull on the CPU and generate some array of all the instances (positions) I want to draw, I would have to bind some newly generated buffer (maybe a Uniform Buffer Object?). I need a new draw call to handle some variable amount of positions. What I have now is not dynamic, but I imagine the consistency of data gives some performance boost. What draw would you do for a dynamic list of positions? How would you get that information to the GPU?
-
A very simple solution using only techniques I’ve done would be to have a separate VAO for each ‘chunk’ of foliage and cull based on chunks and glDrawElementsInstanced for each static chunk that was not culled. This doesn’t sound entirely bad to me, but this wouldn’t teach me how to deal with culling the individual instances within a chunk. What I hope to do is cull by chunks first and do some very simple LOD draw, then implement a solution to my question #2 for the remaining instances.
Thank you for reading