Equivalent of instancing?

I have searched the forum for this matter, but somehow didn’t find anything substantial.

So I was wondering, what’s the equivalent of D3D instancing for OpenGL? I realize that the problem it solves is not so severe in GL as is in D3D, but still, is there a robust method that performs something similar? Right now I draw lots of objects the “naive” way: I bind the VBOs and I repeteadly call DrawElements, loading different matrices each time. Is there another alternative, that is well supported?

OpenGL instancing :slight_smile:

You have multiple solutions in OpenGL for it actually. You can either use the functionality provided by ARB_draw_instanced or ARB_instanced_arrays. The former was made its way to core OpenGL in version 3.1 the later in version 3.3. But if your GPU does not support OpenGL 3.x you can still use the extensions.

Actually you just have to use DrawElementsInstanced :wink:

Just to add to that: the instance data can be sourced from vertex arrays, uniform arrays, uniform buffer objects or texture buffer objects.
The choice depends on how many instances and the size of the instance data.

This tutorial helped me a lot: http://sol.gfxile.net/instancing.html

Using (the last technique in the tutorial) glVertexAttribPointer + glVertexAttribDivisor + mat4’s in shader + my own matrix stack implementation turned out great and really efficient for high instance counts.

I’d like to ask this question about this topic:

say you had this for the case of Basic shaders:

uniform mat4 model_matrices[max_num_instances];

which you would modify in a single glUniform* call.

instead of a single:

uniform mat4 model_matrix;

that you change for every glDrawElements() call. Would this help any with perf? You’d still need to send over an instance_id via a uniform or attribute. Say, if a large batch were sent with instance_id as a vertex attribute…

that you change for every glDrawElements() call. Would this help any with perf?

No. You’re not actually doing instancing if you’re changing a uniform every frame. Instancing requires the use of one of the glDraw*Instanced calls. You can either rely on gl_InstanceID or use an attribute divisor.

It might be slightly faster, but only if the upload of data is a major performance issue (and if there is actually a performance difference between uploading an integer and uploading a matrix).

No, there is no need for any vertex attribute as you have the gl_InstanceID variable available in the shader that you can use to index into the array of matrices. But you can also you the vertex attribute divisor based method when you feed the matrix as four separate vertex attributes (though the former is simpler in your case).

BTW: I have this in mind with the techinque, I don’t know if the idea is sound.

Basically copies of the same vertex attributes, but with different instance_ids (as a vertex attribute), loaded into a VBO, of course. Like this:

VAs (instance_id = 0)
VAs (instance_id = 1)
VAs (instance_id = 2)

Is the idea sound (for the case where no instancing is available, only basic shaders)? Even two consecutive VA copies would reduce the amount of glDrawElements() by 2, but probably there would be 4-5 consecutive copies, to reduce the number of drawing calls by the corresponding factor.

If you don’t have native instancing you’re probably using pre-DX10 hardware and at least for that generation of hardware there can be a 2x to 3x speedup using this technique according to NVidia. Google for “NVidia pseudo instancing” to find some papers about the topic.

I am not the OP. I have:

OpenGL vendor string: ATI Technologies Inc.
OpenGL renderer string: AMD Radeon HD 6670
OpenGL version string: 4.1.11079 Compatibility Profile Context
OpenGL shading language version string: 4.10

Nothing to brag about, but it has instancing and I’d buy a Radeon 5450 or 6450, if I didn’t have instancing. Furthermore, the pseudo-instancing paper does not discuss the technique I proposed.