Vertex array object

hi

how can i use ARB_vertex_array_object extension (please help with sample code)

Read this: ARB_vertex_array_object

It’s just like anything else in GL. gen, bind, setup. Then when you want to re-use, just bind it.


if ( vao )
  glBindVertexArray( vao );
else
{
  glGenVertexArrays( 1, &vao );
  glBindVertexArray( vao );
  <setup vertex array pointers and enables>
}

Did anyone try this extension yet? Does it give noticeable speed-improvements? Does anyone know, wheter nVidia’s Linux-drivers already expose it?

Tried. No diff here. I’m using it on NVidia-Linux (so yes, it’s exposed). 8800GTX + 3.73GHz hyperthread dual core.

However IIRC my test case at the time was primarily lots of client arrays with some VBOs and display lists mixed in, and this was on a fairly fast CPU. May see some benefit on other configs/loads. I’d be interested to hear more from NVidia/others about that.

Vaguely recall reading somewhere that VAOs main goal was to optimize away the expensive binding overhead of VBOs to vertex attribs. So maybe better results with 100% VBOs.

This extension is so easy to use, i just implemented it myself.

Well, some stumble stones:
When you use VAOs, you STILL need to BIND the GL_ELEMENT_ARRAY_BUFFER and GL_ARRAY_BUFFER_ARB AFTER you have bound your VAO. Otherwise it crashes. Seems these states are not stored inside the VAO. However, you do not need to enable arrays and set up pointers anymore. They might want to mention this somewhere (in the spec?), took me some digging.

Another problem is, that because of the retared way vertex-attrib-arrays are bound to shaders, one needs to have a VAO for EACH combination of vertex-array and bound shader. In my case that makes about 100 VAOs. Otherwise, you guessed it, it crashes.

Now, for the performance part, i use VBOs only, no DL, no IM no client-side arrays. I took the biggest mesh, that i could find, reduced resolution to a small window and switched VAOs on and off on-the-fly to test the difference.

It is: zero

There is really NO difference whatsoever, neither good nor bad. I should mention, that this is on a Geforce 9600, Core 2 Quad 2.4 GHz, Vista 32 Bit, latest drivers (installed today).

So i guess, either this part of the pipeline is never the bottleneck for me, or the drivers are not yet optimized to take advantage of the extension. But maybe one of my cores is less under stress with VAOs, don’t know.

Anyway it is very easy to add (took me 45 minutes including debugging and performance testing), so i quite like it.

Jan.

When you use VAOs, you STILL need to BIND the GL_ELEMENT_ARRAY_BUFFER and GL_ARRAY_BUFFER_ARB AFTER you have bound your VAO. Otherwise it crashes

You just need to bound GL_ELEMENT_ARRAY_BUFFER, GL_ARRAY_BUFFER_ARB isn’t a really “bounding point”. Even without VAO you can just before you glDraw*** use glBindBuffer(GL_ARRAY_BUFFER, 0).

It’s actually what I do to be sure that the “glVertexAttribPointer” are always set to the right buffers.

To test this feature, I think it’s better to use small mesh and change the VAO as more a possible. Basically, just load 2 diferents meshes, and render them 10000 times. But each time change the VAO used. It will be something like

  • Render Mesh 1
  • Render Mesh 2
  • Render Mesh 1
  • Render Mesh 2
  • Render Mesh 1
  • Render Mesh 2
  • etc…

I don’t know about the current efficiency but on time … it will become more efficient.

Yeah, i know if one wanted to really test it properly, your idea would be better, but in my application a huge mesh means to switch the VAO more often, because each VBO contains 2^16 vertices max, so the bigger the mesh, the more VBOs i have, and thus more changes. It is always difficult to benchmark a feature with an existing engine/application, because your bottleneck is often somewhere else (in my case fillrate and pixel shader computations are definitely the limiting factors, rather than polygon throughput or CPU usage).

Anyway, from a theoretical standpoint i find this extension more than overdue (just like direct state access), so i am happy it is finally available.

Jan.

Interesting!

Do you think that this limited size buffers are still useful considering VAO and mapped buffer range ?

Well, i did a very small test. I can switch buffer-splitting on and off (at creation time of the buffer), so i switched it off and tried again comparing speed with and without VAO. In some very few cases VAO gave a speed improvement (from 46 to 49 FPS). But in general there was no difference.

Using unsigned short indices and thus only up to 2^16 vertices per buffer does give you quite a speed improvement on all hardware. Especially also on ATI, which does not benefit from glDraw_Range_Elements, at all. I tested this a while back, with big meshes and on especially with VRAM sizes smaller than your working set, this does improve performance in any case.

The thing about VAOs is, even if they would allow for the driver to optimize that you only use a subset of your data, this does not automatically imply, that such optimizations are in place. With a limited VBO size the driver can be really stupid and it still gives you better speed. Also with an offset into a buffer you only limit the lower end, not the upper. There is no way to tell the driver, that this VAO is only used to operated on n vertices from the given buffer, it must assume, that drawcalls will index all data starting behind the offset. Only with glDrawRangeElments you can make this more clear to the driver, but the next drawcall might access further data, thus it might be more efficient to simply load the whole buffer right away.

IIRC “mapped buffer range” only applies to writing to buffers, i am only talking about pure rendering. For updating buffers the mapped buffer ranges are certainly a very sensible improvement.

Well, to make this long story short, yes, it seems that smaller buffers are for now still worth the effort.

Jan.

In my case, I was not rebinding ARRAY_BUFFER for VBO cases. That was apparently stored in the VAO, which makes sense. You could have 1 VBO for each vertex attrib. In that case, VAOs would make no sense unless the VBO bindings were cached with the offsets.

But like you I was of course binding ELEMENT_ARRAY_BUFFER as usual, as that’s part of the draw call and not the vertex array bindings.

You are absolutely right, just tested it myself. Thanks for the info.

Jan.

ELEMENT_ARRAY_BUFFER_BINDING should also be stored in the VAO as it is part of table 6.8, so after initial set up you just need to bind the VAO and draw.

To be clear, the ELEMENT_ARRAY_BUFFER binding is part of the VAO and the ARRAY_BUFFER binding is not part of the VAO.

Strange, because it works with the one but not the other. Maybe a a bug in nVidia’s implementation? I will need to check that again…

Jan.

From my tests I would say that are ELEMENT_ARRAY_BUFFER not part of VAO because I need to bind them separately. One other weird behaviours is that the binding order matter.

This works:


	glBindVertexArray(vertexArrayName);
	glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, elementBufferName);

		glDrawElements(
			GL_TRIANGLES, 
			mesh.elementCount(), 
			GL_UNSIGNED_SHORT, 
			0);

	glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);
	glBindVertexArray(0);

This doesn’t work, it crashs:


	glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, elementBufferName);
	glBindVertexArray(vertexArrayName);

		glDrawElements(
			GL_TRIANGLES, 
			mesh.elementCount(), 
			GL_UNSIGNED_SHORT, 
			0);

	glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);
	glBindVertexArray(0);

And including glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, elementBufferName) during the VAO creation doesn’t work either.

And “ARRAY_BUFFER binding” are definitely part of the VAO … even I would expect it isn’t just to be able to change array buffers after VAO creation but we can’t. “Highest number of a kind of object” title is going to win by VAOs.

Looks like the VAO extension and core 3.0 are not aligned or there’s a mistake in either one. The extension excludes CLIENT_ACTIVE_TEXTURE from the stored states, but 3.0 leaves out table 6.10 meaning CLIENT_ACTIVE_TEXTURE and ARRAY_BUFFER_BINDING.
In the 181.22 driver Nvidia seems to have implemented the extension following 3.0 spec.

When you do glBindVertexArray(vertexArrayName), that swaps out the whole VAO so rather than using elementBufferName you’ll use whatever element_array_buffer was last bound to vertexArrayName (assuming you haven’t since then implicitly detached it by deleting it or something like that).

ARRAY_BUFFER is a selector just like CLIENT_ACTIVE_TEXTURE. You should go with what the 3.0 spec says in this case.

Unfortunalety as Jan and I said earlier, it doesn’t seams that GL_ELEMENT_ARRAY_BUFFER is attached to the VAO at least on nVidia implementation. If we don’t bind it at draw call, it crashes.