VBO performance question.

I am working on an application drawing 2D CAD drawings. Most geometry is of the GL_LINE variety. Though there are a few GL_TRIANGLE types in there as well.

My first pass at it had me using glBegin(), glVertex2f() …, glEnd() stuff.
Which meant that the geometry was being pushed up to the video card on every draw.

Now, on my second pass at the code, I am implementing vertex buffer objects.

Everything is drawing just fine.
However, I am not seeing any speed improvements.
In fact, some drawings draw even slower when using VBOs.

As an example of my data, here is a breakdown of one file.

It’s geometry consists of a total of 28,937 points.
It uses 2020 seperate glDrawArrays() calls.
It has 3 vertex buffer objects. (I make 1 vbo for every 10000 points)

So, in a rendering pass:

  • glBindBufferARB() is being called 3 times.
  • glVertexPointer() and glDrawArrays() are being called 2020 times. (with different offsets into their respective vbos)

yet this is somehow equal to, or slower than, calling glVertex2f() 28,937 times.

Any ideas on what could be wrong?
What type of things could slow it down?

I am using the static specifier when I feed it my points:

glBufferDataARB(GL_ARRAY_BUFFER_ARB, datasize, data, GL_STATIC_DRAW_ARB);

That is supposed to move the data up to the card, right?
Could something else interfere with that and cause it to keep the points off the card?

Here is the code, for reference.

This is the init logic for each VBO:

glGenBuffersARB(1, &vboid);
glBindBufferARB(GL_ARRAY_BUFFER_ARB, vboid);
glBufferDataARB(GL_ARRAY_BUFFER_ARB, datasize, data, GL_STATIC_DRAW_ARB);
glBindBufferARB(GL_ARRAY_BUFFER_ARB, 0);

This is the drawing code for each group.

if( this vbo isn't already the current vbo )
{
    glBindBufferARB(GL_ARRAY_BUFFER_ARB, vboid);
}

if( vertex array isn't already enabled )
{
    glEnableClientState(GL_VERTEX_ARRAY);
}
            
glVertexPointer(2, GL_FLOAT, VERTS_STRIDE, (const GLvoid *)offset);
glDrawArrays(mode, 0, numpoints);

Where mode, offset and numpoints are unique for each group.

It uses 2020 seperate glDrawArrays() calls

This is your problem. Why are you not just doing 3 glDrawArrays on for each buffer?

If your triangles are a problem put them in another buffer

Each group (object?) is in it’s own glDrawArrays() call because they each have their own mode (GL_LINE_STRIP, GL_LINE_LOOP, GL_TRIANGLE_STRIP, GL_TRIANGLE_FAN, etc…) and there may have been a state change in between groups. Functions like glColor3f() may have been called, for example.

I have trouble believing that calling glVertexPointer() and glDrawArrays() 2020 times is slower than calling glVertex2f() 28,937 times.

Which makes me suspect that the points are not being moved up to the video card’s memory when the vbo is defined. Instead, they are being moved on every glDrawArrays() call.

Is there any way to test to know where the data is?

I know it is up to the driver to decide where the data will live; and that my GL_STATIC_DRAW_ARB parameter is just a suggestion to the driver that I want it stored on the card. But maybe there was some other factor that causes the driver to keep the data local?

Is there any way to test to know where the data is?

As far as I know - no

The glDrawArrays will be quicker - at least our bench marks show this; but if you are doing state changes as well - all this will kill performance.
Performance only comes with the cost of ease of coding. You have to be a lot smarter with you vertex buffer to cut down on
state changes and OpenGL calls. I draw hundreds of thousands of polyline in negligible time but I have some cost in the amount of
memory I use.

I use large VBOs to minimize glDrawArrays.
Each of my vertices has the colour included and all 2 point vectors and triangles are drawn as line strips. I use an index buffer with a glPrimitiveRestartIndex.
My triangle meshes are held in VBO with an index buffer so I don’t use GL_TRIANGLE_STRIP or GL_TRIANGLE_FAN.

Because I have noticed much of our data has 2 point lines I am in the process of creating a VBO just for these to save a little space.

Interesting!

“but if you are doing state changes as well - all this will kill performance”

So, you are saying that

glVertexPointer();
glDrawArrays();
glColor3f();
glVertexPointer();
glDrawArrays();

is much slower than

glVertexPointer();
glDrawArrays();
glVertexPointer();
glDrawArrays();

“I use large VBOs to minimize glDrawArrays”

I am not sure what you are saying here.
Are you saying you don’t have to call glDrawArrays() as often?
If so, how do you handle different modes? (GL_LINE_STRIP, GL_LINE_LOOP, etc…)

As far as I can see the adding of things like

glColor3f(); do effect performance but remember these are accumulated effects so it is rare that one change will suddenly make everything faster.

I will you an example of some data I might want to dar
a line v1,v2 - GL_LINES
a square v3,v4,v5,v6 - GL_LINE_LOOP
a polyline v7,v8,v9 - GL_LINE_STRIP

I put all of these in a VBO v1,v2 …v9

I create an index buffer with a restart index of 999

the indices in this buffer are
0,1,999 (the line) 2,3,4,5,2,999 (the square) 6,7,8,999 (the polyline)

now I render with glPrimitiveRestartIndex(999) and glDrawElements(GL_LINE_STRIP

Thus 1 call 3 objects rendered but at the cost of some extra memory on the gpu

Hope that helps