Line drawing performance

Hi,

I’ve got to render a whole buch of lines resp. line-strips. Amount of vertices: 100.000 to 4.000.000.

And I get around 3.500.000 lines per second on a Quadro FX500 and 10.000.000 lines per second on a 6800LE graphics board (lighting and depth-buffering disabled).

Is this the maximum performance that I can expect?

And a second question … I used raw GL calls, display lists and vertex buffer objects. They all render at the same speed. So what’s wrong with my code? :slight_smile:

TIA,

Moritz

  

void display(void)
{
    glClear(GL_COLOR_BUFFER_BIT);

    glColor4ub(128, 255, 128, 32);
    unsigned i;

    // several draw modes:
    // raw GL, line-strip as GL_LINE_STRIP or GL_LINES
    // displaylists,            "
    // vertex buffer objects,   "
    switch(dmode)
    {
    case 'r':
        glBegin(GL_LINE_STRIP);
        for (i = 0; i < LSIZE; i++)
        {
            glVertex3fv(lineData+3*i);
        }
        glEnd();
        break;

    case 'R':
        for (i = 1; i < LSIZE; i++)
        {
            glBegin(GL_LINES);
            glVertex3fv(lineData+3*(i-1));
            glVertex3fv(lineData+3*i);
            glEnd();
        }
        break;

    case 'd':
        if (rlist[0] == 0) // init displaylist once
        {
            rlist[0] = glGenLists(1);
            printf("rlist[0] = %d
", rlist[0]);
            glNewList (rlist[0],  GL_COMPILE);
            glBegin(GL_LINE_STRIP);
            for (i = 0; i < LSIZE; i++)
            {
                glVertex3fv(lineData+3*i);
            }
            glEnd();
            glEndList();
        }
        else
        {
            glCallList(rlist[0]);
        }
        break;

    case 'D':
        if (rlist[1] == 0)
        {
            rlist[1] = glGenLists(1);
            printf("rlist[1] = %d
", rlist[1]);
            glNewList (rlist[1],  GL_COMPILE);
            for (i = 1; i < LSIZE; i++)
            {
                glBegin(GL_LINES);
                glVertex3fv(lineData+3*(i-1));
                glVertex3fv(lineData+3*i);
                glEnd();
            }
            glEndList();
        }
        else
        {
            glCallList(rlist[1]);
        }
        break;

    case 'v':
        if (vlist[0] == 0)
        {
            glGenBuffersARB(1, vlist);
            glBindBufferARB( GL_ARRAY_BUFFER_ARB, vlist[0] );
            printf("vlist[0] = %d
", vlist[0]);
            glBufferDataARB( GL_ARRAY_BUFFER_ARB, LSIZE*3*sizeof(float), lineData, GL_STATIC_DRAW_ARB );
        }
        else
        {
            glEnableClientState( GL_VERTEX_ARRAY );
            glBindBufferARB( GL_ARRAY_BUFFER_ARB, vlist[0] );
            glVertexPointer( 3, GL_FLOAT, 0, (char *) NULL );
            glDrawArrays( GL_LINE_STRIP, 0, LSIZE );
            glDisableClientState( GL_VERTEX_ARRAY );
        }
        break;

    case 'V':
        if (vlist[1] == 0)
        {
            glGenBuffersARB(1, vlist+1);
            glBindBufferARB( GL_ARRAY_BUFFER_ARB, vlist[1] );
            printf("vlist[1] = %d
", vlist[1]);
            float *tmp = new float[2*(LSIZE-1)*3];
            unsigned long offset = 0;
            for (i = 1; i < LSIZE; i++)
            {
                memcpy(tmp+offset+0, lineData + 3*(i-1), 3*sizeof(float));
                memcpy(tmp+offset+3, lineData + 3*(i), 3*sizeof(float));
                offset += 6;
            }
            glBufferDataARB(GL_ARRAY_BUFFER_ARB, 2*(LSIZE-1)*3*sizeof(float), tmp, GL_STATIC_DRAW_ARB );
            glFlush();
            delete[] tmp;
        }
        else
        {
            glEnableClientState( GL_VERTEX_ARRAY );
            glBindBufferARB( GL_ARRAY_BUFFER_ARB, vlist[1] );
            glVertexPointer( 3, GL_FLOAT, 0, (char *) NULL );
            glDrawArrays( GL_LINES, 0, 2*(LSIZE-1) );
            glDisableClientState( GL_VERTEX_ARRAY );
        }
        break;

    default:
        break;
    }
	
    {
        OrthoMode o;
        glColor4ub(255,255,255,255);
        if (dhelp)
			drawHelp();
        if (ddmode)
			drawDrawMode();
        drawFPS();
    }

    glutReportErrors();
    glutSwapBuffers();
}

Hi !

Most modern OpenGL hardware is pretty crappy at rendering lines because they are optimized for triangles (games).

But you should get better performance with displaylists for example, but I guess if you have a very large number of vertices it might not fit in the video memory, otherwise display lists should be faster then feeding OpenGL with the vertices with glVertex…

Mikael

You should see noticeable difference when using VARS or VBOs, I find it strange that you’re not.

If what mikael says is true, you can always double your first vertex and render your lines as triangles…see if that improves performance.

On a board with 128MB of RAM you can upload ~ 11 million vertices…I wouldn’t worry about running out of storage for verts :stuck_out_tongue:

If what mikael says is true, you can always double your first vertex and render your lines as triangles…see if that improves performance.
OK, I tried that. Using a vertex twice to draw a line as a triangle does not work. Nothing will be rasterized. :smiley:

I’ve created a very thin triangle strip based on the line strip’s geometry, but that’s not faster. Really strange.

You should see noticeable difference when using VARS or VBOs, I find it strange that you’re not.
Something must be wrong with my GLUT-program. Even this triangle strip won’t be rendered faster when I’m using display lists or VBOs.

It can’t be that difficult. :frowning:

Hi,
On my FX 5900, using vertex arrays and having all lines visible (not clipped), I’m getting
about 10.7 million antialiased lines / second.
For the aliased lines, about 85 million / second ( 4 million vertex line strip drawing test ).
I am only drawing 2D geometry, it may be different for 3D stuff…
I believe you should get better line drawing performance from a 6600 GT than a 6800 LE.
At some point, if you are using aliased lines and a very fast card, you would become
bus-limited on AGP, so then you should definitely use vertex buffer objects.

Originally posted by Aeluned:
You should see noticeable difference when using VARS or VBOs, I find it strange that you’re not.
Not necessarly true if not CPU or bandwidth limited.
I once read that there’s quite a bit of driver work when rendering lines because the driver will tassellate them into triangles.
So, a line is a quad (2 tris?) and vertex count bombs out!
I read this on OpenGL forum. I still don’t believe this too much but I think you may find this interesting.

Hi,
AFAIK the lines would be rendered as quads only once they bypass a certain width.
I am familiar with the AA rendering of line and quad strips on Ti 4200 and FX 5900,
and the quality of the lines is much better than of the equivalent quads, so a somewhat
different algorithm is used. I beleive a Quadro FX 3000 - which is pretty much like
a GeForce FX 5900 - does render AA lines 2.5 - 3 times faster than the latter, please
feel free to confirm or not this opinion (the driver uses a superior algorithm for Quadro,
and a lesser one for GeForce, although the hardware is present on both chips).
Frankly, for drawing over 100,000 lines, it doesn’t make sense to use thick aliased
lines, but the starter of the topic didn’t specify the aliasing status. For AA lines, the
numbers he mentioned are reasonable though.