Improving performance

Hi, I need to build a program which render as many quads as possible and I am not happy with my best solution at the moment…

Well, i am using lists and my structure is the following:

for(each quad)
glNewList();
glBegin(GL_QUADS);
glVertex3f(-size, -size, 0.0);
glVertex3f( size, -size, 0.0);
glVertex3f( size, size, 0.0);
glVertex3f(-size, size, 0.0);
glEnd();
glEndList();

Execution:

for (each list){
glPushMatrix()
glTranslatef()//Each quad has its position in the 3D world
billboardCheat()//All quads are looking to the camera and it affects to the modelview matrix
glDisplayList()
glPopMatrix()
}

I wish I could put all the quads in only one list inside one glbegin. Do you see any way to improve it?

The only way I can think of is to do the billboardCheat within the vertex shader.

If possible, make one display list for all the quads, instead of one per quad, as it seems you are doing now.

// Old
//
for(each quad)
glNewList();
glBegin(GL_QUADS);
glVertex3f(-size, -size, 0.0);
glVertex3f( size, -size, 0.0);
glVertex3f( size, size, 0.0);
glVertex3f(-size, size, 0.0);
glEnd();
glEndList();

// New
//
glNewList();

glBegin(GL_QUADS);
for(each quad)
{
glVertex3f(-size, -size, 0.0);
glVertex3f( size, -size, 0.0);
glVertex3f( size, size, 0.0);
glVertex3f(-size, size, 0.0);
}
glEnd();

glEndList();

I.e. reduce the number of glBegin/glEnd calls, since these are most likely stored in the display list along with your vertex information.

Thank ZbuffeR, your idea is possible. I could apply billboardCheat in VS and the translatef directly over the glVertex3f and then I could put all quads inside a glbegin and a list.

I have heard about vertex arrays and vertex buffer object(but i havent used them), are them or any other way faster than one list with only one glbegin for all vertices?

It will depend on the implementation, for example Nvidia drivers seem unbeatable at optimizing the display list.
But trying to master VBO is a good idea, as it also allow easier update of dynamic geometry.

It can easily be done with the use of a VS and vertex arrays.
You start by sending your quads as a vertex array - sending the same position 4 times (each vertex of the quad). The clever bit is sending a vec3 as the texture coordinate - the last float is set as 0,1,2,3 0,1,2,3, … 0,1,2,3 and is used to determine which vertex edge the VS is processing.
As a uniform, we send the VS a modified modelview matrix which is then accessed using the 3rd texture cordinate to obtain the camera aligned vertex for the quad.
The result is an array of camera aligned quads. I the rendering loop, either send in a translate position as a shader uniform, or add the translate position to the centre position of the quad and bake into the vertex array. The choice depends upon whether the quads will move - eg particles of smoke.

On a similar note, can display-lists be created by specifying vertex-arrays and indexes, instead of a million glVertex3f/etc ?

You could use point-sprites, available in OpenGL core since 2.0, and via extensions GL_ARB_point_sprite or GL_NV_point_sprite from version 1.4 onwards.

You would just need to provide the positions, rather than quads.

example code (Delphi):

var
params: array[0…2] of GLfloat = (1,0.0,0.02);
begin
glEnable(GL_POINT_SPRITE);
glPointSize(32);
glPointParameterf(GL_POINT_FADE_THRESHOLD_SIZE, 20);
glPointParameterf(GL_POINT_SIZE_MIN, 0);
glPointParameterf(GL_POINT_SIZE_MAX, 128);
glPointParameterfv(GL_POINT_DISTANCE_ATTENUATION, @params);
// to flip the sprite upside down (openGL 2.0+ only)
//glPointParameteri(GL_POINT_SPRITE_COORD_ORIGIN, GL_LOWER_LEFT);

glTexEnvf(GL_POINT_SPRITE,GL_COORD_REPLACE,GL_TRUE);

GLMaterialLibrary1.ApplyMaterial(‘snowflake’, rci);

// Fill the vertices list
vertices.Clear;

// For every particle, add to list:
vertices.Add(aParticle.Position);
vertices.Add(aParticle2.Position);

glEnableClientState(GL_VERTEX_ARRAY);
glVertexPointer(3,GL_FLOAT,0,vertices.List);
glDrawArrays(GL_POINTS,0,vertices.Count);
glDisableClientState(GL_VERTEX_ARRAY);

GLMaterialLibrary1.UnApplyMaterial(rci);

end;

Forgive me, but aren’t point sprites limited in size?
On most NV hardware that I’ve used point_size max = 63 so that does not make for very large quads.
Good solution for small particles though.

Plus, point-sprites are limited to squares. Trees, poles, etc can’t be done as efficiently with them.

If you have Geometry shaders you can adopt a similar method to which BionicBytes mentions, and send one central point per QUAD, and have the Geometry shader emit the vertices for the QUAD you want as a Triangle Strip.

This effectively gives you ‘point sprites’ with no size limit.