glCallList very slow

I have a model with about 10,000 triangles. I put it into a display list and then display it 100 times, each time with a different translation. The problem is, this takes about 7 seconds!! It is horribly slow. This seems like a super small model (as things are usually measured in million of polygons / second, right?), so I don’t know what the problem is. I measured the time with a timer class and everything was super fast except the glCallList calls.

Any thoughts?

Thanks,

Dave

Do you have accelerated opengl ?
Check glGetString(GL_VENDOR); and also GL_RENDERER, GL_VERSION. If it is ‘miscrosoft’ something, it is not hardware accelerated.

i also discovered that binding the OpenGL.lib instead of OpenGL32.lib will switch to this microsoft driver. had a hard time to figure out what the problem was.

Here is the output

Vendor: NVIDIA Corporation
Renderer: Quadro FX 3450/4000 SDI/PCI/SSE2
Version: 2.1.2 NVIDIA 173.14.12

does that look good?

that looks fantastic.
download glviewer and start some rendering tests or any sample using gl lists.
http://www.realtech-vr.com/glview/download.html

btw. here is a small benchmark doing immediate calls, display lists or vertex arrays.
http://www.codesampler.com/oglsrc/oglsrc_9.htm#ogl_benchmark_sphere

I am running linux… is there any linux version of that or something equivalent??

for sure but don’t know any. can’t you check it with wine?

Vendor: NVIDIA Corporation
Renderer: Quadro FX 3450/4000 SDI/PCI/SSE2
Version: 2.1.2 NVIDIA 173.14.12

It looks fine and your code should be hardware accelerated. On linux, there is a quick test that is $glxgears, give us the fps value… but anyway this is not very usable.

Show us how you build and use your display list, how do update “your translation” ?

glxgears says:

58538 frames in 5.0 seconds = 11707.478 FPS

Here is my drawing function:


	
void ModelFile::DrawTriangles(const bool Normals, const bool Named) const
{
	bool HasColors = false;
	if(NumColors() > 0)
		HasColors = true;
	else
		glColor3ub(100, 100, 100);
	
	if(Named)
		glPushName(1);
	
	if(NumTriangles() > 0)
	{
		//cout << "Drawing " << NumTriangles() << "Triangles with " << NumColors() << " colors." << endl;
		
		for(int i = 0; i < NumTriangles(); i++)
		{
			if(Named)
				glLoadName(i);
			
			if(HasColors) //color!
			{
				geom_Color<unsigned char> cu = Colors_[i];	
				glColor3ub(cu.getR(), cu.getG(), cu.getB());
			}
				
			Triangles_[i].Draw(Normals);		
		}
	}
	else
	{
		DrawPoints();
	}
}

I make the display list with :


int list;
	list = glGenLists(1);
	glNewList(list,GL_COMPILE);
	Scene.DrawTriangles(true, true);
	glEndList();

and then I call it like this


glMatrixMode( GL_PROJECTION );
	glLoadIdentity();
	gluPerspective(50, 1, 1, 100); //field of view angle, aspect ratio, znear, zfar
    	
	glMatrixMode( GL_MODELVIEW );
	glLoadIdentity();
	//gluLookAt(2, 2, 10, 0, 0, 0, 0, 1, 0);
	gluLookAt(Location_.getX(), Location_.getY(), Location_.getZ(), P.getX(), P.getY(), P.getZ(), 0, 1, 0);
	
	//display(Model);
	display(list);


The camera location is simply updated between calls.

Is there any reason this would be so slow?

Thanks,
Dave

mmm can you also post the Triangles::Draw(bool) function?
I’m afraid that you are calling the glBegin / glEnd for every triangle… :expressionless:

you are correct - I assume that is very bad by your frowning face??!!

I’ll take that out and let you know what happens.

Thanks,

Dave

Sure, glBegin/glEnd for each triangle is slow in immediate mode, but being compiled in a display list should make it fast anyway.

However, what is in display(list) ?

Haha sorry, forgot to include that.


	glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
	glEnable(GL_DEPTH_TEST);
	//Model.DrawTriangles(true,true);
	glCallList(list);
	SDL_GL_SwapBuffers();

removing those begin() and end() calls from each triangle seemed to speed it up significantly, but it still takes 2 minutes to draw the model 16,000 times - still seems like way too long to me.

Dave

glEnable(GL_DEPTH_TEST); should be done only once, not in the render loop. This can cause the driver to dirty some caches. In a general way, you can not benchmark GPU graphics the same way as CPU programs.
Some tools to trace GL calls, so that you can spot redundant calls etc :
http://www.opengl.org/sdk/tools/

16000/120 = 133 fps

  • 10000 triangles : 1.3 million tris/sec

Not that bad, but I don’t have any idea of what to expect of this quadro performance.

What display resolution ? Do you happen to have dual-screen display ?
Try with a small window (ie 64*64 pixels) to see if there is an important speed change : that would mean your case is not limited by the geometry, but by the rasterization. Texturing, blending, shaders, anisotropy, non-mipmapped textures, antialiasing …

when I run glxgears, I get:
54337 frames in 5.0 seconds = 10867.398 FPS

These are the kind of numbers I would like to see! But I guess that is not drawing nearly as many triangles…

I do have dual screens, but I’m just using a 640x480 window on one of them.

WHOA - i just tried with a 320x240 window and it takes 10 seconds instead of 2 minutes!! So why is that such a huge difference?? I am not doing anything with lighting/texturing etc - what I’ve shown here is the only opengl that I’m doing.

Thanks,

Dave

glBegin / glEnd should add some overhead even in the display list.
Do you really needs the glPushName and glLoadName?

ps: Why you want to reach 10.000 fps? The refresh rate of the display is limited to 75/80Hz.
In order to get an higher fps you should use more advanced technique like VBO, simple shader and optimize your mesh for cache hit.

the names are the entire reason I am doing this. I am not interested in actually seeing it, I am just using opengl to do the clipping and tell me which triangle is in the center of the image, because that will be the first triangle that would have been hit by a ray travelling in the direction that the camera is looking.

I need it to go fast because I need to determine this triangle for millions of rays, and I dont want to wait for a year! haha

Well, GL names are not hardware accelerated !

  1. you dont need a large window, a 1x1 pixel will be enough
  2. instead of names, a better way is to specify a unique color per triangle, RGB8 can provide 16.7 million of unique ids. draw scene, then glReadPixels the pixel(s) of interest.

Edit: to avoid being bitten by slow path, read the Appendix E, and try to avoid deprecated features :
http://www.opengl.org/registry/doc/glspec30.20080811.pdf

You can keep display lists, they are fast especially on nvidia hardware.

so glColor3f would be much much faster than glLoadName ?

The explanation is more complex than that. But yes you will get much better performance.