Abysmal performance w/ many polygons on recent Macs

Hi! I’ve been having trouble with performance with my OpenGL app (Cn3D). It’s gotten worse each time I’ve upgraded my Mac and the OSX version. The rendering is very noticably slower on 10.3 on my G4 than it was on 10.1 on a G3. I have not made any changes to the rendering engine in a couple of years; even with the same binary, performance is almost degraded to the point of uselessness on new Macs.

The problem seems to be occur during the creation (not the display) of large display lists, e.g. with objects that have ~500,000 polygons. That is, anytime I have to redraw the object(s), it takes up to several minutes. On a PC or unix (I do solaris, linux, irix), it only takes at most a few seconds, usually much less. Once the object is drawn (display lists are created), then performance is fine, in terms of live rendering and manipulation of the lists. It’s only the redrawing that’s taking forever.

Does anybody have any idea why this might be happening, in particular with newer Macs, or tips to improve performance? (Aside from the obvious reducing the number of polygons, I mean :wink: - I’m doing that, too) I’m at a loss here, and people are definitely starting to complain of the unusability of the app on Macs nowadays.

Thanks for any advice, and let me know if I’ve left out any pertinent info.

  • Paul

Okay, let me amend my statement slightly. Doing more careful profiling, I see the code between glNewList and glEndList is going plenty fast; that is, constructing the display lists isn’t the bottleneck.

It’s the first time glCallList is called on the newly created display lists that things are really, really slow. After the first time, subsequent calls to glCallList go at a decent speed (~hundreds of times faster than the first time).

So, the question is, what’s happening the first time a display list is executed that’s taking so much time? I assume the underlying drivers/hardware are optimizing the lists somehow? Is there some way to control from my code what’s going on behind the scenes at this time?

Any suggestions?

Yes, Apple optimizes the display list during compilation, and this can potentially be slow. In my game, compiling very tiny (<500 vertices) lists occasionally takes longer than 1/60 second which makes me drop frames. :frowning:

The only recommendation I’ve seen is to ensure you are specifying homogenous data all the time-- that is, if you give vertex/color/normal for some data in a begin/end block, do it for all of that data.

In any event you should log a bug in Apple’s RADAR with an example application so they can work on improving the speed.

I can second the recommendation to send uniform data during display list compilation. In one application I was compiling a display list of about 10000 uniform-colored hexes with code like this:

NewList
    Begin(TRIANGLE_FAN)
         Color
         Vertex
         Vertex
         Vertex
         Vertex
         Vertex
         Vertex
         Vertex
    End
EndList

That list was taking several minutes to compile on my dual 867 G4. Changing the code to look like this:

NewList
    Begin(TRIANGLE_FAN)
         Color
         Vertex
         Color
         Vertex
         Color
         Vertex
         Color
         Vertex
         Color
         Vertex
         Color
         Vertex
         Color
         Vertex
    End
EndList

(where all the color calls set the same color) made display list compilation nearly instantaneous (< 2s) on the same machine.

Ah, excellent suggestion, thanks! Yes, that helps quite a bit. That takes the overall redrawing time from ~200 seconds down to ~5-20 seconds. Of course, it takes a lot longer now to construct the displays lists, because in my case I have objects of largely solid colors, so specifying a color for every vertex is extremely redundant. So, now the time is spent mostly between between glNewList and glEndList; even the first call to glCallList now only takes <1 sec.

So, this is a big improvement. I’m still not getting the performance I do on an equivalent PC, but it’s a big help.

Thanks again!

One more point - there is a downside to “homogenizing” the data this way, at least in my app (again, with hundreds of thousands of polygons). True, now at least display list creation/optimization is not taking nearly so long, but “live” performance is way down. That is, doing real-time manipulation of the object (change matrix, redisplay - call glCallList) as the user drags the mouse around (e.g. to rotate) is much, much slower with “homogenized” data, I guess because the display lists are much longer - many more glMaterial calls in my case. Oddly, though, the program uses less system memory with the “homogenized” data.

I haven’t benchmarked this, but if you only have to deal with Color/Vertex, another idea might be:

NewList
    Color
    Begin(TRIANGLE_FAN)
         Vertex
         Vertex
         Vertex
         Vertex
         Vertex
         Vertex
         Vertex
    End
EndList

Since color is a state change.

Yeah, that’s actually more or less what I originally had (although I do normals, too), so I’m guessing that won’t make much difference. My (very slow) display lists were composed of many triangle strips, interspersed with a few calls to glMaterial (outside glBegin/End).

Now my next problem is to test what happens if I use lots of gluCylinder and gluSphere objects, whether those suffer from the same problems, and if so, how I can fix that, since there I have no control over per-verted calls…

Immediate mode calls on the Mac are very slow, due to large function call overhead.

If you don’t mind a little bit of mac-specific code, you can use AGLMacro/CGLMacro to reduce that overhead. In my tests, it makes the calls take about half as long as usual.

You could also use vertex arrays and DrawElements to avoid that overhead, though be aware that Mac OS X 10.3.0 - 10.3.3 crash when compiling a DrawElements or DrawArrays call into a display list.

Of course, the GLU quadrics won’t take advantage of this technique, so if you’re relying on them, you won’t get the advantage. You could, of course, grab the GLU source code from Mesa and CGLMacro-ize them :slight_smile:

Can you get away with enabling GL_COLOR_MATERIAL and using glColor to change the material property, or do you need more control than that affords?

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.