PDA

View Full Version : help! osx performance incredible slow vs windows



jameswh
06-23-2010, 04:06 AM
Hi

I've recently started on a mac port of a windows. The app is using opengl for rendering of a 2D interface (featuring lots of antialiasing, alpha blending etc).

I started on the windows version first, wrongly assuming the OSX version would be a breeze because of Apples apparent tight-integration of OpenGL.

However, the time taken to render the scene on OSX is incredible slow compared to the windows build. On Windows debug build, its less than 0.5ms. On OSX (10.5 with Nvidia drivers) its taking around 5-7ms. (in both cases i'm timing before the swapbuffers call). I've heard that Nvidia drivers are bad on OSX, but I find it hard to believe this is the only explanation for this huge difference. (This is on the same machine with bootcamp!)

Im totally lost, does anyone have any ideas or tips, know of common pitfalls etc?

ZbuffeR
06-23-2010, 05:39 AM
Read this for starters :
http://www.opengl.org/wiki/Performance
(it looks like you do not measure time correctly)

And avoid doing perf test with debug build, that can activate a lot of unnecessary checks in executable, each compiler doing it differently.

jameswh
06-23-2010, 06:01 AM
Thanks for the reply

Perhaps im confused, but I wasn't really interested in how long the graphics card takes to render all the data. I specifically want to compare the time taken on the CPU to *prepare* the scene, before the swapbuffers. (the scene is relative simple compared to a modern video game).

So what is freaking me out is that the same block of code (essentially a bunch of calls to glCallLists with a bit of immediate-mode stuff) is taking approximately 10x as long on OSX as on Windows. As a result, the app shows almost no activity on Windows Task Manager, but shows CPU of around 20% on OSX Activity Monitor. And its not really any better in Release build.

It simply seems like the glXXXX calls are just taking much longer on OSX. This is what is confusing me, as I thought the whole advantage of OpenGL is that the calls are supposed to be asynchronous? (as stated in the link you mentioned!)

Of course I expect some differences between the call overhead across platforms, but the factor im seeing is just totally beyone anything that would seem reasonable. Like the doc says, OpenGL is pipelined based, and it seesm strange that loading the pipeline is so much more slow on OSX.

Am I misunderstanding something here?

ZbuffeR
06-23-2010, 07:14 AM
Perhaps im confused, but I wasn't really interested in how long the graphics card takes to render all the data. I specifically want to compare the time taken on the CPU to *prepare* the scene, before the swapbuffers. (the scene is relative simple compared to a modern video game)
This is flawed as slow scene preparation does not mean much if rendering time is faster.

The rest of your post is correct though, higher cpu average usage is a signal.

The GL system on a Mac is quite different from win/linux, as Apple prefer implement themselves good parts of it. Maybe you hit some slower part due to format conversion or whatever.

Sorry I can not help you much, that touches the limits of my Mac knowledge.

jameswh
06-23-2010, 01:55 PM
Ive done further measurements, and its definitely seems the case that all the glXXXX calls are just taking way longer on OSX then on windows/bootcamp. The time taken to call glCallLists also seems to depend on the complexity of the lists, so it seems its not optimised by HW at all, maybe even the list is being unrolled somewhere on the CPU.

Having looked it up on the web, there seems to be some reports that NVidia drivers are totally unoptimised on OSX. Which is really depressing...

On the positive, I have some ideas for workarounds to "compile" static parts of the GUI into textures on the fly. I need some advice on how to do this so I will make a new post in advanced forum - any tips would be really appreciated!

Its days like this when I really hate coding :(

Groovounet
06-25-2010, 10:30 AM
Hi,

Display lists are not a hardware feature but a software feature that optimized your calls to make the glCallLists more efficient than the same command list called independently which save CPU. Yes in some sorted, the display lists are unrolled. The quantity of optimization is dependent on the implementation and chances are that Apple which doesn't seam to care enough about OpenGL, did some lazy work.

nVidia is not to blame because actually they just implement a sub layer of the driver. The rest is Apple work and I quite have the feeling that both AMD and nVidia would be really happy to build the entire drivers. Chances are that because of the Apple layer and the nVidia / AMD layer, this make every calls slowers than on Windows where they can take short cut (or avoid long ways?).