multithreaded OpenGL WTF?

Am I the only one irritated by this so-called “Multithreaded OpenGL” Apple is talking about? This awesome new “technology”, of course only available on Mac OSX. :confused:
It seems Steve Jobs has invented a new breed of graphics cards which are capable of doing this, while still using the manufactures hardware reference designs…
:wink:

And no more info on the provided link. Cheap PR stunt from Blizzard and Apple :stuck_out_tongue:

There is information available… like this developer technote: http://developer.apple.com/technotes/tn2006/tn2085.html

Basically, it offloads the CPU work associated with making an OpenGL call (allocation of driver resources, pixel format conversion, etc.) to a secondary thread. Obviously, that means that the application as a whole is doing more work (thread synchronization and data copying + all the work it was already doing), but if the second CPU was otherwise under-utilized, the quicker return of the OpenGL call on the main thread can improve performance.

If your application was already CPU-bound on all CPUs, you will lose out with the multithreaded GL. That’s why it’s opt-in; applications have to explicitly enable it if they feel they will benefit from it. Basically, it’s an easy way for a developer to get a multithreaded application out of a single-threaded one; if you’ve already done the work of threading your app well you’ll likely gain nothing from it.

It’s pretty buggy in 10.4.8; WoW must be doing things pretty strictly on whatever clean path Apple has provided to not be crashing.

Thanks for the relevant link and precisions.

Thanks for the info. That is exactly what I was expecting…
This has nothing to do with OpenGL, however. It’s just a framework for multithreaded applications with support for OpenGL in a single thread. It’s nothing new at all!
But maybe it is for mac users. I am not familiar with the OSX API.

Since part of the OpenGL framework’s workload has been parallelized, OpenGL calls will return control to your application sooner and more CPU cycles will be available for its main thread.
Parallelism means execution in parallel, not waiting for execution in parallel, what it actually is.

I think the phrasing is misleading (intentionally so) and agree with ZbuffeR, just a cheap PR trick.

Don’t recent drivers from nVidia and ATI do this already? I think i read somewhere, that they already utilize multithreading, but i might be wrong.

Jan.

I did some simple tests yesterday on my X1900 and the result really surprise me. I attached two threads of the same process to the same hDC (I called wglCreateContext in both thread) and it works. The geometry rendered from both thread is combined in the same back buffer. My test application is very simple (I just render two quads without depth test), so Im wondering if it will always work or I was just lucky.

In the MSDN documentation they say:

“An application can perform multithread drawing by making different rendering contexts current to different threads, supplying each thread with its own rendering context and device context.”

In my test app I have different rendering context, but the same device context for both thread so I don’t follow the spec. There’s a lot of information on the internet about many new effects we can do with the programmable hardware but only few information about multithreaded rendering.

Originally posted by def:
This has nothing to do with OpenGL, however. It’s just a framework for multithreaded applications with support for OpenGL in a single thread. It’s nothing new at all!
Um, I’m not certain, but I think you may have misinterpreted. When you opt in, every GL call submits work to another thread. It has nothing to do with anything not GL, and indeed, couldn’t work with most other APIs where synchronous execution is required.

[quote] Since part of the OpenGL framework’s workload has been parallelized, OpenGL calls will return control to your application sooner and more CPU cycles will be available for its main thread.
Parallelism means execution in parallel, not waiting for execution in parallel, what it actually is.
[/QUOTE]There is no waiting involved… you make a GL call, it starts executing asynchronously on the other thread some time later. You wait only for the work required to transfer the call to the other thread, not for any of the work that the call makes.

I think the phrasing is misleading (intentionally so) and agree with ZbuffeR, just a cheap PR trick.
I think the phrasing succinctly captures what it’s doing, and if an 80% performance improvement for WoW is a “cheap PR trick”, I’ll take a cheap PR trick any day of the week :wink:

If your GL calls are lightweight, meaning the command just needs to get to the GPU, then there is no need for a second thread.
I don’t know why but other libs like OpenAL seems to create another thread.

What evidence is there that there is benifit?

AFAIK OpenAL does its audio-processing in the second thread. That makes it independent from the main-app, so that it can process audio-data at the necessary updated-rate, no matter how fast the main-app runs.

I think that’s a big difference, what the threads are used for, because the audio processing actually needs to be continued over the next frames.

So i don’t think one can compare the two use-cases.

Jan.

Originally posted by Jan:
[b] Don’t recent drivers from nVidia and ATI do this already? I think i read somewhere, that they already utilize multithreading, but i might be wrong.

Jan. [/b]
I’ve heard this too, it would be good if NVidia/ATI could tell us how it compares to the Apple OpenGL multithreading, and, if it is significantly different, when we will we see it on the PC (if ever). If its an OS related feature perhaps Vista will allow NVidia/ATI to implement something similar? Hopefully this will all be sorted out by the time I get a quad core :slight_smile:

I don’t know how audio works exactly, but isn’t it like GL? Make a sound BO and tell the system to play it, the system being the sound card. Or maybe the sound mixing is done by software and that’s the need for a thread.

I remember someone said he had a dual core 64 bit CPU and WinXP 64. nVidia spawned a thread that uses 100% of second CPU even when he wasn’t doing anything.

What evidence is there that there is benifit?
Evidence? An 80% performance improvement in WoW isn’t good enough for you?

Although it does change the performance cost of various GL API, other behavior should not change. It does not allow an app to submit commands to the same GL context from multiple threads at once. An app is, of course, free to use other threads for it’s own work, as long as each context is only used from a single thread at once.

The performance boost comes from the state validation within GL occuring within another thread, allowing it to run in parallel with your code. GL calls effectively return earlier, allowing your code to continue.

hmm, does it mean that for code like:

glDrawElements(.........);
glGetError();
sleep(5);
glGetError();

The first glGetError can return NO_ERROR because it hasn’t seen the error generated by DrawElements, and the second glGetError can return an error code?

Nope, it’ll still work. However, the first glGetError will likely be expensive, as it must wait for any outstanding commands to complete. It isn’t a full glFinish (as you don’t need to wait for the HW to do software-side work), but it will hurt.

Basically, you should be able to just flip the switch on for any app, and the results should be unchanged from a correctness point of view*. The app might actually be slower in the case that OneSadCookie describes (hence the need for many apps be modified to avoid constructs like this), but it should still be correct.

If you’ve observed otherwise, you really need to file a radar.

*Assuming the app is correctly written. If an app relies on undefined beavior, then ‘undefined’ very well may have changed. This is especially true for apps that may not get their fencing quite right when using extensions such as APPLE_flush_buffer_range. It is especially especially true for apps with latent threading errors (ie, trying to re-enter a GL context), even if you were getting lucky before.

Hmm, I always thought the underlying hardware was responsible for OpenGL being single threaded…
If “Multithreaded OpenGL” means I can do CPU work 80% faster (WoW), that’s great, but OpenGL is still the same as before.
Who is saying that actual raw OpenGL performance is getting better through Multithreaded OpenGL? Raise your hands, please. :wink:

You’re right, of course, multithreaded OpenGL won’t let you draw more polygons with more complex shaders or anything. All it does is spread the CPU load (and in fact, increase it slightly).

Still, if that makes it easier to make a fast game, who’s to complain? Nobody ever claimed that this was magic :wink:

Well, just for the record, there’s a multithreading switch in the latest nVidia drivers for WinXP (only present, if you are running on multi-core CPUs), but it actually made our app run a lot slower, so we had to turn it off! YMMV…

EDIT: Correction! After reading the article, I’ve checked if we had any glGetError() in the code, and we did, so I #ifdef-d them out, and now our app runs okay with the multithreaded optimization. I say okay, because it’s not any faster (our app is already threaded, and is not really CPU limited), but it makes the framerate a bit jerky (even when locked to vsync!), so I still keep it turned off.

There is also a flag “Generate/Log OpenGL errors” (or something similar; written from memory) in the driver settings (just above the MT-flag). What does it generate and where can I find the logs?