create display lists in another thread

Hi,

I create display lists using glNewList() for a rendering context (say RC). Since the whole graphics caching process takes a very long time, I am trying to move it to another new thread (or process). Is it possible? glNewList() must be called when the RC is current. But the RC is in the original thread. If the RC must keep current for the whole graphics caching process, then we lose the purpose. Could somebody help me out? Any help is greatly appreciated.

By the way, I am working in an MS windows environment.

JD

Create a second rendering context RC2 which is “sharing lists” with RC (via wglShareLists() ). You can now use RC2 in the second thread to create the display lists. When you are finished, signal your main thread, which then can safely use the lists in RC.

Issue is, as skynet alludes to, you can’t be doing something productive with OpenGL/GPU in your main draw thread while you’re using OpenGL to build display lists in the background thread.

You can however overlap display list building with some non-OpenGL-call related CPU work via threading.

However, in this case, there’s no reason to build display lists in a background thread. Use your main “GL” draw thread, and put the non-OpenGL-call CPU-heavy tasks in another thread. Might be simpler.

Hi Skynet,

Thanks for the answer. I copy the document for wglShareLists() below. It seems that there is a limitation for list sharing across processes. Maybe it’s ok to share lists across threads within the same process?

“You can only share display lists with rendering contexts within the same process. However, not all rendering contexts in a process can share display lists. Rendering contexts can share display lists only if they use the same implementation of OpenGL functions. All client rendering contexts of a given pixel format can always share display lists.”

JD

Hi Dark,

I don’t quite understand what you meant by “not productive”. Maybe I have to explain my purpose more clearly. Basically, graphics caching will be used for an animation. I want it to be a background process because the main thread can be freed for user to access other features until he/she is notified by the 2nd thread that the animation is ready to go. So user has some others things to do while he/she is waiting for the animation.

I understand your suggestion about shifting non-OpenGL task to another thread. But, in my case, graphics caching alone is already time consuming enough to frustrate user.

Thanks for any further help.

JD

@Dark Photon

He didn’t tell what he actually wants to put into these DLs. I just tried to make up a way to create a DL in a second thread and share it with the main context. And since GL_COMPILE should not actually execute the commands, they should not interfere with the rendering(or what else he is doing with GL) in the main thread.

@iyoung77
Yes, you can share lists between contexts that are each bound to different threads within you application.

If you actually implement what’s been suggested, would you please come back later and report how it worked out?

Ok, I will try and then come back to report. The 2nd context is a dummy one and hence it can be a bitmap or anything other than a real windows, right?

JD

I don’t quite understand what you meant by “not productive”.[/QUOTE]
By that I mean everything I’ve read states, with a single GPU, there’s little to no benefit from trying to access that GPU via OpenGL simultaneously in multiple threads. To access GL simultaneously in multiple threads, each needs its own GL context, and GL context switches are expensive. Also, AFAIK there is not a defined subset of the OpenGL API that is guaranteed CPU-only with no per-GPU state locks and thus guaranteed to CPU thread very efficiency against another thread that’s also making OpenGL calls to a GL context bound to the same GPU at the same time.

For instance, check out the Equalizer Parallel OpenGL FAQ.

The only exception I’ve seen to this is that some folks get great results from mapping GL buffers (glMapBuffer) in the GL thread, passing the mapped CPU memory pointers to a background thread, and doing the fill (e.g. memcpy) of that mapped buffer in the background thread. This saves time in the GL thread where it’d otherwise be blocked waiting on that memory fill to complete (which can take several ms for a couple MB of data) and can therefore push more GL commands down the pipe during this time. When complete, a message is sent back to the GL thread to unmap the buffer. Note that this background thread doesn’t need a GL context, and this is perfectly valid as the background thread makes no OpenGL API calls. It’s just shuffling data into CPU memory blocks.

But in the same breath, I’ll caveat that this is all second-hand info. I personally haven’t tried to doing multithread fighting over a single GPU. So please do report back with your results – many will be interested – but read that FAQ above carefully to save you being stumped when if you try something you just get a crash out of it.

Could be. Worth a shot anyway. The BG (background) thread needs its own GL context to tickle GL at the same time as the FG (foreground) thread, so I guess the question is…

…with each thread having its own GL context bound:

  1. can the BG thread call glNewList( GL_COMPILE ) … glEndList() and everything in between via the GL API and this will never trigger any per-GPU context locks that’ll starve out the FG thread’s GL stream, and

  2. will the GL context switching overhead instigated between every task switch between FG and BG GL threads (each having their own GL context referring to the same GPU) be tolerable, or at least far less than the time needed for compiling display lists in the foreground single-thread

…on any vendors GPU/driver combination. And if this works, is it defined or undefined behavior?

Undefined me thinks, which is the unsettling part. Kinda like reading from and writing to the same texture simultaneously in the old days. Will it work? …maybe

Perhaps it is time to switch to vertex buffer objects… but that’s just a suggestion, it is a bit off topic.

No, quite relevent IMO. If we had the definitive “here’s how to make VBOs run as fast as our display lists” whitepaper from NVidia then heck, no sense in even considering display lists (except when you can pre-compile all you’ll ever need on startup, which we and the original poster apparently can’t, and even then only because DLists are super simple to build and use).

Relevent thread link