PDA

View Full Version : renderer in separate thread



karx11erx
05-29-2008, 07:48 AM
I have tried to put my rendering code in a separate thread, but it doesn't work. I suppose that's because the thread doesn't know the OpenGL rendering context (which got created beforehand in the main app). Is there a way to get this to work w/o having to create another context in the thread - just let the thread and the app share the same context? How?

Zengar
05-29-2008, 08:21 AM
OpenGL context is bound to a thread; you cannot call OpenGL operations from another thread. With other words, all OpenGL operations on a single context have to be called from the same thread that used to make the context current (with wglMakeCurrent)

karx11erx
05-29-2008, 08:36 AM
I should be able to make some context current from another thread, I just need to have the handle, don't I?

Zengar
05-29-2008, 08:53 AM
Yes, but it has to be "disconnected" from the old thread before (call wglMakeCurrent(0, 0) in the old thread).

karx11erx
05-29-2008, 09:27 AM
I know. I have done the following:

Setup (in main thread):


HDC currentDC;
HGLRC currentContext;

currentDC = wglGetCurrentDC ();
currentContext = wglGetCurrentContext ();

Main thread:


if (currentDC && currentContext && wglMakeCurrent (0, 0)) {
WaitForRenderThreads ();
if (RunRenderThreads (rtRenderFrame))
nError = wglMakeCurrent (currentDC, currentContext);
else { //fall back if thread failed
nError = wglMakeCurrent (currentDC, currentContext);
GameRenderFrame ();
}
}
else { //fall back if context switch failed
nError = glGetError ();
GameRenderFrame ();
}

Render thread:


extern HDC currentDC;
extern HGLRC currentContext;

GLuint nError;
if (wglMakeCurrent (currentDC, currentContext))
GameRenderFrame ();
else
nError = glGetError ();
wglMakeCurrent (0, 0);

wglMakeCurrent() always return 0 in the render thread, and nError is always 1282. Context switching in the main thread works though. I don't know what else I would have to provide in the render thread (which has been spawned off the main thread after the OpenGL context in the main thread has been created, and has not created any OpenGL context of its own). Do I have to do some basic initializations to make wglMakeCurrent() work?

Edit:

I have found that wglMakeCurrent() doesn't work at all in the render thread, not even wglMakeCurrent (0, 0).

Edit 2:

It doesn't even work if I create a separate render context in the render thread first to have initialized something there. I am clueless.

Zengar
05-29-2008, 09:43 AM
Well, I would like to see your thread code (what does RunRenderThread as similar do). What you do makes somehow no sense to me, I see no functionality... for example, MakeCurrent(0, 0) can practically never fail and that is why your first "if" is useless. More important, you try to split a single resource (an rc) over multiple threads --- a very useless thing to do.

Anyway, why are you doing this in the first place? If you want to get more performance, I must disappoint you --- this way is the best way to slow your rendering down. And to be honest, if the rest of your code is similarly overcomplicated, there is no wonder you can't get past 3mio/tri sec...

karx11erx
05-29-2008, 09:51 AM
This code is experimental yet with a lot of diagnostics stuff. I will certainly minimize context changes before releasing it.

I wrote that even glMakeCurrent (0,0) fails in the render thread.

The render thread waits for a global semaphore to be set and then starts to render the entire scene (that's the GameRenderFrame() call).

The program has several renderers: a 2D renderer for the menu stuff and a 3D renderer for the game. I want the 3D renderer to run parallel to the AI, particle, lightning etc. pp. handlers, because these take about 25% to 50% of the total frame time, so running them parallel to the renderer could theoretically halve frame time (on a multi core CPU). As e.g. AI and physics have to be synchronized with the renderer, the non-rendering stuff will wait for the renderer to finish the rendering the current frame before starting the next AI/physics/etc. frame.

If you want to see the rendering code, get render/render.cpp, render/fastrender.cpp, ogl/ogl_lib.cpp, ogl/ogl_fastrender.cpp, lighting/dynlight.cpp and lighting/headlight.cpp from my SVN repository for a start.

The only question I need to get answered though is how to get context switching in the render thread to work. I have not been asking how to make my rendering code work or be faster here. It does work, so it's good enough for experimenting.

karx11erx
05-29-2008, 10:24 AM
I got it sorted out.

The initialization code launched the render thread and immediately proceeded to reenabling its render context, hence the render thread failed when claiming it.

Same went for the main frame loop: The context got disabled in the main thread, the render thread got started and the main thread immediately enabled its render context due to the render thread running in parallel; hence the render thread couldn't claim the context.

Ooh yeah, the pitfalls of parallel code execution.

You could have seen that if you're so smart.

Zengar
05-29-2008, 04:07 PM
Well, I assumed that you at least knew as much about threads not to make this stupidest mistake :p

But I hope we can continue the discussion without insults...

I would really like to hear if your multi-threaded renderer yields good results when it is ready. If you don't mind, please keep us informed!

karx11erx
05-30-2008, 07:01 AM
Okeys. :)

Nobody is perfect, and so aren't my coding skills (but they are really close ... ^_^).

Hehe.

My profiling code was ... incomplete. The 3D renderer consumes 95 percent of the entire frame time, so having it in an extra thread wasn't as useful as I had supposed.

But it was a nice exercise. :)

karx11erx
06-01-2008, 03:28 PM
I have been asked to add some detail about how to do this, so here it is. I am writing this off the top of my head, so I hope it's not too far from the truth. This stuff here is Windows specific, you will have to find out about the corresponding Linux (X) and OS X calls yourself (I don't know them either).



//first get the current device and render contexts
HDC currentDC = wglGetCurrentDC ();
HGLRC currentRC = wglGetCurrentRC ();

Set up a render thread looking like this (I used SDL for this):


#ifdef _WIN32
# define G3_SLEEP(_t) Sleep (_t)
#else
# include <unistd.h>
# define G3_SLEEP(_t) usleep ((_t) * 1000)
#endif

typedef enum renderStates {
renderWait,
renderInit,
renderScene,
renderQuit,
renderExit
} tRenderStates;

tRenderStates renderState = renderWait;

int _CDECL_ RenderThread (void *pThreadId)
{
do {
while (renderState == renderWait)
G3_SLEEP (0); //wait a real short time
if (renderState == renderQuit)
break;
else if (renderState == renderInit)
wglMakeCurrent (currentDC, currentRC);
else
RenderScene ();
renderState = renderWait;
}
wglMakeCurrent (0, 0);
renderState = renderExit;
return 0;
}

Game initialization and loop:


void RunGame (void)
{
//do all the init stuff
wglMakeCurrent (0, 0);
renderState = renderInit;
while (renderState != renderWait)
G3_SLEEP (0); //wait for the render thread to finish init.
do {
DoAllOtherGameStuff ();
while (renderState != renderWait)
G3_SLEEP (0); //wait for any prev render frame to finish
//copy any data that must not be changed during rendering
renderState = renderScene;
} while (!<quit game condition occurred>);
renderState = renderQuit;
while (renderState != renderExit)
G3_SLEEP (0);
wglMakeCurrent (currentDC, currentRC);
}

I have not been using SDL's semaphore handling for synchronizing threads because apparently it waits at least 1 ms before starting a thread, and that is way too slow, at least if you are using a lot of threads, or are computing a lot of sequential tasks and just parallelizing each of the tasks (that is something you can e.g. do with time consuming sub tasks of the main or render threads. You could e.g. speed up sorts that way by splitting the elements to be sorted in two threads, quick sort them and then linearly sort the two sorted sub sets back into one - just for an example. Another application would be to have two threads executing Perlin functions in parallel, e.g. for lighting bolt computation).

You need to make sure no data that the renderer needs is changed while it renders a frame. Depending on time spent in certain parts of the renderer, you can use semaphores to keep the main thread from changing that data (for very short times), or make copies of the data for the renderer.

I had used something like the following for the particle system:


bool bRenderParticles = false;
bool bUpdateParticles = false;

void RenderParticles (void)
{
while (bUpdateParticles)
G3_SLEEP (0);
bRenderParticles = true;
//render all particles
bRenderParticles = false;
}


void UpdateParticles (void)
{
while (bRenderParticles)
G3_SLEEP (0);
bUpdateParticles = true;
//now create, move, delete particles
//this would heavily interfere with the renderer
//and copying all that data would be way too much
bUpdateParticles = false;
}

This code assumes the particle renderer and updater execution will not overlap. If that can happen, you will have to make sure one process is prioritized and the other suspended until the prioritized one finishes.

You may want to use that type of approach for all effects that only take a very short time to be rendered, and copy stuff like actor states to a buffer used during rendering.

Actually there is a lot of theory and math about asynchronous thread execution and solving the challenges and problems arising with it (like deadlocks and thrashing, or proper balancing), but in this simple case my code samples may point you to the right direction.