nVidia multithreading issues

I’m trying to do rendering in a second thread. Actually, I already managed to do it. But now, my app is crashing in nvogl.dll with an “unknown software exception” at program exit as soon as I used the second thread for rendering a single time. Additionally I magically “loose” 1ms per frame. It looks like the time is spend when a rendering context is hand over to another thread.

Currently, “Threaded Optimization” is disabled in the driver. If set it to “enable” or “auto”, wglMakeCurrent() failes sporadically without reason. But I’m not worried about that, since that Threaded Optimization stuff gains nothing anyway.

This happens on a GF8800GTS (Forceware 175.16). I have tested the same application on an ATI FireGL5600. None of the problems coud be seen there.

Are there any known issues with multithreaded rendering on nVdia cards? Anything I should look for?
Just to clarify, this is how I use the threads:

  1. MainThreadA:
    create context, create textures, VBO’s et…

  2. Per Frame:
    unbind context with wglMakeCurrent(0,0)
    hand over context to ThreadB

  3. Thread B:
    render scene
    unbind context
    signal finished frame

  4. MainThreadA:
    bind context, render additional stuff
    call wglSwapBuffers()
    Goto 2:

Currently, this might seem like bad usage of multithreaded rendering. But in future, there will be several ThreadB’s which will work on different parts of the final image, each on its own GPU. ThreadA will compose the parts together and then present them.

thanks for your time!

Do you have a context per thread?

Later, I will have one context GPU and each GPU will render in one thread. At the moment I have just one context which alternates between the main thread and the render thread. Of course, that one context is only bound to (and used by) one thread at any time. So, currently, there is no “real” parallel rendering. But that will change as soon as I can use 2+ GPUs (for instance, for stereo rendering)

I wonder if your problem would be solved simply by having a unique context for each thread which accesses GL. I had a similar problem prior with NVidia drivers when just using one thread for GL setup (one time at start of program) and then for the rest, a separate thread with GL access. Wouldn’t work with one context even though there was no overlap in usage between the threads.

Can you post some code here… If app have multiple threads and one context, app must unbind context from the running thread befor it bind on another thread. Because of that app need good sync control between threads. Anyway, frequent rebinding context may lead to performance issues, so its better to have one rendering thread with message/job queue and other threads just post messages/jobs in rendering thread.
In my applications usually I create window in UI thread then spawn rendering thread, create context, and post notification messages to render thread when windows gets resized, minimized… Rendering thread is free from window message processing in UI thread.

If app have multiple threads and one context, app must unbind context from the running thread before it bind on another thread.

I’m doing that. In future I plan to use one thread/context per GPU which will be permanently bound.

My current concern is not how to do the multithreaded rendering. I just want to know if there are certain non-obvious pitfalls or driver bugs that I should be aware of. It seems, I’m experiencing something like this bug:

http://developer.nvidia.com/forums/index.php?showtopic=318

The loss of constant 1ms per frame is still unsolved, but I’m hoping it will go away when I start to permanently bind the context(s) to one thread each.

Update:

New problem:
When using occlusion queries in the threads, the driver quickly locks up somewhere in the middle of nvogl32.dll .

The crash at exit is probably resolved, the cleanup code was broken.

Based on my experience, you should not use OpenGL in a multithreaded fashion, except for the well tested / oftenly used driver code paths like, for example, resource (textures / geometry) loading in a separate, non-rendering thread.

The drivers are riddled with race conditions and threading problems AFAIK, and problems you are seeing have the stench of that.

My hope is, that when I finally can do some tests in a “production environment” (means, two quadro cards, supporting NV_gpu_affinity or some AMD/ATI equivalent), those multithreading issues will eventually go away.