Extremely weird performance issue

Hi folks. We have been struggling with a very strange performance issue in our flight simulation app.

We have gotten reports of very bad frame rates on GeForce 7 series cards on various types of drivers. It also seems to happen primarily on Dual Core CPU’s.

The expected frame rate should be about 70-100 when looking horizontally over the terrain. The low frame rates we have seen are seemingly constant about 20fps.

Now, what we just found out is that if you look straight up for about 5 seconds, the frame rate suddenly pops back where it should be, even when looking down on the terrain again. It seems that when you manage to get the frame rate above 60 for a few seconds, the performance problem corrects itself.

Our test computer for this issue is a Dell XPS 1710 with GeForce 7950. Nvidia driver version is 94.22, but we’ve tried with different versions.

This only happens on Windows. On the same computer with linux, we get full performance all the time.

We have tried with and without VBO’s and display lists for the terrain rendering.

Any ideas why this happens?

Cheers!

  1. Go into Regedit and determine the current primary display card
    by looking in HKey_Local_Machine\Hardware/DeviceMap\Video
    and note the GUID (global unique indentifier assigned by Windows)
    for the entry “\device\video0” which is the long string at the end
    of the entry in brackets { }.

  2. Edit HKey_Local_Machine\SYSTEM\CurrentControlSet\Control\Video{guid}\0000,
    where {guid} is the number derived from the above step.

  3. Open the “0000” directory and enter a new DWORD called OGL_ThreadControl
    and give it a value of 2. This will disable multithreading in the driver
    for all OpenGL [OGL] applications.

It is due to driver “optimization” in NV drivers
dating from 8x.xx.
It is described in NV driver release notes.

Thanks! That fixed it. But most of our users are not that comfortable with editing registry settings.

We could of course do the registry changes in the application, but I don’t know if I’m very comfortable with that either.

Is there any other way we could programatically disable this threading locally in our application?

Cheers

This setting is system wide.
But you can modify the registry from you code
via Windows API.
In some recent drivers I saw a possibility to change
it from Display control panel.
Select Performance & Quality Settings,
select Advanced settings, browse “Threaded optimization”.
I have this option on my Quadro NV card.

I’ve seen this problem too, but I haven’t tried the registry setting (I’ll try this later though). It showed up for me whenever I started rendering to a cube map via FBO. Sometimes the framerate pops back to where it should be and sometimes it just stays low. Not sure if this information helps anyone, but I figured I’d throw it out there in case it’s related to cubemaps or FBOs somehow. =)

Kevin B

You could read out the registry setting and restore it upon application exit.

OR you could have an option in your settings “disable driver threading” or so, with a small explanation, so that the user can choose, whether he wants to disable it.

Other than that, i don’t know, it’s a bad situation.

Jan.

nvidia should be profiling their drivers at runtime and flip the switch themselves if it’s messing things up.

One issue with NV ‘profiling’ (especially with visual simulation apps) is that the developer/integrator doesnt want the performance to alter from one frame to the next by too much. Predictable ‘medium’ performance is much easier to deal with and manage than occasional ‘fast’ and intermittent ‘slow’ for the same scene content.
As well as opengl threading try selecting a ‘visual simulation’ profile as it will reduce the amount of intelligence and analysis that the driver will apply to the opengl command stream.

You can customize the nvidia thread by setting the process affinity to 1 CPU when you create the window & opengl context. Restore it afterwards (use the sysmask)

This is the only way I’ve found to control nvidia’s threading from the app. Not nice, but it works pretty well.

I.e:

 
::SetProcessAffinityMask(::GetCurrentProcess(), 0x1),

CreateWindowEx(...)
wglCreateContext(...)

::GetProcessAffinityMask(::GetCurrentProcess(), &procMask, &sysMask);
::SetProcessAffinityMask(::GetCurrentProcess(), sysMask);

In those ages and days where everybody is expecting your programs to take advantage of dual/quad cores machines, I don’t think it’s a good decision to set the process affinity.

Y.

Since I restore it afterwards, it’s not an issue for the rest of the app. (We do run a bunch of intensive threads, and use multi-core)

I just want some control over nvidia’s gl driver since it’s auto-thread feature gives me problems.

eldritch,
Could you post a link to your application so NVIDIA engineers can try to fix this?
Using the ThreadControl regkey is the best solution right now.

eldritch, can you get me a repro application showing the problem? We’ll take a look.

Thanks,
Barthold
NVIDIA

Our game shows a similar problem, in our setup though, we have complete control of the target computers, so I just disable thread optimizations from the driver control panel.

Is your game use QueryPerformanceCounter for timing? On dual core systems, when CPU have 2 perf. counters, those counters are not synced and you never know which core execute QueryPerformanceCounter counter. In some case you can get negative delta time between two frames (which is impossible, right?). This negative delta time might screw your physics, AI, and other calculations. There is a fix for XP. Google for KB896256.

We had the exact same problem and it had been haunting me for a long time. MarcusL:s “SetProcessAffinityMask” code fixed it straight away. Thanks alot.

I will send you a mail barthold and I can supply a repro application if you have not yet recieved one.

Greetz David

Guys, please take a look at this. For optimal performance try not to do many queries into the OpenGL driver for each frame. For example, glGetError() or glGetFloatv() etc.

http://developer.nvidia.com/object/multi-thread-gdc-2006.html

Barthold

It reminds me of the Apple solution: use the Core duo to improve performance in games like WOW. I think they reported some huge improvement like 400% or maybe memory is failing me.

I’ve read that before. It’s interesting, but doesn’t say what happens if all cores are working 100%. I am not certain there is a benefit in a driver thread then. (But I’m not sure, multi-threading is not intuitive… yet. :slight_smile:

Our app does a fair number of glGet(), and it’s not something I can easily fix at the moment.

However, I don’t see why I should be getting variations in performance just because of that. Is it because the GL driver dynamically switches between multi-threading or not?

Btw, our render-thread is pretty much only doing rendering, so I’m not convinced that a asynch-layer to the gl driver helps that much. At least not in our case, since our current usage pattern implies a lot of syncronization.

Opening this thread again, since the latest Nvidia drivers for XP (162.18) seem to go around my fix, and use threading no-matter-what.

So, CPU usage jumped by 20-25% on my quad-core (i.e. almost a whole extra core was used) and FPS is still the same.

Arrgh. And no option to disable this in the control panel either. Jeez.