PDA

View Full Version : Vertex shader in program 2 is being recompiled based on GL state



RealtimeSlave
07-31-2017, 12:21 PM
Hello,

using a debug OpenGL context I frequently get the following error:

Program/shader state performance warning: Vertex shader in program 2 is being recompiled based on GL state.

I checked which of my programs has handle 2 and it is my "render depth only"-shader program
which only gets a vertex shader attached but no fragment shader (I use it for rendering shadow
map depth values only).

According to https://www.khronos.org/opengl/wiki/Fragment_Shader it is legit to attach no fragment shader:

Fragment shaders are technically an optional shader stage. If no fragment shader is used, then the color values of the output Fragment (https://www.khronos.org/opengl/wiki/Fragment) have undefined values. However, the depth and stencil values for the output fragment have the same values as the inputs.

This is useful for doing rendering where the only useful output is the fragment's depth, and you want to use the depth computed by the system, rather than some other depth. Such depth-only rendering is used for shadow mapping operations as well as depth pre-pass optimizations.



My vertex shader:

#version 150 core
uniform mat4 u_MVPM;
in vec3 a_Position;
void main(void)
{
gl_Position = u_MVPM * vec4(a_Position, 1.0);
}

Anyone knows why this shader program is "recompiled"? Afaik recompile is bad for performance.
Do I need to change my vertex shader somehow, maybe I forgot to write some varying out? Thanks!

My hardware and driver specs:


Betriebssystem: Windows 10 Home, 64-bit
DirectX-Version: 12.0
GPU-Prozessor: GeForce GTX 1080
Treiberversion: 384.94
Direct3D-API-Version: 12
Direct3D-Funktionsebene: 12_1
CUDA-Kerne: 2560
Kerntakt: 1607 MHz
Speicher-Datenrate: 10010 MHz
Speicherschnittstelle: 256-Bit
Speicherbandbreite: 320.32 GB/s
Gesamter verfügbarer Grafikspeicher: 16367 MB
Dedizierter Videospeicher: 8192 MB GDDR5X
System-Videospeicher: 0 MB
Freigegebener Systemspeicher: 8175 MB


UPDATE: It seems to happen only randomly, sometimes also other shaders. Really flaky behavior.

Silence
08-01-2017, 12:47 AM
If you change some OpenGL states (blending, depth) this might happen. This could also happen to the shaders which simulate the fixed function pipeline.

john_connor
08-01-2017, 03:06 AM
just a guess: are you using the same (vertex) shader object for 2 (or more) different program objects ?

GClements
08-01-2017, 04:33 AM
This could also happen to the shaders which simulate the fixed function pipeline.
That would seem the most likely culprit here, as there's nothing in the vertex shader which would explain recompilation.

Without a fragment shader, the implementation will generate one which calculates the correct colour for each fragment using the fixed-function pipeline. It will probably do this regardless of whether the framebuffer has any colour attachments. It's likely that the shader will be recompiled if you change any relevant state (lighting, texturing, etc).

I suggest adding a fragment shader which just writes zero to its output. This will be much simpler than the fragment shader used to emulate the fixed-function pipeline, and less likely to be recompiled.

Silence
08-01-2017, 04:54 AM
That would seem the most likely culprit here, as there's nothing in the vertex shader which would explain recompilation.

Without a fragment shader, the implementation will generate one which calculates the correct colour for each fragment using the fixed-function pipeline. It will probably do this regardless of whether the framebuffer has any colour attachments. It's likely that the shader will be recompiled if you change any relevant state (lighting, texturing, etc).

I suggest adding a fragment shader which just writes zero to its output. This will be much simpler than the fragment shader used to emulate the fixed-function pipeline, and less likely to be recompiled.

Yes but as the OP explicitly said, the shader name (here 2) is the name of one of its shader (render depth-only). So my guess is that the OP is simply enabling/disabling depth testing or changing the depth function or mask. Then for optimization purpose the driver will rebuild the shader in order to activate/inhibit some of its statements.
Removing such state change calls might help in here.

RealtimeSlave
08-02-2017, 01:09 PM
just a guess: are you using the same (vertex) shader object for 2 (or more) different program objects ?

Hello, no for each program I always attach a new vertex and a new fragment shader.

RealtimeSlave
08-02-2017, 01:21 PM
Removing such state change calls might help in here.

Which state change calls exactly? I think I have no state change calls which can just be removed without breaking functionality. Or maybe I misunderstood you

Dark Photon
08-02-2017, 08:55 PM
[NVIDIA ARB_debug_output performance warning:]
Program/shader state performance warning: Vertex shader in program 2 is being recompiled based on GL state.
...
Anyone knows why this shader program is "recompiled"? Afaik recompile is bad for performance.

Which state change calls exactly? I think I have no state change calls which can just be removed without breaking functionality. Or maybe I misunderstood you


Hi RealtimeSlave.

I don't work for NVidia so I can't give you all the details, but I can give you some insight to go on. To save duplication, see this thread (https://www.opengl.org/discussion_boards/showthread.php/199464-Effects-of-driver-sided-program-re-linking) for more info.

In short, the OpenGL state / shader boundary isn't defined in GL drivers the way the spec defines it. Sometimes, GL state external to GL shader source is compiled into the shader. Change that state? The driver needs a different compiled shader under-the-hood (even though at the GL API level, it's just one shader), and your app is likely to "hitch" or possibly break frame while the driver goes off to generate that shader variant. I think that's what's going on here inside the NVidia driver.

For a vertex shader triggering this, I'd focus your digging on state the driver might pre-bake into a vertex shader, such as related to vertex attribute formats, texture formats, etc.

Given the shader, you should be able to put it through some methodical testing (change state; render change; change state; render -- did the NVidia driver kick out any warnings? If not, rinse/repeat with different state) and ideally nail down what state change in your app is triggering this.

If/when you do figure it out, please "do" follow-up! I for one am very interested in what state(s) you find triggering recompilation on NVidia, and I know I'm not alone!

Silence
08-03-2017, 12:36 AM
Which state change calls exactly?

What I supposed is any state changes on the depth functionalities (see my above post). But I might be wrong. See the post of GClements about the fact you don't have any fragment shader.


I think I have no state change calls which can just be removed without breaking functionality. Or maybe I misunderstood you

All what is asked is to make some tests, even if you break your rendering, your results in order to find what is causing these recompilations.

RealtimeSlave
08-03-2017, 04:37 AM
The problem is really the flaky behavior. Some days I never see this warning. And when I get the warning, I cannot reproduce it a second time with the same shader. So I cannot even reproduce it properly, probably due to some race condition.

Dark Photon
08-04-2017, 08:23 PM
The problem is really the flaky behavior. Some days I never see this warning. And when I get the warning, I cannot reproduce it a second time with the same shader. So I cannot even reproduce it properly, probably due to some race condition.

Hmm. That's interesting.

Do you always run:
1) with a debug context,
2) with a glDebugMessageCallback() plugged in, and
3) with logging of those performance warnings enabled?

(For performance reasons, I'd expect the answer to be no.) You're of course going to need to do so to be able to get this perf warning.

It could be you don't see this consistently because you're letting your app use the NVidia GL driver's shader cache (as you'd normally want to). The shader cache allows the NVidia GL driver to cache compiled/optimized shader variants across multiple runs of your application. So it could be that you only see this perf warning when it has to produce a shader variant that it's not seen in this run or any recent previous run of your app.

For testing purposes, try nuking the NV GL shader cache before every run of your app. Then see if you can repro this perf warning consistently. For details on nuking the shader disk cache (for Linux), see this link (http://us.download.nvidia.com/XFree86/Linux-x86/367.57/README/openglenvvariables.html) and search down for "shader disk cache". On Linux, you'd just nuke $HOME/nv/GLCache/ to clear the driver's on-disk shader disk cache. I'm not sure what you do to clear it on Windows. I'm sure a websearch will turn this up (here's a post (https://forums.geforce.com/default/topic/1020898/why-does-shader-cache-option-do-absolutely-nothing-in-driver-profiles-/?offset=4) that lists a few possibilities). Also, more info on the NVidia disk shader cache (on Linux) is available here (https://www.opengl.org/discussion_boards/showthread.php/181986-NVidia-driver-internal-on-disk-shader-cache-%28and-draw-time-shader-recompilation%29).

If you can get it to happen consistently, you can capture a GL call trace, bring it up in your favorite call trace viewer, and go looking at the exact GL state active when you're executing that draw call that triggers it. Exploring variations on that state should lead you to the cause.

If that fails to help you, another thought...

To help you get a line on this tough problem, you might add some code to the debug callback that will (when it receives ones of the perf warnings) log out some contextual state information to help you collect evidence on the instigating state combination for each occurance. Things like:

1) the bound shader program (reference to shader source for vert/frag/etc. shaders),
2) vertex attribute formats,
3) blend modes,
4) bound draw framebuffer and the format of its attachments,
5) buffer test and write masks,
6) shader sampler uniform settings
7) bound texture formats and parameters (e.g. whether depth compare is enabled for depth textures).

After a few data points, hopefully you'll start seeing a pattern.

RealtimeSlave
08-09-2017, 12:06 PM
In Visual Studio DEBUG mode I always use the debug OpenGL context, in RELEASE mode the normal context.

So atm during development I always have the debug OpenGL context.

Thanks for the hint with the shader cache. I give it a try on Windows when I find some time
and come back to this thread then :-)

Thanks

Dark Photon
08-09-2017, 06:24 PM
Thanks for the hint with the shader cache. I give it a try on Windows when I find some time
and come back to this thread then :-)

Sounds good. BTW, I looked on a Windows box running NVidia drivers today, and it appears that the NVidia GL driver shader cache is stored in:



<USERDIR>/AppData/Roaming/NVIDIA/GLCache/


So just recursively remove that folder.

If you use compute shaders, there's another directory at the GLCache level that contains cached versions of those.