NVidia driver-internal on-disk shader cache (and draw-time shader recompilation)
( Will give some background in a second, but first my question(s): )
Has anyone developed some experience with this driver feature that they'd care to share as far as avoiding the driver "writing to its disk cache at render time"?
And do you know some examples of what GL state is in the "shader key" for an internal precompiled shader?
Even with prerendering with shaders/materials in an initialization phase, I'm still seeing some first-render spikes when rendering for the user on the first app run after building a shader for the first time (or when the shader cache has been nuked beforehand). Obviously, this is very undesirable for performance.
The NVidia driver-internal on-disk GL precompiled "shader cache" is something NVidia added back in the 290.03 beta drivers back on Oct 21, 2011 (Phoronix post) to speed up subsequent compilation and rendering with that same shader.
For direct rendering contexts (the usual case), the driver by default writes precompiled shaders to a database off $HOME/.nv/GLCache/ (on Linux). There is an analog for this for OpenCL/CUDA kernels at $HOME/.nv/ComputeCache/. These paths as well as the enabled state of these caches can be changed by various means. The driver appears to store the compiled assembly code along with some interface uniforms/attributes and such in text format in this cache. Having the cache enabled, present, and up-to-date on startup has positive performance benefits for second and subsequent runs of an application.
The trick is, figuring out how to prod/trick the driver so that it offloads all the "shader compiling/optimization" to startup/init time and doesn't do any of this at render time. More on that:
I'd thought until recently that the whole "dynamic recompilation of shaders at render time" business was a distant memory left over from GeForce 7 days due to hardware architecture limitations on older GPUs. Not so! Come to find out, even on very recent top-of-the-line GPUs, the driver is still dynamically recompiling shaders at render time, and not only that, writing to disk (the shader cache) in such cases. This is of course very bad for performance and something to be avoided.
How do I know?
- Nuke the on-disk shader cache
- Run GL app in a DEBUG context, with a glDebugMessageCallbackARB() callback plugged in. Print out everything.
On this run (where it's having to rebuild the on-disk shader cache from scratch) at application render time you may see one or more PERFORMANCE warnings about:
"Fragment Shader is going to be recompiled because the shader key based on GL state mismatches"
(Though no hints given as to what GL state is in the "shader key".) Not only that, but the modification timestamp on the NV shader cache disk files indicates that these files are not only being written during startup (shader compilation occurs here) but after startup during render time in front of the user.
These seem to correspond to the occurance of first-render spikes on the first run with a new shader (or a removed shader cache). I've seen a lot of "what is this PERF warning and how do I fix it" posts on the net, but no decent responses as a resolution -- most just ignore it because there's no concrete suggestion provided. Removing the shader cache before startup at least makes the appearance of this warning consistent however.
Now by prerendering with shader/state combinations before "render time in front of the user", I can move the occurance of this warning to an "init" phase. However, I'm still seeing some first-render frame spikes when rendering for the user on the first run, which do not reappear on subsequent app runs (suggests state persistence, ala the shader cache which makes me suspect it, but that's just a guess at this point).
So before I dig further on this, I thought I'd check with the group, offer what I've found out so far, and see if anyone else has additional info on :
- avoiding "driver on-disk shader cache writes" at render time, or
- getting rid of "dynamic recompilation of shaders" at render time
scenarios with the NVidia drivers.