Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Results 1 to 6 of 6

Thread: AMD glLinkProgram Performance Tips?

Hybrid View

  1. #1
    Newbie Newbie
    Join Date
    Jun 2013
    Posts
    3

    AMD glLinkProgram Performance Tips?

    Hi,

    I am having trouble with glLinkProgram on AMD drivers (both Windows and Linux). The compilation time is absolutely absurd, and is killing my game engine. I am seeing 5 to 10 seconds for single shader compiles, where on NV drivers it is immeasurable. The shaders in question are generated by the engine and are quite math-heavy.

    Does anyone have general tips / advice for speeding up shader compilation on AMD? For example, should I try hand-unrolling heavily nested function calls, perhaps hand-unrolling, loops, etc? Given my lack of knowledge about shader compilers, I don't really know how to proceed with making my shaders more compiler-friendly.

    Thanks for any tips!

  2. #2
    Advanced Member Frequent Contributor
    Join Date
    Apr 2010
    Posts
    645
    Hmm, I don't know about the GLSL compilers, but those for C/C++ can sometimes run into performance issues with very large functions (blocks really) that have many variables, due to the use of algorithms that are quadratic in the number of instructions or variables for example.

    Haven't used AMD cards in a while, but 5-10 secs sounds outrageously long to me. Are you using a debug context? You could try shader binaries (if your hardware supports them) that you cache on disk, that way you only pay the link time penalty once.

  3. #3
    Newbie Newbie
    Join Date
    Jun 2013
    Posts
    3
    Quote Originally Posted by carsten neumann View Post
    Hmm, I don't know about the GLSL compilers, but those for C/C++ can sometimes run into performance issues with very large functions (blocks really) that have many variables, due to the use of algorithms that are quadratic in the number of instructions or variables for example.

    Haven't used AMD cards in a while, but 5-10 secs sounds outrageously long to me. Are you using a debug context? You could try shader binaries (if your hardware supports them) that you cache on disk, that way you only pay the link time penalty once.
    Thanks very much Carsten, I was completely unaware of the GL binary facilities! Wish I had known about these sooner That will certainly help. Still open to compilation insights if anyone has them, in the mean time I am sure binaries will lift a lot of the load.

    And no, it's not a debug context. It's pretty terrible because, as I said, on NV drivers it's virtually instant...ouch, come on now AMD...

  4. #4
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    985
    Sometimes drivers trick you, as even when the compilation looks virtually instant, it could be because the driver just transmitted the actual compilation job to a separate thread and thus won't block your code to continue until the time when you actually try to use the shader (thus the latency didn't disappear, but just got delayed).

    So, at first, I would make sure you measure compilation time properly. In order to do so, do the following:
    1. Compile your shaders
    2. Render some simple primitive using the shaders (e.g. a point)
    3. Use glReadPixels or other mechanism to make sure the rendering actually happened and not delayed as well
    4. Measure the time of all the 3 steps, it will give you a better estimate on how much time the compilation actually required.
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

  5. #5
    Newbie Newbie
    Join Date
    Jun 2013
    Posts
    3
    Quote Originally Posted by aqnuep View Post
    Sometimes drivers trick you, as even when the compilation looks virtually instant, it could be because the driver just transmitted the actual compilation job to a separate thread and thus won't block your code to continue until the time when you actually try to use the shader (thus the latency didn't disappear, but just got delayed).

    So, at first, I would make sure you measure compilation time properly. In order to do so, do the following:
    1. Compile your shaders
    2. Render some simple primitive using the shaders (e.g. a point)
    3. Use glReadPixels or other mechanism to make sure the rendering actually happened and not delayed as well
    4. Measure the time of all the 3 steps, it will give you a better estimate on how much time the compilation actually required.
    Thanks aqnuep, but the shaders are used immediately to generate geometry, so I am quite sure of the compilation time. On NV they are compiled and able to start displaying the geometry with virtually no delay, so I do think it's actually the AMD compiler But it is very surprising to me that the difference is so dramatic...

  6. #6
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,124
    Quote Originally Posted by xeonxt View Post
    ...the shaders are used immediately to generate geometry, so I am quite sure of the compilation time. On NV they are compiled and able to start displaying the geometry with virtually no delay
    Unless you are nuking the NV-driver-internal on-disk precompiled GL shader cache before doing this test, don't be so sure.

    If you've run with that shader before, it's probably just loading a precompiled version off-disk (or more likely, from a memory cache of that on-disk data thanks to the OS caching of disk accesses, so it's blindingly fast), not actually compiling it on-the-fly. There are precompiled caches for OpenCL/CUDA kernels as well.

    On Linux, the default paths for these caches are: $HOME/.nv/GLCache and $HOME/.nv/ComputeCache, respectively.

    On Windows, %APPDATA%\NVIDIA\GLCache and %APPDATA%\NVIDIA\ComputeCache, respectively.

    Websearch these paths for hits. For more info, see:

    * NVIDIA's OpenGL Shader Disk Cache For Linux
    * NVidia Linux Driver README - Chapter 11 (see the bottom section here)
    * CUDA Pro Tip: Understand Fat Binaries and JIT Caching
    * NVidia driver-internal on-disk shader cache (and draw-time shader recompilation)

    I don't know if AMD has a similar mechanism in-place. Check their driver docs.
    Last edited by Dark Photon; 06-22-2013 at 10:06 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •