Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Results 1 to 4 of 4

Thread: Shader Performance

  1. #1
    Senior Member OpenGL Pro
    Join Date
    Jan 2012
    Location
    Australia
    Posts
    1,117

    Shader Performance

    A short while ago there was a thread on measuring performance of different shader subroutines.
    I have been having a look at nVidia NVPerfKit sdk. It appears to have counters that give the instruction rate by shader stage; ie vertex, tesselation, geometry and fragment.

    I have not used this. Has anyone else looked at this data.

  2. #2
    Senior Member OpenGL Pro Aleksandar's Avatar
    Join Date
    Jul 2009
    Posts
    1,153
    I'm using NV PerfKit for a long time, but only the first release (called PerfSDK), that does not support Fermi/Kepler. Version 2.2 was released a year ago, but I had a problem to interpret values retrieved from the counters, so I stopped experimenting.
    At the end of April, NV released ver. 3.0.0.13123. The link was valid just one day, and unfortunately I didn't download it. Obviously there is some problem.

    As you can read from the user guide, *_shader_instruction_rate retrieves "the % of all shader instructions seen on the first SM unit that were executing defined shader(s)". How do you plan to use it in measuring performance of different shader routines?

  3. #3
    Senior Member OpenGL Pro
    Join Date
    Jan 2012
    Location
    Australia
    Posts
    1,117
    I was wondering if I selected a different subroutine using dynamic subroutines would a change in instruction rate correspond to a better (worse) implementation.

    I have started using version 2.2. It seems to return 0 for the "OGL ..." variables on my Quadro 5000 (although it finds the variable name) but they are ok on my geForce card; the gpu values seem ok on both.

  4. #4
    Junior Member Newbie
    Join Date
    Jun 2013
    Posts
    25
    Another solution, which is less complicated, might be to grab a copy of nVidia's nvEmulate. You can set it to print out the assembly instructions for your shader. If you retool the shader and end up with fewer instructions then you probably have a more efficient shader. I suspect that this assumption will not always hold true, but likely will for most cases.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •