OpenGL Profiling

Hi All,

Profiling is a very important process in software development. Dealing with multithreaded execution in a multi-core preemptive environment, that dynamically changes its performance state during execution, is quite challenging to be accurately profiled; but that is exactly what should be done with all graphics applications.

Before delving deeper into the topic, I would like to hear from you guys, what profiler are you using for developing OpenGL based applications?

Here is a short list of possible profilers:

  • gDEBugger
  • AMD gDEBugger - Visual Studio 2010 Extension
  • AMD GPU PerfStudio 2.8
  • NVIDIA Parallel Nsight
  • NVIDIA Platform Analyzer (NPA)
  • VSProfileLib – Very Simple Profile Library

Please broaden the list with other profilers that support OpenGL. NPA doesn’t directly supports OpenGL, but it can be used for profiling GPU processes. Also, NV Parallel Nsight has a very modest support for OpenGL compared to D3D.

On Linux, I use BuGLe with the stats_calltimes filter:

http://www.opengl.org/sdk/tools/BuGLe/documentation/stats_calltimes.7.php

Thanks overlay!

Unfortunately, I’m not using Linux and have never tried BuGLe. On the first glance it seems to be an interceptor that hooks to OpenGL binaries and enables post-execution analysis of textual dumps. Please correct me if I’m wrong.

What really excites me is AMD GPUPerfAPI. I’m not using AMD graphics cards and I cannot try it at the moment, but this seems extraordinary. Has anyone from this community ever tried GPUPerfAPI? I just need a confirmation that vertex, tessellation, geometry and fragment shader utilization counters work for OpenGL. I have never encountered anything similar for NV platforms.

For the second time I am amazed by AMD’s level of openness!

AMD also supports the GL_AMD_performance_monitor extension, which is similar to the GPUPerfAPI.

Yes, I saw that extension, some time ago, but the spec was quite unclear. What counters can be accessed through GL_AMD_performance_monitor?
I have to take a closer look again. Thanks for the suggestion!

P.S. GL_AMD_performance_monitor has almost 5 years old spec.

Can anyone point me to a spec that could decipher names of groups and counters reported by AMD_performance_monitor?

There are 92 groups each with up to 236 counters. The names reported by glGetPerfMonitorGroupStringAMD/glGetPerfMonitorCounterStringAMD are coded. For example, the counter name could be CB1_000.

Also, can PerfMonitors overlap? A monitor can activate/track arbitrary number of counters from different groups. What happens if the sum of counters from the same group (but in several monitors) exceeds the max. number of active for the group?

I need to appeal to NVIDIA to update PrefSDK. It is sad not to support two years old product. Namely, PerfSDK (ver.6.72.0719.0645 released on July 19th last year) does not support Fermi. :frowning:

Another problem is an inability to profile 32-bit applications on 64-bit OS using NVIDIA PerfSDK.