Name AMD_performance_monitor Name Strings GL_AMD_performance_monitor Contributors Dan Ginsburg Aaftab Munshi Dave Oldcorn Maurice Ribble Jonathan Zarge Contact Dan Ginsburg (dan.ginsburg 'at' amd.com) Status ??? Version Last Modified Date: 11/29/2007 Number OpenGL Extension #360 OpenGL ES Extension #50 Dependencies None Overview This extension enables the capture and reporting of performance monitors. Performance monitors contain groups of counters which hold arbitrary counted data. Typically, the counters hold information on performance-related counters in the underlying hardware. The extension is general enough to allow the implementation to choose which counters to expose and pick the data type and range of the counters. The extension also allows counting to start and end on arbitrary boundaries during rendering. Issues 1. Should this be an EGL or OpenGL/OpenGL ES extension? Decision - Make this an OpenGL/OpenGL ES extension Reason - We would like to expose this extension in both OpenGL and OpenGL ES which makes EGL an unsuitable choice. Further, support for EGL is not a requirement and there are platforms that support OpenGL ES but not EGL, making it difficult to make this an EGL extension. 2. Should the API support multipassing? Decision - No. Reason - Multipassing should really be left to the application to do. This makes the API unnecessarily complicated. A major issue is that depending on which counters are to be sampled, the # of passes and which counters get selected in each pass can be difficult to determine. It is much easier to give a list of counters categorized by groups with specific information on the number of counters that can be selected from each group. 3. Should we define a 64-bit data type for UNSIGNED_INT64_AMD? Decision - No. Reason - While counters can be returned as 64-bit unsigned integers, the data is passed back to the application inside of a void*. Therefore, there is no need in this extension to define a 64-bit data type (e.g., GLuint64). It will be up the application to declare a native 64-bit unsigned integer and cast the returned data to that type. New Procedures and Functions void GetPerfMonitorGroupsAMD(int *numGroups, sizei groupsSize, uint *groups) void GetPerfMonitorCountersAMD(uint group, int *numCounters, int *maxActiveCounters, sizei countersSize, uint *counters) void GetPerfMonitorGroupStringAMD(uint group, sizei bufSize, sizei *length, char *groupString) void GetPerfMonitorCounterStringAMD(uint group, uint counter, sizei bufSize, sizei *length, char *counterString) void GetPerfMonitorCounterInfoAMD(uint group, uint counter, enum pname, void *data) void GenPerfMonitorsAMD(sizei n, uint *monitors) void DeletePerfMonitorsAMD(sizei n, uint *monitors) void SelectPerfMonitorCountersAMD(uint monitor, boolean enable, uint group, int numCounters, uint *counterList) void BeginPerfMonitorAMD(uint monitor) void EndPerfMonitorAMD(uint monitor) void GetPerfMonitorCounterDataAMD(uint monitor, enum pname, sizei dataSize, uint *data, int *bytesWritten) New Tokens Accepted by the parameter of GetPerfMonitorCounterInfoAMD COUNTER_TYPE_AMD 0x8BC0 COUNTER_RANGE_AMD 0x8BC1 Returned as a valid value in parameter of GetPerfMonitorCounterInfoAMD if = COUNTER_TYPE_AMD UNSIGNED_INT 0x1405 FLOAT 0x1406 UNSIGNED_INT64_AMD 0x8BC2 PERCENTAGE_AMD 0x8BC3 Accepted by the parameter of GetPerfMonitorCounterDataAMD PERFMON_RESULT_AVAILABLE_AMD 0x8BC4 PERFMON_RESULT_SIZE_AMD 0x8BC5 PERFMON_RESULT_AMD 0x8BC6 Addition to the GL specification Add a new section called Performance Monitoring A performance monitor consists of a number of hardware and software counters that can be sampled by the GPU and reported back to the application. Performance counters are organized as a single hierarchy where counters are categorized into groups. Each group has a list of counters that belong to the counter and can be sampled, and a maximum number of counters that can be sampled. The command void GetPerfMonitorGroupsAMD(int *numGroups, sizei groupsSize, uint *groups); returns the number of available groups in , if is not NULL. If is not 0 and is not NULL, then the list of available groups is returned. The number of entries that will be returned in is determined by . If is 0, no information is copied. Each group is identified by a unique unsigned int identifier. The command void GetPerfMonitorCountersAMD(uint group, int *numCounters, int *maxActiveCounters, sizei countersSize, uint *counters); returns the following information. For each group, it returns the number of available counters in , the max number of counters that can be active at any time in , and the list of counters in . The number of entries that can be returned in is determined by . If is 0, no information is copied. Each counter in a group is identified by a unique unsigned int identifier. If does not reference a valid group ID, an INVALID_VALUE error is generated. The command void GetPerfMonitorGroupStringAMD(uint group, sizei bufSize, sizei *length, char *groupString) returns the string that describes the group name identified by in . The actual number of characters written to , excluding the null terminator, is returned in . If is NULL, then no length is returned. The maximum number of characters that may be written into , including the null terminator, is specified by . If is 0 and is NULL, the number of characters that would be required to hold the group string, excluding the null terminator, is returned in . If does not reference a valid group ID, an INVALID_VALUE error is generated. The command void GetPerfMonitorCounterStringAMD(uint group, uint counter, sizei bufSize, sizei *length, char *counterString); returns the string that describes the counter name identified by and in . The actual number of characters written to , excluding the null terminator, is returned in . If is NULL, then no length is returned. The maximum number of characters that may be written into , including the null terminator, is specified by . If is 0 and is NULL, the number of characters that would be required to hold the counter string, excluding the null terminator, is returned in . If does not reference a valid group ID, or does not reference a valid counter within the group ID, an INVALID_VALUE error is generated. The command void GetPerfMonitorCounterInfoAMD(uint group, uint counter, enum pname, void *data); returns the following information about a counter. For a belonging to , we can query the counter type and counter range. If is COUNTER_TYPE_AMD, then returns the type. Valid type values returned are UNSIGNED_INT, UNSIGNED_INT64_AMD, PERCENTAGE_AMD, FLOAT. If type value returned is PERCENTAGE_AMD, then this describes a float value that is in the range [0.0 .. 100.0]. If is COUNTER_RANGE_AMD, returns two values representing a minimum and a maximum. The counter's type is used to determine the format in which the range values are returned. If does not reference a valid group ID, or does not reference a valid counter within the group ID, an INVALID_VALUE error is generated. The command void GenPerfMonitorsAMD(sizei n, uint *monitors) returns a list of monitors. These monitors can then be used to select groups/counters to be sampled, to start multiple monitoring sessions and to return counter information sampled by the GPU. At creation time, the performance monitor object has all counters disabled. The value of the PERFMON_RESULT_AVAILABLE_AMD, PERFMON_RESULT_AMD, and PERFMON_RESULT_SIZE_AMD queries will all initially be 0. The command void DeletePerfMonitorsAMD(sizei n, uint *monitors) is used to delete the list of monitors created by a previous call to GenPerfMonitors. If a monitor ID in the list does not reference a previously generated performance monitor, an INVALID_VALUE error is generated. The command void SelectPerfMonitorCountersAMD(uint monitor, boolean enable, uint group, int numCounters, uint *counterList); is used to enable or disable a list of counters from a group to be monitored as identified by . The argument determines whether the counters should be enabled or disabled. specifies the group ID under which counters will be enabled or disabled. The argument gives the number of counters to be selected from the list . If is not a valid monitor created by GenPerfMonitorsAMD, then INVALID_VALUE error will be generated. If is not a valid group, the INVALID_VALUE error will be generated. If is less than 0, an INVALID_VALUE error will be generated. When SelectPerfMonitorCountersAMD is called on a monitor, any outstanding results for that monitor become invalidated and the result queries PERFMON_RESULT_SIZE_AMD and PERFMON_RESULT_AVAILABLE_AMD are reset to 0. The command void BeginPerfMonitorAMD(uint monitor); is used to start a monitor session. Note that BeginPerfMonitor calls cannot be nested. In addition, it is quite possible that given the list of groups and counters/group enabled for a monitor, it may not be able to sample the necessary counters and so the monitor session will fail. In such a case, an INVALID_OPERATION error will be generated. While BeginPerfMonitorAMD does mark the beginning of performance counter collection, the counters do not begin collecting immediately. Rather, the counters begin collection when BeginPerfMonitorAMD is processed by the hardware. That is, the API is asynchronous, and performance counter collection does not begin until the graphics hardware processes the BeginPerfMonitorAMD command. The command void EndPerfMonitorAMD(uint monitor); ends a monitor session started by BeginPerfMonitorAMD. If a performance monitor is not currently started, an INVALID_OPERATION error will be generated. Note that there is an implied overhead to collecting performance counters that may or may not distort performance depending on the implementation. For example, some counters may require a pipeline flush thereby causing a change in the performance of the application. Further, the frequency at which an application samples may distort the accuracy of counters which are variant (e.g., non-deterministic based on the input). While the effects of sampling frequency are implementation dependent, general guidance can be given that sampling at a high frequency may distort both performance of the application and the accuracy of variant counters. The command void GetPerfMonitorCounterDataAMD(uint monitor, enum pname, sizei dataSize, uint *data, sizei *bytesWritten); is used to return counter values that have been sampled for a monitor session. If is PERFMON_RESULT_AVAILABLE_AMD, then will indicate whether the result is available or not. If is PERFMON_RESULT_SIZE_AMD, will contain actual size of all counter results being sampled. If is PERFMON_RESULT_AMD, will contain results. For each counter of a group that was selected to be sampled, the information is returned as group ID, followed by counter ID, followed by counter value. The size of counter value returned will depend on the counter value type. The argument specifies the number of bytes available in the buffer for writing. If is not NULL, it gives the number of bytes written into the buffer. It is an INVALID_OPERATION error for to be NULL. If is PERFMON_RESULT_AMD and is less than the number of bytes required to store the results as reported by a PERFMON_RESULT_SIZE_AMD query, then results will be written only up to the number of bytes specified by . If no BeginPerfMonitorAMD/EndPerfMonitorAMD has been issued for a monitor, then the result of querying for PERFMON_RESULT_AVAILABLE and PERFMON_RESULT_SIZE will be 0. When SelectPerfMonitorCountersAMD is called on a monitor, the results stored for the monitor become invalidated and the value of PERFMON_RESULT_AVAILABLE and PERFMON_RESULT_SIZE queries should behave as if no BeginPerfMonitorAMD/EndPerfMonitorAMD has been issued for the monitor. Errors INVALID_OPERATION error will be generated if BeginPerfMonitorAMD is unable to begin monitoring with the currently selected counters. INVALID_OPERATION error will be generated if BeginPerfMonitorAMD is called when a performance monitor is already active. INVALID_OPERATION error will be generated if EndPerfMonitorAMD is called when a performance monitor is not currently started. INVALID_VALUE error will be generated if the parameter to GetPerfMonitorCountersAMD, GetPerfMonitorCounterStringAMD, GetPerfMonitorCounterStringAMD, GetPerfMonitorCounterInfoAMD, or SelectPerfMonitorCountersAMD does not reference a valid group ID. INVALID_VALUE error will be generated if the parameter to GetPerfMonitorCounterInfoAMD does not reference a valid counter ID in the group specified by . INVALID_VALUE error will be generated if any of the monitor IDs in the parameter to DeletePerfMonitorsAMD do not reference a valid generated monitor ID. INVALID_VALUE error will be generated if the parameter to SelectPerfMonitorCountersAMD does not reference a monitor created by GenPerfMonitorsAMD. INVALID_VALUE error will be generated if the parameter to SelectPerfMonitorCountersAMD is less than 0. New State Sample Usage typedef struct { GLuint *counterList; int numCounters; int maxActiveCounters; } CounterInfo; void getGroupAndCounterList(GLuint **groupsList, int *numGroups, CounterInfo **counterInfo) { GLint n; GLuint *groups; CounterInfo *counters; glGetPerfMonitorGroupsAMD(&n, 0, NULL); groups = (GLuint*) malloc(n * sizeof(GLuint)); glGetPerfMonitorGroupsAMD(NULL, n, groups); *numGroups = n; *groupsList = groups; counters = (CounterInfo*) malloc(sizeof(CounterInfo) * n); for (int i = 0 ; i < n; i++ ) { glGetPerfMonitorCountersAMD(groups[i], &counters[i].numCounters, &counters[i].maxActiveCounters, 0, NULL); counters[i].counterList = (GLuint*)malloc(counters[i].numCounters * sizeof(int)); glGetPerfMonitorCountersAMD(groups[i], NULL, NULL, counters[i].numCounters, counters[i].counterList); } *counterInfo = counters; } static int countersInitialized = 0; int getCounterByName(char *groupName, char *counterName, GLuint *groupID, GLuint *counterID) { int numGroups; GLuint *groups; CounterInfo *counters; int i = 0; if (!countersInitialized) { getGroupAndCounterList(&groups, &numGroups, &counters); countersInitialized = 1; } for ( i = 0; i < numGroups; i++ ) { char curGroupName[256]; glGetPerfMonitorGroupStringAMD(groups[i], 256, NULL, curGroupName); if (strcmp(groupName, curGroupName) == 0) { *groupID = groups[i]; break; } } if ( i == numGroups ) return -1; // error - could not find the group name for ( int j = 0; j < counters[i].numCounters; j++ ) { char curCounterName[256]; glGetPerfMonitorCounterStringAMD(groups[i], counters[i].counterList[j], 256, NULL, curCounterName); if (strcmp(counterName, curCounterName) == 0) { *counterID = counters[i].counterList[j]; return 0; } } return -1; // error - could not find the counter name } void drawFrameWithCounters(void) { GLuint group[2]; GLuint counter[2]; GLuint monitor; GLuint *counterData; // Get group/counter IDs by name. Note that normally the // counter and group names need to be queried for because // each implementation of this extension on different hardware // could define different names and groups. This is just provided // to demonstrate the API. getCounterByName("HW", "Hardware Busy", &group[0], &counter[0]); getCounterByName("API", "Draw Calls", &group[1], &counter[1]); // create perf monitor ID glGenPerfMonitorsAMD(1, &monitor); // enable the counters glSelectPerfMonitorCountersAMD(monitor, GL_TRUE, group[0], 1, &counter[0]); glSelectPerfMonitorCountersAMD(monitor, GL_TRUE, group[1], 1, &counter[1]); glBeginPerfMonitorAMD(monitor); // RENDER FRAME HERE // ... glEndPerfMonitorAMD(monitor); // read the counters GLint resultSize; glGetPerfMonitorCounterDataAMD(monitor, GL_PERFMON_RESULT_SIZE_AMD, sizeof(GLint), &resultSize, NULL); counterData = (GLuint*) malloc(resultSize); GLsizei bytesWritten; glGetPerfMonitorCounterDataAMD(monitor, GL_PERFMON_RESULT_AMD, resultSize, counterData, &bytesWritten); // display or log counter info GLsizei wordCount = 0; while ( (4 * wordCount) < bytesWritten ) { GLuint groupId = counterData[wordCount]; GLuint counterId = counterData[wordCount + 1]; // Determine the counter type GLuint counterType; glGetPerfMonitorCounterInfoAMD(groupId, counterId, GL_COUNTER_TYPE_AMD, &counterType); if ( counterType == GL_UNSIGNED_INT64_AMD ) { unsigned __int64 counterResult = *(unsigned __int64*)(&counterData[wordCount + 2]); // Print counter result wordCount += 4; } else if ( counterType == GL_FLOAT ) { float counterResult = *(float*)(&counterData[wordCount + 2]); // Print counter result wordCount += 3; } // else if ( ... ) check for other counter types // (GL_UNSIGNED_INT and GL_PERCENTAGE_AMD) } } Revision History 11/29/2007 - dginsburg + Clarified the default state of a performance monitor object on creation 11/09/2007 - dginsbur + Clarify what happens if SelectPerfMonitorCountersAMD is called on a monitor with outstanding query results. + Rename counterSize to countersSize + Remove some ';' typos 06/13/2007 - dginsbur + Add language on the asynchronous nature of the API and counter accuracy/performance distortion. + Add myself as the contact + Remove INVALID_OPERATION error when countersList is NULL + Clarify 64-bit issue + Make PERCENTAGE_AMD counters float rather than uint + Clarify accuracy distortion on variant counters only + Tweak to overview language 06/09/2007 - dginsbur + Fill in errors section and make many more errors explicit + Fix the example code so it compiles 06/08/2007 - dginsbur + Modified GetPerfMonitorGroupString and GetPerfMonitorCounterString to be more client/server friendly. + Modified example. + Renamed parameters/variables to follow GL conventions. + Modified several 'int' param types to 'sizei' + Modifid counters type from 'int' to 'uint' + Renamed argument 'cb' and 'cbret' + Better documented GetPerfMonitorCounterData + Add AMD adornment in many places that were missing it 06/07/2007 - dginsbur + Cleanup formatting, remove tabs, make fit in proper page width + Add FLOAT and UNSIGNED_INT to list of COUNTER_TYPEs + Fix some bugs in the example code + Rewrite introduction + Clarified Issue 1 reasoning + Added Issue 3 regarding use of 64-bit data types + Added revision history 03/21/2007 - Initial version written. Written by amunshi.