Part of the Khronos Group

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Results 1 to 1 of 1

Thread: Shader run time

  1. #1
    Junior Member Newbie
    Join Date
    Mar 2013

    Shader run time

    I'm back for more. This time I'm confused by some results I'm getting while measuring performance on my shaders. The interesting part of my shader looked like this:

    Code glsl:
    for(int i = 0; i < probeSurfelCount; i++){
    	int probeIndex = probeSurfelCount*index.y + i;
    	int surfIndex = surfelIndex[probeIndex];
    	color += colorIn[surfIndex].rgb * weightGroup[probeIndex].weights[index.x];

    Where surfelIndex, colorIn and weightGroup(weights is just a float[6]) are all buffer objects I'm using to pass data. In this configuration the run time was 1.8ms. And to test a theory I removed the weight from the second last line and the time dropped to 1.4ms. So I assumed if I could limit the number of lookups in the large buffers I'm using the time should drop because if the weight was replaced with a static float it stayed the same.

    So after some changes in the rest of my code the buffers surfelIndex and weightGroup were combined into a single buffer. The shader changed to reflect this and ended up with:

    Code glsl:
    for(int i = 0; i < probeSurfelCount; i++){
    	int probeIndex = probeSurfelCount*index.y + i;
    	SurfelRef temp = surfelRefs[probeIndex];
    	int surfIndex = temp.index;
    	color += colorIn[surfIndex].rgb * temp.weights[index.x];

    Where a SurfelRef is a struct that looks as follows:

    Code glsl:
    struct SurfelRef {
    	int index;
    	float[6] weights;

    However this increased the run time to 3.8ms. Removing the weight part from the second last line again lowered it back down to 1.4ms. I would expect "SurfelRef temp = surfelRefs[probeIndex]" to eliminate the look up cost for weights almost completely. As it stands I am fairly confused why this change would increase the run time by so much. If anything I would expect the run time to go down by this change. If it would be related to the size of the buffer going up then I would expect the run time without the weight to be increased as well but that remains the same.

    Anyone have any ideas or explanations?

    (As a side not I am using glQueryCounter(queryID[i], GL_TIMESTAMP); and glGetQueryObjecti64v(queryID[0], GL_QUERY_RESULT, &startTime); to get my performance times. So I would assume they are correct.)

    (Edit: Minor change to the first code section.)
    Last edited by Mustard; 04-17-2013 at 04:08 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts