Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Results 1 to 6 of 6

Thread: When you use uniform buffer objects...

  1. #1
    Junior Member Newbie
    Join Date
    Feb 2016
    Posts
    24

    Lightbulb When you use uniform buffer objects...

    Last week I had a frustrating problem with depth testing. I had a program, which worked perfectly on an NVIDIA card (driver version 364.72), but not on AMD (driver version 15.201.1001.1005). It turned out that the problem was due to the array insinde a GLSL uniform block: It is due to the bug on AMD driver, I think, or maybe my original code didn't obey OpenGL specification and AMD couldn't swallow it. Anyway, the bug was hard to find, because no glGetError or glGetInfoLog errors found and data inside uniform block seemed to be valid. But only seemed, because the data inside an uniform block array was just gibberish on an AMD card. To help someone else avoid this bug, I'll show below which way work and which doesn't. Let's start the wrong one:

    NOTE: CODE BELOW DOESN'T WORK ON AMD CARD, BUT IT WILL WORK ON NVIDIA

    In a shader program we have uniform block

    Code :
    layout(shared) uniform TILES {
      int tiiliaX;
      int tiiliaY;
      int fmodX;
      int fmodY;
      float horizontalLevel[4*(MAKSIMIRIVI/BLOCK_SIZE+2)];   //this is an array which won't work on AMD
      float verticalLevel[4*(MAKSIMIRIVI/BLOCK_SIZE+2)];     //this is an array which won't work on AMD
    };

    and it's pair on CPU side is

    Code :
    	struct TILES {
    	  int tiiliaX;
    	  int tiiliaY;
    	  int fmodX;
    	  int fmodY;
    	  float horizontalLevel[4*(MAKSIMIRIVI/BLOCK_SIZE+2)];
    	  float verticalLevel[4*(MAKSIMIRIVI/BLOCK_SIZE+2)];
    	  GLint locations[6];
    	  GLuint bindingPoint;
    	} tiles;

    and we want to use it on three separate shader programs:

    Code :
    //lets assing a binding point 
      ubIndex=glGetUniformBlockIndex(passes[5].ohjelmaID, "TILES");
      glUniformBlockBinding(passes[5].ohjelmaID, ubIndex, tiles.bindingPoint);
      ubIndex=glGetUniformBlockIndex(passes[6].ohjelmaID, "TILES");
      glUniformBlockBinding(passes[6].ohjelmaID, ubIndex, tiles.bindingPoint);
      ubIndex=glGetUniformBlockIndex(passes[7].ohjelmaID, "TILES");
      glUniformBlockBinding(passes[7].ohjelmaID, ubIndex, tiles.bindingPoint);
     
    //here we create a single unifrom block
      glGetActiveUniformBlockiv(passes[5].ohjelmaID, ubIndex, GL_UNIFORM_BLOCK_DATA_SIZE, &uboSize2);
      glGenBuffers(1, &uboBuffer2);
      glBindBuffer(GL_UNIFORM_BUFFER, uboBuffer2);
      glBufferData(GL_UNIFORM_BUFFER, uboSize2, NULL, GL_DYNAMIC_DRAW);
     
    //this buffer is on a CPU side. We collect all uniform block data into it first and then send it to the GPU on a single call
      cpuBuffer3=(GLubyte*) malloc(uboSize2);
     
    //need to get a location to every struct item
      const GLchar *muuttujat2[] = {"tiiliaX", "tiiliaY", "fmodX", "fmodY", "horizontalLevel", "verticalLevel"};
      GLuint indeksit2[6];
      glGetUniformIndices(passes[6].ohjelmaID, 6, muuttujat2, indeksit2);
      glGetActiveUniformsiv(passes[6].ohjelmaID, 6, indeksit2, GL_UNIFORM_OFFSET, tiles.locations);
      glBindBuffer(GL_UNIFORM_BUFFER, 0);

    To fill above uniform block, use code below:

    Code :
    	glBindBuffer(GL_UNIFORM_BUFFER, uboBuffer2);
    	glBindBufferBase(GL_UNIFORM_BUFFER, tiles.bindingPoint, uboBuffer2);
     
    	memcpy(cpuBuffer3 + tiles.locations[0], &tiles.tiiliaX, sizeof(GLint));
    	memcpy(cpuBuffer3 + tiles.locations[1], &tiles.tiiliaY, sizeof(GLint));
    	memcpy(cpuBuffer3 + tiles.locations[2], &tiles.horizontalLevel, 4*(MAKSIMIRIVI/BLOCK_SIZE+2)*sizeof(GLfloat));
    	memcpy(cpuBuffer3 + tiles.locations[3], &tiles.verticalLevel, 4*(MAKSIMIRIVI/BLOCK_SIZE+2)*sizeof(GLfloat));
    	memcpy(cpuBuffer3 + tiles.locations[4], &tiles.fmodX, sizeof(GLint));
    	memcpy(cpuBuffer3 + tiles.locations[5], &tiles.fmodY, sizeof(GLint));
     
    	glBufferData( GL_UNIFORM_BUFFER, uboSize2, cpuBuffer3, GL_DYNAMIC_DRAW );

    Okay, that was the original version, which work on NVIDIA but not AMD. Next take a look a source which works both on NVIDIA and on AMD:

    NOTE: CODE BELOW WILL WORK BOTH NVIDIA AND AMD

    In a shader program we have a uniform block

    Code :
    layout(std140, shared, column_major) uniform TILES {
      ivec4 tiilet;
      vec4 horizontalLEvel[MAKSIMIRIVI/BLOCK_SIZE+2];
      vec4 verticalLevel[MAKSIMIRIVI/BLOCK_SIZE+2];
    };


    and it's pair on CPU side is

    Code :
    	struct TILES {
    	  float horizontalLevel[4*(MAKSIMIRIVI/BLOCK_SIZE+2)];
    	  float verticalLevel[4*(MAKSIMIRIVI/BLOCK_SIZE+2)];
    	  int tiilet[4];
    	  GLint locations[3];
    	  GLuint bindingPoint;
    	  GLint sizes[3];
    	} tiles;

    and we want to use it on three separate shader programs:

    Code :
    //lets assing a binding point
      ubIndex=glGetUniformBlockIndex(passes[5].ohjelmaID, "TILES");
      glUniformBlockBinding(passes[5].ohjelmaID, ubIndex, tiles.bindingPoint);
      ubIndex=glGetUniformBlockIndex(passes[6].ohjelmaID, "TILES");
      glUniformBlockBinding(passes[6].ohjelmaID, ubIndex, tiles.bindingPoint);
      ubIndex=glGetUniformBlockIndex(passes[7].ohjelmaID, "TILES");
      glUniformBlockBinding(passes[7].ohjelmaID, ubIndex, tiles.bindingPoint);
     
    //here we create a single unifrom block
      glGetActiveUniformBlockiv(passes[5].ohjelmaID, ubIndex, GL_UNIFORM_BLOCK_DATA_SIZE, &uboSize2);
      glGenBuffers(1, &uboBuffer2);
      glBindBuffer(GL_UNIFORM_BUFFER, uboBuffer2);
      glBufferData(GL_UNIFORM_BUFFER, uboSize2, NULL, GL_DYNAMIC_DRAW);
     
     
    //need to get a location to every struct item
      const GLchar *muuttujat2[] = {"tiilet", "horizontalLevel", "verticalLevel"};
      GLuint indeksit2[3];
      glGetUniformIndices(passes[6].ohjelmaID, 3, muuttujat2, indeksit2);
      glGetActiveUniformsiv(passes[6].ohjelmaID, 3, indeksit2, GL_UNIFORM_OFFSET, tiles.locations);
      glGetActiveUniformsiv(passes[6].ohjelmaID, 3, indeksit2, GL_UNIFORM_SIZE, tiles.sizes);
      glBindBuffer(GL_UNIFORM_BUFFER, 0);

    To fill above uniform block, use code below:

    Code :
    	glBindBuffer(GL_UNIFORM_BUFFER, uboBuffer2);
    	glBindBufferBase(GL_UNIFORM_BUFFER, tiles.bindingPoint, uboBuffer2);
     
    	glBufferSubData(GL_UNIFORM_BUFFER, tiles.locations[0], tiles.sizes[0]*4*sizeof(GLint), tiles.tiilet);
    	glBufferSubData(GL_UNIFORM_BUFFER, tiles.locations[1], tiles.sizes[1]*4*sizeof(GLfloat), tiles.horizontalLevel);
    	glBufferSubData(GL_UNIFORM_BUFFER, tiles.locations[2], tiles.sizes[2]*4*sizeof(GLfloat), tiles.verticalLevel);

    While there is std140 layout, you cannot rely on it. Actually code above doesn't need it, it's just to make AMD happy.

    Have a nice day
    Last edited by mamannon; 04-18-2016 at 02:33 AM. Reason: fixes

  2. #2
    Senior Member OpenGL Lord
    Join Date
    May 2009
    Posts
    5,906
    While there is std140 layout, you cannot rely on it.
    No, you can rely on `std140` layout. Your code simply didn't.

    Code :
    layout(shared) uniform TILES

    Code :
    layout(std140, shared, column_major) uniform TILES

    Both of these use `shared`, not `std140` (FYI: if you put multiple values that are mutually exclusive in `layout`, then the last one takes precedence). If you use `shared`, you must query the shader for the offsets, strides, and so forth for everything.

    That's not to say that there aren't issues with some driver's support for `std140` (though these are primarily around the use of 3-element vectors). But before you can claim you're getting a driver bug, you have to actually be using `std140` layout. And you aren't.

  3. #3
    Junior Member Newbie
    Join Date
    Feb 2016
    Posts
    24
    Quote Originally Posted by Alfonse Reinheart View Post
    No, you can rely on `std140` layout. Your code simply didn't.
    Yes, that's true... I tested it without 'std140' and it worked. While debuggin I just added all bells and giggles to get it working.

    I didn't claim I found a driver bug, it was just a suggestion between my code and driver code. However, say what you say, AMD driver is limited compared to NVIDIA.

  4. #4
    Senior Member OpenGL Lord
    Join Date
    May 2009
    Posts
    5,906
    Quote Originally Posted by mamannon View Post
    I tested it without 'std140' and it worked.
    You got lucky; that doesn't mean your code works. Just like a wild write won't necessarily crash your program, but that doesn't mean your program "works".

    Quote Originally Posted by mamannon View Post
    I didn't claim I found a driver bug
    "It is due to the bug on AMD driver,..." I don't know how to interpret that as anything except an accusation of there possibly being a bug in AMD's drivers.

  5. #5
    Junior Member Newbie
    Join Date
    Feb 2016
    Posts
    24
    Quote Originally Posted by Alfonse Reinheart View Post
    "It is due to the bug on AMD driver,..." I don't know how to interpret that as anything except an accusation of there possibly being a bug in AMD's drivers.
    You should read my original sentence to the end.

  6. #6
    Senior Member OpenGL Pro
    Join Date
    Jan 2007
    Posts
    1,681
    Reading the original sentence, to the end, we get:
    It is due to the bug on AMD driver, I think, or maybe my original code didn't obey OpenGL specification and AMD couldn't swallow it.
    However we also see statements like "Actually code above doesn't need it, it's just to make AMD happy" and "say what you say, AMD driver is limited compared to NVIDIA".

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •