UniformBuffers in OpnGL 3.0+

Hi I have question, how I connect uniform buffer with array of structs in vertex shader in modern OGL 3.0+ way?
I think glUniformBuffer is deprecated isn’t it?

I understand that I should create and fill this buffer in such way:


GLuint vbuffer;
glGenBuffers(1, &vbuffer);
glBindBuffer(GL_UNIFORM_BUFFER_EXT, vbuffer);
glBufferData(GL_UNIFORM_BUFFER_EXT, vsize, tmp,GL_STATIC_READ);

But from where I can get max uniform buffer size in bytes?
And how to connect it with declaration in shader source?

Take this reply in context. I would like to learn this also – I am just reading <a href=“http://www.opengl.org/registry/specs/ARB/uniform_buffer_object.txt” rel=“nofollow” target=“_blank”>ARB_uniform_buffer_object
</a> where there is a section named “Sample Code”… I suggest reading that too.

Assume in your shader code


    layout(std140) uniform colors0
    {
        float DiffuseCool;
        float DiffuseWarm;
        vec3  SurfaceColor;
        vec3  WarmColor;
        vec3  CoolColor;
    };

It looks like the relevant code to “connect it[uniform_buffer_object] with declaration in shader source” to be


        //Update the uniforms using ARB_uniform_buffer_object
        glGenBuffers(1, &buffer_id);

        //There's only one uniform block here, the 'colors0' uniform block. 
        //It contains the color info for the gooch shader.
        uniformBlockIndex = glGetUniformBlockIndex(prog_id, "colors0");
        
        //We need to get the uniform block's size in order to back it with the
        //appropriate buffer
        glGetActiveUniformBlockiv(prog_id, uniformBlockIndex,
                                     GL_UNIFORM_BLOCK_DATA_SIZE,
                                     &uniformBlockSize);
        glError();
        
        //Create UBO.
        glBindBuffer(GL_UNIFORM_BUFFER, buffer_id);
        glBufferData(GL_UNIFORM_BUFFER, uniformBlockSize,
                     NULL, GL_DYNAMIC_DRAW);

        //Now we attach the buffer to UBO binding point 0...
        glBindBufferBase(GL_UNIFORM_BUFFER, 0, buffer_id);
        //And associate the uniform block to this binding point.
        glUniformBlockBinding(prog_id, uniformBlockIndex, 0);
        glError();

...

// then when you go to render

       glBindBuffer(GL_UNIFORM_BUFFER, buffer_id);
        //We can use BufferData to upload our data to the shader,
        //since we know it's in the std140 layout
        glBufferData(GL_UNIFORM_BUFFER, 80, colors, GL_DYNAMIC_DRAW);
(!!!insert my comment, 80 =sizeof(colors) !!!)
        //With a non-standard layout, we'd use BufferSubData for each uniform.
        glBufferSubData(GL_UNIFORM_BUFFER_EXT, offset, singleSize, &colors[8]);

Maybe that will give us some hints :slight_smile:

So I understand that I could use std140 layout for UBO which gives me simplicity to update it at once but with cost of VRAM (every component is expanded to float4 I assume?).

Alternative is my own UBO layout with good packed structures (no holes) but then I need to call glBufferSubData() for every struct in UBO? So for eg. in for loop 10 times if I have data for 10 instances in it?

I think in second approach there also can be some performance loose because of fact gathering non float4 aligned data.
So in above egsample the best approach would be to use std140 with modified struct:


    layout(std140) uniform colors0
    {
        struct warm 
        {
        float Diffuse;
        vec3  Color;
        } Warm;

        struct cool 
        {
        float Diffuse;
        vec3  Color;
        } Cool;

        vec3  SurfaceColor;
    };

I think it would be now packed as three float4 so we loose
only 4 bytes for offset and sizeof(colors) should be
3x4x4 = 48.

If something is wrong please correct me :).

I believe that’s correct per rule 3:

http://www.opengl.org/registry/specs/ARB/uniform_buffer_object.txt

[b] (1) If the member is a scalar consuming <N> basic machine units, the
base alignment is <N>.

  (2) If the member is a two- or four-component vector with components
      consuming &lt;N&gt; basic machine units, the base alignment is 2&lt;N&gt; or
      4&lt;N&gt;, respectively.

  (3) If the member is a three-component vector with components consuming
      &lt;N&gt; basic machine units, the base alignment is 4&lt;N&gt;.

  (4) If the member is an array of scalars or vectors, the base alignment
      and array stride are set to match the base alignment of a single
      array element, according to rules (1), (2), and (3), and rounded up
      to the base alignment of a vec4. The array may have padding at the
      end; the base offset of the member following the array is rounded up
      to the next multiple of the base alignment.

  (5) If the member is a column-major matrix with &lt;C&gt; columns and &lt;R&gt;
      rows, the matrix is stored identically to an array of &lt;C&gt; column
      vectors with &lt;R&gt; components each, according to rule (4).

  (6) If the member is an array of &lt;S&gt; column-major matrices with &lt;C&gt;
      columns and &lt;R&gt; rows, the matrix is stored identically to a row of
      &lt;S&gt;*&lt;C&gt; column vectors with &lt;R&gt; components each, according to rule
      (4).

  (7) If the member is a row-major matrix with &lt;C&gt; columns and &lt;R&gt; rows,
      the matrix is stored identically to an array of &lt;R&gt; row vectors
      with &lt;C&gt; components each, according to rule (4).

  (8) If the member is an array of &lt;S&gt; row-major matrices with &lt;C&gt; columns
      and &lt;R&gt; rows, the matrix is stored identically to a row of &lt;S&gt;*&lt;R&gt;
      row vectors with &lt;C&gt; components each, according to rule (4).

  (9) If the member is a structure, the base alignment of the structure is
      &lt;N&gt;, where &lt;N&gt; is the largest base alignment value of any of its
      members, and rounded up to the base alignment of a vec4. The
      individual members of this sub-structure are then assigned offsets 
      by applying this set of rules recursively, where the base offset of
      the first member of the sub-structure is equal to the aligned offset
      of the structure. The structure may have padding at the end; the 
      base offset of the member following the sub-structure is rounded up
      to the next multiple of the base alignment of the structure.

  (10) If the member is an array of &lt;S&gt; structures, the &lt;S&gt; elements of
       the array are laid out in order, according to rule (9).

[/b]

PS - that said, it might well turn out that using the “packed” form could come out smaller on common implementations, then you just need to query the offsets at runtime. Setting up your code to not require things to line up the same way as a C struct can be helpful.

Can I do something like this (part in my framework treat as pseudocode):

In constructor I prepare buffers/program connection once:


m_gpu = enRenderingContext::getPointer();

 // Create VBO with tile description
 uint8 row[] = {2, FLOAT3, FLOAT2 };
 m_tileID = m_gpu->render.input.create((void*)tile,3,row);

 // Create VS, FS and combined program
 m_objectsID[0] = loadShader(EN_VERTEX, string("resources/T2_vs.txt"));
 m_objectsID[1] = loadShader(EN_FRAGMENT, string("resources/Fbasic.txt"));
 m_progID       = m_gpu->render.program.create(2,m_objectsID);
 m_gpu->render.program.compile(m_progID);

 // Connect VBO to program input
 m_ogl_progID = m_gpu->render.program.handle(m_progID);
 glBindAttribLocation(m_ogl_progID, 0, "mPosition");
 glBindAttribLocation(m_ogl_progID, 1, "mTexCoord0");

 // Get handles to program params
 m_ogl_mvpID = glGetUniformLocation(m_ogl_progID, "matMVP");

 // Create UBO for instancing data
 glGenBuffers(1, &m_uboID);
 glBufferData(GL_UNIFORM_BUFFER, (T2_MAX_SECTORS * sizeof(sector_desc)), NULL, GL_DYNAMIC_DRAW);

 // Connect UBO to UB in program (through binding table)
 m_ogl_ubID = glGetUniformBlockIndex(m_ogl_progID, string("Sectors").c_str() ); 
 if (m_ogl_ubID != GL_INVALID_INDEX)
    {
    sint32 ub_size = 0;
    glGetActiveUniformBlockiv(m_ogl_progID, m_ogl_ubID, GL_UNIFORM_BLOCK_DATA_SIZE, &ub_size);

    if (ub_size == (T2_MAX_SECTORS * sizeof(sector_desc)))
       {
       glBindBufferBase(GL_UNIFORM_BUFFER, 0, m_uboID);
       glUniformBlockBinding(m_ogl_progID, m_ogl_ubID, 0);   // 0 czy m_uboID ??
       }
    }

Then in rendering I just do something like that:


// Fill Tiles Data Buffer (UBO)
 glBindBuffer(GL_UNIFORM_BUFFER, m_uboID);
 glBufferData(GL_UNIFORM_BUFFER, sectors * sizeof(sector_desc), &m_sectors[0],GL_DYNAMIC_DRAW); // STATIC READ ?? are this correct ??

 // Set the rest of params
 float4x4 mvpMatrix = mul(camera->projectionMatrix(),mul(camera->matrix(),this->matrix()));  
 glUniformMatrix4fv(m_ogl_mvpID, 1, false, (const GLfloat *)&mvpMatrix.m); // set the planet projection matrix

 m_gpu->render.program.use(m_progID);
 m_gpu->render.input.draw(m_tileID,EN_TRIANGLES,sectors);

So basically I once connect everything together and then I
just update UBO with new data. But this isn’t working so I’m trying to figure out where I could have an error.

Do you have some suggestions?

EDIT:
It turn’s out that it works and error was not in the UBO’s.

Second question:
Does updating UBO every frame hurts performance a lot?

>Does updating UBO every frame hurts performance a lot?

I was wondering the same thing. I did some very specific benchmarks with my particular code (ie risky to extrapolate to a general case) but found that using the uniform block decreased the time to draw my scene significantly (about 50%). So the UBO actually helped my performance, even though I sent the entire block of data each frame (glBindBuffer + glBindBufferData just like you do in your renderframe).

I am attaching my code based on GLUT, openGL Mathematics Library and using the CPU timing instruction rdtsc. And I have some 2MB vertices/normals files (blender_droid_noidtluom.*) that are too big to post here :). This may be too specific for anyone else to compile and run but it will show what I am using for the very specific benchmark … Just including the code to be complete.


unsigned long long int rdtsc(void);

// g++ -DFULLSCREEN -DUSE_IDLE -DREGULATE_FPS -I./include/ glut_opengl3_helloworld_uniform_block.cpp  lib/libglut.a -lXxf86vm -L./lib/ -lglut -lGLEW -DGL3_PROTOTYPES
// g++              -DUSE_IDLE -DREGULATE_FPS -I./include/ glut_opengl3_helloworld_uniform_block.cpp  lib/libglut.a -lXxf86vm -L./lib/ -lglut -lGLEW -DGL3_PROTOTYPES
// -D [ USE_IDLE | REGULATE_FPS | XGLUNIFORMMATRIX4FV ]

#include <GL/glew.h>     // great opengl extensions helper
#include <GL/gl3.h>
#include <GL/freeglut.h> // use of opengl 3.0 context requires freeglut!

//OpenGL Mathematics (GLM).  A C++ mathematics library for 3D graphics.
#include <glm/glm.h>
#include <glm/glmext.h>
#include <glm/GLM_VIRTREV_address.h> // gives helper function address(X) instead of using &X[0][0]
namespace glm
{
          // read glm/gtx/transform.h for function prototypes
          using GLM_GTX_transform;
          using GLM_GTX_matrix_projection;
          using GLM_GTX_transform2; // for lookAt
}
using namespace glm;

#include <cstdio>
#include <cmath>
#include <iostream>
#include <cstring>
#include <cassert>

#define MAKE_GLSL_STRING(A) #A

#ifdef XGLUNIFORMMATRIX4FV
inline void XglUniformMatrix4fv (const GLint &a, const GLsizei &b, const GLboolean &c, const mat4 &d)
{
	glUniformMatrix4fv (a, b, c, address(d));
};
#endif

//quick helper for printing mat4 values
std::ostream& operator << (std::ostream& os, glm::mat4& M)
{
 os << M[0][0] << "	" << M[0][1] << "	" << M[0][2] << "	" << M[0][3] << std::endl;
 os << M[1][0] << "	" << M[1][1] << "	" << M[1][2] << "	" << M[1][3] << std::endl;
 os << M[2][0] << "	" << M[2][1] << "	" << M[2][2] << "	" << M[2][3] << std::endl;
 os << M[3][0] << "	" << M[3][1] << "	" << M[3][2] << "	" << M[3][3] << std::endl;
 return os;
};

//quick helper for printing flat matrix values
void show_gl4x4 (GLfloat *M)
{
 std::cout <<  M[0] << "	" <<  M[1] << "	" <<  M[2] << "	" <<  M[3] << std::endl;
 std::cout <<  M[4] << "	" <<  M[5] << "	" <<  M[6] << "	" <<  M[7] << std::endl;
 std::cout <<  M[8] << "	" <<  M[9] << "	" << M[10] << "	" << M[11] << std::endl;
 std::cout << M[12] << "	" << M[13] << "	" << M[14] << "	" << M[15] << std::endl;
};

GLfloat uniformBlock_colors0[] =
{                    //layout(std140) uniform colors0
  0.45,       1,1,1, //float DiffuseCool
  0.45,       1,1,1, //float DiffuseWarm
  0.75,0.75,0.75, 1, //vec3  SurfaceColor
  0.0,0.0,1.0,    1, //vec3  WarmColor
  0.0,1.0,0.0,    1, //vec3  CoolColor
};
GLuint sizeof_uniformBlock_colors0 = sizeof(uniformBlock_colors0);

//convenience map into uniformBlock_colors0
GLfloat &DiffuseCool = (GLfloat&)uniformBlock_colors0[0];
GLfloat &DiffuseWarm = (GLfloat&)uniformBlock_colors0[4];
vec3 &SurfaceColor = (vec3&)uniformBlock_colors0[8];
vec3 &WarmColor = (vec3&)uniformBlock_colors0[12];
vec3 &CoolColor = (vec3&)uniformBlock_colors0[16];

GLfloat uniformBlock_matrix1[] =
{                  //layout(std140) uniform matrix1
  1.0,0.0,0.0,0.0, //mat4  glm_ProjectionMatrix
  0.0,1.0,0.0,0.0,
  0.0,0.0,1.0,0.0,
  0.0,0.0,0.0,1.0,
  1.0,0.0,0.0,0.0, //mat4  glm_ModelViewMatrix
  0.0,1.0,0.0,0.0,
  0.0,0.0,1.0,0.0,
  0.0,0.0,0.0,1.0,
  1.0,0.0,0.0,0.0, //mat4  glm_NormalMatrix
  0.0,1.0,0.0,0.0,
  0.0,0.0,1.0,0.0,
  0.0,0.0,0.0,1.0,
};
GLuint sizeof_uniformBlock_matrix1 = sizeof(uniformBlock_matrix1);

//convenience map into uniformBlock_matrix1
mat4 &glm_ProjectionMatrix = (mat4&)uniformBlock_matrix1[0]; 
mat4 &glm_ModelViewMatrix = (mat4&)uniformBlock_matrix1[16]; 
mat4 &glm_NormalMatrix = (mat4&)uniformBlock_matrix1[32]; 

class CustomGraphicsPipeline
{
			GLuint shader_id;
      GLuint uniformBlock_matrix1_id; // uniform block matrix1
      GLuint uniformBlock_colors0_id; // uniform block colors0
#ifdef XGLUNIFORMMATRIX4FV
      mat4   glm_ProjectionMatrix; //cpu-side 
			GLint  glm_ProjectionMatrix_id; //cpu-side hook to shader uniform
      mat4   glm_ModelViewMatrix; //cpu-side 
			GLint  glm_ModelViewMatrix_id; //cpu-side hook to shader uniform
      mat4   glm_NormalMatrix; //cpu-side 
			GLint  glm_NormalMatrix_id; //cpu-side hook to shader uniform
#endif
			GLuint vao_id[2]; // vertex array object hook id
			GLuint vao_elementcount[2]; // number of attribute elements in vao ie number of vertices
		public:
			CustomGraphicsPipeline() :
       shader_id(NULL),
#ifdef XGLUNIFORMMATRIX4FV
       glm_ProjectionMatrix( mat4(1.0f) ), // identity matrix
       glm_ProjectionMatrix_id(NULL),
       glm_ModelViewMatrix( mat4(1.0f) ), // identity matrix
       glm_ModelViewMatrix_id(NULL),
       glm_NormalMatrix( mat4(1.0f) ), // identity matrix
       glm_NormalMatrix_id(NULL),
#else
       uniformBlock_matrix1_id(NULL),
#endif
       uniformBlock_colors0_id(NULL)
      {
       vao_id[0]=NULL; 
       vao_id[1]=NULL;
       vao_elementcount[0]=0; 
       vao_elementcount[1]=0;
      }
			~CustomGraphicsPipeline() {}

      bool Init()										// All Setup For OpenGL Goes Here
      {
        GLint ret = true; // optimistic return value upon completion of execution 

#ifdef DEBUG
        std::cout << "GL_VERSION: " << glGetString(GL_VERSION) << std::endl;
        std::cout << "GL_EXTENSIONS: " << glGetString(GL_EXTENSIONS) << std::endl;
        std::cout << "GL_RENDERER: " << glGetString(GL_RENDERER) << std::endl;
        std::cout << "GL_VENDOR: " << glGetString(GL_VENDOR) << std::endl;
        std::cout << "GLU_VERSION: " << gluGetString(GLU_VERSION) << std::endl;
        std::cout << "GLU_EXTENSIONS: " << gluGetString(GLU_EXTENSIONS) << std::endl;
#ifdef GLUT_XLIB_IMPLEMENTATION
        std::cout << "GLUT_API_VERSION: " << GLUT_API_VERSION << std::endl;
        std::cout << "GLUT_XLIB_IMPLEMENTATION: " << GLUT_XLIB_IMPLEMENTATION << std::endl;
#endif
#endif

        glEnable(GL_DEPTH_TEST);
        glClearColor(0.2f, 0.2f, 0.2f, 0.5f);

      //Datas destioned for video memory, can be local (and lost after bound to GPU!). 
        GLfloat vertices0[] = { // dgl_Vertex
          1.0, -1.0, 0.0, 1.0, // xyzw 
         -1.0, -1.0, 0.0, 1.0,
          0.0,  1.0, 0.0, 1.0
        };
        size_t Nbytes_vertices0=sizeof(vertices0);

        GLfloat normals0[] = { // dgl_Normal
          0.0,  0.0, 1.0, 1.0, // xyzw
          0.0,  0.0, 1.0, 1.0, 
          0.0,  0.0, 1.0, 1.0
        };
        size_t Nbytes_normals0=sizeof(normals0);

        GLfloat colors0[] = { // dgl_Color
          0.0, 0.0, 1.0, 1.0, //rgba
          0.0, 1.0, 0.0, 1.0,
          1.0, 0.0, 0.0, 1.0
        };
        size_t Nbytes_colors0=sizeof(colors0);

        GLfloat vertices1[] = { // dgl_Vertex
#include "blender_droid_noidtluom.vertices"
        };
        size_t Nbytes_vertices1=sizeof(vertices1);

        GLfloat normals1[] = { // dgl_Normal
#include "blender_droid_noidtluom.normals"
        };
        size_t Nbytes_normals1=sizeof(normals1);


        GLfloat colors1[] = { // dgl_Color
#include "blender_droid_noidtluom.vertices"
        };
        size_t Nbytes_colors1=sizeof(colors1);

        const GLchar *g_vertexShader[] = 
        {
          //GLSL "#" directives can't be inside MAKE_GLSL_STRING, put before MAKE_GLSL_STRING!
          "#version 150
",
          MAKE_GLSL_STRING(
           // Vertex shader for Gooch shading
           // Author: Randi Rost
           // Copyright (c) 2002-2006 3Dlabs Inc. Ltd.
           // See 3Dlabs-License.txt for license information
           //uniform mat4 glm_ProjectionMatrix; // replaces deprecated gl_ProjectionMatrix see http://www.lighthouse3d.com/opengl/glsl/index.php?minimal
           //uniform mat4 glm_ModelViewMatrix; // replaces deprecated gl_ModelViewMatrix
           //[tsm]http://www.lighthouse3d.com/opengl/glsl/index.php?normalmatrix
           //[tsm]gl_NormalMatrix is transpose(inverse(gl_ModelViewMatrix))
           //uniform mat4 glm_NormalMatrix; // replaces deprecated gl_ModelViewMatrix

#ifndef XGLUNIFORMMATRIX4FV
           layout(std140) uniform matrix1
           {
#endif
             uniform mat4 glm_ProjectionMatrix; // replaces deprecated gl_ProjectionMatrix see http://www.lighthouse3d.com/opengl/glsl/index.php?minimal
             uniform mat4 glm_ModelViewMatrix; // replaces deprecated gl_ModelViewMatrix
             //[tsm]http://www.lighthouse3d.com/opengl/glsl/index.php?normalmatrix
             //[tsm]gl_NormalMatrix is transpose(inverse(gl_ModelViewMatrix))
             uniform mat4 glm_NormalMatrix;
#ifndef XGLUNIFORMMATRIX4FV
           };
#endif

           in		 vec4 dgl_Vertex; // replaces deprecated gl_Vertex
           in		 vec4 dgl_Normal; // replaces deprecated gl_Normal
           in		 vec4 dgl_Color; // replaces deprecated gl_Color
          
           out float NdotL;
           out vec3  ReflectVec;
           out vec3  ViewVec;
           invariant out	vec4 Color; // to fragment shader
         
           vec3 LightPosition = vec3(5.0, 5.0, 5.0); 

           void main(void)
           {
               vec3 ecPos      = vec3 (glm_ModelViewMatrix * dgl_Vertex);
               vec3 tnorm      = normalize(glm_NormalMatrix * dgl_Normal).xyz; //[tsm] kludge added .xyz at end
               vec3 lightVec   = normalize(LightPosition - ecPos);
               ReflectVec      = normalize(reflect(-lightVec, tnorm));
               ViewVec         = normalize(-ecPos);
               NdotL           = (dot(lightVec, tnorm) + 1.0) * 0.5;
         	     Color = dgl_Color;
               gl_Position = glm_ProjectionMatrix*glm_ModelViewMatrix*dgl_Vertex; // replaces deprecated ftransform()
           }
          )
        };

        const GLchar *g_fragmentShader[] = 
        {
          //GLSL "#" directives can't be inside MAKE_GLSL_STRING, put before MAKE_GLSL_STRING!
          "#version 150
",
          "#extension GL_ARB_uniform_buffer_object : enable
",
          MAKE_GLSL_STRING(
            // Fragment shader for Gooch shading, adapted for ARB_uniform_buffer_object
          
            layout(std140) uniform colors0
            {
                float DiffuseCool;
                float DiffuseWarm;
                vec3  SurfaceColor;
                vec3  WarmColor;
                vec3  CoolColor;
            };
          
            invariant in vec4 Color; // from vertex shader
            in float NdotL;
            in vec3  ReflectVec;
            in vec3  ViewVec;

            out vec4 dgl_FragColor; // replaces deprecated gl_FragColor
          
            void main (void)
            {  
                vec3 kcool    = min(CoolColor + DiffuseCool * SurfaceColor, 1.0);
                vec3 kwarm    = min(WarmColor + DiffuseWarm * SurfaceColor, 1.0); 
                vec3 kfinal   = mix(kcool, kwarm, NdotL);
        
                vec3 nreflect = normalize(ReflectVec);
                vec3 nview    = normalize(ViewVec);
        
                float spec    = max(dot(nreflect, nview), 0.0);
                spec          = pow(spec, 32.0);
        
                dgl_FragColor = vec4 (min(kfinal + spec, 1.0), 1.0); // gl_FragColor is deprecated
                //dgl_FragColor = Color; // gl_FragColor is deprecated
            }
          )
        };

      // compile Vertex shader
        GLuint m_vxShaderId = glCreateShader(GL_VERTEX_SHADER);
        GLsizei nlines_vx = sizeof(g_vertexShader)/sizeof(const GLchar*);
        glShaderSource(m_vxShaderId, nlines_vx, (const GLchar**)g_vertexShader, NULL);
        glCompileShader(m_vxShaderId);
        CheckShader(m_vxShaderId, GL_COMPILE_STATUS, &ret, "unable to compile the vertex shader!");

      // compile Fragment shader
        GLuint m_fgShaderId = glCreateShader(GL_FRAGMENT_SHADER);
        GLsizei nlines_fg = sizeof(g_fragmentShader)/sizeof(const GLchar*);
        glShaderSource(m_fgShaderId, nlines_fg, (const GLchar**)g_fragmentShader, NULL);
        glCompileShader(m_fgShaderId);
        CheckShader(m_fgShaderId, GL_COMPILE_STATUS, &ret, "unable to compile the fragment shader!");
	  
      // link shaders
        shader_id = glCreateProgram();
        glAttachShader(shader_id, m_vxShaderId);
        glAttachShader(shader_id, m_fgShaderId);
        glLinkProgram(shader_id);
        CheckShader(shader_id, GL_LINK_STATUS, &ret, "unable to link the program!");
 
      //hooks from CPU to GPU
#ifdef XGLUNIFORMMATRIX4FV
        //define Uniform hooks
        glm_ProjectionMatrix_id = glGetUniformLocation(shader_id, "glm_ProjectionMatrix");
        glm_ModelViewMatrix_id = glGetUniformLocation(shader_id, "glm_ModelViewMatrix");
        glm_NormalMatrix_id = glGetUniformLocation(shader_id, "glm_NormalMatrix");
#endif

        //better to send block of uniforms rather than a single uniform at a time?
        //"layout(std140) uniform colors0"
        defineUniformBlockObject(0,"colors0",uniformBlock_colors0_id);

#ifndef XGLUNIFORMMATRIX4FV
        //better to send block of uniforms rather than a single uniform at a time?
        //"layout(std140) uniform matrix1"
        defineUniformBlockObject(1,"matrix1",uniformBlock_matrix1_id);
#endif

        //guard that all attributes have same number of elements
        assert(Nbytes_vertices0/4/sizeof(GLfloat)==Nbytes_colors0/4/sizeof(GLfloat));
        vao_elementcount[0]=Nbytes_vertices0/4/sizeof(GLfloat); // number of elements for first VAO
        assert(Nbytes_vertices1/4/sizeof(GLfloat)==Nbytes_colors1/4/sizeof(GLfloat));
        vao_elementcount[1]=Nbytes_vertices1/4/sizeof(GLfloat); // number of elements for second VAO

        //create and define vertex array objects
        glGenVertexArrays(2, &vao_id[0]); // vao_id[#] will be referenced in Draw()
        defineVertexArrayObject(vao_id[0],Nbytes_vertices0,4,GL_FLOAT,vertices0,normals0,colors0); //VertexAttribArray: vertices0, colors0
        defineVertexArrayObject(vao_id[1],Nbytes_vertices1,4,GL_FLOAT,vertices1,normals1,colors1); //VertexAttribArray: vertices1, colors1

      // finally, use the shader for rendering 
        glUseProgram(shader_id);            // select the shaders program

        return ret;										// Initialization Went OK?
      }

      void defineVertexArrayObject(GLuint vaoId, size_t Nbytes, GLint size, GLenum type, GLfloat *vertices, GLfloat *normals, GLfloat *colors)
      {
        //enable vertex array object to be defined
        glBindVertexArray(vaoId);

        //generate VBO foreach 'in'; dgl_Vertex, dgl_Normal, and dgl_Color
        GLuint m_vboId[3];
        glGenBuffers(3, &m_vboId[0]);

        //"in		 vec4 dgl_Vertex;",
        glBindBuffer(GL_ARRAY_BUFFER, m_vboId[0] );	// enable the 1st VBO
        glBufferData(GL_ARRAY_BUFFER, Nbytes, vertices, GL_STATIC_DRAW); // fill the VBO with vertices data
        const GLuint index_mPosition = glGetAttribLocation(shader_id,"dgl_Vertex"); // get ID for "dgl_Vertex"
        glVertexAttribPointer(index_mPosition, size, type, GL_FALSE, 0, 0); // VBO point to the "dgl_Vertex" attribute
        glEnableVertexAttribArray(index_mPosition);		// enable VBO vertex attribute ("dgl_Vertex")

        //"in		 vec4 dgl_Normal;",
        glBindBuffer(GL_ARRAY_BUFFER, m_vboId[1] );	// enable the 2nd VBO
        glBufferData(GL_ARRAY_BUFFER, Nbytes, normals, GL_STATIC_DRAW); // fill the VBO with vertices data
        const GLuint index_mNormal = glGetAttribLocation(shader_id,"dgl_Normal"); // get ID for "dgl_Normal"
        glVertexAttribPointer(index_mNormal, size, type, GL_FALSE, 0, 0); // VBO point to the "dgl_Normal" attribute
        glEnableVertexAttribArray(index_mNormal);		// enable VBO vertex attribute ("dgl_Normal")

        //"in		 vec4 dgl_Color;",
        glBindBuffer(GL_ARRAY_BUFFER, m_vboId[2]);	// enable the 3rd VBO
        glBufferData(GL_ARRAY_BUFFER, Nbytes , colors, GL_STATIC_DRAW); // fill the 2nd VBO with colors data
        const GLuint index_mcolor = glGetAttribLocation(shader_id,"dgl_Color"); // get ID for "dgl_Color"
        glVertexAttribPointer(index_mcolor, size, type, GL_FALSE, 0, 0); // VBO point to the "dgl_Color" attribute
        glEnableVertexAttribArray(index_mcolor);		// enable VBO vertex attribute ("dgl_Color")
      }

      void defineUniformBlockObject(GLuint binding_point, const char *GLSL_block_string, GLuint &uniformBlock_id)
      {
        // externally used values //////////////////////////////////////////
	      //Update the uniforms using ARB_uniform_buffer_object
	      glGenBuffers(1, &uniformBlock_id);
        ////////////////////////////////////////////////////////////////////

	      //"layout(std140) uniform GLSL_block_string"
	      GLuint  uniformBlockIndex = glGetUniformBlockIndex(shader_id, GLSL_block_string);

	      //And associate the uniform block to binding point
	      glUniformBlockBinding(shader_id, uniformBlockIndex, binding_point);

	      //Now we attach the buffer to UBO binding_point...
	      glBindBufferBase(GL_UNIFORM_BUFFER, binding_point, uniformBlock_id);

	      //We need to get the uniform block's size in order to back it with the
	      //appropriate buffer
	      GLsizei uniformBlockSize;
	      glGetActiveUniformBlockiv(shader_id, uniformBlockIndex,
		      GL_UNIFORM_BLOCK_DATA_SIZE,
		      &uniformBlockSize);
        //uniformBlockSize is NOT same as sizeof_uniformBlock_colors0

	      //Create UBO.
	      glBindBuffer(GL_UNIFORM_BUFFER, uniformBlock_id);
	      glBufferData(GL_UNIFORM_BUFFER, uniformBlockSize, NULL, GL_DYNAMIC_DRAW);
      }

      void CheckShader(GLuint id, GLuint type, GLint *ret, const char *onfail)
      {
       //Check if something is wrong with the shader
       switch(type) {
       case(GL_COMPILE_STATUS):
         glGetShaderiv(id, type, ret);
         if(*ret == false){
          int infologLength =  0;
          glGetShaderiv(id, GL_INFO_LOG_LENGTH, &infologLength);
          GLchar buffer[infologLength];
          GLsizei charsWritten = 0;
          std::cout << onfail << std::endl;
          glGetShaderInfoLog(id, infologLength, &charsWritten, buffer);
          std::cout << buffer << std::endl;
         }
         break;
       case(GL_LINK_STATUS):
         glGetProgramiv(id, type, ret);
         if(*ret == false){
          int infologLength =  0;
          glGetProgramiv(id, GL_INFO_LOG_LENGTH, &infologLength);
          GLchar buffer[infologLength];
          GLsizei charsWritten = 0;
          std::cout << onfail << std::endl;
          glGetProgramInfoLog(id, infologLength, &charsWritten, buffer);
          std::cout << buffer << std::endl;
         }
       default:
         break;
       };
      }

      void Draw()									// Here's Where We Do All The Drawing
      {
unsigned long long before = rdtsc();

        glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT );

        // just before drawing update the normal matrix
        glm_NormalMatrix = transpose(inverse(glm_ModelViewMatrix));

        //set uniform block colors0
        glBindBuffer(GL_UNIFORM_BUFFER, uniformBlock_colors0_id);
        //We can use BufferData to upload our data to the shader,
        //since we know it's in the std140 layout
        glBufferData(GL_UNIFORM_BUFFER, sizeof_uniformBlock_colors0, uniformBlock_colors0, GL_DYNAMIC_DRAW);
        //With a non-standard layout, we'd use BufferSubData for each uniform.
        //  glBufferSubData(GL_UNIFORM_BUFFER, offset, singleSize, &uniformBlock_colors0[8]);

#ifndef XGLUNIFORMMATRIX4FV
        //set uniform block matrix1
        glBindBuffer(GL_UNIFORM_BUFFER, uniformBlock_matrix1_id);
        //We can use BufferData to upload our data to the shader,
        //since we know it's in the std140 layout
        glBufferData(GL_UNIFORM_BUFFER, sizeof_uniformBlock_matrix1, uniformBlock_matrix1, GL_DYNAMIC_DRAW);
        //With a non-standard layout, we'd use BufferSubData for each uniform.
        //  glBufferSubData(GL_UNIFORM_BUFFER, offset, singleSize, &uniformBlock_colors0[8]);
#else
        //glm_ModelViewMatrix *= translate(       vec3(0.0, 0.0, 0.0) );
        //glm_ModelViewMatrix *=    rotate(angle, vec3(0.0, 1.0, 0.0) ); 
        //glm_ModelViewMatrix *=     scale(       vec3(0.5, 0.5, 0.5) ); 
        //XglUniformMatrix4fv(glm_ProjectionMatrix_id, 1, false, glm_ProjectionMatrix );
        XglUniformMatrix4fv(glm_ModelViewMatrix_id, 1, false, glm_ModelViewMatrix );	// set the rotation/translation/scale matrix 
        XglUniformMatrix4fv(glm_NormalMatrix_id, 1, false, transpose(inverse(glm_ModelViewMatrix)) );	// set the rotation/translation/scale matrix 
#endif
        glBindVertexArray(vao_id[0]);		// select the vertex array object:vao_id[0] by definiton using vertices0,normals0,colors0
        glDrawArrays(GL_TRIANGLES, 0, vao_elementcount[0]);	// draw the array (at the speed of light)

#ifdef XGLUNIFORMMATRIX4FV
        //glm_ModelViewMatrix *= translate(       vec3(0.0, 0.0, 0.0) );
        //glm_ModelViewMatrix *=    rotate(angle, vec3(0.0, 1.0, 0.0) ); 
        //glm_ModelViewMatrix *=     scale(       vec3(0.5, 0.5, 0.5) ); 
        //XglUniformMatrix4fv(glm_ProjectionMatrix_id, 1, false, glm_ProjectionMatrix );
        XglUniformMatrix4fv(glm_ModelViewMatrix_id, 1, false, glm_ModelViewMatrix );	// set the rotation/translation/scale matrix 
        XglUniformMatrix4fv(glm_NormalMatrix_id, 1, false, transpose(inverse(glm_ModelViewMatrix)) );	// set the rotation/translation/scale matrix 
#endif
        glBindVertexArray(vao_id[1]);		// select the vertex array object:vao_id[1] by definition using vertices1,normals1,colors1
        glDrawArrays(GL_TRIANGLES , 0, vao_elementcount[1]);	// draw the array (at the speed of light)
unsigned long long after = rdtsc();
std::cout << after-before << std::endl;
      }
      
      void CameraRotate(GLfloat angle)
      { 
        vec4 eye_pos(0.0, 0.0, 10.0, 1.0); // start camera at z=10
        eye_pos = rotate(angle, vec3(0.0, 1.0, 0.0))*eye_pos; // rotate camera around y-axis
        glm_ModelViewMatrix  = mat4(1.0); //glLoadIdentity
        glm_ModelViewMatrix  *= lookAt( vec3(eye_pos),
                                        vec3(0.0, 0.0, 0.0),
                                        vec3(0.0, 1.0, 0.0) 
                                      );
      }

      void reshape(GLfloat w, GLfloat h)
      { 
        glViewport(0,0,w,h);						// Reset The Current Viewport

        glm_ProjectionMatrix = mat4(1.0); //glLoadIdentity
        //glm_ProjectionMatrix *=  ortho3D(0.f, 2.f, 0.f, 1.f, 0.f, 1.f); // identical to glOrtho(0.f, 2.f, 0.f, 1.f, 0.f, 1.f);
        //glm_ProjectionMatrix *=  frustum(1.f, 2.f, 1.f, 2.f, 1.0e-45f, 2.f); // identical glFrustum(1.f, 2.f, 1.f, 2.f, 1.f, 2.f) but problem with 0.f, 2.f, 0.f, 1.f, 0.f, 1.f) -- solution near and far must both be > 0 (error if equal to zero!);
        glm_ProjectionMatrix *= perspective(20.0f,w/h,1.0f,21.0f);

#ifdef XGLUNIFORMMATRIX4FV
        XglUniformMatrix4fv(glm_ProjectionMatrix_id, 1, false, glm_ProjectionMatrix );
#endif
      }
};

// GLOBAL ///////////////////////////////////////////////////////////
CustomGraphicsPipeline Scene;
/////////////////////////////////////////////////////////////////////
unsigned long long int rdtsc(void)
{
// http://www.cs.wm.edu/~kearns/001lab.d/rdtsc.html
   unsigned long long int x;
   unsigned a, d;
   __asm__ volatile("rdtsc" : "=a" (a), "=d" (d));
   return ((unsigned long long)a) | (((unsigned long long)d) << 32);
}

double gettime_ms() 
{
  const double ticks_per_ms = 3214818125.0/1000.; // hardwired from main.cpp
  return rdtsc()/ticks_per_ms;
}

size_t gFramesPerSecond = 1;
const int desiredFPS = 50;
size_t gPhysicsPerSecond = 1;
const int desiredPPS=desiredFPS*2;

class MeasureCallsPerSecond
{
  size_t Frames;        // frames averaged over 1000mS
  double PreviousClock; // [milliSeconds]
  double Clock;         // [milliSeconds]
  double NextClock;     // [milliSeconds]
  size_t CallsPerSecond;

public:
  MeasureCallsPerSecond():
		Frames(0),
	  PreviousClock(gettime_ms()),
	  Clock(gettime_ms()),
	  NextClock(gettime_ms()),
    CallsPerSecond(1)
	{};
	~MeasureCallsPerSecond(){};

	size_t measure() {
    ++Frames;
    Clock = gettime_ms(); //has limited resolution, so average over 1000mS
    if ( Clock < NextClock ) return CallsPerSecond;

    CallsPerSecond = Frames*10; // store the averaged number of frames per second

    PreviousClock = Clock;
    NextClock = Clock+100; // 1000mS=1S in the future
    Frames=0;

		return CallsPerSecond;
	};
};

class condition_block 
{
  double desiredCPS;
  double PreviousClock;
  double Clock;
  double deltaT;

public:
  condition_block(double _desiredCPS):
    desiredCPS(_desiredCPS),
	  PreviousClock(gettime_ms()),
	  Clock(gettime_ms()),
    deltaT(0.)
	{};
	~condition_block(){};
	                  
  bool regulate()
	{
		bool status = false;
    #ifdef REGULATE_FPS
    Clock = gettime_ms();
    deltaT=Clock-PreviousClock;
    if (deltaT < 1000./desiredCPS) {status=true;} else {status=false; PreviousClock=Clock;}
    #endif
		return status;
	};
};


//#define USE_IDLE
#ifdef USE_IDLE
void idle()
{
	static condition_block condition_block_pps(desiredPPS);
	if (condition_block_pps.regulate()) return;
#else
static void timer(int value)
{
	glutTimerFunc(30,timer,++value); // come back in 30msec
#endif

	static MeasureCallsPerSecond PPS;
  gPhysicsPerSecond = PPS.measure(); //only call once per frame loop to measure FPS 
  double dt = 1./gPhysicsPerSecond; // delta time since last call

  //put your specific Physics code here
  //... this code will run at desiredPPS
  static GLfloat Angle=0.0;
  Angle+= 360./8.*dt; // 360 degrees rotation in 8 seconds
  Scene.CameraRotate(Angle);
  //end your specific Physics code here

  //update the uniform block colors0
  DiffuseCool = 0.45*cos(radians(Angle));
  DiffuseWarm = 0.45*sin(radians(Angle));
  SurfaceColor = vec3(0.75,0.75,0.75);
  WarmColor = vec3(0.0,0.0,1.0);
  CoolColor = vec3(0.0,1.0,0.0);

	static condition_block condition_block_fps(desiredFPS);
	if (condition_block_fps.regulate()) return;

	glutPostRedisplay();
}

static void display(void)
{
	static MeasureCallsPerSecond FPS;
  gFramesPerSecond = FPS.measure(); //only call once per frame loop to measure FPS 
//  printf("FPS %d
",gFramesPerSecond); 
//  printf("PPS %d
",gPhysicsPerSecond);
  fflush(stdout);

	Scene.Draw();
  glutSwapBuffers();
}

static void reshape(int width, int height)// Resize And Initialize The GL Window
{
  Scene.reshape(width,height);
}

static void key(unsigned char k, int x, int y)
{
  switch (k) {
  case 27:  // Escape
    exit(0);
    break;
  default:
    return;
  }
}

int main( int argc, char *argv[] )
{
   glutInit( &argc, argv );
   glutInitContextVersion( 3, 0 ); // freeglut is so cool! get openGL 3.0 context
   //glutInitContextFlags( int flags )
   glutInitDisplayMode (GLUT_DOUBLE | GLUT_RGB);
#ifdef FULLSCREEN
	glutGameModeString("1920x1200:32");
	assert( glutGameModeGet(GLUT_GAME_MODE_POSSIBLE) ); // bail if false
	glutEnterGameMode();
#else
   glutInitWindowSize (480, 480);
   glutCreateWindow ("hello world from openGL 3.0/freeglut/glew/glm");
#endif

   glewInit();

   assert( Scene.Init() ); // bail if scene initialization fails

//#define USE_IDLE
#ifdef USE_IDLE
   glutIdleFunc(idle);
#else
   glutTimerFunc(0,timer,0);
#endif
   glutDisplayFunc(display);
   glutReshapeFunc(reshape);
   glutKeyboardFunc(key);

   glutMainLoop();

   return 0;
}

if you really want to compile, you can download the large object


wget http://24.130.61.216/temporary/blender_droid_noidtluom.vertices 

wget http://24.130.61.216/temporary/blender_droid_noidtluom.normals

wget http://24.130.61.216/temporary/blender_droid_noidtluom.credit


Original 3d Blender model (droid.blend) from http://e2-productions.com/repository/modules/PDdownloads/singlefile.php?cid=20&lid=7
and converted by TooL-v0.2.5 (with slight modification)