PDA

View Full Version : AMD: struct array in UBO



_blitz
02-15-2011, 08:20 AM
My GPU (AMD 5650, catalyst 11.1, ubuntu x86) doesn't seem to fetch data correctly in an array of structures in the Uniform block of a fragment shader :
Here's the glsl fragment code :

#version 330 core

#define MAX_LAYERS 4

// \struct Gerstner
struct GerstnerParams
{
vec2 cst_size; ///< size of the plane [0, cst_size.x] x [0, cst_size.y]
vec2 cst_k; ///< wave vector
float cst_A; ///< wave frequency
float cst_w; ///< wave amplitudes
float cst_phi; ///< wave phase
float private; ///< pow2 byte size
};

// Gerstner attributes
layout(std140)
uniform Gerstners
{
GerstnerParams cst_gerstners[MAX_LAYERS];
};

// current time
uniform float cst_time;


// id and ndc pos
in vec2 vsout_pos;
flat in int vsout_id;


// xyz position
out vec3 fsout_pos;


void main()
{
// get params
int ID = vsout_id;
vec2 SIZE = cst_gerstners[ID].cst_size;
vec2 Ki = cst_gerstners[ID].cst_k;
float Ai = cst_gerstners[ID].cst_A;
float OMEGAi = cst_gerstners[ID].cst_w;
float PHIi = cst_gerstners[ID].cst_phi;

vec2 x0 = vsout_pos * SIZE * 0.5;

float Ki_dot_x0 = dot(Ki, x0);
float wt = OMEGAi * cst_time;
float term = Ki_dot_x0 - wt + PHIi; // precompute cosine and sine term
vec2 X = normalize(Ki) * Ai * sin(term); // X = [x, z]^t
float y = Ai * cos(term);

// final position
fsout_pos = vec3(X.x, y, X.y);
}

And the corresponding batch :

glBindFramebuffer(GL_FRAMEBUFFER, gl::fbuffers[gl::fbuffer::FIRST]);
glViewport(0,0,512,512);
glDisable(GL_DEPTH_TEST);
glEnable(GL_BLEND);
glBlendFunc(GL_ONE, GL_ONE);
glBlendEquation(GL_FUNC_ADD);
static float clear_color[] = {0.0f, 0.0f, 0.0f, 0.0f};
glClearBufferfv(GL_COLOR, 0, clear_color);
glUseProgram(gl::programs[gl::program::SUM]);
glBindVertexArray(gl::varrays[gl::varray::QUAD]);
glUniform1f(TIME_LOCATION, float(t));
glDrawArrays( GL_TRIANGLES,
0,
24 );
Basically I'm accumulating some data in a texture, by drawing full screen quads (identified by vsout_id).

Data is recovered from cst_gerstner[0] and cst_gerstner[3] only.

GL_UNIFORM_OFFSET gives correct results (based on std140), and the data in the UBO is correct...

I tried the code on a Nvidia GPU (8800GTS, 260.19.06) and it works, that's why I'm suspecting the driver.

Perhaps should I post a minimal code ?

Groovounet
02-15-2011, 08:49 AM
Hi _Blitz

You need to ensure that the alignment of the data in the buffer is correct.

glGetIntegerv(GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, &amp;UniformBufferOffset);

That should fix your problem.

_blitz
02-15-2011, 09:05 AM
Hi _Blitz

You need to ensure that the alignment of the data in the buffer is correct.

glGetIntegerv(GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, &amp;UniformBufferOffset);

That should fix your problem.

GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT gives me 256.
However, querying the byte offsets of the first element of the structure in the array ( with glGetActiveUniformsiv(gl::programs[gl::program::SUM], 4, indexes, GL_UNIFORM_OFFSET ,params); )
with indexes of :
const GLchar* names[] = {
"cst_gerstners[0].cst_A",
"cst_gerstners[1].cst_A",
"cst_gerstners[2].cst_A",
"cst_gerstners[3].cst_A" };

gives me 0,32,64,and 96 respectively (so as if the data was tightly packed, which it is supposed to be).

Thus if I follow the UNIFORM_BUFFER_OFFSET_ALIGNMENT, it won't work... And as I mentionned, it works on Nvidia.

Groovounet
02-15-2011, 09:45 AM
Remember that it's not because it works on nVidia that it is a standard behaviours ;)

I am quite surprised that GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT returns 256 but maybe. (I had 64 in my head for some reasons).

Anyway, that's what the drivers should expect and that why your program doesn't work.

Alfonse Reinheart
02-15-2011, 11:17 AM
He may be running into this bug (http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&amp;Number=287747#Post2877 47).

_blitz
02-15-2011, 11:26 AM
Remember that it's not because it works on nVidia that it is a standard behaviours ;)

I am quite surprised that GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT returns 256 but maybe. (I had 64 in my head for some reasons).

Anyway, that's what the drivers should expect and that why your program doesn't work.
How do you explain then that data is recovered from index 0 and 3 ?

Alfonse Reinheart
02-15-2011, 11:30 AM
How do you explain then that data is recovered from index 0 and 3 ?

It would be easier to do so if you showed us the code where you're uploading data to the buffer object.

Groovounet
02-15-2011, 11:31 AM
You mean, as if the array of structures was a structure of array?

Kind of crazy yes, sound like a bug then. :p

_blitz
02-15-2011, 11:44 AM
He may be running into this bug (http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&amp;Number=287747#Post2877 47).
"He" used the function of the thread you mentioned.
Output :

Uniform block "Gerstners":
0000: 35664 cst_gerstners[0].cst_size
0008: 35664 cst_gerstners[0].cst_k
0016: 5126 cst_gerstners[0].cst_A
0020: 5126 cst_gerstners[0].cst_w
0024: 5126 cst_gerstners[0].cst_phi
0028: 5126 cst_gerstners[0].private
0032: 35664 cst_gerstners[1].cst_size
0040: 35664 cst_gerstners[1].cst_k
0048: 5126 cst_gerstners[1].cst_A
0052: 5126 cst_gerstners[1].cst_w
0056: 5126 cst_gerstners[1].cst_phi
0060: 5126 cst_gerstners[1].private
0064: 35664 cst_gerstners[2].cst_size
0072: 35664 cst_gerstners[2].cst_k
0080: 5126 cst_gerstners[2].cst_A
0084: 5126 cst_gerstners[2].cst_w
0088: 5126 cst_gerstners[2].cst_phi
0092: 5126 cst_gerstners[2].private
0096: 35664 cst_gerstners[3].cst_size
0104: 35664 cst_gerstners[3].cst_k
0112: 5126 cst_gerstners[3].cst_A
0116: 5126 cst_gerstners[3].cst_w
0120: 5126 cst_gerstners[3].cst_phi
0124: 5126 cst_gerstners[3].private
cst_gerstner[0] and cst_gerstner[3] are read okay by the shader, not the ones in between.

_blitz
02-15-2011, 11:58 AM
How do you explain then that data is recovered from index 0 and 3 ?

It would be easier to do so if you showed us the code where you're uploading data to the buffer object.
No problem ! Here's the code :

namespace gerstner {
namespace param {
enum {
FIRST = 0,
SECOND,
THIRD,
FOURTH,

MAX
};
}
typedef struct _GerstnerParameters {
float x0max; ///< width of the simulated plane
float z0max; ///< depth of the simulated plane
float kx; ///< wave vector x
float ky; ///< wave vector y
float A; ///< wave amplitude
float omega; ///< wave frequency
float phi; ///< wave phase
char __reserved[4]; ///< round size to pow of two
} Params;

Params params[param::MAX];
}
/* ... */
// updload grestner (this is not optimal, uniforms should be updated only when modified)
glBindBuffer(GL_UNIFORM_BUFFER, gl::buffers[gl::buffer::UNIFORM_GERSTNER_PARAMS]);
gerstner::Params* params = (gerstner::Params*) glMapBufferRange( GL_UNIFORM_BUFFER,
0,
sizeof(gerstner::Params) * gerstner::param::MAX,
GL_MAP_WRITE_BIT |
GL_MAP_UNSYNCHRONIZED_BIT |
GL_MAP_INVALIDATE_RANGE_BIT);

std::memcpy(params, &amp;(gerstner::params[0].x0max), sizeof(gerstner::Params) * gerstner::param::MAX);
glUnmapBuffer(GL_UNIFORM_BUFFER);

I checked the data in gDebugger.

You can check out the code here : https://subversion.assembla.com/svn/gpuocean/
(lots of pdfs so the checkout may take some time, sorry about that), directory sample03.

Alfonse Reinheart
02-15-2011, 12:25 PM
GL_MAP_UNSYNCHRONIZED_BIT |


OK, so you're using unsynchronized access. What steps are you taking to insure that your upload is properly synchronized with OpenGL?

Also, I'm not really sure what the purpose of the invalidate range is in tandem with unsynchronized access.

_blitz
02-15-2011, 12:32 PM
GL_MAP_UNSYNCHRONIZED_BIT |


OK, so you're using unsynchronized access. What steps are you taking to insure that your upload is properly synchronized with OpenGL?

Also, I'm not really sure what the purpose of the invalidate range is in tandem with unsynchronized access.
Swapbuffers is my synchroniser, the update occurs once per frame.

Alfonse Reinheart
02-15-2011, 02:27 PM
Swapbuffers is my synchroniser, the update occurs once per frame.

SwapBuffers does not necessarily block until rendering has completed. It may, but it does not have to.

And because of that, it probably doesn't. No point in blocking the CPU while the GPU renders, after all. As long as the driver is threaded, it can do the actual swap once the GPU signals that the rendering is over.

_blitz
02-15-2011, 03:19 PM
You mean that a buffer may be swapped while the GPU hasn't finished writing to it? Whenever this happens you'd end up with random renderings... Never seen this happen, perhaps most of the swap buffers implementations have a sync command before the swap actually occurs?
Anyway, if I add glFinish at the end of the render function or remove the GL_MAP_UNSYNCHRONISED_BIT flag from glMap , it still doesn't work, the same issue remains.

Alfonse Reinheart
02-15-2011, 06:07 PM
You mean that a buffer may be swapped while the GPU hasn't finished writing to it?

I mean SwapBuffers will return before it actually swaps the buffers. Much like glReadPixels returns before it finishes, or possibly even starting, reading pixels.

_blitz
02-16-2011, 01:28 AM
Okay here's a minimal code.
The bug is present on my platform

/// /////////////////////////////////////////////////////////////
/// \author Jonathan DUPUY
/// \date 16.02.2011
/// \version 1.0
///
/// \sample Testing Uniform Array

/// /////////////////////////////////////////////////////////////

#define GL3_PROTOTYPES 1
#include <GL/gl3.h>
#define __gl_h_
#include <GL/freeglut.h>

#include <stdint.h>
#include <string>
#include <vector>
#include <cmath>
#include <cstring>
#include <unistd.h>
#include <sstream>
#include <algorithm>
#include <iomanip>
#include <iostream>

#ifdef _WIN32
#include <windows.h>
#include <winbase.h>
#else
#include <sys/time.h>
#endif

#define UTILS_BUFFER_OFFSET(i) ((char*)NULL+(i))
/////////////////////////////////////////////////////////////
// Variables
/////////////////////////////////////////////////////////////
namespace
{
// window attribs
namespace window
{
int32_t width = 800;
int32_t height = 500;
const int32_t gl_major = 3;
const int32_t gl_minor = 3;
const char* name = "Driver test";
}

// OpenGL variables
namespace gl
{
// buffer objects
namespace buffer
{
enum
{
VERTEX_QUAD,

UNIFORM_PARAMS,
MAX
};
}
// vertex array objects
namespace varray
{
enum
{
QUAD = 0,
MAX
};
}

// programs
namespace program
{
enum
{
SUM = 0,
MAX
};
}


GLuint buffers[buffer::MAX];
GLuint varrays[varray::MAX];
GLuint programs[program::MAX];
}

// scene params
namespace scene
{
}

// gerstner simulation variables
namespace color
{
namespace param
{
enum
{
FIRST = 0,
SECOND,
THIRD,
FOURTH,

MAX
};
}
typedef struct _ShaderParameters
{
float red; ///< width of the simulated plane
float green; ///< depth of the simulated plane
float blue; ///< wave vector x
float alpha; ///< wave vector y
char __reserved[16]; ///< round size to 32 bytes
} Params;

Params params[param::MAX];
}



}


/////////////////////////////////////////////////////////////
// Functions
/////////////////////////////////////////////////////////////
bool onInit();
void onShutdown();
void onRender();
void onResize(int32_t w, int32_t h);
void onKeyboard(uint8_t k, int32_t x, int32_t y);
void onSpecialKey(int32_t k, int32_t x, int32_t y);
void onMouseButton(int32_t button, int32_t state, int32_t x, int32_t y);
void onMouseMotion(int32_t x, int32_t y);
void onMousePassiveMotion(int32_t x, int32_t y);
void onIdle();


/////////////////////////////////////////////////////////////
// Main
/////////////////////////////////////////////////////////////
int main(int argc, char *argv[])
{
// create a GL context
glutInit(&amp;argc, argv);
glutInitContextVersion(window::gl_major, window::gl_minor);
glutInitContextProfile(GLUT_COMPATIBILITY_PROFILE) ; // can't go to core profile here because of tw
// glutInitContextFlags (GLUT_FORWARD_COMPATIBLE | GLUT_DEBUG);

// configure glut
glutInitDisplayMode(GLUT_DEPTH | GLUT_DOUBLE | GLUT_RGBA);
glutInitWindowSize(window::width , window::height);
glutInitWindowPosition(32, 32);

glutCreateWindow(window::name);

glutReshapeFunc(onResize);
glutDisplayFunc(onRender);
glutKeyboardFunc(onKeyboard);
glutSpecialFunc(onSpecialKey);
glutMouseFunc(onMouseButton);
glutMotionFunc(onMouseMotion);
glutPassiveMotionFunc(onMousePassiveMotion);
glutIdleFunc(onIdle);
glutCloseFunc(onShutdown);

// try running the app and clean if necessary
if(onInit())
glutMainLoop();
else
onShutdown();

return 0;
}


/////////////////////////////////////////////////////////////
// Helper functions
/////////////////////////////////////////////////////////////

// get time in seconds
double time()
{
#ifdef _WIN32
__int64 time;
__int64 cpuFrequency;
QueryPerformanceCounter((LARGE_INTEGER*) &amp;time);
QueryPerformanceFrequency((LARGE_INTEGER*) &amp;cpuFrequency);
return time / double(cpuFrequency);
#else
static double t0 = 0;
timeval tv;
gettimeofday(&amp;tv, NULL);
if (!t0)
{
t0 = tv.tv_sec; // don't care about useconds here
}
return double(tv.tv_sec-t0) + double(tv.tv_usec) / 1e6;
#endif
}

void print_uniform_block_info(GLuint prog, GLint block_index, std::string const &amp;indent = std::string())
{
// Fetch uniform block name:
GLint name_length;
glGetActiveUniformBlockiv(prog, block_index, GL_UNIFORM_BLOCK_NAME_LENGTH, &amp;name_length);
std::string block_name(name_length, 0);
glGetActiveUniformBlockName(prog, block_index, name_length, NULL, &amp;block_name[0]);

// Fetch info on each active uniform:
GLint active_uniforms = 0;
glGetActiveUniformBlockiv(prog, block_index, GL_UNIFORM_BLOCK_ACTIVE_UNIFORMS, &amp;active_uniforms);

std::vector<GLuint> uniform_indices(active_uniforms, 0);
glGetActiveUniformBlockiv(prog, block_index, GL_UNIFORM_BLOCK_ACTIVE_UNIFORM_INDICES, reinterpret_cast<GLint*>(&amp;uniform_indices[0]));

std::vector<GLint> name_lengths(uniform_indices.size(), 0);
glGetActiveUniformsiv(prog, uniform_indices.size(), &amp;uniform_indices[0], GL_UNIFORM_NAME_LENGTH, &amp;name_lengths[0]);

std::vector<GLint> offsets(uniform_indices.size(), 0);
glGetActiveUniformsiv(prog, uniform_indices.size(), &amp;uniform_indices[0], GL_UNIFORM_OFFSET, &amp;offsets[0]);

std::vector<GLint> types(uniform_indices.size(), 0);
glGetActiveUniformsiv(prog, uniform_indices.size(), &amp;uniform_indices[0], GL_UNIFORM_TYPE, &amp;types[0]);

std::vector<GLint> sizes(uniform_indices.size(), 0);
glGetActiveUniformsiv(prog, uniform_indices.size(), &amp;uniform_indices[0], GL_UNIFORM_SIZE, &amp;sizes[0]);

std::vector<GLint> strides(uniform_indices.size(), 0);
glGetActiveUniformsiv(prog, uniform_indices.size(), &amp;uniform_indices[0], GL_UNIFORM_ARRAY_STRIDE, &amp;strides[0]);

// Build a string detailing each uniform in the block:
std::vector<std::string> uniform_details;
uniform_details.reserve(uniform_indices.size());
for(std::size_t i = 0; i < uniform_indices.size(); ++i)
{
GLuint const uniform_index = uniform_indices[i];

std::string name(name_lengths[i], 0);
glGetActiveUniformName(prog, uniform_index, name_lengths[i], NULL, &amp;name[0]);

std::ostringstream details;
details << std::setfill('0') << std::setw(4) << offsets[i] << ": " << std::setfill(' ') << std::setw(5) << types[i] << " " << name;

if(sizes[i] > 1)
{
details << "[" << sizes[i] << "]";
}

details << "\n";
uniform_details.push_back(details.str());
}

// Sort uniform detail string alphabetically. (Since the detail strings
// start with the uniform's byte offset, this will order the uniforms in
// the order they are laid out in memory:
std::sort(uniform_details.begin(), uniform_details.end());

// Output details:
std::cout << indent << "Uniform block \"" << block_name << "\":\n";
for(auto detail = uniform_details.begin(); detail != uniform_details.end(); ++detail)
{
std::cout << indent << " " << *detail;
}
}


GLvoid printShaderLog(GLuint shader)
{
GLint infologLength = 0;
GLint charsWritten = 0;
GLchar *infoLog;

glGetShaderiv(shader, GL_INFO_LOG_LENGTH, &amp;infologLength);

if (infologLength > 0)
{
infoLog = (GLchar *)malloc(infologLength);
glGetShaderInfoLog(shader, infologLength, &amp;charsWritten, infoLog);
std::cout << infoLog << "\n";
free(infoLog);
}
}

/////////////////////////////////////////////////////////////
// Call back impl
/////////////////////////////////////////////////////////////
// Init
bool onInit()
{
// ---------------------
// init GL
glGenBuffers(gl::buffer::MAX, gl::buffers);
glGenVertexArrays(gl::varray::MAX, gl::varrays);
for(uint8_t i = 0; i < gl::program::MAX; ++i)
gl::programs[i] = glCreateProgram();

// vertex buffer to draw 4 quads
std::vector<float> quad_vertices;
for(uint8_t i = 0; i < 4u; ++i)
{
quad_vertices.push_back(-1.0f); // ll
quad_vertices.push_back(-1.0f);
quad_vertices.push_back( 1.0f); // ur
quad_vertices.push_back( 1.0f);
quad_vertices.push_back(-1.0f); // ul
quad_vertices.push_back( 1.0f);
quad_vertices.push_back(-1.0f); // ll
quad_vertices.push_back(-1.0f);
quad_vertices.push_back( 1.0f); // lr
quad_vertices.push_back(-1.0f);
quad_vertices.push_back( 1.0f); // ur
quad_vertices.push_back( 1.0f);
}
// param buffer
color::params[0].red = 0.1f;
color::params[0].green = 0.0f;
color::params[0].blue = 0.0f;
color::params[0].alpha = 0.0f;

color::params[1].red = 0.0f;
color::params[1].green = 0.25f;
color::params[1].blue = 0.75f;
color::params[1].alpha = 0.0f;

color::params[2].red = 0.0f;
color::params[2].green = 0.75f;
color::params[2].blue = 0.25f;
color::params[2].alpha = 0.0f;

color::params[3].red = 0.9f;
color::params[3].green = 0.0f;
color::params[3].blue = 0.0f;
color::params[3].alpha = 0.0f; // quad should be white

glBindBuffer(GL_ARRAY_BUFFER, gl::buffers[gl::buffer::VERTEX_QUAD]);
glBufferData(GL_ARRAY_BUFFER, sizeof(float)*quad_vertices.size(), &amp;quad_vertices[0], GL_STATIC_DRAW);
glBindBuffer(GL_ARRAY_BUFFER, 0);

glBindBuffer(GL_UNIFORM_BUFFER, gl::buffers[gl::buffer::UNIFORM_PARAMS]);
glBufferData(GL_UNIFORM_BUFFER, sizeof(color::Params)*color::param::MAX, &amp;(color::params[0].red), GL_STREAM_COPY);
glBindBuffer(GL_UNIFORM_BUFFER, 0);

glBindBufferBase( GL_UNIFORM_BUFFER,
gl::buffer::UNIFORM_PARAMS,
gl::buffers[gl::buffer::UNIFORM_PARAMS]);

// vertex arrays
glBindVertexArray(gl::varrays[gl::varray::QUAD]);
glBindBuffer(GL_ARRAY_BUFFER, gl::buffers[gl::buffer::VERTEX_QUAD]);

glEnableVertexAttribArray(0);
glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, UTILS_BUFFER_OFFSET(0));
glBindVertexArray(0);

// programs
const GLchar* vshader_src[] =
{
"#version 330 core \n",
"layout(location=0) in vec2 vsin_pos;\n",
"flat out int vsout_id;\n",
"void main()\n",
"{\n",
" vsout_id = gl_VertexID/6;\n",
" gl_Position = vec4(vsin_pos, 0.0, 1.0);\n",
"}\n"
};

const GLchar* fshader_src[] =
{
"#version 330 core \n",
"struct Parameter\n",
"{\n",
" vec2 rg;\n",
" vec2 ba;\n",
" vec4 reserved;\n",
"};\n",
"layout(std140) uniform Params\n",
"{\n",
" Parameter cst_params[4];\n"
"};\n",
"flat in int vsout_id;\n",
"out vec4 fsout_color;\n",
"void main()\n",
"{\n",
" int ID = vsout_id;\n",
" fsout_color = vec4(cst_params[ID].rg, cst_params[ID].ba);\n",
" vec4 tmp = cst_params[ID].reserved;\n",
"}\n"
};

GLuint vshader_sum = glCreateShader(GL_VERTEX_SHADER);
glShaderSource(vshader_sum, 8, vshader_src, NULL);
glCompileShader(vshader_sum);

printShaderLog(vshader_sum);

GLuint fshader_sum = glCreateShader(GL_FRAGMENT_SHADER);
glShaderSource(fshader_sum, 18, fshader_src, NULL);
glCompileShader(fshader_sum);
GLint length = 0;
// glGetShaderiv(fshader_sum, GL_SHADER_SOURCE_LENGTH, &amp;length);
GLchar buffer[512];
glGetShaderSource(fshader_sum, 512, &amp;length, buffer);
std::cout << "source : \n" << buffer << "\n";
printShaderLog(fshader_sum);

glAttachShader(gl::programs[gl::program::SUM], vshader_sum);
glAttachShader(gl::programs[gl::program::SUM], fshader_sum);
glLinkProgram(gl::programs[gl::program::SUM]);

glUseProgram(gl::programs[gl::program::SUM]);
glUniformBlockBinding( gl::programs[gl::program::SUM],
glGetUniformBlockIndex(gl::programs[gl::program::SUM], "Params"),
gl::buffer::UNIFORM_PARAMS );

print_uniform_block_info(gl::programs[gl::program::SUM], glGetUniformBlockIndex(gl::programs[gl::program::SUM], "Params"));

glUseProgram(0);
glDeleteShader(vshader_sum);
glDeleteShader(fshader_sum);

glClearColor(0.0f,0.0f,0.0f,0.0f);

glEnable(GL_DEPTH_TEST);
glViewport(0,0,window::width, window::height);

// check errors
int error = glGetError();
std::cout << "gl error : " << error << "\n";
if(error)
return false;

return true;
}


/////////////////////////////////////////////////////////////
// Shutdown
void onShutdown()
{
glDeleteBuffers(gl::buffer::MAX, gl::buffers);
glDeleteVertexArrays(gl::varray::MAX, gl::varrays);
for(uint8_t i = 0; i < gl::program::MAX; ++i)
glDeleteProgram(gl::programs[i]);

// make sure we're done
glFinish();
}


/////////////////////////////////////////////////////////////
// Render
void onRender()
{
// time
double t = time();

// updload colors
// glBindBuffer(GL_UNIFORM_BUFFER, gl::buffers[gl::buffer::UNIFORM_PARAMS]);
// gerstner::Params* params = (gerstner::Params*) glMapBufferRange( GL_UNIFORM_BUFFER,
// 0,
// sizeof(gerstner::Params) * gerstner::param::MAX,
// GL_MAP_WRITE_BIT |
// GL_MAP_UNSYNCHRONIZED_BIT |
// GL_MAP_INVALIDATE_RANGE_BIT);

// std::memcpy(params, &amp;(gerstner::params[0].x0max), sizeof(gerstner::Params) * gerstner::param::MAX);
// glUnmapBuffer(GL_UNIFORM_BUFFER);

glClear(GL_COLOR_BUFFER_BIT);
glDisable(GL_DEPTH_TEST);
glEnable(GL_BLEND);
glBlendFunc(GL_ONE, GL_ONE);
glBlendEquation(GL_FUNC_ADD);
// draw 4 fulls screen quads
glUseProgram(gl::programs[gl::program::SUM]);
glBindVertexArray(gl::varrays[gl::varray::QUAD]);
glDrawArrays( GL_TRIANGLES,
0,
24 );

glBindVertexArray(0);
glUseProgram(0);

glutSwapBuffers();
}


/////////////////////////////////////////////////////////////
// Resize
void onResize(int32_t w, int32_t h)
{
window::width = w;
window::height = h;
glViewport(0,0,w,h);
}


/////////////////////////////////////////////////////////////
// Keyboard
void onKeyboard(uint8_t k, int32_t x, int32_t y)
{
// handle esc key
switch(k)
{
case 27:
onShutdown();
exit(0);
break;
}
}


/////////////////////////////////////////////////////////////
// Special keys
void onSpecialKey(int32_t k, int32_t x, int32_t y)
{

}

/////////////////////////////////////////////////////////////
// Passive mouse
void onMousePassiveMotion(int32_t x, int32_t y)
{

}


/////////////////////////////////////////////////////////////
// Button event
void onMouseButton(int32_t button, int32_t state, int32_t x, int32_t y)
{

}


/////////////////////////////////////////////////////////////
// Mouse active motion
void onMouseMotion(int32_t x, int32_t y)
{

}


/////////////////////////////////////////////////////////////
// idle
void onIdle()
{
// sleep
#ifdef _WIN32
Sleep(1);
#else
usleep(1000);
#endif

onRender();
}


Makefile :

BIN = sample

CPP = g++
LDD = g++

CPPFLAGS = -ansi -pedantic -std=c++0x -I ./
LDDFLAGS = -lm -lGL -lglut

FILES_SRC = $(wildcard *.cpp)
FILES_OBJ = $(FILES_SRC:%.cpp=%.o)
FILES_DEP = $(FILES_SRC:%.cpp=%.d)

default: $(BIN)

all: $(BIN)

$(BIN): $(FILES_OBJ)
$(LDD) -o $@ $^ $(LDDFLAGS)

$(DIR_OBJ)/%.o: $(DIR_SRC)/%.c
$(CPP) $(CPPFLAGS) -o $@ -c $<

$(DIR_DEP)/%.d: $(DIR_SRC)/%.c
$(CPP) $(CPPFLAGS) -MM -MD -o $@ $<

-include $(FILES_DEP)

# regles de nettoyage
.PHONY: clean distclean

clean :
rm -rf $(FILES_OBJ) $(FILES_DEP)

distclean:
rm -f $(BIN)

(You will need to provide the GL/gl3.h)

The bug is still present : buffer[0] and buffer[3] seem to be read, the others are ignored ?! (and I'm not even mapping the buffer anymore)

EDIT Not working on catalyst 11.2 either

_blitz
02-18-2011, 12:50 AM
I made a small mistake in the makefile (but it compiled anyway)
so here's a valid one

BIN = sample

CPP = g++
LDD = g++

CPPFLAGS = -ansi -pedantic -std=c++0x -I ./
LDDFLAGS = -lm -lGL -lglut

FILES_SRC = $(wildcard *.cpp)
FILES_OBJ = $(FILES_SRC:%.cpp=%.o)
FILES_DEP = $(FILES_SRC:%.cpp=%.d)

default: $(BIN)

all: $(BIN)

$(BIN): $(FILES_OBJ)
$(LDD) -o $@ $^ $(LDDFLAGS)

%.o: %.cpp
$(CPP) $(CPPFLAGS) -o $@ -c $<

%.d: %.cpp
$(CPP) $(CPPFLAGS) -MM -MD -o $@ $<

-include $(FILES_DEP)

.PHONY: clean distclean

clean :
rm -rf $(FILES_OBJ) $(FILES_DEP)

distclean:
rm -f $(BIN)

Has anyone tried to run the sample yet ?
Could I have confirmation that this is an abnormal behaviour ?
Thanks !

Dark Photon
02-18-2011, 11:42 AM
Has anyone tried to run the sample yet ?
Could I have confirmation that this is an abnormal behaviour ?
Thanks !
Don't have an ATI card to try it on, but I did try it on NVidia. Had to make the mods below (I don't understand how that "auto" ref in your code compiles -- any insight?). But first here are the results on NVidia GTX285 with 260.19.36 drivers (Linux):




source :
#version 330 core
struct Parameter
{
vec2 rg;
vec2 ba;
vec4 reserved;
};
layout(std140) uniform Params
{
Parameter cst_params[4];
};
flat in int vsout_id;
out vec4 fsout_color;
void main()
{
int ID = vsout_id;
fsout_color = vec4(cst_params[ID].rg, cst_params[ID].ba);
vec4 tmp = cst_params[ID].reserved;
}


Uniform block "Params\x00":
0000: 35664 cst_params[0].rg\x00
0008: 35664 cst_params[0].ba\x00
0016: 35666 cst_params[0].reserved\x00
0032: 35664 cst_params[1].rg\x00
0040: 35664 cst_params[1].ba\x00
0048: 35666 cst_params[1].reserved\x00
0064: 35664 cst_params[2].rg\x00
0072: 35664 cst_params[2].ba\x00
0080: 35666 cst_params[2].reserved\x00
0096: 35664 cst_params[3].rg\x00
0104: 35664 cst_params[3].ba\x00
0112: 35666 cst_params[3].reserved\x00
gl error : 0

And the mods:


> diff tst.orig.cxx tst.cxx 149c149
< glutInitContextProfile(GLUT_COMPATIBILITY_PROFILE) ; // can't go to core profile here because of tw
---
> //glutInitContextProfile(GLUT_COMPATIBILITY_PROFILE) ; // can't go to core profile here because of tw
263c263
< for(auto detail = uniform_details.begin(); detail != uniform_details.end(); ++detail)
---
> for( unsigned i = 0; i < uniform_details.size(); i++ )
265c265
< std::cout << indent << " " << *detail;
---
> std::cout << indent << " " << uniform_details[ i ];

wien
02-18-2011, 01:24 PM
I don't understand how that "auto" ref in your code compiles -- any insight?
That's a C++0x feature (http://en.wikipedia.org/wiki/C%2B%2B0x#Type_inference). You'll need VS2010 or a recent GCC (4.5 I think) with -std=c++0x for it to work.

_blitz
02-19-2011, 03:38 AM
I just copy/pasted the UBO diagnostic function from this thread : http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&amp;Number=287747#Post2877 47
With Nvidia, I get a white screen (meaning that all 4 elements in the array are read), wheareas with ATi/AMD, I get a red screen (first and last elements read). Could you confirm the Nvidia result I got ?

Dark Photon
02-20-2011, 05:57 PM
With Nvidia, I get a white screen (meaning that all 4 elements in the array are read), wheareas with ATi/AMD, I get a red screen (first and last elements read). Could you confirm the Nvidia result I got ?
White screen, like you.


That's a C++0x feature. You'll need VS2010 or a recent GCC (4.5 I think) with -std=c++0x for it to work.
Ah! Thanks. This is the first time C++0x has crossed my radar. If I'd just used his options it would have worked (GCC 4.5.0 here).

Groovounet
02-20-2011, 06:18 PM
Of topic



Ah! Thanks. This is the first time C++0x has crossed my radar. If I'd just used his options it would have worked (GCC 4.5.0 here).


If it's the first time then have a look at the new static_assert keyword, magnificent!

Dark Photon
02-22-2011, 06:27 PM
Will do. Thanks.

frank li
03-22-2011, 07:14 PM
We confirmed it's a driver bug. It will be fixed soon. Thanks for your feedback.

_blitz
03-23-2011, 12:26 AM
Great !