PDA

View Full Version : Passing large mat4 array to fragment shader



Iaolia456
04-15-2016, 02:04 AM
Hello, Firstly I'm new to the GLSL language. Right now I'm working on porting my code on the CPU side to the GLSL implementation.

[short version]
What is the best way to pass 224 of mat4 and store it in the fragment shader to be able to do calculation using them?

[long version]
In my code i'm doing matrix multiplication to project the 3D points around so I need the camera transformation and parameter which will store in mat4 (I do camera param * camera transform * 3D point). Each camera has 2 mat4 and there are 112 total cameras so there are total of 224 mat4 to be exact. Right now I'm declaring uniform of type array of mat4 in the fragment shader like so,

uniform mat4 cam_views[128];
uniform mat4 cam_params[128];

and I pass the value to the shader like so,


//cam_trans and cam_params is the array of GLfloat of size [16 * 128]
glUniformMatrix4fv(glGetUniformLocation(shaderProg ramID, "cam_views"), 128, GL_TRUE, cam_trans);
glUniformMatrix4fv(glGetUniformLocation(shaderProg ramID, "cam_params"), 128, GL_TRUE, cam_params);

The problem is when I try to access the uniform in my shader it seem that the value of the uniform is null. My code instantly crash with the read access violation error. Here is an example of the code that will cause the crash in fragment shader. Note if I change cam_views[current_cam] to cam_views[Any number less than 128] it works fine.


for (int current_cam = 0; current_cam < 128; current_cam++) {
vec4 pixel3D_pos_acam = cam_views[current_cam] * pixel3D_coord;
vec4 img_2d_pos = cam_params[current_cam] * pixel3D_pos_acam;
img_2d_pos.x /= img_2d_pos.z;
img_2d_pos.y /= img_2d_pos.z;
fColor = vec4(img_2d_pos.x, 0,0,1); //it run fine without this line
}

So my question is what is the best (and correct) way to pass such a huge array of mat4 to the shader? I heard of using float texture but if I use float texture how can I read the value back and create a mat4 data type? Because I need to do matrix multiplication. Or maybe a uniform buffer? I not quite get the idea how to use any of this.

Extra question: After this I also need to access many (112 arrays) huuugee array of float (360,000 elements) within the shader. I also need to access image data (112 of 600*600 images) within the shader too. So how to pass those 2 huuuge data to the shader? Everything combined needs around 800 mb of memory.


Thank you

GClements
04-15-2016, 11:15 AM
The problem is when I try to access the uniform in my shader it seem that the value of the uniform is null. My code instantly crash with the read access violation error. Here is an example of the code that will cause the crash in fragment shader. Note if I change cam_views[current_cam] to cam_views[Any number less than 128] it works fine.

You can obtain the maximum number of uniform components and vectors with glGetIntegerv() with parameters GL_MAX_FRAGMENT_UNIFORM_COMPONENTS and GL_MAX_FRAGMENT_UNIFORM_VECTORS. A mat4 is 4 vectors, so you aren't guaranteed to be able to store more than 64 of them in the default uniform block (that's assuming that there are no other uniform variables).



So my question is what is the best (and correct) way to pass such a huge array of mat4 to the shader?

Use a named uniform block in the shader. In the client code, you need to bind a suitably-sized buffer object to the corresponding binding point with glBindBufferBase().



I heard of using float texture but if I use float texture how can I read the value back and create a mat4 data type? Because I need to do matrix multiplication.

With a float texture, you'd need to fetch four texels (each of which is a vec4) and combine them into a mat4.



Extra question: After this I also need to access many (112 arrays) huuugee array of float (360,000 elements) within the shader. I also need to access image data (112 of 600*600 images) within the shader too. So how to pass those 2 huuuge data to the shader? Everything combined needs around 800 mb of memory.

That much data could be a problem.

You'll probably want to use one or more 2D array textures. OpenGL 4 requires support for 1024x1024 textures and at least 256 layers for an array texture, but I don't believe that it necessarily requires support for a 1024x1024x256 array texture.

A uniform block may be limited to 16384 bytes, a buffer texture may be limited to 65536 texels (each of which may be up to 4 floats). Shader storage buffer objects don't a specific limit (i.e. a value which can be queried), but there may still be issues with attempting to allocate most of the available video memory as a single buffer object.

Iaolia456
04-18-2016, 08:33 PM
Thanks for the reply.

Turns out that the problem that cause the shader to crash is array indexing using dynamic variable, the arraying indexing must be a constant number (sorry I can't post a link to the stackoverflow question. The question's title is "GLSL for-loop array index"). For example you can't do this

for (int current_cam = 0; current_cam < 128; current_cam++) {
vec4 pixel3D_pos_acam = cam_views[current_cam] * pixel3D_coord;
but all of these are ok

cam_views[0];
cam_views[29];
cam_views[111];
For the completeness of the data in the uniform variable of the shader, I have no idea that it can fit all of that 200+ mat4 into it or not. So I'm a little concern that the limitation of the uniform variable size will cause a nasty bug later. From this reason I already switch to using a float texture to pass the data instead of uniform variable. Here's how I done it.

fragment shader declaration

uniform sampler2D cam_views_tex;

C++ preparation code

GLuint cam_trans_texID;
glActiveTexture(GL_TEXTURE1);
glGenTextures(1, &cam_trans_texID);
glBindTexture(GL_TEXTURE_2D, cam_trans_texID);
//cam_trans is an array of GLfloat of size [16*256]
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, 4, 256, 0, GL_RGBA, GL_FLOAT, cam_trans);
glUniform1i(glGetUniformLocation(shaderProgramID, "cam_views_tex"), 1);



You'll probably want to use one or more 2D array textures
Thanks for the advice, I will try that out for loading 600*600 image once I finish debugging my shader code.

GClements
04-19-2016, 03:51 AM
Turns out that the problem that cause the shader to crash is array indexing using dynamic variable, the arraying indexing must be a constant number

That is incorrect.

The only case where indices must be constant expressions is if the array is declared without an explicit size, e.g.


uniform mat4 cam_views[];


In that case, all accesses must use constant expressions so that the compiler can infer the array size from the code. If the array is explicitly sized, this isn't an issue.



(sorry I can't post a link to the stackoverflow question. The question's title is "GLSL for-loop array index").

Comments to that reply have pointed out that it is incorrect.



For the completeness of the data in the uniform variable of the shader, I have no idea that it can fit all of that 200+ mat4 into it or not. So I'm a little concern that the limitation of the uniform variable size will cause a nasty bug later. From this reason I already switch to using a float texture to pass the data instead of uniform variable.

Using a texture avoids the size limitations of using the default uniform block, and was fairly common before uniform buffer objects were added. However, it's likely to be slower than using a uniform variable. If you're targeting versions which support uniform buffer objects (OpenGL 3.1 or later, OpenGL ES 3.0 or later), I'd suggest using those instead.

Iaolia456
04-24-2016, 08:55 PM
Thank you for your reply again

Just to finish this off, I have successfully do the things with the help of uniform buffer block as you mentioned. Now the next thing to do is to use texture array to pass the image data and another big 2d array of GLfloat (bigger than 65536 bytes limit of uniform buffer block) to the shader. I might post another topic if I need help about texture array. Anyway thanks for your help again.

For convenience of people who might come here in the future, here is now to use uniform buffer block

To check the buffer block size limit use (in C++),

GLint max_buffer_size;
glGetIntegerv(GL_MAX_UNIFORM_BLOCK_SIZE, &max_buffer_size);
cout << "max buffer size: " << max_buffer_size << " bytes" << endl;
fragment shader: declaration of the buffer block

uniform camera_views {
mat4 transformation[256];
mat4 parameter[256];
int block_belong[256];
bool need_masked[256];
};

C++ code: to set it up and pass value to it


//get the index location and size of the block
GLuint block_index = glGetUniformBlockIndex(shaderProgramID, "camera_views");
GLint block_size;
glGetActiveUniformBlockiv(shaderProgramID, block_index, GL_UNIFORM_BLOCK_DATA_SIZE, &block_size);

//create temporary buffer on CPU side to hold the data
GLubyte *block_buffer = (GLubyte*)malloc(block_size);

//find the offset location (from index) of each variable in the block
const GLchar *var_name[] = { "transformation", "parameter", "block_belong", "need_masked" };
GLuint indices[4];
glGetUniformIndices(shaderProgramID, 4, var_name, indices);
GLint offset[4];
glGetActiveUniformsiv(shaderProgramID, 4, indices, GL_UNIFORM_OFFSET, offset);

//fill temporary buffer
memcpy(block_buffer + offset[0], trans, camera_views.size() * 16 * sizeof(GLfloat));
memcpy(block_buffer + offset[1], param, camera_views.size() * 16 * sizeof(GLfloat));
memcpy(block_buffer + offset[2], block_belong, camera_views.size() * sizeof(GLint));
memcpy(block_buffer + offset[3], need_masked, camera_views.size() * sizeof(GLboolean));

//create OpenGL buffer and bind it to the uniform block
GLuint buffer_handle;
glGenBuffers(1, &buffer_handle);
glBindBuffer(GL_UNIFORM_BUFFER, buffer_handle);
glBufferData(GL_UNIFORM_BUFFER, block_size, block_buffer, GL_DYNAMIC_DRAW);
glBindBufferBase(GL_UNIFORM_BUFFER, block_index, buffer_handle);

Finally, fragement shader: to access the value of this block inside the shader, access it just like normal global variable

transformation[5]


edit: clean up some messy text

GClements
04-25-2016, 08:09 AM
For convenience of people who might come here in the future, here is now to use uniform buffer block

Note that for arrays and matrices, you should also query the strides with GL_UNIFORM_ARRAY_STRIDE and GL_UNIFORM_MATRIX_STRIDE respectively, as the data isn't guaranteed to have the same representation as in C. Note that if this isn't equal to the element size, you won't be able to simply memcpy() entire arrays, you'll have to loop over the elements.

Alternatively, you can use the std140 layout which ensures a fixed representation. Although not necessarily the same as representation as C will use; some data types will require padding.