Mat4x3 Uniform Registers?

I remember reading that mat4x3 takes 1 more register than mat3x4 as it stores it as 4 columns of vec3. At least i can’t seem to find anywhere to confirm this anymore. Has this changed in the spec at all? Does it automatically store it in 3 as well or do i have to use mat3x4 instead?

Also:



// assuming i believe these do the samething...

mat3x4 a;

result = transpose(a) * vec4(somevalue, 1); // better as maintains "order"
result = vec4(somevalue, 1) * a; // similar performance as above?


Just tried it, cross-compiling GLSL to NV assembly (gp5vp profile), and it looks like what you say is the case. mat4x3 takes 4 uniform slots. mat3x4 takes 3. Which makes intuitive sense. GLSL is column major by default, so it’s all about the number of column vectors.

In a test I just did, passing in a mat4x3, postmultiplying it by a vec4 directly, and outputting the vec3 from the shader consumes 8 instructions (w/ 2 R-Regs). However, if I pass in a mat3x4, transpose it, and postmultiply that by the vec4 to output a vec3, I get 19 instructions (3 R-regs). Lots of extra moves. So a penalty of 11 instructions and 1 R-reg to eliminate use of one uniform slot while keeping the v2 = A*v multiplication order.

This also talks about it:

You can try using row_major. Then ideally it should be all about the number of rows.

// assuming i believe these do the samething…

mat3x4 a;

result = transpose(a) * vec4(somevalue, 1); // better as maintains “order”
result = vec4(somevalue, 1) * a; // similar performance as above?

Yep. The cost of the latter is 8 instructions (w/ 2 R-reg), so that’s definitely one way to go. As I said, here the former consumes considerably more assembly instructions, and so prob isn’t your best bet. But check into this and see.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.