Common subexpression elimination?

doug65536 · October 17, 2011, 9:32pm

I’ve seen shaders that do pretty expensive looking redundant operations, like matrix products (I know GPUs are amazing at those, but thousands and thousands of them unnecessarily…).

Sometimes the operands are uniforms (for example), meaning that every time it will be the same result.

For example: gl_Position = mvmatrix * projmatrix * gl_Vertex;

(mvmatrix * projmatrix will always have the same result if both are uniform - until they change that is)

Should I expect/assume that a shader compiler will hoist invariants like this out of the loop so it won’t actually do a matrix product per vertex? Or should I worry about it and do extra work to lift that work onto the CPU and deliver a final matrix?

Thanks!

Alfonse_Reinheart · October 18, 2011, 12:13am

Should I expect/assume that a shader compiler will hoist invariants like this out of the loop so it won’t actually do a matrix product per vertex?

Out of what loop? Your example doesn’t show a loop.

There are obvious places where you can make the shader run faster. For example, you should never do the example you posted. It should always be this:


gl_Position = projmatrix * (mvmatrix * gl_Vertex);

And that assumes you don’t use an explicit temporary (because you need the camera-space position to do lighting).

Anything beyond that is dealer’s choice. As previously mentioned, you oftentimes need the camera-space position to pass to the fragment shader to do lighting (or do per-vertex lighting). Is it worth while to extract the cases where you don’t need to do it? That’s up to you. How many variations of shader code do you want lying around? Is having more shader variants and possible uniform composition variations worth the possibly insignificant performance increase?

That’s why you should always profile before trying to do these sorts of things. Implement first, get everything working, then profile it.

I can answer your question though: you can certainly assume that GLSL compilers will not magically combine two uniforms into one just because you use them in a “constant” expression. It is possible that they might, but it’s nothing you should rely on.

Dark_Photon · October 18, 2011, 4:34am

Or, even better, just premultiply MVP on the CPU, pass it into the shader, and then you only have one matrix-vector multiply per each of your hundreds or thousands of verts, rather than two (ala gl_ModelViewProjectionMatrix, from GLSL 1.2).

doug65536 · October 18, 2011, 10:44am

The loop is the processing of each vertex. It processes many vertices, I’m calling that “the loop”. Sorry, I thought that was obvious.

Thanks for the great tip Alfonse, I see that a doing the transform twice per vertex would be much better than multiplying matrices themselves per vertex.

Dark, I am doing the “even better” thing and providing the shader with both a MV and MVP matrix (in a VBO, using instancing).

I guess I felt like it was a slightly stupid design for GLSL to make you provide both a MV and MVP matrix for everything and figured because you ALWAYS need to do that (if you do a perspective projection and lighting calculations), that shader compilers would hoist out redundancies like that.

ugluk · October 18, 2011, 11:22am

Thanks, Dark, for the insights.

imported_kyle · October 18, 2011, 12:28pm

Afaik, the only optimization in GLSL you can depend on is constant folding in builtin functions. All else is up to you favorite GLSL compiler, so if you want to run on multiple vendors, you pretty much have to take care of this yourself.

Alfonse_Reinheart · October 18, 2011, 4:19pm

I guess I felt like it was a slightly stupid design for GLSL to make you provide both a MV and MVP matrix for everything and figured because you ALWAYS need to do that (if you do a perspective projection and lighting calculations)

In what way do they “force” you to do that? You don’t have to use matrices at all; nobody’s forcing you to do anything.

You are doing it this way because you have certain needs. If you were drawing text using a simple ortho matrix that worked in window coordinates, you wouldn’t need two matrices, now would you? That’s the point of shaders: you use what you need. You need two matrices, because you need the intermediate stage between model space and clip-space. If you weren’t doing lighting, you could just use one.

ugluk · October 18, 2011, 11:15pm

Yeah, I remember a user (Dinev, I think) who used quaternions instead of matrices in his shaders. He didn’t use matrices at all, if I remember correctly.

system · October 19, 2021, 7:20pm

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.