Efficiency of Using Inverse Matrices to Calculate World Coordinates

I am creating an 2D game engine for myself and have a question about the efficiency of my method of finding the world coordinates that screen coordinates correspond to. My method uses inverse matrices:

1.) Turn the screen coordinates into OpenGL coordinates (from 0 - 1080 to -1.0 - 1.0)
2.) Plug those new coordinates into a vec4
3.) Multiply by the inverse orthographic projection matrix
4.) Multiply by the inverse view matrix
5.) Extract the x and y values of the vec4

I believe that this is good solution, but I’m not sure if it is the best solution. I’m essentially asking if there is a better, quicker way to convert from screen coordinates to world coordinates. Any input is appreciated! :slight_smile:

[QUOTE=gooroo7;1287572]I am creating an 2D game engine for myself and have a question about the efficiency of my method of finding the world coordinates that screen coordinates correspond to. My method uses inverse matrices:

1.) Turn the screen coordinates into OpenGL coordinates (from 0 - 1080 to -1.0 - 1.0)
2.) Plug those new coordinates into a vec4
3.) Multiply by the inverse orthographic projection matrix
4.) Multiply by the inverse view matrix
5.) Extract the x and y values of the vec4

I believe that this is good solution, but I’m not sure if it is the best solution. I’m essentially asking if there is a better, quicker way to convert from screen coordinates to world coordinates.
[/QUOTE]

You can merge steps 3 and 4, i.e. multiply the projection and view matrices together and invert the result, so you only need to multiply by one matrix.

For 2D with an orthographic projection, you don’t need a 4x4 matrix; 2x3 would probably suffice. And if you don’t support rotation, you can simplify it further to just dst=src*scale+offset. IOW, don’t use 16+16=32 multiplies where 2 or 4 would suffice.

Specifically regarding inversion, it depends.

If you invert once on the CPU, then upload the already-inverted matrix to your shader as a uniform, you’re only going to be eating the cost of the inversion once per frame, which statistically will be down in the noise on any performance graph. Otherwise it’s just the same as any other matrix multiply (which you may be able to optimize further as GClements suggests).

If you invert per vertex in your GLSL things get a little more interesting (or exciting, depending on your point of view).

A reasonably intelligent GLSL compiler might be able to detect the pattern, convert it to a one-time-only inversion, then cache and reuse the result, and this might be particularly true if you’re using the old (and deprecated) gl_ModelViewProjectionMatrix and friends. That’s probably the best case and would leave you really no worse off than if you had done your own one-time-only inversion. However, you are completely at the mercy of how intelligent your implementation’s GLSL compiler is, which is a place you probably don’t want to be.

The worst case is that you get a full inversion per vertex.

Even that might not be a huge (if any) performance impact, because ALU operations are generally cheap and you’re more likely to be bottlenecked on fillrate or texture lookups. That, of course, doesn’t mean that scenarios where you’re actually bottlenecked on ALU don’t exist.

If it was me I’d invert one-time-only on the CPU and upload the inverted matrix, eating the cost of using additional uniform slots, because I’d know that performance of everything that follows would be more dependable and predictable, and I wouldn’t be reliant on my driver doing the right thing.