Regarding the use of Homogeneous Coordinates

Mukund · May 8, 2011, 6:05am

Hello Everyone,

i am reading a book on the Mathematics involved in Graphics. On homogeneous coordinates, this is what i read:

Basically, homogeneous coordinates define a point in a plane using three
coordinates instead of two. Initially, Pl¨ucker located a homogeneous point
relative to the sides of a triangle, but later revised his notation to the one
employed in contemporary mathematics and computer graphics. This states
that for a point P with coordinates (x, y) there exists a homogeneous point
(x, y, t) such that X = x/t and Y = y/t. For example, the point (3, 4) has
homogeneous coordinates (6, 8, 2), because 3 = 6/2 and 4 = 8/2. But the
homogeneous point (6, 8, 2) is not unique to (3, 4); (12, 16, 4), (15, 20, 5) and
(300, 400, 100) are all possible homogeneous coordinates for (3, 4).
The reason why this coordinate system is called ‘homogeneous’ is because
it is possible to transform functions such as f (x, y) into the form f (x/t, y/t)
without disturbing the degree of the curve. To the non-mathematician this
may not seem anything to get excited about, but in the field of projective
geometry it is a very powerful concept.

Well i have seen its use quite a lot in Graphics, but i have never quite understood its essence. Could anyone please elaborate why it so useful and its advantages, with a clear example? i read some links online, but im still not very clear about it.

Well using [x, y, 1] instead of [x, y] does provide a facility for all transformations.So, i understood that part. But why do we always use a “1” there?

Thanks.

Dark_Photon · May 8, 2011, 11:47am

To your latter point, you don’t always put a 1 there. You put a 1 for “position vectors” and you put a 0 there for “direction vectors”.

Now as to what it’s used for? Say you have a bunch of transforms you need to apply to get a bunch of points (or vectors) from one space to another (x being a point and A…G being transforms):

y = G * F * E * D * C * B * A * x

For each point, you can apply each transform one-at-a-time in the right order, right? But that’s needlessly expensive.

An alternative is to precompute the product of all the transforms (H=GFEDCBA), and then just apply one transform to each point … much cheaper:

y = H * x

Well, with certain transforms like rotate and scale, you don’t need an extra coordinate to encode the transform in matrix form. For 2D, just use 2D points/vectors and 2x2 matrices. For 3D, just use 3D points/vectors and 3x3 matrices. No sweat.

But for some transforms (like translation and perspective), you need another coordinate in order to “jam” the transform into a matrix form so that it can be multiplied with other transforms. Think of this as kind of a “trick”.

So for 2D, you’d need 3D vectors (last coord is 1, for positions) and 3x3 matrices. For 3D, you’d need 4D vectors and 4x4 matrices (again, last coord is 1 for positions).

With a translate matrix for instance, that “1” in the position vector ends up getting multiplied by the translation encoded in the matrix, effectively adding in the translation. With a direction vector (“0” in the last coordinate), you end up multiplying that translation encoded in the matrix by 0, effectively getting rid of it. Which is what you want: position vectors are affected by translations, but not direction vectors.

Mukund · May 9, 2011, 2:22am

Thanks for the reply Dark Photon.

Well, with certain transforms like rotate and scale, you don’t need an extra coordinate to encode the transform in matrix form. For 2D, just use 2D points/vectors and 2x2 matrices. For 3D, just use 3D points/vectors and 3x3 matrices. No sweat.

So, basically homogeneous coordinates help us do all transformations using a single matrix(4x4 in case of 3D points/vectors). So say i need to apply a rotate, a translate and a scale to an object having 4 vertices, i do:

for each point
scale
rotate
translate
end for

and here, the matrix for translate would be different for translate from that of scale and rotate, (since we can use 3x3 for scale and rotate).

Whereas using homogeneous coordinates, i can do:

M = scale * rotate * translate
for each point
Multiply point by M
end for

Am i right about that?

Alfonse_Reinheart · May 9, 2011, 3:00am

So, basically homogeneous coordinates help us do all transformations using a single matrix(4x4 in case of 3D points/vectors).

Yes and no.

Everything Dark Photon said was mathematically true, except for the fact that he called them “Homogeneous coordinates.” This isn’t the technically correct term.

What he’s describing are Affine transformations. By adding an extra coordinate, you are able to perform translation operations with matrix multiples.

Homogeneous coordinates are a way to handle projective geometric spaces easily. These are non-Euclidean geometries that represent non-linear projections, like a perspective projection.

Now, they also use an extra coordinate. The transformation between a Cartesian coordinate system and a Homogeneous one is done by dividing all of the components of the Homogeneous coordinates by the last coordinate.

Note that if the last coordinate is 0, this division is impossible. This represents a point that is, in projective space, at infinity. Homogeneous math still works on these coordinates, but they cannot be transformed back into Cartesian coordinates.

In general, the way it works in the graphics pipeline is that you perform an affine multiplication of your positions by a matrix that transforms them into a space relative to the viewing region. Then you affine transform those positions into a homogeneous coordinate system with the use of a perspective projection matrix.

So the whole time, you’re using 4D positions. It’s just that they don’t become homogeneous coordinates until the final transformation.

Dark_Photon · May 9, 2011, 4:24am

Yes, you’ve got it.

Dark_Photon · May 9, 2011, 4:37am

I don’t know what pure math guys call it, though colloquially homogeneous coordinates in 3D graphics is synonymous with 4-vectors and 4x4 transforms. The OpenGL Red Book among others refers to them as such:

So sounds I am using the the term homogenous coordinates properly. They support both affine transforms (e.g. translation) as well as projective transforms (e.g. perspective).

Mukund · May 10, 2011, 6:41am

Thanks a lot Alfonse Reinheart and Dark Photon.