NitroGL's 4x4 Matrix Inverse may have been arrived at using Gaussian elimination, but is not, as pointed out, the Gaussian elimination algorithm. It was most likely arrived at using Cramer's rule exploiting the fact that the homogenous coordinate's transformation matrix has a row and/or column that is mostly zeros. This method is the fastest (faster than Gaussian elimination) but, one of the problems with using someone else's code is that you are not sure of what assumptions were made to begin with unless it is very well documented.

I think that what the OP and many of you want is an inverse of a 4x4 homogenous coordinate's transformation matrix regardless of whether the matrix is affine or not. However it may help to think of an “affine” matrix as just another transformation matrix with special properties that preserves lines and parallelism.

Double precision should used before implementing a partial pivoting strategy such as row pivoting because in some situations partial pivoting can also produce inaccurate solutions and because certain matrices, such as diagonally dominant and positive definite matrices don’t require pivoting at all. Because the fpu in most computers do arithmetic internally in double precision, the use of double precision is almost free, except for a few extra memory transfers and memory storage. The memory storage for a 4x4 matrix inverse is negligible.