I'll admit I've never implemented the routine you've mentioned but for the very same reason you make your suggestion over co-factors, it seems more difficult to implement.
Here's my implementation, I've never thoroughly benchmarked but it's sufficient for my needs.
i, j, k, and l are the matrix row vectors.
Vec4<T> i, j, k, l;T Determinant(){ T x = -l.x*(i.y*(j.z*k.w-j.w*k.z) - i.z*(j.y*k.w-j.w*k.y) + i.w*(j.y*k.z-j.z*k.y)); T y = l.y*(i.x*(j.z*k.w-j.w*k.z) - i.z*(j.x*k.w-j.w*k.x) + i.w*(j.x*k.z-j.z*k.x)); T z = -l.z*(i.x*(j.y*k.w-j.w*k.y) - i.y*(j.x*k.w-j.w*k.x) + i.w*(j.x*k.y-j.y*k.x)); T w = l.w*(i.x*(j.y*k.z-j.z*k.y) - i.y*(j.x*k.z-j.z*k.x) + i.z*(j.x*k.y-j.y*k.x)); return (x + y + z + w);}Mat4x4<T> Adjoint(){ Mat4x4<T> adj; // Transposed matrix of cofactors // First row adj.i.x = (j.y*(k.z*l.w-k.w*l.z) - j.z*(k.y*l.w-k.w*l.y) + j.w*(k.y*l.z-k.z*l.y)); adj.i.y = -(i.y*(k.z*l.w-k.w*l.z) - i.z*(k.y*l.w-k.w*l.y) + i.w*(k.y*l.z-k.z*l.y)); adj.i.z = (i.y*(j.z*l.w-j.w*l.z) - i.z*(j.y*l.w-j.w*l.y) + i.w*(j.y*l.z-j.z*l.y)); adj.i.w = -(i.y*(j.z*k.w-j.w*k.z) - i.z*(j.y*k.w-j.w*k.y) + i.w*(j.y*k.z-j.z*k.y)); // Second row adj.j.x = -(j.x*(k.x*l.w-k.w*l.z) - j.z*(k.x*l.w-k.w*l.x) + j.w*(k.x*l.z-k.z*l.x)); adj.j.y = (i.x*(k.z*l.w-k.w*l.z) - i.z*(k.x*l.w-k.w*l.w) + i.w*(k.x*l.z-k.z*l.x)); adj.j.z = -(i.x*(j.z*l.w-j.w*l.z) - i.z*(j.x*l.w-j.w*l.x) + i.w*(j.x*l.z-j.z*l.x)); adj.j.w = (i.x*(j.z*k.w-j.w*k.z) - i.z*(j.x*k.w-j.w*k.x) + i.w*(j.x*k.z-j.z*k.x)); // Third row adj.k.x = (j.x*(k.y*l.w-k.w*l.y) - j.y*(k.x*l.w-k.w*l.x) + j.w*(k.x*l.y-k.y*l.x)); adj.k.y = -(i.x*(k.y*l.w-k.w*l.y) - i.y*(k.x*l.w-k.w*l.x) + i.w*(k.x*l.y-k.y*l.x)); adj.k.z = (i.x*(j.y*l.w-j.w*l.y) - i.y*(j.x*l.w-j.w*l.x) + i.w*(j.x*l.y-j.y*l.x)); adj.k.w = -(i.x*(j.y*k.w-j.w*k.y) - i.y*(j.x*k.w-j.w*k.x) + i.w*(j.x*k.y-j.y*k.x)); // Fourth Row adj.l.x = -(j.x*(k.y*l.z-k.z*l.y) - j.y*(k.x*l.z-k.z*l.x) + j.z*(k.x*l.y-k.y*l.x)); adj.l.y = (i.x*(k.y*l.z-k.z*l.y) - i.y*(k.x*l.z-k.z*l.x) + i.z*(k.x*l.y-k.y*l.x)); adj.l.z = -(i.x*(j.y*l.z-j.z*l.y) - i.y*(j.x*l.z-j.z*l.x) + i.z*(j.x*l.y-j.y*l.x)); adj.l.w = (i.x*(j.y*k.z-j.z*k.y) - i.y*(j.x*k.z-j.z*k.x) + i.z*(j.x*k.y-j.y*k.x)); return adj;}// Inverse using Kramer's rule.Mat4x4<T> Invert(){ Mat4x4<T> adj = Adjoint(); T invdet = 1; // If the determinant is 0 there is no inverse. // 1/0 == undefined. if(Determinant() != 0) invdet = 1/T(Determinant()); Mat4x4<T> inv; inv.i = adj.i * invdet; inv.j = adj.j * invdet; inv.k = adj.k * invdet; inv.l = adj.l * invdet; return inv;}
There's a lot of room for optimization.
If the OP is going for speed/efficiency I'd recommend the looking into LU decomposition.