Combining matrices, scaling, and perspective projection

Started by
10 comments, last by 4everlive 15 years, 1 month ago
I'm writing a web app which performs software scale, rotate, translate, and perspective, but I'm battling with the matrices. My matrices are row-major and I'm using a left-handed coordinate system. Rotation and scaling works fine, but I'm having difficulty with translation and perspective. I start with an identity matrix: Data = new double[] { 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1 }; And then add translation: /// <summary>Adds translation to this matrix.</summary> public void Translate(double x, double y, double z) { Mat4 t = new Mat4( 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, x, y, z, 1); Multiply(t); } The multiply method performs standard matrix multiplication, so if I translate the identity matrix, e.g. Translate(5, 0, 0), I end up with a matrix like this: 1, 0, 0, 0 0, 1, 0, 0 0, 0, 1, 0 5, 0, 0, 1 If I now multiply a 4D vector by this matrix, e.g. [1, 1, 1, 1], it transforms to [1, 1, 1, 6]. How do I get from this to the 3D vector I want, i.e. [6, 1, 1]? What I am trying to do is: • Combine any number of scale, rotate, and translate operations into a single 4x4 matrix, • add the perspective transform at the end, • multiply all 4d vectors in my scene by this matrix, • and finally, extract the 2d coordinates for painting on the screen. What are the steps needed for doing this? I have scale, rotate, and translate matrices, but not perspective, which is my other problem. Even with just a rotating cube (I guess this is "hello world") I cannot get perspective to work. I've tried to apply various matrices from OpenGL (after converting to row-major) to DirectX, but no joy. Results range from simple orthographic projection (same as just stripping off z) to the corners of the cube flying off the screen and re-entering later. So once again, I'm needing to understand the workflow. What is the procedure for applying perspective, and what should the perspective matrix look like?
Advertisement
I have a suspicion that your matrix-vector multiplication function is implemented incorrectly. Could you post it?
My vectors (struct Vec4) have a data member:
public readonly double[] Data; // Data.Length == 4 (x,y,z,w)

Multiplication is then performed as follows:

/// <summary>Multiplies this by mat.</summary>
public void Multiply(Mat4 mat) {

Vec4 tmp = new Vec4(this); // copy constructor - see below
double[] v = tmp.Data;
double[] m = mat.Data;

Data[0] = v[0] * m[0] + v[1] * m[1] + v[2] * m[2] + v[3] * m[3];
Data[1] = v[0] * m[4] + v[1] * m[5] + v[2] * m[6] + v[3] * m[7];
Data[2] = v[0] * m[8] + v[1] * m[9] + v[2] * m[10] + v[3] * m[11];
Data[3] = v[0] * m[12] + v[1] * m[13] + v[2] * m[14] + v[3] * m[15];
}

The copy constructor performs a deep copy:

/// <summary>Copy constructor.</summary>
public Vec4(Vec4 source) {
Data = new double[4];
source.Data.CopyTo(Data, 0);
}


I *think* it's correct, because it works when multiplying a vector by a matrix containing rotations and scaling.
One more question: is your matrix data stored in row-major order, or column-major order?
It's row major - mentioned in my first post ;-)
Quote:It's row major - mentioned in my first post ;-)
Indeed. The reason I asked is that people often say they're using (e.g.) row-major matrices, when what they actually mean is that they're using row-vector notation. So I just wanted to make sure.

Anyway, it looks to me as if your matrix-vector multiplication function is transposed from what it should be. If it appears to be working in some cases, it may just be accidental. In any case, the results that you're getting when you apply the translation matrix are exactly what you'd expect from having the mult code backwards.

Here's an example using a 2x2 matrix (for clarity):
[0 1][0 1] = [0*0+1*2 0*1+1*3]     [2 3]
As you can see, the pattern is the opposite of what you have in your function. Assuming row-vector notation and row-major matrix storage (which appear to be the conventions that you're using), the matrix indices in your matrix-vector mult function should be the transpose of what they are now.
Ok, I just tried changing my Translate() method from

Mat4 t = new Mat4(
1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, 0,
x, y, z, 1);
Multiply(t);

to

Mat4 t = new Mat4(
1, 0, 0, x,
0, 1, 0, y,
0, 0, 1, z,
0, 0, 0, 1);
Multiply(t);

And this also works.

So I'm confused, is it not perhaps my Translate() method which is wrong? I read the "Matrix and Vector Manipulation for Computer
Graphics" pdf from this forum and it appears (to me) that my multiplication is correct.

In using row-major order, which of the above translation matrices have the right form, i.e. should the x, y, z transforms be stored in the w components of rows 1, 2, and 3, or should it be stored in row 4?
I skimmed the paper you mentioned. It looks like the paper uses column-vector notation, but I didn't see any mention of matrix storage order, or any examples showing how to perform matrix-vector multiplication using strictly 1-d arrays. So unless I'm missing something, it doesn't look like there's anything there to compare your multiplication function to.
Quote:In using row-major order, which of the above translation matrices have the right form, i.e. should the x, y, z transforms be stored in the w components of rows 1, 2, and 3, or should it be stored in row 4?
See, this is where it gets confusing - whether the matrix is row- or column-major has absolutely nothing to do with whether the translation is stored in the bottom row or the right-most column. (Also, the 'w' components aren't 'locked into' the right-most column - in fact, it's probably best to just think of matrix elements, e.g. m11, rather than in terms of x, y, z, and w.)

Basically, there are two choices you have to make: whether you're going to use row- or column-vector notation, and whether you're going to use row- or column-major storage. Although in some cases there may be practical reasons to choose a particular pair of conventions, the two issues are basically orthogonal. In other words, any of the possible combinations is valid.

So this leaves you with four options:

1. Row vectors and row-major matrices
2. Row vectors and column-major matrices
3. Column vectors and row-major matrices
4. Column vectors and column-major matrices

The four options can be further categorized as follows:

Basis vector elements are contiguous:

1. Row vectors and row-major matrices
4. Column vectors and column-major matrices

Basis vector elements are not contiguous:

2. Row vectors and column-major matrices
3. Column vectors and row-major matrices

OpenGL is generally thought of as using option 4, while DirectX uses option 1. You'll note that both of these fall into the category of 'contiguous basis vector elements', which means, in short, that OpenGL and DirectX matrices are actually interoperable. (At least in terms of layout - when it comes to how the various transform matrices are constructed, there are some differences.)

This post is getting a little long, so I'll get to the point. Before going any further, you'll need to settle on a choice of conventions (that is, choose one of the four options outlined above). That choice will then determine how the various functions in your library (matrix multiplication, construction of transforms, etc.) will need to be implemented.
Ah, I'm starting to see the light - pardon my ignorance.

Just looking at it again, yes, the paper uses column-notation. What I thought was row-major matrix order (or the reason for my multiplication being correct) was based on "6 Matrix Vector Transformations" on page 5 (my multiplication is done the same way). But I see what you're saying, it's almost more of a notational convenience - implementation is independent. Thank you for the clarification, as you can probably tell, I'm more than a little confused ;-)

I've settled on row-vectors and row-major matrices. I initially had it the other way around (for OpenGL) but then realised that it makes no difference, and a book I'm reading (love it, but it will be elementary for you - "3D Math Primer for Graphics and Game Development") has some sensible arguments for choosing row-vectors (and it uses row-major matrices, as does DirectX).

So given my choices (option 1: row-vector and row-major matrices) I fixed my Translate() method and now have scale, rotate, translate, and perspective working. But in terms of perspective, please explain to me:

The projection matrix I'm using is from the Direct3D docs:

w 0 0 0
0 h 0 0
0 0 Q 1
0 0 -QN 0

Where:
w = X scaling factor
h = Y scaling factor
N = near Z
F = far Z
Q = F / (F - N)

This matrix seems to be in column-major order, not that I have an intuitive mathematical understanding, but it only works if I transpose it ;-) so I end up with code like this:

(Edit: I'm not actually transposing it, I'm swapping two cells)

double w = 1;
double h = 1;
double N = 10;
double F = 1;
double Q = F / (F - N);

Mat4 matProj = new Mat4(
w, 0, 0, 0,
0, h, 0, 0,
0, 0, -Q * N, 0,
0, 0, Q, 1);

Which works! A happy moment for me. But from other docs I've read it appears that the projection matrix doesn't actually perform the projection, it only sets up w, which you still have to divide by in order to derive the (x,y) pair for painting, e.g.:

x = vec.X / vec.W;
y = vec.Y / vec.W;

Is there a way to incorporate this division into the compound matrix, so that I only need to perform one matrix multiplication on each vector and then simply discard z and w, and paint x,y? Or do I need to do this with a 3x3 matrix, in which case the single divisions will be quicker?
Quote:I've settled on row-vectors and row-major matrices. I initially had it the other way around (for OpenGL) but then realised that it makes no difference, and a book I'm reading (love it, but it will be elementary for you - "3D Math Primer for Graphics and Game Development") has some sensible arguments for choosing row-vectors (and it uses row-major matrices, as does DirectX).
Oh, I often recommend 3D Math Primer - I think it's a great book :)
Quote:But in terms of perspective, please explain to me:

The projection matrix I'm using is from the Direct3D docs:

w 0 0 0
0 h 0 0
0 0 Q 1
0 0 -QN 0

Where:
w = X scaling factor
h = Y scaling factor
N = near Z
F = far Z
Q = F / (F - N)

This matrix seems to be in column-major order, not that I have an intuitive mathematical understanding, but it only works if I transpose it ;-) so I end up with code like this:

(Edit: I'm not actually transposing it, I'm swapping two cells)

double w = 1;
double h = 1;
double N = 10;
double F = 1;
double Q = F / (F - N);

Mat4 matProj = new Mat4(
w, 0, 0, 0,
0, h, 0, 0,
0, 0, -Q * N, 0,
0, 0, Q, 1);

Which works! A happy moment for me. But from other docs I've read it appears that the projection matrix doesn't actually perform the projection, it only sets up w, which you still have to divide by in order to derive the (x,y) pair for painting, e.g.:

x = vec.X / vec.W;
y = vec.Y / vec.W;
Hm, I'm not sure why you'd need to swap those two elements - that seems a little suspicious.
Quote:Is there a way to incorporate this division into the compound matrix, so that I only need to perform one matrix multiplication on each vector and then simply discard z and w, and paint x,y? Or do I need to do this with a 3x3 matrix, in which case the single divisions will be quicker?
The perspective division will need to be performed as a separate step, I'm fairly certain (also, using 3x3 matrices won't help you here, since you're working with 4-d homogeneous coordinates).

If you're looking for ways to optimize the pipeline, you might try searching the forum archives for 'software renderer' (I'd be willing to bet you'd find some threads discussing how to optimize a software-based graphics pipeline).

This topic is closed to new replies.

Advertisement