Confused a bit on the matrix math in 3D programming

Graphics and GPU Programming Programming

Started by Purebe March 10, 2012 01:32 AM

6 comments, last by Purebe 12 years, 1 month ago

100

Author

March 10, 2012 01:32 AM

I'm reading 'Beginning XNA 3.0 Game Programming's chapter on 3D programming math. I had a univesity class (3D game design) where I pretty much screwed up a good chance to learn the math because I was slacking and now I'm actually interested and I need to learn the math behind it all so I can do it proficiently.

With that out of the way...I feel like I'm getting this stuff (since I've already made a few 3D games before in XNA 3.1 my ideas here are somewhat clear, but I was really just trail n erroring my way through most of the 3D stuff) because it makes a lot of sense in retrospect. BUT...the book is not very clear on the following and I'd like come help to clear it up:

1. Since XNA uses a right-handed coordinate system objects with a negative Z values are what would be visible if the camera was at the origin. What I'm a bit confused on is this functions parameter and how it works:

Matrix.CreatePerspectiveFieldOfView(MathHelper.ToRadians(45.0f), aspectRatio, 1.0f, 10.0f)    // parameters: FOV, Aspect Ratio, Near View Plane, Far View Plane

The book explains it like this:

... then create a perspective projection matrix, "looking" in a 45-degree angle as the field of view. The rendering happens for objects from 1 to 10 units from the screen (z values from -1 to -10).[/quote]

I'm confused because it seems to me if the near view plane is set to be 1.0f, it should be at 1.0f along the z axis, which would be behind the screen, going further out behind the screen to 10.0f at the far view plane. However, it's saying this isn't the case, and I have a few ideas about why that might be but none seem obvious enough to not warrant asking for clarification.

2. What are these matrices {'view', 'world', 'projection'...etc} It lists the following as definitions of each matrix:

View: defines the camera position and direction. Usually created using Matrix.CreateLookAt.

Projection: used to map the 3D scene coordinates to screen coordinates. Usually created through Matrix.CreatePerspective...etc

World: used to apply transformations to all objects in the 3D scene.[/quote]

This gives me a great idea of what they are used for but I'm fuzzy on what they are. Here's my foggy understanding of each:

View: My understanding is that this is the camera "in concept." In other words, every frame a camera matrix (which is a hard idea for me to get, a camera is just a matrix? Not that it would be able to do anything just as matrix, but you know what I mean maybe?) at origin, and then it's translated, rotatated, or whatever by matrix multiplcation with this view matrix.

Projection: This seems to be the "functionality" of the camera. I have no idea what gets multiplied by this matrix or how it's used, REALLY foggy on this one.

World: I'm also really foggy about this one. It almost seems like if every objects position matrix is multiplied by the world matrix then that is what sets up the scene for rendering, but, if that's the case then you aren't really using cameras you are just using a camera at origin and moving the entire world to fit your scene? Confusing.

Sorry about the length of the post, trying to be as concise as I possibly can be.

21st Century Moose

13,459

March 10, 2012 02:21 AM

I personally think that the camera analogy is really really bad as it can lead to confusions such as those you've expressed. It does seem to have stuck, however. I've always found it easier to just think of objects being positioned and moving in a 3D scene, which is truer to what's really happening - maybe that's just me?

So, the view matrix represents where you are in the scene and what direction you're looking in. Like coordinates on a map.

Projection is what makes a scene look 3D - far away objects look smaller, you get the nice perspective effect of lines receding into the distance, and so on. (It can be used for other things too, but let's not confuse things too much right now). Regarding the near plane confusion - there's no need to worry about the details of this for now. There is other stuff going on that sets everything up correctly; just think of the near and far planes as distances from where you are in the direction you're looking, and it may help.

World - you're actually right. It isn't really a camera at all, it's just defining where objects are in the scene - the "camera" is just an analogy that's supposed to try help make understanding things easier; it's just an abstraction of what's really happening. If however you think about it, there is no difference between "you moving and the scene staying still" versus "you staying still and the scene moving" - I don't really want to invoke Albert Einstein here, but it is actually all relative. Your thinking about this seems to be the same as mine - you've mentally picked the frame of reference that is "you" and see things in terms of how they appear from that frame of reference (the "camera" analogy works the other way around).

For most static objects in the scene you won't actually need a world matrix - you just set it to identity, position view appropriately, set up projection and it works.

Where world comes in handy is when the scene contains other objects that also move. In those cases the objects will have different positions relative to you and relative to the scene, which they must also be move to. So they get a non-identity world matrix.

A separate view and world aren't actually needed - you can combine them into a single matrix for the same effect - but having them separate does make things a little easier and cleaner IMO.

So:

View: where you are in the scene and which direction you're looking in.
Projection: make the scene look 3D.
World: position of other objects in the scene.

That's a fairly rough and admittedly incomplete description, but I hope it helps a little.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

Purebe

100

Author

March 10, 2012 02:38 AM

Okay so the view matrix represents where I'm at and objects that should be visible based off the projection matrix (I'm still confused on this one. What gets multiplied by the projection matrix?) are multiplied by the view matrix to position them into the area where they should appear so when they are rendered to the screen they look right? My deduction here seems really off.

If you do not need a seperate view and world matrix, are they put together somehow or do they interact with different objects?

I'm not sure that makes sense either. If I knew more about this I'd be able to ask better formulated questions but as is I'm just really mucked up, sorry.

Adaline

710

March 10, 2012 06:59 AM

Hello

First of all you have your World, and in this world (world space) your entities are located thanks to their own world matrix :
The world matrix of an entity contains a translation, a rotation and a scaling eventually. So a world matrix per entity.

Then you want to render your world from a particular point of view (view space) . The view matrix transforms world space into view space : the origin becomes the camera position, the z vector points towards the direction the camera is looking at and so on.

Then you want to control how the 3d scene is rendered (projected) on the 2d screen (clip space) . The projection matrix defines how the scene is projected. We can see it as defining the optical properties of the camera.

I hope it can help you see how transforms work

Nico

Purebe

100

Author

March 10, 2012 07:40 AM

Okay, so that means that the world matrix is actually just defining properties of an entity in the world (more precisely, it is multiplied by the "position" matrix of each entity to get them properly situated in world space)

But...what is being multiplied by the view matrix? Each entity in the world space that can be seen looking out from where the view matrix position is? (But that can only be determined by the projection matrix...but what does the projection matrix get multiplied by?)

I'm starting to get a better understanding of it I believe but I'm still not clear on the view matrix and the projection matrix. I understand that the view matrix "sets up" the camera and the projection matrix "sets up" what it can see, I think, but, I don't understand how that is achievable by matrix math?

Adaline

710

March 10, 2012 08:24 AM

The complete transform for a given object is the matrix M=Mworld*Mview*Mprojection.

(line vectors)

How to transform an object ? By multiplying each one of its vertices with M.
(At the rendering level, substancially, you simply bind M and the object to the gpu, and order a draw call.)

Maybe seeing matrices as sort of 'converters' from one space (frame) to another can help here?

A matrix is a powerful and practical tool to encode and manipulate affine and projective transformations, perhaps getting infos on them may help I think ?

Nico

Trienco

2,555

March 10, 2012 08:36 AM

Every object has a world matrix. This matrix is also very simple to read, because the four columns are nothing but "right","up","forward","position".

If you pretend that the camera is a regular object with a world matrix, the view matrix is simply the inverse of it. There is no "camera" and no way to actually change your point of view, You just move everything the opposite way.

Every object that is rendered will go through the view matrix, its world matrix and the projection matrix (or rather one single matrix you get by multiplying them all). 3D is nothing but taking 3D points, putting them through a bunch of matrices and getting a 2D point in screen coordinates. Simply speaking and ignoring all the extra stuff that makes 3D look cool and fancy.

f@dzhttp://festini.device-zero.de

Purebe

100

Author

March 10, 2012 08:57 AM

@Nico: I'm trying to understand how the final rendered scene is set up.

My current "understanding" is that for any entity you take that entities world matrix, multiply it by the projection and view matrix, and then preform that transformation matrix on the entity itself. Then if that entity is visible from the origin it should be rendered on screen, and if it's not that's how we know not to render it in the final scene. Hopefully this is correct.

@Trienco: That helps me understand the view matrix and world matrix relationship much better, thank you! (actually it seems your entire post is exactly what I was looking for, but as I wrote above I'm not entirely certain I get it yet but I think this makes sense.)

PS: Is there any way to enable spell checking on these forums? Fire fox usually works in these boxes but it doesn't seem to want to on this particular website

Confused a bit on the matrix math in 3D programming

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Confused a bit on the matrix math in 3D programming

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines