Started by Mar 21 2006 06:38 AM

,
10 replies to this topic

Posted 21 March 2006 - 06:38 AM

I posted this at comp.graphics.algorithms as well, but I have no replies so far, so I wanted to ask here as well:
I have a unit sphere (the globe) and a camera that orbits it. Imagine an input system that allows two points of contact (e.g. two fingers). I want to ensure that as the user drags his two fingers around, the latitude / longitude positions that he originally clicked on remain under his fingers at all times. A couple of scenarios might help to explain the concept.
User puts Left and Right fingers down about 300 pixels apart in the center of the screen. User drags right finger towards the top of the screen. That would cause a counter-clockwise rotation about the camera's z vector as well as translating the camera towards the globe (and I think a positive x axis post-translation, but it's hard to envision correctly).
User puts Left and Right fingers down about 300 pixels apart in the center of the screen. User drags both fingers away from each other (left finger -x, right finger +x). This would result in the camera translating towards the globe (zoom in). The general concept is that whatever lat/long coordinates the user clicks on remain under the respective fingers until "mouse up". This encompasses rotation and translation. It is very easy to approximate this behavior using an arbitrary value to rotate the proper direction, and translate the proper direction, but I am looking for an exact result.
I believe the Screen Space coordinates I will be getting from the input device are only useful in calculating the World Space coordinates. So assume that I have the World Space coordinates of:
* Original Camera position - (the World Space location of the camera at the time of the mouse down event) - C from now on
* Left Finger (LF) start point - (the mouse down event) - LF1 from now on
* LF current - (the current left finger position) - LF2
* Right Finger (RF) start point - RF1
* RF current - RF2
constraints:
* The Camera is always looking at the origin (0, 0, 0).
as my inputs. How would I compute where the Camera should be positioned and oriented to ensure that the original latitude and longitude directly map to LF2 and RF2 ?
Is this type of system best represented as quaternions or matrices (or perhaps something else)?
Any help is much appreciated.

Posted 21 March 2006 - 11:26 PM

Quote:

User puts Left and Right fingers down about 300 pixels apart in the center of the screen. User drags right finger towards the top of the screen. That would cause a counter-clockwise rotation about the camera's z vector as well as translating the camera towards the globe (and I think a positive x axis post-translation, but it's hard to envision correctly).

Calculate the initial screen space vector from LF to RF (in normalized coordinates in (-1,1)). Normalize it and call it u

So long as there is user input, calculate the same vector for the current coordinates LF1,RF1; let it be u

These vectors will be at an angle to each other, which obviously implies the angle you should use to rotate the camera around Z.

This can be achieved by multiplication of the view matrix V by the Z-rotation matrix:

as in:

[ cosθ -sinθ 0 0 ]

R_{z}=[ sinθ cosθ 0 0 ]

[ 0 0 1 0 ]

[ 0 0 0 1 ]

V' = R

Quote:

User puts Left and Right fingers down about 300 pixels apart in the center of the screen. User drags both fingers away from each other (left finger -x, right finger +x). This would result in the camera translating towards the globe (zoom in). The general concept is that whatever lat/long coordinates the user clicks on remain under the respective fingers until "mouse up". This encompasses rotation and translation. It is very easy to approximate this behavior using an arbitrary value to rotate the proper direction, and translate the proper direction, but I am looking for an exact result.

This is not "well defined".

Imagine the user touching two locations on the globe, which map onto the same horizontal line very close to the top of the monitor; now imagine that the user moves his fingers away from each other on the same parallel line. You want both points to remain under his fingers.

The problem is that it wouldn't be long that these points should map outside the screen (pixel Y<0), since the effect you want is merely zoom-in.

However, the user keeps both his fingers along the same parallel line very close to the top of the monitor, which effectively constrains the globe-point to remain on the very same line too.

Isn't there an ambiguity here?

Posted 22 March 2006 - 01:40 AM

Quote:

Original post by someusername

This can be achieved by multiplication of the view matrix V by the Z-rotation

I think there also needs to be a post translation involved. For instance, if the user puts the Left Finger and Right Finger down near the top of the globe (300 pixels apart) and doesn't move his Left Finger, and sort of draws a circle around it with his Right Finger, the rotation should happen about the Left Finger, not the center of the screen. If he moves his Right Finger up and his Left Finger down, I think that would rotate about the center point between the fingers at twice the rate.

Quote:

Original post by someusername

This is not "well defined".

Imagine the user touching two locations on the globe, which map onto the same horizontal line very close to the top of the monitor; now imagine that the user moves his fingers away from each other on the same parallel line. You want both points to remain under his fingers.

The problem is that it wouldn't be long that these points should map outside the screen (pixel Y<0), since the effect you want is merely zoom-in.

However, the user keeps both his fingers along the same parallel line very close to the top of the monitor, which effectively constrains the globe-point to remain on the very same line too.

Isn't there an ambiguity here?

I hope not. There are some constraints, that is, the camera is an orbit camera, so it must always be looking at (0, 0, 0), and the lat/long coordinates that were clicked on must remain under the respective fingers. The result you describe would happen if I were to merely translate the camera down its z-vector, but that is basically zooming in to the center. That wouldn't keep the lat/long positions under the user's fingers. The effect I

Posted 22 March 2006 - 07:11 AM

Quote:

Original post by mfawcettQuote:

Original post by someusername

This can be achieved by multiplication of the view matrix V by the Z-rotation

I think there also needs to be a post translation involved. For instance, if the user puts the Left Finger and Right Finger down near the top of the globe (300 pixels apart) and doesn't move his Left Finger, and sort of draws a circle around it with his Right Finger, the rotation should happen about the Left Finger, not the center of the screen.

Ah, indeed. If you want the rotation to be performed around LF, instead of simply multiplying the view matrix by the Z-rotarion matrix R

T should be:

, where dy = LF.y*ProjectionMatrix(2,2). It may need to be divided by 2, I'm not 100% sure.

[ 1 0 0 0 ]

[ 0 1 0 dy ]

[ 0 0 1 0 ]

[ 0 0 0 1 ]

I assume that LF.y is the Y component of LF, expressed in screen space (ranging in {-1,1}). It should increase upwards.

Quote:

Original post by mfawcett

If he moves his Right Finger up and his Left Finger down, I think that would rotate about the center point between the fingers at twice the rate.

Isn't it more intuitive to zoom in, in such a case? like when moving the fingers apart horizontally?

Quote:

Original post by mfawcett

The result you describe would happen if I were to merely translate the camera down its z-vector, but that is basically zooming in to the center.

I thought you *wanted* to translate the camera along its Z in this case, since you posted

Quote:

User drags both fingers away from each other (left finger -x, right finger +x). This would result in the camera translating towards the globe (zoom in)

Well, this complexifies the problem...

You can transform LF and RF into your 3d scene. You'll get two vectors v1 and v2, representing the directions implied by LF and RF in the untransformed globe's local space, and a point p0 , through which they both pass. This point will actually be the position of the camera in the same frame...

So, the points where these vectors intersect the globe can be found...

Then you get another two vectors v1', v2' (representing LF2 and RF2), passing through another point p0', and you want them to also intersect the globe at the same points...

But since p0'

Could it be that {v1, v2} will be pair-wise parallel to {v1',v2'} since -afterall- they are collinear to the same longtiude/latitude on the globe?...

Let me give this some more thought, and I'll get back...

Are you planning on implementing this?

What *is* this "input system" about, anyway?

edit:

mistakenly used LF1/RF1 instead of LF2/RF2

[Edited by - someusername on March 22, 2006 1:11:37 PM]

Posted 22 March 2006 - 07:34 AM

Quote:

Original post by someusername

Ah, indeed. If you want the rotation to be performed around LF

If the LF is stationary that is the case. What if both move? Do you envision the rotation happening about the midpoint between LF and RF?

Quote:

Original post by someusername

Isn't it more intuitive to zoom in, in such a case? like when moving the fingers apart horizontally?

That's how I currently have it working, and most users feel the same as you at first, but after a while it's annoying. This is for a pretty huge touch table (4 foot by 6 foot at waist height). If I'm trying to zoom in to a certain location, it's annoying to have to zoom gesture, pan, zoom gesture, pan, repeat...That's the behavior you get when you always zoom to center. After some time using the touch table, you start to realize that what you really want is the points under your fingers to remain under your fingers.

Quote:

Original post by someusername

I thought you *wanted* to translate the camera along its Z in this case, since you posted

Yes, sorry. I should've been clearer. The example I posted would be if the user's fingers were along the vertical center of the globe.

Quote:

Original post by someusername

Let me give this some more thought, and I'll get back...

That would be great, I appreciate you taking the time to think this one over.

Quote:

Original post by someusername

Are you planning on implementing this?

What *is* this "input system" about, anyway?

Yes. Here is a link to what I am working on. I am responsible for the User Interface of that. There is a short video to show you it in action, but it's from a year or two ago, so it's pretty out of date.

Here's a video showing an approximation of the behavior I'd like. It's a longer video, and the part I'm specifically talking about is towards the end, but you can clearly tell that they are just approximating the user's motions.

Posted 22 March 2006 - 09:26 AM

Here's my latest thought:

There are a few steps I'm unsure of (marked with question marks).

constraint: C's z-vector is always a ray from the center of the screen to the origin.

Let R = rotation((RF1 - LF1), (RF2 - LF2))

Let Cd = magnitude©

Let Sf = mag(RF2 - LF2) / mag(RF1 - LF1)

Let I1 = the intersection of C's z-vector and the globe

Let VLF = I1 - LF2

VLF /= Sf

Rotate VLF by inverse®

Let I2 = VLF + LF1

C's new position = I2 * (Cd / Sf) ?

C's new z-vector = normalize(I2)

C's new rotation = R

(Edit: I was wrong on a few parts, hopefully fixed now)

[Edited by - mfawcett on March 22, 2006 4:26:49 PM]

There are a few steps I'm unsure of (marked with question marks).

constraint: C's z-vector is always a ray from the center of the screen to the origin.

Let R = rotation((RF1 - LF1), (RF2 - LF2))

Let Cd = magnitude©

Let Sf = mag(RF2 - LF2) / mag(RF1 - LF1)

Let I1 = the intersection of C's z-vector and the globe

Let VLF = I1 - LF2

VLF /= Sf

Rotate VLF by inverse®

Let I2 = VLF + LF1

C's new position = I2 * (Cd / Sf) ?

C's new z-vector = normalize(I2)

C's new rotation = R

(Edit: I was wrong on a few parts, hopefully fixed now)

[Edited by - mfawcett on March 22, 2006 4:26:49 PM]

Posted 22 March 2006 - 10:30 AM

Here is the source to my orbit camera. Free to use for all purposes, no guarantees.

Where:

U is in the range of -PI/2 to PI/2

V is in the range of 0 to 2PI

W is in the range of 0.0 to INF

To set these, call SetUVW. For ray tracing purposes, there is also the setup function SetImagePlaneConfig().

For ray tracing purposes, the final image plane ray corner vectors are stored in the u*v* member variables.

My preferred texture mapping / vertex order is as such:

For other camera use:

See look_at, up and right for orientation vectors after initial camera setup.

See eye for the camera position in R3 after initial camera setup.

point_3.h

point_3.cpp

uv_rig.h

uv_rig.cpp

Where:

U is in the range of -PI/2 to PI/2

V is in the range of 0 to 2PI

W is in the range of 0.0 to INF

To set these, call SetUVW. For ray tracing purposes, there is also the setup function SetImagePlaneConfig().

For ray tracing purposes, the final image plane ray corner vectors are stored in the u*v* member variables.

My preferred texture mapping / vertex order is as such:

// UV texture space / vertex winding order

// _________

// v1|1 4|

// | |

// | |

// | |

// v0|2_______3|

// u0 u1

For other camera use:

See look_at, up and right for orientation vectors after initial camera setup.

See eye for the camera position in R3 after initial camera setup.

point_3.h

#ifndef POINT_3

#define POINT_3

#include <cmath>

class point_3

{

public:

float x, y, z;

point_3(void);

point_3(const float &x, const float &y, const float &z);

point_3 operator-(void);

void zero(void);

void normalize(void);

void rotate_x(const float &radians);

void rotate_y(const float &radians);

};

#endif

point_3.cpp

#include "point_3.h"

point_3::point_3(void)

{

zero();

}

point_3::point_3(const float &x, const float &y, const float &z)

{

this->x = x;

this->y = y;

this->z = z;

}

point_3 point_3::operator-(void)

{

return point_3(-this->x, -this->y, -this->z);

}

void point_3::zero(void)

{

x = y = z = 0.0f;

}

void point_3::normalize(void)

{

float len = sqrt(x*x + y*y + z*z);

x /= len;

y /= len;

z /= len;

}

void point_3::rotate_x(const float &radians)

{

float t_y = y;

y = t_y*cos(radians) + z*sin(radians);

z = t_y*-sin(radians) + z*cos(radians);

}

void point_3::rotate_y(const float &radians)

{

float t_x = x;

x = t_x*cos(radians) + z*-sin(radians);

z = t_x*sin(radians) + z*cos(radians);

}

uv_rig.h

#ifndef uv_rig

#define uv_rig

// uv_rig.h::Fig. 1

//

// UV camera rig

//

// latitude: | longitude: | radius: |

// *_*_ | ___ | ___ |

// */ \ | / \ | / \ |

// u: *| x | | v: |**x**| | w: | x**| |

// *\___/ | \___/ | \___/ |

// * * | | |

//

// where u { }

//

#include "point_3.h"

#include <cfloat>

#define PI_HALF (1.5707963267f)

#define PI (3.1415926535f)

#define PI_2 (6.2831853071f)

#define RAD_TO_DEG_COEFFICIENT (57.2957795f)

class uv_rig

{

protected:

float u; // -PI_HALF ... PI_HALF

float v; // 0 ... PI_2

float w;

int ip_width;

int ip_height;

float ip_fov;

public:

// eye location, after rotation*translation

point_3 eye;

// look_at unit vector, after rotation

point_3 look_at;

// up and right unit vectors, after rotation

point_3 up;

point_3 right;

// image plane look-at unit vectors, after rotation

point_3 u0v1, u0v0, u1v0, u1v1;

public:

uv_rig(void);

void SetUVW(const float u, const float v, const float w);

void GetUVW(float &u, float &v, float &w) const;

void SetImagePlaneConfig(const int &width, const int &height, const float &fov);

void GetImagePlaneConfig(int &width, int &height, float &fov) const;

inline float GetIPFOV(void){ return ip_fov; }

inline const int GetIPWidth(void){ return ip_width; }

inline const int GetIPHeight(void){ return ip_height; }

inline float GetU(void){ return u; }

inline float GetV(void){ return v; }

inline float GetW(void){ return w; }

protected:

void Transform(void);

void Reset(void);

void Rotate(void);

void Translate(void);

void ConstructImagePlane(void);

};

#endif

uv_rig.cpp

#include "uv_rig.h"

uv_rig::uv_rig(void)

{

u = v = 0.0f;

w = 5.0f;

ip_width = 1;

ip_height = 1;

ip_fov = PI/4.0f; // 1/8th of a circle

Transform();

}

void uv_rig::SetUVW(const float u_radians, const float v_radians, const float w_units)

{

u = u_radians;

v = v_radians;

w = w_units;

static float gimbal_lock_buffer = FLT_EPSILON * 1E3;

if(u < -PI_HALF + gimbal_lock_buffer)

u = -PI_HALF + gimbal_lock_buffer;

else if(u > PI_HALF - gimbal_lock_buffer)

u = PI_HALF - gimbal_lock_buffer;

while(v < 0.0f)

v += PI_2;

while(v > PI_2)

v -= PI_2;

if(w < 0.0f)

w = 0.0f;

else if(w > 10000.0f)

w = 10000.0f;

Transform();

}

void uv_rig::GetUVW(float &u_radians, float &v_radians, float &w_units) const

{

u_radians = u;

v_radians = v;

w_units = w;

}

void uv_rig::SetImagePlaneConfig(const int &width, const int &height, const float &fov)

{

if(width < 1)

ip_width = 1;

else

ip_width = width;

if(height < 1)

ip_height = 1;

else

ip_height = height;

if(fov < 1.0f)

ip_fov = 1.0f;

else if(fov > PI_2 - 1.0f)

ip_fov = PI_2 - 1.0f;

Transform();

}

void uv_rig::GetImagePlaneConfig(int &width, int &height, float &fov) const

{

width = ip_width;

height = ip_height;

fov = ip_fov;

}

void uv_rig::Transform(void)

{

Reset();

Rotate();

Translate();

}

void uv_rig::Reset(void)

{

eye.zero();

look_at.zero();

up.zero();

right.zero();

// eye.x += translate_u;

// eye.y += translate_v;

look_at.z = -1.0f;

up.y = 1.0f;

right.x = 1.0f;

ConstructImagePlane();

}

void uv_rig::Rotate(void)

{

// rotate about the world x axis

look_at.rotate_x(u);

up.rotate_x(u);

u0v1.rotate_x(u);

u0v0.rotate_x(u);

u1v0.rotate_x(u);

u1v1.rotate_x(u);

// rotate about the world y axis

look_at.rotate_y(v);

up.rotate_y(v);

right.rotate_y(v);

u0v1.rotate_y(v);

u0v0.rotate_y(v);

u1v0.rotate_y(v);

u1v1.rotate_y(v);

}

void uv_rig::Translate(void)

{

// place the eye directly across the sphere from the look-at vector's "tip",

// then scale the sphere radius by w

eye.x = -look_at.x*w;

eye.y = -look_at.y*w;

eye.z = -look_at.z*w;

look_at.x = 0.0;

look_at.y = 0.0;

look_at.z = 0.0;

}

void uv_rig::ConstructImagePlane(void)

{

// uv_rig.cpp::ConstructImagePlane::Fig. 1

//

// split the frustum down the middle using a plane that is parallel to the shorter sides

//

// ___a___________

// |\ | /|

// | \ | / |

// | \ R| / |b

// |______\ /______|

//

// R = field of view / 2.0 (radians)

// a = tan® (units)

// b = a * s/l; (units)

// s = shortest side (pixels)

// l = longest side (pixels)

float ip_half_w = 0.0f;

float ip_half_h = 0.0f;

if(ip_width >= ip_height)

{

ip_half_w = tan(0.5f*ip_fov*(static_cast<float>(ip_width - 1)/static_cast<float>(ip_width)));

ip_half_h = ip_half_w*(static_cast<float>(ip_height)/static_cast<float>(ip_width));

}

else

{

ip_half_h = tan(0.5f*ip_fov*(static_cast<float>(ip_height - 1)/static_cast<float>(ip_height)));

ip_half_w = ip_half_h*(static_cast<float>(ip_width)/static_cast<float>(ip_height));

}

u0v1.x = -ip_half_w;

u0v1.y = ip_half_h;

u0v1.z = look_at.z;

u0v0.x = -ip_half_w;

u0v0.y = -ip_half_h;

u0v0.z = look_at.z;

u1v0.x = ip_half_w;

u1v0.y = -ip_half_h;

u1v0.z = look_at.z;

u1v1.x = ip_half_w;

u1v1.y = ip_half_h;

u1v1.z = look_at.z;

}

Posted 22 March 2006 - 12:02 PM

I am speechless! Honestly! I hadn't even imagined that touch-screen technology has advanced so much; not to mention the consequences!

Very interesting project, indeed.

This would be the reasonable thing to expect, yes.

Btw, now that I noticed it, I thought that the LF/RF were screen space coordinates. In my posts above, I was considering them in that frame -unless explicitly stated otherwise.

With that maxim in mind, it becomes obvious that the transformations implied by the user's fingers, should be performed on the data, in the very same frame that they are originally input: the screen space.

Both transformation schemes we discussed (the rotation and the scaling) are trivial to implement in screen space, transformed, 2D vertices.

And the reference points*would* remain under the respective fingers, 100% accurately.

However, you want to be able to propagate these transformations down to the very camera...

At least, since the camera FOV doesn't have to change, this means that all transformation should be focused on the view matrix...

I am pretty much sure that I got the rotation part right, in my previous post.

The scaling seems a bit troublesome though...

I cannot assert that, because I fail to follow your reasoning on this one, sorry. :/

I'll check this again tomorrow, with a fresh pair of eyes.

Very interesting project, indeed.

Quote:

Original post my mfawcett

If the LF is stationary that is the case. What if both move? Do you envision the rotation happening about the midpoint between LF and RF?

This would be the reasonable thing to expect, yes.

Btw, now that I noticed it, I thought that the LF/RF were screen space coordinates. In my posts above, I was considering them in that frame -unless explicitly stated otherwise.

Quote:

Original post my mfawcett

After some time using the touch table, you start to realize that what you really want is the points under your fingers to remain under your fingers.

With that maxim in mind, it becomes obvious that the transformations implied by the user's fingers, should be performed on the data, in the very same frame that they are originally input: the screen space.

Both transformation schemes we discussed (the rotation and the scaling) are trivial to implement in screen space, transformed, 2D vertices.

And the reference points

However, you want to be able to propagate these transformations down to the very camera...

At least, since the camera FOV doesn't have to change, this means that all transformation should be focused on the view matrix...

I am pretty much sure that I got the rotation part right, in my previous post.

The scaling seems a bit troublesome though...

Quote:

Original post my mfawcett

Let R = rotation((RF1 - LF1), (RF2 - LF2))

Let Cd = magnitude©

Let Sf = mag(RF2 - LF2) / mag(RF1 - LF1)

Let I1 = the intersection of C's z-vector and the globe

Let VLF = I1 - LF2

VLF /= Sf

Rotate VLF by inverse®

Let I2 = VLF + LF1

C's new position = I2 * (Cd / Sf) ?

C's new z-vector = normalize(I2)

C's new rotation = R

I cannot assert that, because I fail to follow your reasoning on this one, sorry. :/

I'll check this again tomorrow, with a fresh pair of eyes.

Posted 23 March 2006 - 01:33 AM

OK, I think I've got this... I haven't checked taby's source code above -I don't know whether it has covered you- but you can also check these out. I will present my reasoning first, to convince you that -at least- I know what I'm talking about :). From there on, if it doesn't seem to work, it should be something trivial that should be easily corrected...

I just want to clarify that the pseudo-code -below- uses column-major vector convention, thus assumes that vectors transform as 3x1 matrices. If otherwise, all matrices and their products, should be transposed using the identity:

(M_{1}*M_{2}*...*M_{n}*)^{T} = M_{n}^{T}*M_{n-1}^{T}*...*M_{1}^{T}

Also, I believe that the co-ordinates' system's handedness will not affect this, because this is usually adjusted through the sign of a single member ( (4,3) ) of the projection matrix (negation of camera Z before turning the frustum into a cuboid), which doesn't affect any of our calculations here.

After clearing these, I can get into more details...

**The rotation**

(This is pretty much the same thing I described in a previous post)

You want to be able to rotate the camera around its local Z by an angle implied by the user's fingers. The goal is to rotate the "camera space" vertices around an axis parallel to the camera's local Z, which passes through the "camera space" pivot point.

This pivot point will lie on the near face of the view frustum, which is z=z_{near} (in camera space), and its x,y coordinates can be found from the "screen space" pivot point ("screen space" -> expressed with coordinates in {-1,1}), and the fact that the (1,1) and (2,2) members of the projection matrix *are* the width and height of the view frustum at its "near" clipping plane z=z_{near}

The origin of the camera space, however, is the position of the camera. We can't apply a rotation matrix directly to the view matrix V (as in R_{Z}*V) because we want to rotate around an arbitrary point, not the origin. We will have to translate the origin to the pivot point first, then apply the rotation, and then restore the origin.

I also believe that it will be more accurate to perform these transformations on the snapshot of the view matrix upon "user fingers down" event, rather than perform them "incrementally" at successive frames on the "previous" view matrix. E.g. it's always good know the total angle , instead of the delta-rotation from previous frame. (prevents loss of accuracy etc., control over final state)

So, here's some pseudo-code for this:

**The scaling**

You want the user to be able to touch two points on the screen, and zoom-into or out of the image, with his fingers remaining under the very same points he first touched. This makes it obvious, that this scaling shouldn't be performed around the origin, but around the midpoint of the user's fingers. Its magnitude will be given by the ratio of the distances of the fingers at any given instance, w.r.t their initial distance.

However, we don't want any actual scaling to take place. We merely want to translate the camera in a new position which accounts for the implied zoom, without even altering the camera FOV.

In that case, the new camera position will come from the scaling of the old one around the "camera-space" pivot implied by the midpoint of the fingers. It's unnecessary to scale the enitire view matrix directly, because we'll have to do a lot of normalizations. We can simply translate the origin to the desired point, scale the position of the camera in that frame, and hardcode it in the view matrix...

That must be it. If I'm missing something it must be trivial, because I've used parts of all these for other stuff and I know they work (both theoretically and in practice!)

The procedure above should also work through direct matrix products, just like the "T^{-1}*R_{Z}*T*V_{0}" approach above, by substituting the rotation matrix with the global scale matrix:

That's it. This is how I'd go about the same problem. For the rotation, I'm almost certain. For the scaling I'm very confident that this is the right way to it. If the effect you want to achieve, is as if the entire 3d scene was scaled around the pivot, I believe this will work. The pivot will always map to the same point on screen, so I guess the rest points will behave as I expect.

I don't know whether I'll be able to help you anymore with this, but if you have any questions/feedback, you know where to post :)

edit:

bug in the "scaling part".

Original post was:

The symmetric of those indices were to be used.

Corrected now.

[Edited by - someusername on March 23, 2006 11:33:47 AM]

I just want to clarify that the pseudo-code -below- uses column-major vector convention, thus assumes that vectors transform as 3x1 matrices. If otherwise, all matrices and their products, should be transposed using the identity:

(M

Also, I believe that the co-ordinates' system's handedness will not affect this, because this is usually adjusted through the sign of a single member ( (4,3) ) of the projection matrix (negation of camera Z before turning the frustum into a cuboid), which doesn't affect any of our calculations here.

After clearing these, I can get into more details...

(This is pretty much the same thing I described in a previous post)

You want to be able to rotate the camera around its local Z by an angle implied by the user's fingers. The goal is to rotate the "camera space" vertices around an axis parallel to the camera's local Z, which passes through the "camera space" pivot point.

This pivot point will lie on the near face of the view frustum, which is z=z

The origin of the camera space, however, is the position of the camera. We can't apply a rotation matrix directly to the view matrix V (as in R

I also believe that it will be more accurate to perform these transformations on the snapshot of the view matrix upon "user fingers down" event, rather than perform them "incrementally" at successive frames on the "previous" view matrix. E.g. it's always good know the total angle , instead of the delta-rotation from previous frame. (prevents loss of accuracy etc., control over final state)

So, here's some pseudo-code for this:

// I assume V_{0}is the view matrix when the user first started rotating, V is the current view matrix, P is the projection matrix and φ is the total angle to rotate by. SPivotX, SPivotY are assumed to be the"screen-space"coordinates of the pivot point, expressed in {-1,1}, increasing to the right and upwards.

// find the pivot point

dx = SPivotX*P(1,1)/2

dy = SPivotY*P(2,2)/2

dz = z_{near}

// Calculate the translation to the pivot point. (dx, dy, z_{near}) are the camera-space coordinates of the pivot.

[ 1 0 0 -dx ]

T = [ 0 1 0 -dy ]

[ 0 0 1 -dz ]

[ 0 0 0 1 ]

// The rotation matrix

[cos(φ) -sin(φ) 0 0 ]

R_{Z}= [sin(φ) cos(φ) 0 0 ]

[ 0 0 1 0 ]

[ 0 0 0 1 ]

// Calculate the new view matrix

V = T^{-1}*R_{Z}*T*V_{0}

// We always perform on V_{0}, so it's not *that* necessary to re-orthogonalize the view matrix, but it wouldn't hurt...

You want the user to be able to touch two points on the screen, and zoom-into or out of the image, with his fingers remaining under the very same points he first touched. This makes it obvious, that this scaling shouldn't be performed around the origin, but around the midpoint of the user's fingers. Its magnitude will be given by the ratio of the distances of the fingers at any given instance, w.r.t their initial distance.

However, we don't want any actual scaling to take place. We merely want to translate the camera in a new position which accounts for the implied zoom, without even altering the camera FOV.

In that case, the new camera position will come from the scaling of the old one around the "camera-space" pivot implied by the midpoint of the fingers. It's unnecessary to scale the enitire view matrix directly, because we'll have to do a lot of normalizations. We can simply translate the origin to the desired point, scale the position of the camera in that frame, and hardcode it in the view matrix...

// V_{0}is the original view matrix, V is the current view matrix, P is the projection matrix. SPivotX, SPivotY are assumed to be the"screen-space"coordinates of the pivot point (midpoint of the two fingers), expressed in {-1,1}, increasing to the right and upwards. Sf is the scale factor, and should be given by the initial distance of the fingers, divided by their current distance. Sx, Sy, Sz will be the scaled position of the camera.

// find the pivot point

dx = SPivotX*P(1,1)/2

dy = SPivotY*P(2,2)/2

dz = z_{near}

// The camera position in that frame is -(dx, dy, z_{near})

// Scale the position of the camera to the desired ratio. (ignore the minus sign above)

Sx = Sf*dx

Sy = Sf*dy

Sz = Sf*dz

// Hardcode the new position in the view matrix. v1, v2, v3 are the vectors-rows of the view matrix. (the camera's local axes) Ignore the 4th component

V(1,4) = - dot{ (Sx,Sy,Sz), v1 }

V(2,4) = - dot{ (Sx,Sy,Sz), v2 }

V(3,4) = - dot{ (Sx,Sy,Sz), v3 }

// That's it. The view matrix should be ready.

That must be it. If I'm missing something it must be trivial, because I've used parts of all these for other stuff and I know they work (both theoretically and in practice!)

The procedure above should also work through direct matrix products, just like the "T

, but I can't guarantee that. It seems like it should work though...

[ 1 0 0 0 ]

[ 0 1 0 0 ]

[ 0 0 1 0 ]

[ 0 0 0 1/Sf ]

That's it. This is how I'd go about the same problem. For the rotation, I'm almost certain. For the scaling I'm very confident that this is the right way to it. If the effect you want to achieve, is as if the entire 3d scene was scaled around the pivot, I believe this will work. The pivot will always map to the same point on screen, so I guess the rest points will behave as I expect.

I don't know whether I'll be able to help you anymore with this, but if you have any questions/feedback, you know where to post :)

edit:

bug in the "scaling part".

Original post was:

Quote:

V(4,1) = - dot{ (Sx,Sy,Sz), v1 }

V(4,2) = - dot{ (Sx,Sy,Sz), v2 }

V(4,3) = - dot{ (Sx,Sy,Sz), v3 }

The symmetric of those indices were to be used.

Corrected now.

[Edited by - someusername on March 23, 2006 11:33:47 AM]

Posted 23 March 2006 - 04:54 AM

Quote:

Original post by someusername

I don't know whether I'll be able to help you anymore with this, but if you have any questions/feedback, you know where to post :)

Thanks, rate++. I'll get back to this thread if I get it working correctly. I'm only currently alotted 25% of my time to it, so it's slow going.

Posted 23 March 2006 - 05:22 AM

Well, I occasionally find some problems -posted here- very interesting and spend quite some time thinking how they could be dealt with.

Because I really doubt I can come up with anything else, that is... :)

Btw, I've just found a small bug in my previous post, about the scaling. I'm off to correct it now.

Quote:

I don't know whether I'll be able to help you anymore with this, [...]

Because I really doubt I can come up with anything else, that is... :)

Btw, I've just found a small bug in my previous post, about the scaling. I'm off to correct it now.