Moving objects with the mouse in 3D

Started by
10 comments, last by steven katic 15 years, 7 months ago
OK what's happened is I think I've found myself trying to write some sort of article/tutorial to explain the diagram/implementation (I can tell you it's easier to implement than to explain an implementation in writing). So it is a work in progress that may sound a bit rough around the edges/incomplete at the moment (so by all means ask any questions ebody). So at the moment here is some more food for thought about that diagram:

Overview
As you can see in the diagram, the process is broken up into a series of steps from A to I and the result of each step is a piece of useful data that is used in the process of getting the object from B to I as the mouse pointer is moved from A to E.

If you take another look at the diagram, you may notice that EFGI is just a translated version of ADCB. A tell tale sign that ABDC may be invariant? Why not...let's treat ADCB (i.e.ABCD) as invariant. It will simplify our work dramatically (If you aren't familiar with the term invariant as used in computer science you can always look it up). So we are going to use ABCD as an invariant to move the object by the vector BI in 3D space when the mouse pointer is moved by the vector AE on the screen. As long as ABCD is maintained as invariant, moving the object with the mouse pointer can be done from any camera/viewport position and rotation (ok, there will be a few exceptions/special cases like when/if the camera/viewport plane is perpendiclar to the virtual plane, but we'll get to that in due course).

Here's how we can do it:
Calculate ABCD when the user picks the object. When the mouse pointer moves from A to E derive FGI. Then move The object by the vector BI. We can do this easily when ABCD is invariant: i.e. because the relationship between A, B, C and D doesn't change, we use it to derive FGI.

Here's the first step (when the object is picked):

A: get the 2d mouse pointer position (call it MousePointer2DPos).
B: derive the 3D position on the object that the mouse pointer hit (call it MousePointer3DPos).
C: map the MousePointer3DPos perpendicular ( and down in this case) onto the virtual plane. ( call it MappedMousePointer3DPos).
D: Derive the 2D position (on the screen viewport) of the MappedMousePointer3DPos (at step C) (call it MappedMousePointer2DPos )

Now let's stop for a moment. It all sounds a bit convoluted so far...even to me. Some of these steps should sound pretty ordinary to anyone famliar with
projecting and unprojecting back and forth (to and from) a 2D screen space and a virtual 3D world space. For example, gluUnProject() can be used to obtain the MousePointer3DPos in step B, and gluProject can be used to obtain the MappedMousePointer2DPos in step D. Another reason I stopped here is because steps A to D is only done once at the beginning of the whole process of getting the object from B to I. Basically we can group steps A to D into a sub-process called PickAnObject(). It's much like any other 'object picking' solution, but we have 2 additional pieces of information obtained from steps C and D. Typically we can use PickAnObject() the following way (using psuedocode):

ProcessMouseDown(int x, int y)
{
if we are in Select and Move Object Mode
if left mouse button is down
PickAnObject(x,y);
}

Well, so far nothing new, except C and D.
Forget about C for the moment, just look at it as something we have to do to get the data resulting at D. At D we have the MappedMousePointer2DPos. This point is used as the start point of our translation of the cube. As soon as we get the 2D mouse position on the screen as it moves(at E), we move D by vector AE to get the end point of the translation (at F). Note in the diagram that the position and distance of E is arbitrary. Note that the position and orientation of the camera/viewport is also arbitrary at the moment. This is probably one of the things I like about this implementation: it doesn't care about that information directly. All it needs to know is how and when to get the position and rotation as needed. Well it turns out the only time we need them is when we project and unproject so we'll just use gluProject() and gluUnProject(). That keeps things simple. If you are a matrix manipulation afficianado you can search for any optimizations at the matrix level if you wish. But for the moment, gluProject() and gluUnProject() will be satisfactory.

Now, on with the process.

E. get the 2d mouse pointer position when it moves.This is actually a repeat of step A(but obviously the mouse pointer position just moved to a new spot). Lets suppose the letters A to I in the diagram are geometric points for a moment.Then we can say the following: Our aim is to translate B to I when the mouse pointer is moved from A to E. That is, translate the object by the vector BI. But we need the point I. Here's one way to get it:

The same a before, we map from the screen and into the 3D space, but we will go backwards this time, to find the point I. This is described next.

F. derive the next MappedMousePointer2DPos position. This is just the point at D translated by the vector AE (i.e. the mouse pointer is moved from A to E).
This is a very interesting point ( is that a pun?). This is because it is the position on the 2D screen/viewport that G (in 3D space) would map to if it
were mapped onto the 2D screen/viewport. Well if we look carefully we can see that G is mapped to F indirectly via the relationship between C and D. The
relationship between C and D defines the mapping of the viewport to the Virtual plane. If we maintain this relationship as an invariant we can use it to freely move the object with the mouse.

G. [note: this has been edited]. Cast a ray from F on the viewing plane to hit the Virtual Plane in 3D space. This ray is cast the same way as the ray cast from C to D is cast, just in the opposite direction. This is the new MappedMousePointer3DPos.(Note: In a viewport using an orthographic projection the rays FG and CD will be parallel to each other and perpendicular to the viewing plane, simply because of the... err...orthogonal nature of the projection (needs verification?). And with a little bit more thought, you may come to realise that for viewports using the orthographic projection, we can accomplish alot of our task(s) without the Virtual Plane all together: and exploit the characteristics of the orthographic projection instead. But, then we may need to write more (other) code to take into account the position and rotation of the viewport/camera more directly in our calculations [ a trade off to consider?] )

I. translate G by vector CB. Then translate the object by vector BI (could we also just translate the object by vector CG?)

Let us know if this has helped ebody...
(You might have found your own better solution by now?)

ps. I think treating the letters A to I as both process steps and geometric points might confuse people? (The explanation seems to be screaming out for a demo implemenation too)

[Edited by - steven katic on September 14, 2008 1:48:32 AM]
Advertisement
Here is an implementation example, in C# XNA (unfortunately not opengl) that someone has done at:

http://www.ziggyware.com/readarticle.php?article_id=189

Which is perhaps a more simple method?:
"For plane translations we do not need to project anything from 3D to 2D before calculating differences. We have a convenient ray/plane intersection method at our disposal that can calculate the intersection points of mouse rays from the mouse start and end positions onto the plane on which we want to translate. The amount of translation is simply the difference between the first mouse intersection with the plane and the second mouse intersection with the plane, as these intersection points already exist in 3D. Adding this difference to the existing translation component results in the correct amount of translation for the corresponding mouse input." from the link above.

or perhaps not simpler (if I resort to splitting hairs about minimising 2D/3D projection use):
The XNA example implementation seems to continually project from 2D to 3D twice during the translation operation(for Start and End Points), where as in my example you would project twice (from 2D to 3D once,and from 3D to 2D once) at the start of the translation operation, then continually project once ( for End point) from 2D to 3D during the rest of the operation.

[Edited by - steven katic on September 17, 2008 6:14:52 PM]

This topic is closed to new replies.

Advertisement