Incorrect hit/picking distance after scaling

Started by
15 comments, last by BaSSraf 18 years, 3 months ago
Hey guys, I use a very simple ray/sphere intersection test for object picking. I transform the picking ray into modelspace by using the inverse of the model's world matrix and when the matrix is unscaled, it works flawlessly. AFTER I scale the object, the bounding sphere's intersection distance has incorrectly come much closer to the camera (ie: intersection distance is too small for the actual distance) even if I move it to the same position as before. Can anyone take a guess at why scaling screws up the simple object picking? (happy new year!, BaSSraf)
huraay!
Advertisement
All the examples I could find on the net use the inverse of the model matrix to transform the Pick-Ray direction into object space but none of these examples use scaling in their model matrix.

if I dont use any scaling it works fine, but when I start scaling models, the picking distance is incorrect, any idea on what I might be missing here?

Thanks :)
huraay!
similar problem here for me :(. my picking finally works correctly under "normal" circumstances. but when i scale or rotate the whole scene after setting up my view/cam (via gluLookAt or selfmade transform) the resulting ray and intersectionpoint is messed up.

after several notworking attempts with gluUnproject and trying to understand the math involved and implement it on my own. take a look at this article for very good background info on mouse picking.

i finally was glad to find working code here. it works, but not in the case of scene rotation/scaling (translation works though). does anybody have an idea why this is happening or how to consider scalings and rotations??


static math::Vector3 ScreenToWorldCoordinates(const int mouseX, const int mouseY, const float fovy, const float zNear){      int x = mouseX;    int y = mouseY;    float value_fov = math::DegreeToRad(fovy);    GLint viewport[4];    glGetIntegerv(GL_VIEWPORT, viewport);    float value_aspect = (float)viewport[2] / ((float)viewport[3]);    float half_window_width  = (float)viewport[2] / 2.f;    float half_window_height = (float)viewport[3] / 2.f;    float modifier_x;    float modifier_y;    //mathematical handling of the difference between    //your mouse position and the 'center' of the window    float point[3];    //the untransformed ray will be put here    float point_dist = zNear;    //it'll be put this far on the Z plane    float camera_origin[3];    //this is where the camera sits, in 3dspace    float point_xformed[3];    //this is the transformed point    GLfloat pulledMatrix[16];    //pulledMatrix is the OpenGL matrix.    //These lines are the biggest part of this function.    //This is where the mouse position is turned into a mathematical    //'relative' of 3d space. The conversion to an actual point    modifier_x = std::tan(value_fov * 0.5f) * ((1.0f - x / half_window_width) * (value_aspect));    modifier_y = std::tan(value_fov * 0.5f) * -(1.0f - y / half_window_height);    //These 3 take our modifier_x/y values and our 'casting' distance    //to throw out a point in space that lies on the point_dist plane.    //If we were using completely untransformed, untranslated space,    //this would be fine - but we're not :)    point[0] = modifier_x * point_dist;    point[1] = modifier_y * point_dist;    point[2] = point_dist;    //Next we make an openGL call to grab our MODELVIEW_MATRIX -    //This is the matrix that rasters 3d points to 2d space - which is    //kinda what we're doing, in reverse    glGetFloatv(GL_MODELVIEW_MATRIX, pulledMatrix);    //Some folks would then invert the matrix - I invert the results.#if 0    //apply global scene rotation (only around y-axis in my case)    math::Matrix4x4 rot;    float rotY = g_globalSceneRotY;    float C = math::Cos(rotY);    float D = math::Sin(rotY);    rot.m[0][0] =  C;    rot.m[0][2] =  D;    rot.m[2][0] = -D;    rot.m[2][2] =  C;    //convert float-16-array to my matrix-class for easier calculation    math::Matrix4x4 mv;    mv.Set(pulledMatrix[0], pulledMatrix[4], pulledMatrix[8],  pulledMatrix[12],           pulledMatrix[1], pulledMatrix[5], pulledMatrix[9],  pulledMatrix[13],           pulledMatrix[2], pulledMatrix[6], pulledMatrix[10], pulledMatrix[14],           pulledMatrix[3], pulledMatrix[7], pulledMatrix[11], pulledMatrix[15]);        //multiply with global rotation matrix    mv *= rot;        //set    pulledMatrix[0]  = mv.f[0].x;    pulledMatrix[4]  = mv.f[0].y;    pulledMatrix[8]  = mv.f[0].z;    pulledMatrix[12] = mv.f[0].w;        pulledMatrix[1]  = mv.f[1].x;    pulledMatrix[5]  = mv.f[1].y;    pulledMatrix[9]  = mv.f[1].z;    pulledMatrix[13] = mv.f[1].w;        pulledMatrix[2]  = mv.f[2].x;    pulledMatrix[6]  = mv.f[2].y;    pulledMatrix[10] = mv.f[2].z;    pulledMatrix[14] = mv.f[2].w;        pulledMatrix[3]  = mv.f[3].x;    pulledMatrix[7]  = mv.f[3].y;    pulledMatrix[11] = mv.f[3].z;    pulledMatrix[15] = mv.f[3].w;#endif    //First, to get the camera_origin, we transform the 12, 13, 14    //slots of our pulledMatrix - this gets us the actual viewing    //position we are 'sitting' at when the function is called    camera_origin[0] = -(pulledMatrix[0]  * pulledMatrix[12] +                         pulledMatrix[1]  * pulledMatrix[13] +                         pulledMatrix[2]  * pulledMatrix[14]);    camera_origin[1] = -(pulledMatrix[4]  * pulledMatrix[12] +                         pulledMatrix[5]  * pulledMatrix[13] +                         pulledMatrix[6]  * pulledMatrix[14]);    camera_origin[2] = -(pulledMatrix[8]  * pulledMatrix[12] +                         pulledMatrix[9]  * pulledMatrix[13] +                         pulledMatrix[10] * pulledMatrix[14]);    //Second, we transform the position we generated earlier - the '3d'    //mouse position - by our viewing matrix.    point_xformed[0] = -(pulledMatrix[0]  * point[0] +                         pulledMatrix[1]  * point[1] +                         pulledMatrix[2]  * point[2]);    point_xformed[1] = -(pulledMatrix[4]  * point[0] +                         pulledMatrix[5]  * point[1] +                         pulledMatrix[6]  * point[2]);    point_xformed[2] = -(pulledMatrix[8]  * point[0] +                         pulledMatrix[9]  * point[1] +                         pulledMatrix[10] * point[2]);       return math::Vector3(point_xformed[0], point_xformed[1], point_xformed[2]);}


the part inside #if 0 ... #endif is my try to incorporate global scene rotation. i thought multiplying the pulled modelview matrix with a rotation matrix should work. but it didn't. i even thought about matrix coefficient order/alignment .. no success.

and yeah, i know the code is dirty/hacky but i didn't want to touch the working code from mt-wudan ;). btw, he hasn't got a clue either. i already posted to his forum.

--

concerning your problem BaSSraf, maybe you should post some code ...

[Edited by - ghostd0g on January 4, 2006 7:00:31 AM]
overhere, rotation doesnt seem to be any problem, bounding boxes and sphere's are returning consistent distances. when I throw in the scaling, the distance seems to be linearly scaled (smaller) by the object's scaling !?!?

it seems to me, there must be something wrong with the translation of the picking ray into the model space, it seems the scaling of the inversed model matrix moves the origin of the picking ray towards.

I just found out that the problem is not there when I simply transform the boundingbox/sphere into world space and keep the picking-ray in world space as well. (so this actually solves the problem, but I still dont know why the scaling screws up the ray into model space transform)
huraay!
The ray in world coordinates is P_w + s*D_w, where P_w is the origin of the ray, D_w is a unit-length direction, and s >= 0 is the ray parameter. The model coordinates for a triangle are V0_m, V1_m, and V_2_m. The model-to-world transformation is X_w = M*X_m + T, where X_m is a model point, X_w is the corresponding world point, M is a matrix (includes rotation and scaling), and T is a translation vector. The inverse transformation (world-to-model) is X_m = Inverse(M)*(X_w - T). The world coordinates for the triangle vertices are V0_w = M*V0_m + T, V1_w = M*V1_m + T, and V2_w = M*V2_m + T.

The ray in model coordinates is Q_m + s*E_m = Inverse(M)*(P_w - T) + s*Inverse(M)*D_w. If M has scaling factors, E_m is not necessarily a unit-length vector. This does not matter (see below).

The intersection of the ray and the model triangle is obtained by solving the equation Q_m + s*E_m = b0*V0_m + b1*V1_m + b2*V2_m, where b0, b1, and b2 are barycentric coordinates (b0+b1+b2 = 1). If there is an intersection, then s >= 0, 0 <= b0 <= 1, 0 <= b1 <= 1, and 0 <= b2 <= 1. This is what a picking system solves numerically.

Now apply the model-to-world transformation: M*Q_m + s*M*E_m + T = b0*(M*V0_m + T) + b1*(M*V1 + T) + b2*(M*V2 + T). Using b0+b1+b2=1 and the various definitions of the ray/triangle quantities, the transformed equation is P_w + s*D_w = b0*V0_w + b1*V1_w + b2*V2_2. So the s, b0, b1, and b2 values are the same whether you solve for the intersection in model coordinates or in world coordinates.

If your picking system is producing what appears to be an incorrect distance along the ray, it is probably because you are computing the distance between two points in model space as Length(b0*V0_m + b1*V1_m + b2*V2_m - Q_m). This is incorrect as a world distance because it is equal to Length(s*E_m) = s*Length(E_m), but Length(E_m) is not guaranteed to be 1. The correct distance along the ray is the value s (recall I assumed D_w is unit length).

Thanks "Wasting Time" but I have difficulties implementing your mentioned correction into my existing direct3d code :(

The construction of the picking-ray:
  v.x =  ( ( ( 2.0 * x ) / ViewPort.Camera.ScreenWidth  ) - 1 ) / Projection._11;  v.y = -( ( ( 2.0 * y ) / ViewPort.Camera.ScreenHeight ) - 1 ) / Projection._22;  v.z = 1.0;  // Get the inverse view matrix  D3DXMatrixInverse(m,null,ViewPort.Camera.View);  // Transform the screen space pick ray into 3D space  PickRay.Direction.x  = v.x*m._11 + v.y*m._21 + v.z*m._31;  PickRay.Direction.y  = v.x*m._12 + v.y*m._22 + v.z*m._32;  PickRay.Direction.z  = v.x*m._13 + v.y*m._23 + v.z*m._33;  PickRay.Direction.Normalize;  PickRay.Origin.x = m._41;  PickRay.Origin.y = m._42;  PickRay.Origin.z = m._43;  // calc origin as intersection with near frustum  PickRay.Origin = PickRay.Origin + (PickRay.Direction * ViewPort.Camera.NearZ);


The actual picking code:

// ray from world space into model spaceRayModelSpace = PickRay.GetTransformed(CombinedModelMatrix.GetInverse);D3DXIntersect(Mesh,RayModelSpace.Origin,RayModelSpace.Direction,Hit,null,null,null,&Distance,null,null);


The code above returns the correct distance for an object, say 10.0 but when I scale the object by a factor of 5 (and move it away from the camera to compensate for the scaling) it will give me a distance of 2... ie: (10/5), how would I compensate/eliminate for the scaling component in the CombinedModelMatrix ?
huraay!
This problem is being discussed in another thread too. I've been looking into it, and it seems that the problem is with the transformation of the ray origin and direction into model space coordinates.

Although the ray origin (point) has to be multiplied with the entire inverse world matrix in order to be projected back into model space, the ray direction is a vector and should be projected as one. This means that if you use a 4d vector for the ray direction, you'll have to set its w-component to zero, to keep it unaffected from the translation part of the world matrix.

Let P,O,S be the translation, orientation and scaling matrix of your object. They'll look sth like this...
    [ ScaleX    0    0    0 ]       [ localX.x   localY.x   localZ.x  0 ]    S = [    0   ScaleY  0    0 ],  O = [ localX.y   localY.y   localZ.y  0 ]    [    0      0  ScaleZ 0 ]       [ localX.z   localY.z   localZ.z  0 ]    [    0      0    0    1 ]       [    0           0           0    1 ]     [ 1   0   0   pos.x ]P = [ 0   1   0   pos.y ]    [ 0   0   1   pos.z ]    [ 0   0   0     1   ]The world matrix, WM, is:             [ ScaleX*localX.x  ScaleY*localY.x  ScaleZ*localZ.x  pos.x ]WM = P*O*S = [ ScaleX*localX.y  ScaleY*localY.y  ScaleZ*localZ.y  pos.y ]             [ ScaleX*localX.z  ScaleY*localY.z  ScaleZ*localZ.z  pos.z ]             [        0                0                 0           1  ]

As you see clearly, the orientation 3x3 part of the world matrix has 3 scaling values hardcoded into each column vector. If you extract the orientation from the world matrix, you'll have to normalize each column vector before using it to transform the ray direction vector. (be careful, this is not the same as normalizing the vector after projection)

In my program, I use the following approach and it seems to work for any camera position and object position/orientation/scaling.
The exact formulas I used, are:

ModelSpaceRayOrigin = (WM)-1*rayOrigin = S-1*O-1*P-1*rayOrigin
ModelSpaceRayDir = O-1*{rayDir.x, rayDir.y, rayDir.z, 0}


You'll have to use 4-vectors for the ray origin and direction. It is extremely important that the w-component of rayOrigin is 1 (because it's supposed to be a point) but the w-component of rayDir must be 0.

This code seems to remain unaffected by the bug you're discussing.
I haven't had the chance to try it an realtime application, so if anyone does, I'd appreciate the feedback.

edit:
btw, these are in column major format. If anyone tries it in directX or any other api that uses a row-major convention, you'll have to transpose each matrix .

WM = (P*O*S)T = ST*OT*PT
ModelSpaceRayOrigin = rayOrigin*(WM)-1 = rayOrigin*(PT)-1*(OT)-1*(ST)-1
ModelSpaceRayDir = {rayDir.x, rayDir.y, rayDir.z, 0}*(OT)-1 = {rayDir.x, rayDir.y, rayDir.z, 0}*O


(as always, look out for any trivial mistakes)
Thanks someusername, and I understand why it is happening.
One remaining problem I have is the fact that the "CombinedMatrix" is a matrix combined constructed by the RST matrices (P O S as you have them called) of the object AND the object's parent(s).. so I cant simply extract the orientation vector from it..

(now that I say that.. i'll have a go at that and see what that brings me :)

Its funny that its such a common thing (ray picking that is) but that noone seems to have much of a problem with object hierarchies.. (or noone is using them..?)

Thanks, i'll let you know.
huraay!
Now that I think of it..
Since i'm already using D3DXVec3TransformNormal() to transform the Ray direction I think this wont resolve the problem :(
huraay!
D3DXVec3TransformNormal will produce erroneous results if you don't normalize the first three rows as 3d vectors.
If there was scaling applied, and you don't do the above procedure, the determinant of the upper left 3x3 part of the matrix will not be 1, therefore what you will get from D3DXVec3TransformNormal, will definitely not be a rotation of the original normal.
Try to read the magnitudes of the normal before and after projection. If you don't normalize the orientation axes, the magnitude will change after projection.

Quote:
One remaining problem I have is the fact that the "CombinedMatrix" is a matrix combined constructed by the RST matrices (P O S as you have them called) of the object AND the object's parent(s).. so I cant simply extract the orientation vector from it..

Don't you render the object in the scene? Use the very same world matrix. It doesn't matter that it's a part of a hierarchy chain as long as you have the final combined world matrix.

This topic is closed to new replies.

Advertisement