Why Diana Gruber's wrong about Quats

Started by
163 comments, last by Shaterri 20 years, 10 months ago
Diana Gruber has written a very interesting article about Quaternions for the Gamedev site; if you haven''t read it yet, you can find it here. It''s certainly worth reading as one person''s perspective on the issue. Unfortunately, I also find it to be inaccurate (or at least misrepresentative) on several counts. Mrs. Gruber is welcome to her own opinions on whether quaternions are worthwhile for her own programming efforts, but at the same time obviously a lot of people _have_ implemented quaternion-based rotations for their graphics and physics engines, and physicists (not just mathematicians) have been actively using them since Hamilton discovered them (as the most convenient representation of the double cover of SO(3) by S^3, to put it in the most technical terms) so I think there''s at least some empirical evidence that they _do_ indeed do something ''that couldn''t be done more easily using more traditional mathematics'', to use her phrase. But she''s also wrong about some of the specifics; here''s why I think so. Firstly, Mrs. Gruber compares building a rotation from a quaternion to building a rotation from a rotation angle and an axis; I won''t reproduce the formulas but you can get them from her article. Her implication is that building from a rotation angle is easier, but let''s count. Computing the matrix from a quaternion involves 10 multiplications: squaring each of w, x, y, and z, and computing the product of each of these with the others (w*x, w*y, w*z, x*y, x*z, and y*z). And of course, w^2 + x^2 + y^2 + z^2 (the term she puts into the lower right corner) is just 1, which implies that one of the squarings can be eliminated and brings us down to 9 multiplies. The fastest way I can find to compute all the products needed to build the rotation matrix from angle/axis takes 10 multiplies: s*x, s*y, s*z, t*x, t*y, and then tx*x, tx*y, tx*z, ty*y, ty*z, and finally tz^2 can be computed as t-tx^2-ty^2 without any additional multiplications. She also fails to mention that if you use the axis-angle representation then you have to compute the cosine and sine of your angle of rotation, and those will far and away dominate the costs of multiplication. Secondly, while examining the cost of building a rotation (matrix) initially, she completely ignores the cost of composing rotations. Two quaternions can be multiplied straightforwardly with twelve multiplies and a handful of additions, and it''s possible to multiply two quaternions using just 8 multiplies (this is a classical result, but I don''t have a good reference and my own code doesn''t use it; can anyone else provide a pointer?). The standard way of multiplying two 3x3 rotation matrices takes 27 multiplies and it''s possible to show that two 3x3 matrices can''t be multiplied with fewer than 17 multiplications. If composition of rotations is a major part of your application (for, e.g., a kinematics system), then this can be a fairly substantial improvement. She talks about the benefits of quaternions (stability and unit length) being ''obviously present in the rotation matrix (since they are the same matrix).'' But a quaternion is _not_ a matrix, and it''s not an axis-and-angle pair either. All three of these are ways of representing a rotation (that is, an element of SO(3)) -- but they''re different representations. The fact that it''s possible to convert from the quaternion representation to a matrix representation does not mean that the representations are equivalent in terms of stability, any more that it means they''re equivalent in terms of mathematical costs (as we saw above). I''m not going to suggest that quaternions are more stable than a rotation matrix, but I _will_ say that they''re easier to stabilize. Consider, again, a kinematics animation system. Let''s say your object is rotating with nonconstant velocity, so that you have to multiply by an instantaneous rotation matrix every frame. Eventually floating-point roundoff will make the aggregate rotation non-orthogonal, whatever representation you use (i.e., it won''t be an exact representation of a rotation any more). This usually leads to shearing, stretching, or compression of the transformed object. In the case of a quaternion representation of rotations it''s easy to re-orthogonalize your rotation; all that''s involved is a normalization of your quaternion, just as you''d normalize a vector. But the process of ensuring that a rotation matrix is orthogonal and of re-orthogonalizing it is substantially more complex. Mrs. Gruber also goes on to dismiss the issue of Slerps, mentioning mostly the ''alternative method'', and tries to scare off the reader by talking about ''logarithms, exponents, the fourth dimension...'' -- but of course, that''s not how anyone actually _implements_ it. The primary use for the exponential interpretation of quaternion slerps is for dynamics, again; if you need to know the angular velocity along, say, a spline rotation curve then it''s possible to compute it by taking the derivative of the quaternion ''position'' formula (but I won''t go into the details of this, because it _is_ a fairly involved process and I don''t want everyone''s eyes glazing over). As far as actual implementation of slerps -- well, here''s a direct paste of the source code from my quaternion engine:
    
// Slerp interpolates between two unit quaternions, maintaining ''uniticity''.
Quaternion Slerp(Quaternion Q0, Quaternion Q1, float T) {
  float CosTheta = Q0.DotProd(Q1);
  float Theta = acosf(CosTheta);
  float SinTheta = sqrtf(1.0f-CosTheta*CosTheta);
  
  float Sin_T_Theta = sinf(T*Theta)/SinTheta;
  float Sin_OneMinusT_Theta = sinf((1.0f-T)*Theta)/SinTheta;

  Quaternion Result = Q0*Sin_OneMinusT_Theta;
  Result += (Q1*Sin_T_Theta);

  return Result;
}
    
Now, certainly cosines (and inverse cosines) and square roots are a little bit scary, but they''re hardly _that_ frightening. I don''t think you''re likely to find many graphics programmers who can''t make sense of the code above. She asks ''Can''t we find a simpler way to interpolate between vectors in R3?'' and goes on to demonstrate an easy answer to her question. The problem is, interpolating between two vectors with a rotation is _NOT_ the same as interpolating between two rotations! For instance, if a camera is rotating it has not only a lookat vector but also an up vector (in effect, a twist angle) and Mrs. Gruber''s method of rotating between two vectors doesn''t deal with this issue at all. One might think that if you had two rotations in axis-angle form -- (v1, theta1) and (v2, theta2) -- then the two could be interpolated by using her formulas to interpolate v1 and v2 and then simply taking theta as (1-t)*theta1+t*theta2. The problem with this formula is that it''s not mathematically accurate (the rotation is at inconstant speed and isn''t geodesic), which makes the motion of tumbling objects look wrong in an animation. The only accurate way of interpolating between two rotations is with a slerp, and the easiest way of representing a slerp is between two quaternions. I''m sorry that Mrs. Gruber doesn''t seem understand the various benefits that using quaternions really _can_ offer and sorry that she''s so active in crusading against them; while they''re certainly not necessary for every project I hope this post makes it a little more clear why one would want to use them and why (to answer the title of her article) ''Yes, some of us really do need quaternions''.
Advertisement
The contents and proofs in Mrs Grubers Article are valid, but she forgets one thing - ''Gimbal Lock''. This alone is something well worth using Quaternions to get around.

I really don''t see the point in writing an article so large when I can tell her in one paragraph why a Quaternion should be used, and I haven''t even bothered to outline the many other benefits (there are tons of stuff posted on this topic).

If people have been asking for Quarternions support in her fastgraph library then maybe there is good reason for it. is anyone actually using fastgraph for a full commercial game project though? To define commercial, I don''t want follow ups telling me someone is selling a poker game. What about some serious games?
Actually, the issue of Gimbal lock is only relevant to the question of ''quaternions vs. Euler angles'', and in Mrs. Gruber''s defense she never once mentions Euler angles in her article. Euler angles are yet another representation of the rotation group (one I forgot to mention), and they _do_ suffer from the problem of Gimbal lock just as the anonymous poster said, but neither the axis-angle and rotation matrix representations that she offers nor quaternions suffer from it. They''re an excellent reason to use quaternions over Euler angles but they don''t actually address the issue in Mrs. Gruber''s article (or my reply) of quaternions vs. rotation matrices.
Another thread here on gamedev: Quaternions - Why use them?

Shaterri,

Identify yourself further, please.

Your message is long, and I don''t want to address all your points at once. So don''t think I am ignoring details. I''ll get to them, eventually.

Let''s start with paragraph 2. The terms w,x,y,z. What are those exactly? Unless I am mistaken, they are shorthand symbols that represent quaternion elements. The usual representation for a rotation quaternion is q=cos(a)+u*sin(a) where a is the rotation angle divided by 2, and u(x'',y'',z'') is the unit vector representing the axis of rotation.

So your elements actually look something like this:

w = cos(theta/2)
x = x'' sin(theta/2)
y = y'' sin(theta/2)
z = z'' sin(theta/2)

It seems to me you have left a few sines, cosines, divides and multiplies out of your operation count.

More later...

Diana
Check out Real Time Rendering (ISBN 1-56881-101-2) page 47, where you will find a formula for conversion from a unit quaternion to an orthogonal matrix involving no trigonometric functions.


I don''t mean to intrude or anything, but I guess what would decide the final straw in terms of what is most efficient, would be to profile both well written quaternion and well written matrix code, for equivalant operations, and compare the results.

One thing quaternions might be better choice for is sending arbitrary orientations (and not just around a single axis), over a network. Their smaller size could make a difference when a signifcant number of objects need to be kept track of, especially when squeezing stuff through a 56K modem. To go off on a slight tangent, you could lose some precision, and convert the floating point values of the quaternion to a signed 16-bit 0:15 fixed point representation (they''re normalized values, same as an 3x3 axial frame without scaling information). I digress... anyway, this would be more ideal than sending a 3x3 axial frame (the upper 3x3 sub-matrix).

One thing a matrix suffers from that a quaternion does not, is that over time, if you concatenate various orientations together into one matrix, is that the values degrade due to loss in floating point precision. This is known as de-orthonormalization. In otherwords, the axes aren''t quite perpindicular to each other. You can perform an orthonormalization operation on the matrix to line-up the axes again, but it''s an operation that can be advoided. When an axial frame is generated from a quaternion, the axes are orthonormal. Of course, this problem is rarely encountered because most programmers generally regenerate the rotation matrix each frame, and there''s no noticable entropy.

Diana, you''ve got some good points though. Most programmers treat quaternion math as a black box, hell, I do. Vector and affine space transformations are a ton more intuitive to understand, and easily implemented. Adding quaternions to your code base means another data construct, and more coding... and a less stream-lined pipeline. Maybe in the end, it''ll just boil down to preference, who knows.

Just some thoughts to add to the discussion. Cheers!

- Jesse Towner
Sorry for not identifying myself previously. My name''s Steven Stadnicki (scruff@halcyon.com) and I should note that I''m not currently in the game development industry, although I have been in the past; currently I''m working at a media processor company doing graphics driver and windowing system API work. I have about 10 years of professional 3d graphics and games programming experience under my belt, though, and another 4-5 years of hobbyist study of the subject before that, so I''m not exactly a babe in the woods either.

I''m going to focus on the meat of the reply here:

quote:
The terms w,x,y,z. What are those exactly? Unless I am mistaken, they are shorthand symbols that represent quaternion elements. The usual representation for a rotation quaternion is q=cos(a)+u*sin(a) where a is the rotation angle divided by 2, and u(x'',y'',z'') is the unit vector representing the axis of rotation.


Actually, you''re mistaken. Mathematically speaking, a quaternion is simply a pair of four numbers [w, x, y, z] that represent a point in R^4, along with a special element [1, 0, 0, 0] that acts as the identity quaternion, certain set of rules that tell you how you multiply two of these things to get another one, and another set of rules telling how you can find the inverse element for any given element (which is to say, they form a group). Computer graphics (and by and large, physics) is interested in the subgroup of unit quaternions, that is, the ones for which w^2+x^2+y^2+z^2 = 1; that''s because this subgroup (you can convince yourself that multiplying two unit quaternions leads to another unit quaternion if you try hard enough, so it _is_ a subgroup) corresponds to the set of all possible rotations in three-space (aside from the minor detail that q and -q represent the same rotation).

The ''representation'' you defined is actually a mapping; it''s how you get from one representation of this rotation group (by axis and angle) to another (by quaternions). Certainly if your initial data consists of an axis and a rotation angle then you have to do a little bit of trigonometry to convert it to quaternion representation. But if your data isn''t in that form then it''s quite possible to use quaternions without ever having to keep any data in axis-angle formula. I''ve got an application that does exactly this; it simulates a randomly tumbling object by slerping between randomly generated rotations. Note that this involves more than just spinning the object around one axis, then spinning it around another one, then... etc etc; the object legitimately tumbles, twisting around several axes at the same time as it interpolates orientations. This is an effect that''s difficult to get (accurately) with axis/angle representation, nearly impossible to get with rotation matrices as the ''base'' representation, and easy to get with quaternions.
This is the anonymous poster who initially refered to ''gimbal lock''.

Whoever shot me down is quite right about the gimbal lock with reference to Euler angles, and in fact I should have pointed out that I was talking about the programming of character animation and implicitly refering to the various methods involved.

Yes, while Mrs Grubers article does not mention Euler angles, the title of her article does question the use of quarternions and that is what I meant to answer.

So...use them for character animation and save yourself a lot of time, after all why would you use anything else? Have you ever tried fitting 1000s of frames of rotational animation into a PS?


Anonymous:

Yeah, it would appear their smaller size has many advantages, from optimizing how much data is sent over a network, to conserving memory (especially on systems where this is a commodity) and diskspace.

- Jesse Towner

This topic is closed to new replies.

Advertisement