The approach uses a Proportional-Integral-Derivative (PID) control loop on the angle of the body to apply torque to it in a well controlled and predictable manner. Two examples of using the approach are shown in the video, one for a missile that can only move in the direction it is facing and another for an "Entity" like a game character that moves independent of its facing direction.

The video below shows this in action. You don't have to watch the whole video to get the idea, but there is a lot in there to see...

Whether your game is in 2-D or 3-D, you often have the need to make an object "turn" to face another direction. This could be a character's walking direction as they are moving, the direction they are shooting while sitting crouched, the direction a missile is flying in, the direction a car is racing towards, etc. This is the job of the "character controller", the piece of code in your system responsible for the basic "movement" operations that a character must undergo (seek, turn, arrive, etc.).

Building games that use physics engines is a lot of fun and adds a level of realism to the game that can dramatically improve the gameplay experience. Objects collide, break, spin, bounce, and move in more realistic ways.

Moving in more realistic ways is

Most important of all, this is a physics engine. You should be using it as such to make bodies move as expected.

Our goal is to create a solution to change the facing direction of the body by applying turning torque to it.

If we decouple the problem of "how it turns" from "how it moves", we can use the same turning solution for other types of moving bodies where the facing direction needs to be controlled. For this article, we are considering a missile that is moving towards its target.

Here, the missile is moving in a direction and has a given velocity. The angle of the velocity is measured relative to the x-axis. The "facing direction" of the missile is directly down the nose and the missile can only move forward. We want to turn it so that it is facing towards the target, which is at a different angle. For the missile to hit the target, it has to be aiming at it. Note that if we are talking about an object that is not moving, we can just as easily use the angle of the body relative to the x-axis as the angle of interest.

The basic idea behind a control system is to take the difference of "what you want the value to be" and "what the value is" and adjust your input to the system so that, over time, the system converges to your desired value.

From this wikipedia article:

A familiar example of a control loop is the action taken when adjusting hot and cold faucets (valves) to maintain the water at a desired temperature. This typically involves the mixing of two process streams, the hot and cold water. The person touches the water to sense or measure its temperature. Based on this feedback they perform a control action to adjust the hot and cold water valves until the process temperature stabilizes at the desired value.

There is a huge body of knowledge in controls system theory. Polynomials, poles, zeros, time domain, frequency domain, state space, etc. It can seem daunting to the uninitiated.

Let's start with the basic variable we want to "control", the difference between the angle we want to be facing and the angle of the body/velocity:

\(e(t) = desired - actual\)

Here, \(e(t)\) is the "error". We want to drive the error to 0. We apply forces to the body to make it turn in a direction to make it turn so that it moves \(e(t)\) towards 0. To do this, we create a function \(f(.)\), feed \(e(t)\) into it, and apply torque to the body based on it. Torque makes bodies turn:

\(torque(t) = I * f(e(t)), I \equiv Angular Inertia \)

The first and most obvious choice is to apply a torque that is proportional to the \(e(t)\) itself. When the error is large, large force is applied. When the error is small, small force is applied. Something like:

\(f(e(t)) = K_p * e(t)\)

And applying the torque with just proporational feedback will work. Somewhat. The problem is that when the error is small, the corrective torque is small as well. So as the body is turning \(e(t)\) gets small (nearing the intended angle), the retarding torque is small also. So the body overshoots and goes past the angle intended. Then you start to swing back and eventually oscillate to a steady state. If the value \(K_p\) is not too large, the oscillation should settle into "damped" (exponentially decaying) sinusoid, dropping a large amount each oscillation (stable solution). It may also spiral off towards infinity (unstable solution), or just oscillate forever around the target point (marginally stable solution). If you reduce \(K_p\) so that it is not moving so fast, then when \(e(t)\) is large, you don't have a lot of driving force to get moving.

A pure proportional error also has a bias (Steady State Error) that keeps the final output different from the input. The error is a function of \(K_p\) of the form:

\(Steady State Error = \frac{desired}{ [1 + constant * K_p]} \)

So increasing the \(K_p\) value makes the bias smaller (good). But this will also make it oscillate more (bad).

The next term to add is the integral term, the "I" in PID:

\(f(e(t)) = K_p * e(t) + \int\limits_{-\infty}^{now} K_i * e(t) \)

For each time step, if \(e(t)\) has a constant value, the integral term will work to counter it:

- If direction to the target suddenly changes a small amount, then over each time step, this difference will build up and create turning torque.
- If there is a bias in the direction (e.g. Steady State Error), this will accumulate over the time steps and be countered.

We don't have to calculate the actual integral. We probably don't want to anyway since it stretches back to \(-\infty\) and an error back in the far past should have little effect on our near term decisions.

We can estimate the integral over the short term by summing the value of \(e(t)\) over the last several cycles and multiplying by the time step (Euler Integration) or some other numerical technique. In the code base, the Composite Simpson's Rule technique was used.

Most PID controllers stop at the "PI" version. The proportional part gets the output swinging towards the input and the integral part knocks out the bias or any steady external forces that might be countering the proportional control.

\(f(e(t)) = K_p * e(t) + \int\limits_{-\infty}^{now} K_i * e(t) dt + K_d * \frac{de(t)}{dt}\)

Consider what happens when \(e(t)\) is oscillating. Its behavior is like a sine function. The derivative of this is a cosine function and its maximum occurs when sin(e(t)) = 0. That is to say, the derivative is largest when \(e(t)\) is swinging through the position we want to achieve. Conversely, when the oscillation is at the edge, about to change direction, its rate of change switches from positive to negative (or vice versa), so the derivative is smallest (minimum).

Just like the integral, the derivative can be estimated numerically. This is done by taking differences over the last several \(e(t)\) values (see the code).

Because we are software minded, whatever algorithm we want to use for a PID controller, we want to wrap it into a convenient package, give it a clean interface, and hide everything except what the user needs. This needs to be "owned" by the entity that is doing the turning.

The

The interface itself is generic so that the

The

/******************************************************************** * File : PIDController.h * Project: Interpolator * ******************************************************************** * Created on 10/13/13 By Nonlinear Ideas Inc. * Copyright (c) 2013 Nonlinear Ideas Inc. All rights reserved. ******************************************************************** * This software is provided 'as-is', without any express or implied * warranty. In no event will the authors be held liable for any * damages arising from the use of this software. * * Permission is granted to anyone to use this software for any * purpose, including commercial applications, and to alter it and * redistribute it freely, subject to the following restrictions: * * 1. The origin of this software must not be misrepresented; you must * not claim that you wrote the original software. If you use this * software in a product, an acknowledgment in the product * documentation would be appreciated but is not required. * 2. Altered source versions must be plainly marked as such, and * must not be misrepresented as being the original software. * 3. This notice may not be removed or altered from any source * distribution. */ #ifndef __Interpolator__PIDController__ #define __Interpolator__PIDController__ #include "CommonSTL.h" #include "MathUtilities.h" /* This class is used to model a Proportional- * Integral-Derivative (PID) Controller. This * is a mathemtical/control system approach * to driving the state of a measured value * towards an expected value. * */ class PIDController { private: double _dt; uint32 _maxHistory; double _kIntegral; double _kProportional; double _kDerivative; double _kPlant; vector<double> _errors; vector<double> _outputs; enum { MIN_SAMPLES = 3 }; /* Given two sample outputs and * the corresponding inputs, make * a linear pridiction a time step * into the future. */ double SingleStepPredictor( double x0, double y0, double x1, double y1, double dt) const { /* Given y0 = m*x0 + b * y1 = m*x1 + b * * Sovle for m, b * * => m = (y1-y0)/(x1-x0) * b = y1-m*x1 */ assert(!MathUtilities::IsNearZero(x1-x0)); double m = (y1-y0)/(x1-x0); double b = y1 - m*x1; double result = m*(x1 + dt) + b; return result; } /* This funciton is called whenever * a new input record is added. */ void CalculateNextOutput() { if(_errors.size() < MIN_SAMPLES) { // We need a certain number of samples // before we can do ANYTHING at all. _outputs.push_back(0.0); } else { // Estimate each part. size_t errorSize = _errors.size(); // Proportional double prop = _kProportional * _errors[errorSize-1]; // Integral - Use Extended Simpson's Rule double integral = 0; for(uint32 idx = 1; idx < errorSize-1; idx+=2) { integral += 4*_errors[idx]; } for(uint32 idx = 2; idx < errorSize-1; idx+=2) { integral += 2*_errors[idx]; } integral += _errors[0]; integral += _errors[errorSize-1]; integral /= (3*_dt); integral *= _kIntegral; // Derivative double deriv = _kDerivative * (_errors[errorSize-1]-_errors[errorSize-2]) / _dt; // Total P+I+D double result = _kPlant * (prop + integral + deriv); _outputs.push_back(result); } } public: void ResetHistory() { _errors.clear(); _outputs.clear(); } void ResetConstants() { _kIntegral = 0.0; _kDerivative = 0.0; _kProportional = 0.0; _kPlant = 1.0; } PIDController() : _dt(1.0/100), _maxHistory(7) { ResetConstants(); ResetHistory(); } void SetKIntegral(double kIntegral) { _kIntegral = kIntegral; } double GetKIntegral() { return _kIntegral; } void SetKProportional(double kProportional) { _kProportional = kProportional; } double GetKProportional() { return _kProportional; } void SetKDerivative(double kDerivative) { _kDerivative = kDerivative; } double GetKDerivative() { return _kDerivative; } void SetKPlant(double kPlant) { _kPlant = kPlant; } double GetKPlant() { return _kPlant; } void SetTimeStep(double dt) { _dt = dt; assert(_dt > 100*numeric_limits<double>::epsilon());} double GetTimeStep() { return _dt; } void SetMaxHistory(uint32 maxHistory) { _maxHistory = maxHistory; assert(_maxHistory >= MIN_SAMPLES); } uint32 GetMaxHistory() { return _maxHistory; } void AddSample(double error) { _errors.push_back(error); while(_errors.size() > _maxHistory) { // If we got too big, remove the history. // NOTE: This is not terribly efficient. We // could keep all this in a fixed size array // and then do the math using the offset from // the beginning and module math. But this // gets complicated fast. KISS. _errors.erase(_errors.begin()); } CalculateNextOutput(); } double GetLastError() { size_t es = _errors.size(); if(es == 0) return 0.0; return _errors[es-1]; } double GetLastOutput() { size_t os = _outputs.size(); if(os == 0) return 0.0; return _outputs[os-1]; } virtual ~PIDController() { } };

This is a very simple class to use. You set it up calling the

Looking at the

void ApplyTurnTorque() { Vec2 toTarget = GetTargetPos() - GetBody()->GetPosition(); float32 angleBodyRads = MathUtilities::AdjustAngle(GetBody()->GetAngle()); if(GetBody()->GetLinearVelocity().LengthSquared() > 0) { // Body is moving Vec2 vel = GetBody()->GetLinearVelocity(); angleBodyRads = MathUtilities::AdjustAngle(atan2f(vel.y,vel.x)); } float32 angleTargetRads = MathUtilities::AdjustAngle(atan2f(toTarget.y, toTarget.x)); float32 angleError = MathUtilities::AdjustAngle(angleBodyRads - angleTargetRads); _turnController.AddSample(angleError); // Negative Feedback float32 angAcc = -_turnController.GetLastOutput(); // This is as much turn acceleration as this // "motor" can generate. if(angAcc > GetMaxAngularAcceleration()) angAcc = GetMaxAngularAcceleration(); if(angAcc < -GetMaxAngularAcceleration()) angAcc = -GetMaxAngularAcceleration(); float32 torque = angAcc * GetBody()->GetInertia(); GetBody()->ApplyTorque(torque); }

If you look carefully at the video, there is a distinct difference in the way path following works for the missile vs. the character (called the

The

I have also, quite deliberately, left out a bit of key information on how to tune the constants for the PID controller. There are numerous articles on Google for how to tune a PID control loop, and I have to leave something for you to do, after all.

You will also note that the default value for

The source code for this, written in cocos2d-x/C++, can be found on github here. The

There are two cases for "shooting" at something:

- Non-Rotating Shooter
- Rotating Shooter

This article only covers the case of the non-rotating shooter. The second case is a bit more complicated. I've posted a second article "Throwing a Winning Pass" to cover that.

Your goal is to predict the future position of the target so that you can launch something at it and have them collide. It is given that you know the following:

- The position of the shooter when the projectile will be launched: \(\vec{P_s}\)
- The position of the target when the shooter will launch the projectile (i.e. now): \(\vec{P_T^0}\)
- The speed at which your projectiles travel: \(S_b\)
- The velocity of the target, \(\vec{v_T}\)

What you would like to find is \(\vec{P_T^1}\), the final position where the projectile will intercept the target.

We are going to have to assume that the target will travel at a constant velocity while the projectile is moving. Without this, the projectile will have to adapt after it has been launched (because it cannot predict the future position of the target if it is changing velocity in an unpredictable way). It should be pointed out that this is not limiting and in fact, still works out pretty well as long as the "edges" are handled well (target right in front of you or headed straight at you).

Consider the image below:

This image shows the situation for the shooter. The target is moving with some constant velocity and the intercept position for the projectile, when launched, is unknown until the calculation is done. When the projectile is launched, it will intercept the target after \(t_B\) seconds. The projectile is launched with a constant speed. We don't know its direction yet, but we know the magnitude of its velocity, its speed will be \(S_b\).

If the target is at position \(\vec{P_T^0}\) now, it will be at position \(\vec{P_T^1}\), given by:

(1) \(\vec{P_T^1} = \vec{P_T^0} + \vec{v_T} * t_B\)

Since the projectile will have traveled for the same amount of time, it will have moved from \(\vec{P_s}\) to \(\vec{P_T^1}\) as well. In that time, it will have moved a distance of \(S_B x t_B\). Since we are talking about vector quantities here, we can write this as:

\(\mid\vec{P_T^1}-\vec{P_s}\mid = S_b * t_B\)

If we square both sides and break it into components to get rid of the absolute value:

(2) \((P_{Tx}^1 - P_{Sx})^2 +(P_{Ty}^1 - P_{Sy})^2 = S_b^2 * t_B^2\)

Breaking (1) into components as well and substituting back into (2) for the value of \(P_{Tx}^1\) and \(P_{Ty}^1\), we get the following:

\((P_{T0x} - P_{Sx} + v_{Tx}t_B)^2 + (P_{T0y} - P_{Sy} + v_{Ty}t_B)^2 = S_b^2 * t_B^2\)

For the sake of simplicity, we going to redefine:

\(P_T^0 - P_s = R \)(this is a constant)

After some algebra, this gives us the **final** equation:

\(t_B^2(v_{Tx}^2 + v_{Ty}^2-S_B^2) + t_B(2*R_x*v_{Tx} + 2*R_y*v_{Ty}) + (R_x^2 + R_y^2) = 0\)

This is a quadratic in \(t_B\):

\(t_b = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}\)

where:

\( a =v_{Tx}^2 + v_y^2-S_B^2\)

\( b =2(R_x*v_{Tx} + R_y*v_{Ty})\)

\( c = R_x^2 + R_y^2\)

You can test the discriminant, \(b^2-4ac\):

< 0 \(\Rightarrow\) No Solution.

= 0 \(\Rightarrow\) One solution.

> 0 \(\Rightarrow\) Two solutions, pick the lowest positive value of \(t_B\).

Once you have solved the quadratic for \(t_B\), you can then substitute it back into (1) and calculate the intercept position, \(\vec{P_T^1}\).

Putting this together and covering some edge cases:

/* Calculate the future position of a moving target so that * a projectile launched immediately can intercept (collide) * with it. * * Some situations where this might be useful for an AI to * make this calculation. * * 1. Shooting a projectile at a moving target. * 2. Launching a football or soccer ball to a player. * 3. Figuring out the best position to jump towards in * a platform game. * * * The output value, solution, is the position that the * intercept will occur at and the location that the * projectile should be launched towards. * * The function will return false if a solution cannot * be found. Consider the case of a target moving away * from the shooter faster than the speed of the * projectile and you will see at least one case where * this calculation may fail. */ bool CalculateInterceptShotPosition(const Vec2& pShooter, const Vec2& pTarget0, const Vec2& vTarget, float64 sProjectile, Vec2& solution ) { // This formulation uses the quadratic equation to solve // the intercept position. Vec2 R = pTarget0 - pShooter; float64 a = vTarget.x*vTarget.x + vTarget.y*vTarget.y - sProjectile*sProjectile; float64 b = 2*(R.x*vTarget.x + R.y*vTarget.y); float64 c = R.x*R.x + R.y*R.y; float64 tBullet = 0; // If the target and the shooter have already collided, don't bother. if(R.LengthSquared() < 2*DBL_MIN) { return false; } // If the squared velocity of the target and the bullet are the same, the equation // collapses to tBullet*b = -c. If they are REALLY close to each other (float tol), // you could get some weirdness here. Do some "is it close" checking? if(fabs(a) < 2*DBL_MIN) { // If the b value is 0, we can't get a solution. if(fabs(b) < 2*DBL_MIN) { return false; } tBullet = -c/b; } else { // Calculate the discriminant to figure out how many solutions there are. float64 discriminant = b*b - 4 * a * c; if(discriminant < 0) { // All solutions are complex. return false; } if (discriminant > 0) { // Two solutions. Pick the smaller one. // Calculate the quadratic. float64 quad = sqrt(discriminant); float64 tBullet1 = (-b + quad)/(2*a); float64 tBullet2 = (-b - quad)/(2*a); if((tBullet1 < 0) && (tBullet2 < 0)) { // This would be really odd. // Both times are negative. return false; } else if(tBullet2 < 0 && tBullet1 >= 0) { // One negative, one positive. tBullet = tBullet1; } else if(tBullet1 < 0 && tBullet2 >= 0) { // One negative, one positive. tBullet = tBullet2; } else if(tBullet1 < tBullet2) { // First less than second tBullet = tBullet1; } else { // Only choice left tBullet = tBullet2; } } else { tBullet = -b / (2*a); } } // If the time is negative, we can't get there from here. if(tBullet < 0) { return false; } // Calculate the intercept position. solution = pTarget0 + tBullet*vTarget; return true; }

I have posted a working solution, with a simulation of using the above function and which you can tinker with, on github.

It is assumed that the reader has insight in theory behind inertia tensors and/or has an existing physics system that can make use of such a matrix. Knowledge in multivariate calculus is also assumed.

Firstly, what is meant by saying capsule? Well, simply put, it is a cylinder that (instead of flat ends) has hemispherical ends. By cylinder I mean a circular one (not elliptical). Hemispheres have the same radius. It should also be mentioned that the given body is solid (not hollow).

Where

**Figure 2.** Capsule decomposition.

The approach that we will be using to calculate the inertia tensor can be seen from Figure 2. The body will be decomposed into three parts: upper hemisphere, lower hemisphere and cylinder. Inertia tensors will be calculated for each of them and then just summed together to result in a complete capsule tensor. Here are given, in Cartesian coordinates, the well-known equations for moments and products of inertia needed for the inertia tensor in general.

(**eq. 1)**

(**eq. 2)**

Note that in eq. 2 products of inertia are equal to zero. There exists a theorem that says that it is possible to find three (mutually perpendicular) axes for every body so the tensor produced from those axes has zero products of inertia. Those axes are called the principal axes. Furthermore, for bodies with constant density an axis of rotational symmetry is a principal axis. We have three perpendicular axes of rotational symmetry for our capsule. First one is the y axis of continuous rotational symmetry and then there are x and z axes of discrete rotational symmetry. I will not go into detail about what axes of rotational symmetry are, but I recommend to go and check it out.

(**eq. 3)**

Since we will integrate over space eq. 3 will be used. If density is required to be some other constant than 1, the end result can simply be multiplied by that constant.

Before we dive into integration here is the Cartesian coordiante system we will be using for cylinder:

**Figure 3.** Cylinder and its position relative to coordinate system.

Let's start with easiest to compute moment of inertia, the one around the y axis. As previously mentioned, integrand is just a squared distance from the axis. With that in mind we can leave the Cartesian coordinate system and use the cylindrical one. The y axis remains unchanged. Also note that solutions will be written immediately without showing any steps.

(**eq. 4)**

First we add up (integrate) over 2*pi*r interval to get the moment of inertia of a circular curve around the y axis. After that we integrate that curve over the interval R to get the moment of inertia of a disc surface perpendicular to the y axis. Lastly those discs are added up across [-H/2, H/2] interval to get the moment of inertia of a whole cylinder. The integral for the mass of a cylinder is very similar and follows the same intuition. It is given by:

(**eq. 5)**

Now, slightly more complicated integrals are the ones for moments of inertia around x and z axis. Those are equal, so only one calculation is needed. Note that we are back in Cartesian coordinates for this one and the integral is set up like we are calculating

(**eq. 6)**

This is it for the cylinder as I will not go into detail about intuition for this integral since there is more work to be done for hemisphere caps.

The following is how the hemisphere is defined relative to Cartesian coordinate system for the purpose of integration:

**Figure 4.** Hemisphere and its position relative to coordinate system.

The easiest to calculate moment of inertia of hemispheres is also the one around y axis.

(**eq. 7)**

We are in the cylindrical coordinate system and the intuition is similar to eq. 4. The only difference is that now the radius of a disc we need to integrate is a variable of the y position of a disc (sqrt(R^2 - y^2)). Then we just integrate those discs from 0 to R and we end up with the whole hemisphere region. As for the cylinder, the integral for the mass of a hemisphere follows the same intuition and is given by:

(**eq. 8)**

Now, when we try to calculate the moments of inertia around x and z axis (which are equal), things get tricky. Note that integral is set up like we are calculating

(**eq. 9)**

**Figure 5.** Center of mass of a hemisphere.

As it can be seen in figure 5, center of mass is not in the origin of the coordinate system, but the axis for which we calculated the moment of inertia is. In order to work with this moment of inertia (to apply Steiner's rule) we need it to be in the center of mass. This is where we calculate the distance

(**eq. 10)**

(**eq. 11)**

Where

(**eq. 12)**

Moments of inertia for the hemisphere cap on the bottom of the capsule do not differ to those here.

(**eq. 13)**

Equations in eq. 13 give us separated masses of cylinder and hemisphere as well as the mass of the whole body. Notice how mass of the hemisphere (

(**eq. 14)**

Although previously shown formulas for moments of inertia are not given by

The following code implementation is given in C-like language and is fairly optimized.

#define PI 3.141592654f #define PI_TIMES2 6.283185307f const float oneDiv3 = (float)(1.0 / 3.0); const float oneDiv8 = (float)(1.0 / 8.0); const float oneDiv12 = (float)(1.0 / 12.0); void ComputeRigidBodyProperties_Capsule(float capsuleHeight, float capsuleRadius, float density, float &mass, float3 ¢erOfMass, float3x3 &inertia) { float cM; // cylinder mass float hsM; // mass of hemispheres float rSq = capsuleRadius*capsuleRadius; cM = PI*capsuleHeight*rSq*density; hsM = PI_TIMES2*oneDiv3*rSq*capsuleRadius*density; // from cylinder inertia._22 = rSq*cM*0.5f; inertia._11 = inertia._33 = inertia._22*0.5f + cM*capsuleHeight*capsuleHeight*oneDiv12; // from hemispheres float temp0 = hsM*2.0f*rSq / 5.0f; inertia._22 += temp0 * 2.0f; float temp1 = capsuleHeight*0.5f; float temp2 = temp0 + hsM*(temp1*temp1 + 3.0f*oneDiv8*capsuleHeight*capsuleRadius); inertia._11 += temp2 * 2.0f; inertia._33 += temp2 * 2.0f; inertia._12 = inertia._13 = inertia._21 = inertia._23 = inertia._31 = inertia._32 = 0.0f; mass = cM + hsM * 2.0f; centerOfMass = {0.0f, 0.0f, 0.0f}; }

It would seem that

Capsules defined in a way which was presented here are very commonly used in games and physical simulations, so this tensor should be of use to anyone building their own 3D game engine or similar software.

Although, right-handed Cartesian coordinate system was used in this article, you can also use the code in left-handed system as long as y axis is the longitudinal axis (the axis that goes through caps).

The somewhat tedious processes of integration were skipped and only the end results were shown. This is mostly because it would take too much space. If you are curious whether those expressions are correct, you can try them on Wolfram or try to integrate them yourself.

Also, you can follow me on Twitter.

The forums are replete with people trying to learn how to efficiently find inverses of matrices or solve the classic \(Ax=b\) problem. Here's a decent method that is fairly easy to learn and implement. Hopefully it might also serve as a stepping stone to learning some of the more advanced matrix factorization methods, like Cholesky, QR, or SVD.

In 1948, Alan Turing came up with LU decomposition, a way to factor a matrix and solve \(Ax=b\) with numerical stability. Although there are many different schemes to factor matrices, LU decomposition is one of the more commonly-used algorithms. Interestingly enough, Gauss elimination can be implemented as LU decomposition. The computational effort expended is about the same as well.

So why would anyone want to use this method? First of all, the time-consuming elimination step can be formulated so that only the entries of \(A\) are required. As well, if there are several matrices of \(b\) that need to be computed for \(Ax=b\), then we only need to do the decomposition once. This last benefit helps us compute the inverse of \(A\) very quickly.

Let's start with an \(Ax=b\) problem, where you have an \(n\)x\(n\) matrix \(A\) and \(n\)x\(1\) column vector \(b\) and you're trying to solve for a \(n\)x\(1\) column vector \(x\). The idea is that we break up \(A\) into two matrices: \(L\), which is a lower triangular matrix, and \(U\), which is an upper triangular matrix. It would look something like this:

\[

\left [

\begin{matrix}

a_{11} & a_{12} & a_{13} \\

a_{21} & a_{22} & a_{23} \\

a_{31} & a_{32} & a_{33} \\

\end{matrix} \right ] = \left [

\begin{matrix}

l_{11} & 0 & 0 \\

l_{21} & l_{22} & 0 \\

l_{31} & l_{32} & l_{33} \\

\end{matrix} \right ] \left [

\begin{matrix}

u_{11} & u_{12} & u_{13} \\

0 & u_{22} & u_{23} \\

0 & 0 & u_{33} \\

\end{matrix} \right ]

\]

This seems like a lot more work, but allows us to use some cool math tricks. Our original equation, substituted with our decomposition is now \((LU)x=b\):

\[

\left (

\left [ \begin{matrix}

l_{11} & 0 & 0 \\

l_{21} & l_{22} & 0 \\

l_{31} & l_{32} & l_{33} \\

\end{matrix} \right ]

\left [ \begin{matrix}

u_{11} & u_{12} & u_{13} \\

0 & u_{22} & u_{23} \\

0 & 0 & u_{33} \\

\end{matrix} \right ]

\right )

\left [ \begin{matrix}

x_1 \\

x_2 \\

x_3 \\

\end{matrix} \right ] =

\left [ \begin{matrix}

b_1 \\

b_2 \\

b_3 \\

\end{matrix} \right ]

\]

If we shift the parentheses, we get the following:

\[

\left [ \begin{matrix}

l_{11} & 0 & 0 \\

l_{21} & l_{22} & 0 \\

l_{31} & l_{32} & l_{33} \\

\end{matrix} \right ]

\left (

\left [ \begin{matrix}

u_{11} & u_{12} & u_{13} \\

0 & u_{22} & u_{23} \\

0 & 0 & u_{33} \\

\end{matrix} \right ]

\left [ \begin{matrix}

x_1 \\

x_2 \\

x_3 \\

\end{matrix} \right ]

\right )

=

\left [ \begin{matrix}

b_1 \\

b_2 \\

b_3 \\

\end{matrix} \right ]

\]

Looking just inside the parentheses, we can see another \(Ax=b\) type problem. If we say that \(Ux=d\), where \(d\) is a different column vector from \(b\), we have 2 separate \(Ax=b\) type problems:

\[

\left [ \begin{matrix}

l_{11} & 0 & 0 \\

l_{21} & l_{22} & 0 \\

l_{31} & l_{32} & l_{33} \\

\end{matrix} \right ]

\left [ \begin{matrix}

d_1 \\

d_2 \\

d_3 \\

\end{matrix} \right ]

=

\left [ \begin{matrix}

b_1 \\

b_2 \\

b_3 \\

\end{matrix} \right ]

\\

\left [ \begin{matrix}

u_{11} & u_{12} & u_{13} \\

0 & u_{22} & u_{23} \\

0 & 0 & u_{33} \\

\end{matrix} \right ]

\left [ \begin{matrix}

x_1 \\

x_2 \\

x_3 \\

\end{matrix} \right ]

=

\left [ \begin{matrix}

d_1 \\

d_2 \\

d_3 \\

\end{matrix} \right ]

\]

It looks like we just created more work for ourselves, but we actually made it easier. If we look at the problems, the matrices on the left are in a form like row echelon form. Using forward and back substitutions, we can solve easily for all the elements of the \(x\) and \(d\) matrices. We first solve \(Ld=b\) for \(d\), then we substitute into \(Ux=d\) to solve for \(x\). The real trick to this method is the decomposition of \(A\).

There are a couple catches to this method:

- The matrix \(A\) must be square to use LU factorization. Other factorization schemes will be necessary if \(A\) is rectangular.
- We have to be sure that \(A\) is a nonsingular (i.e. invertible) matrix. If it can't be inverted, then the decomposition will produce an \(L\) or \(U\) that is singular and the method will fail because there is no unique solution.
- To avoid division by zero or by really small numbers, we have to implement a pivoting scheme just like with Gaussian elimination. This makes the problem take the form \(PA=LU\), where P is a permutation matrix that allows us to swap the rows of A. P is usually the identity matrix with rows swapped such that \(PA\) produces the \(A\) matrix with the same rows swapped as P. Then the \(Ax=b\) problem takes the form \(LUx=Pb\) since \(PA=LU\).
- Some of the entries in the \(L\) and \(U\) matrices must be known before the decomposition, or else the system has too many unknowns and not enough equations to solve for all the entries of both matrices. For what's formally known as Doolittle decomposition, the diagonal entries of the \(L\) matrix are all 1. If we use Crout decomposition, the diagonals of the \(U\) matrix are all 1.

For a square matrix \(A\) with entries \(a_{ij},\,i=1,\cdots,n,\,j=1,\cdots,n\), the Crout decomposition is as follows:

First:

\[

l_{i1} = a_{i1},\,\textrm{for}\,i=1,2,\cdots,n\\

u_{1j} = \frac{a_{1j}}{l_{11}},\,\textrm{for}\,j=2,3,\cdots,n

\]

For \(j=2,3,\cdots,n-1\):

\[

l_{ij} = a_{ij}-\sum_{k=1}^{j-1}l_{ik}u_{kj},\,\textrm{for}\,i=j,j+1,\cdots,n \\

u_{jk} = \frac{a_{jk}-\sum_{i=1}^{j-1}l_{ji}u_{ik}}{l_{jj}},\,\textrm{for}\,k=j+1,j+2,\cdots,n

\]

Finally:

\[

l_{nn}=a_{nn}-\sum_{k=1}^{n-1}l_{nk}u_{kn}

\]

That's it! If you notice, \(A\) is being traversed in a specific way to build \(L\) and \(U\). To start building \(L\), we start with \(a_{11}\) and then traverse down the rows in the same column until we hit \(a_{n1}\). To start building \(U\), we start at \(a_{12}\) and traverse along the same row until we hit \(a_{1n}\). We then build \(L\) further by starting at \(a_{22}\) and traverse along the column until we hit \(a_{n2}\), then we build \(U\) further by starting at \(a_{23}\) and traverse along the row until \(a_{2n}\), and so on.

Since the 1's on the diagonal of \(U\) are given as well as the 0's in both the \(L\) and \(U\) matrices, there's no need to store anything else. In fact, there's a storage-saving scheme where all the calculated entries of both matrices can be stored in a single matrix as large as \(A\).

From here, \(d\) can be solved very easily with forward-substitution:

\[

d_1 = \frac{b_1}{l_{11}} \\

d_i = \frac{b_i - \sum_{j=1}^{i-1}l_{ij}d_j}{l_{ii}},\,\textrm{for}\,i=2,3,\cdots,n

\]

As well, \(x\) can be solved very easily with back-substitution:

\[

x_n = d_n \\

x_i = d_i - \sum_{j=i+1}^n u_{ij}x_j,\,\textrm{for}\,i=n-1,n-2,\cdots,1

\]

Let's try to solve the following problem using LU decomposition:

\[

\left [ \begin{matrix}

3 & -0.1 & -0.2 \\

0.1 & 7 & -0.3 \\

0.3 & -0.2 & 10 \\

\end{matrix} \right ]

\left [ \begin{matrix}

x_1 \\ x_2 \\ x_3 \\

\end{matrix} \right ]

=

\left [ \begin{matrix}

7.85 \\ -19.3 \\ 71.4 \\

\end{matrix} \right ]

\]

First, we copy the first column to the \(L\) matrix and the scaled first row except the first element to the \(U\) matrix:

\[

L =

\left [ \begin{matrix}

3 & 0 & 0 \\

0.1 & - & 0 \\

0.3 & - & - \\

\end{matrix} \right ],\,\,\,

U =

\left [ \begin{matrix}

1 & -0.0333 & -0.0667 \\

0 & 1 & - \\

0 & 0 & 1 \\

\end{matrix} \right ]

\]

Then we compute the following entries:

\[

l_{22} = a_{22} - l_{21}u_{12} = 7-(0.1)(-0.0333) = 7.00333 \\

l_{32} = a_{32} - l_{31}u_{12} = -0.2-(0.3)(-0.0333) = -0.19 \\

u_{23} = \frac{a_{23}-l_{21}u_{13}}{l_{22}} = \frac{-0.3-(0.1)(-0.0667)}{7} = -0.0419 \\

l_{33} = a_{33}-l_{31}u_{13}-l_{32}u_{23} = 10-(0.3)(-0.0667)-(-0.19)(-0.0419) = 10.012 \\

\]

Inserting them into our matrices:

\[

L =

\left [ \begin{matrix}

3 & 0 & 0 \\

0.1 & 7.00333 & 0 \\

0.3 & -0.19 & 10.012 \\

\end{matrix} \right ],\,\,\,

U =

\left [ \begin{matrix}

1 & -0.0333 & -0.0667 \\

0 & 1 & -0.0419 \\

0 & 0 & 1 \\

\end{matrix} \right ]

\]

This is the LU factorization of the matrix. You can check if \(LU=A\). Now, we use back- and forward-substitution to solve the problem:

\[

d_1 = \frac{b_1}{l_{11}} = 7.85/3 = 2.6167 \\

d_2 = \frac{b_2-l_{21}d_1}{l_{22}} = (-19.3-(0.1)(2.6167))/7.00333 = -2.7932 \\

d_3 = \frac{b_3-l_{31}d_1-l_{32}d_2}{l_{33}} = (71.4-(0.3)(2.6167)-(-0.19)(-2.7932))/10.012 = 7 \\

\]

\[

x_3 = d_3 = 7 \\

x_2 = d_2 - u_{23}x_3 = -2.7932-(-0.0419)(7) = -2.5 \\

x_1 = d_1 - u_{12}x_2 - u_{13}x_3 = 2.6167-(-0.0333)(-2.5)-(-0.0667)(7) = 3 \\

\]

So the solution to the problem is:

\[

\left [ \begin{matrix}

x_1 \\ x_2 \\ x_3 \\

\end{matrix} \right ]

=

\left [ \begin{matrix}

3 \\ -2.5 \\ 7 \\

\end{matrix} \right ]

\]

This solution can easily be verified by plugging it back into \(Ax=b\).

Here's some quick MATLAB code for LU decomposition:

function [L,U] = lucrout(A) [~,n] = size(A); L = zeros(n,n); U = eye(n,n); L(1,1) = A(1,1); for j=2:n L(j,1) = A(j,1); U(1,j) = A(1,j) / L(1,1); end for j=2:n-1 for i=j:n L(i,j) = A(i,j); for k=1:j-1 L(i,j) = L(i,j) - L(i,k)*U(k,j); end end for k=j+1:n U(j,k) = A(j,k); for i=1:j-1 U(j,k) = U(j,k) - L(j,i)*U(i,k); end U(j,k) = U(j,k) / L(j,j); end end L(n,n) = A(n,n); for k=1:n-1 L(n,n) = L(n,n) - L(n,k)*U(k,n); end end

LU decomposition is nice for solving a series of \(Ax=b\) problems with the same \(A\) matrix and different \(b\) matrices. This is advantageous for computing the inverse of \(A\) because only one decomposition is required. The definition of the inverse of a matrix \(A^{-1}\) is a matrix such that \(AA^{-1}=I\), where \(I\) is the identity matrix. This looks like this for a general 3x3 matrix:

\[

\left [ \begin{matrix}

A_{11} & A_{12} & A_{13} \\

A_{21} & A_{22} & A_{23} \\

A_{31} & A_{32} & A_{33} \\

\end{matrix} \right ]

\left [ \begin{matrix}

a_{11}^\prime & a_{12}^\prime & a_{13}^\prime \\

a_{21}^\prime & a_{22}^\prime & a_{23}^\prime \\

a_{31}^\prime & a_{32}^\prime & a_{33}^\prime \\

\end{matrix} \right ]

=

\left [ \begin{matrix}

1 & 0 & 0 \\

0 & 1 & 0 \\

0 & 0 & 1 \\

\end{matrix} \right ]

\]

This can be set up like \(n\) \(Ax=b\) problems with different \(b\) matrices and the \(x\) matrix becomes the nth column in the inverse matrix. For example, the first problem is:

\[

\left [ \begin{matrix}

A_{11} & A_{12} & A_{13} \\

A_{21} & A_{22} & A_{23} \\

A_{31} & A_{32} & A_{33} \\

\end{matrix} \right ]

\left [ \begin{matrix}

a_{11}^\prime \\

a_{21}^\prime \\

a_{31}^\prime \\

\end{matrix} \right ]

=

\left [ \begin{matrix}

1 \\

0 \\

0 \\

\end{matrix} \right ]

\]

The second column of the inverse can be computed by changing \(b\) to \([0,1,0]^T\), the third column with \([0,0,1]^T\), and so on. This method is quick because only back- and forward-substitution is required to solve for the column vectors after the initial LU decomposition.

There are a lot of other matrix factorization schemes besides LU, like Cholesky or QR factorization, but the general idea of decomposing a matrix into other matrices is roughly the same. The real key to computational savings comes from knowing beforehand what kind of matrix you're factoring and choosing the appropriate algorithm. For example, in structural finite element analysis, the matrix being decomposed is always symmetric positive definite. Cholesky decomposition is way more efficient and quicker than LU for those kinds of matrices, so it's preferred.

LU decomposition is a great tool for anyone working with matrices. Hopefully you can make use of this simple, yet powerful method.

Making a Game Engine: Core Design Principles

Moving objects around in a scene is not hard to do. You simply draw objects at different locations each frame. Adding rotation, and resizing on top of that makes the code a little trickier but still not hard to manage. But then start trying to move objects relative to another such as putting a sword in a character's hand and things can get really tricky. The sword should stay in the character's hand regardless of what direction they rotate or where they move. This is where using matrices to represent these transformations is extremely valuable.

In order for a coordinate to have any meaning there needs to be a point of reference to base that coordinate off of. Imagine you have a dot on a piece of paper. The dot does not have a coordinate unless you come up with a way to measure its location. One way would be to determine the distance the point is from the left side of the page and its distance from the bottom. These two measurements give the coordinate meaning since it has a point of reference, the bottom left corner of the page. The point of reference is known as a basis and any point measured relative to the same basis are said to be in the same coordinate system. If a worm were on a ball you could devise a way to measure the location of the worm on the ball. This coordinate would be the worm's location in the ball's coordinate space.

The ball itself also has a location but its location is measured in world coordinates. Let's assume that the location of the ball is measured from the bottom left corner of the image to the center of the ball. If you were to throw that ball the worm would move with the ball. The position of the worm in world coordinates would move as the ball moves, even though its position in ball coordinates stay the same. Transformations give us a way to convert between these coordinate systems relatively easy using matrices allowing us to easily keep the worm on the ball as it travels through the air.

Matrices have a lot of useful properties that lend themselves to solving many complex problems. We will be using them to transform objects on the screen. This article will focus on three simple transformations: translation, rotation, and scaling. I will be using 2D transformations for the sake of simplicity. The principles in the article also apply to 3D transformations. If you want to do 3D transformations, search around on the web for 3D transformation matrices. The jump from 2D to 3D is trivial with the exception of rotations. I am assuming you already know what matrices are and how to multiply them, what the inverse and transpose of a matrix are, and other basic matrix operations. (For help with matrices, try this video series)

So far this talk of coordinate spaces and matrices may seem to be complicated things when all you need to do is move or rotate objects. How exactly do these things help move objects on the screen anyway? Well, imagine you define a character sprite with a width of 50 and a height of 100. You could define four points that represent the corners of the sprite image (-25, 0), (-25, 0), (25, 100), (25, 0). These points represent the corners of the sprite relative to the character. So what whappens when the character moves? The transform of the player converts the points of the sprite to world space. Then the camera transform converts from world space to screen space to be drawn. If the player clicks on the screen you can use the inverse of these transforms to determine where on the character they clicked.

A translation simply changes an object's location. The matrix used to represent this is pretty simple too.

T

Rotates points around the origin.

The above matrix will rotate counter clockwise by θ radians.

Scaling an object changes its size. Scalar values in the range (0, 1) make objects smaller. A scale of 1 keeps it the same size, a scale greater than 1 makes it bigger. Really all the scale is doing is taking the current size and multiplying by the scale value.

Scales along the x and y directions by S

Now that we have a few different transformation matrices we need a way to transform points using them. To do this you simply treat the point as a 1x3 matrix padding the extra slots with a 1 and multiplying it with the transform matrix.

The result of that multiplication is a another 1x3 matrix. Just taking the top two elements of the resulting matrix gives a new x, y pair of the point in the new coordinate system.

Transforming a direction is slightly different than transforming a point. Take a look at the example below.

The ball has been squashed. This is achieved by scaling the y direction down while keeping the x the same. If directions were transformed using the same matrix that were used to transform points, then you would get the red arrow as a result, but notice how the red arrow is no longer perpendicular to the ball. We want the green arrow that still is. To get that result, use the following equation.

As shown above, to get the proper matrix for transforming direction you need the inverse transpose of the transform matrix. You can avoid transposing a 3x3 matrix if you simply do a row vector multiplication in front of the matrix. Also take note that the third element is a 0 instead of a 1. This makes transforming a direction uneffected by any translations. This is proper behavior for transforming a direction since directions still point the same way regaurdless of where they are moved to.

The three basic tranformations listed above aren't that useful on their own. To make things interesting we want to be able to join them together allowing objects to be translated, rotated, and scaled simultaneously. To do this we simply multiply the matrices together.

The resulting matrix,

One thing to note is that roatations pivot around the origin. This means if you move an object up ten units then rotate it clockwise 90 degrees it will end up 10 units to the right. If you want the object to pivot around its own center, rotate it first then move it.

Since matrices can be multiplied to concatenate transforms you can easily keep an object at the same position relative to another, like the example of the worm on a ball described above. You first create a transform using the position and orientation of the worm on the ball. This transform represents the conversion of the worm's local coordinates to the ball coordinates. You then create the transform of the ball relative to world coordinates. Since the worm is on the ball, to find the transform from worm coordinates to world coordinates you simply multiply them together.

Matrix insectToWorldTransform = ballToWorldTransform * insectToBallTransform;

Before moving into the next section I should mention the difference between column vectors and row vectors. I have been using column vectors in this article. This means vectors are represented as a 3x1 matrix that is multiplied on the right of the transform.

The alternative are row vectors. A row vector is a 1x3 matrix and is multiplied on the left of the matrix.

Mathematically, these two methods are the same. Either one works. Some prefer how compact column vectors are to write out on paper. Others find row vectors to be more intiuative. The important thing is to pick one and be consistent with it. However, the graphics package you are using may require matrices for row vectors or column vectors. If you are using the other method you will need to convert your matrices before passing them in. To do so, simply transpose the matrix. As an example, the translate matrix using column vectors is

The same matrix for row vectors is

Also, keep in mind that when using row vectors the order to multiply matrices in is reversed.

So using row vectors, the left most matrix is applied first to the point and the matrix on the far right is applied last. An easy way to remember the difference is the matrix closest to the vector is applied first. This means the order is left to right using row matrices and right to left using column matrices.

Now that we have covered some basics of how transforms work let's build a transform class. Since we don't want to deal with matrices at a high level, we will allow the transform to be manipulated using a position, rotation, and scale. The transform class will also have children to allow for a heirarchy. When the parent transform moves all of the children move with it and the children's position, rotation, and scale will be defined relative to its parent. Finally the transform will extend the component class that is attached to a game object. Game objects and components are touched upon in this article.

class Transform extends Component { // the parent transform of this transform // if it is null then the parent transform // is the world coordinate system private Transform parent; // all of the transforms that have this // transform set as their parent private Transform[] children; // the position relative to the parent transform private Vector2 localPosition = new Vector2(0.0f, 0.0f); // rotation relative to the parent private float localRotation = 0.0f; // scale relative to the parent private Vector2 localScale = new Vector2(1.0f, 1.0f); // specifies if the localToWorldTransform // needs to be recalulated private bool isDirty = false; // the transform that converts local coordinates // to world coordinates private Matrix localToWorldMatrix = Matrix.identity; // specifies if the worldToLocalMatrix // needs to be recalculated private bool isInverseDirty = false; // the transform that converts world cooridnates // to local coordinates private Matrix worldToLocalMatrix = Matrix.identity; /* * Whenever any change happens that changes the localToWorldMatrix * this should be called. That way the next time localToWorldMatrix * is requested it will be recalculated */ private void setDirty() { // only update dirty boolean if it isn't already dirty if (!isDirty) { isDirty = true; isInverseDirty = true; // set all children to be dirty since any modification // of a parent transform also effects its children's // localToWorldTransform foreach (Transform child in children) { child.setDirty(); } } } // change the parent transform. // setting it to null makes the // transform a child of world coordinates public void setParent(Transform value) { // remove this from the previous parent if (parent != null) { parent.children.remove(this); } // assign new parent parent = value; // add this to new parent if (parent) { parent.children.add(this); } // changes parents effects // the world position setDirty(); } public Transform getParent() { return parent; } // calculates the transform matrix that converts // from local coordinates to the coordinate space // of the parent transform public Matrix calculateLocalToParentMatrix() { // Matrix.translate creates a translation matrix // that shifts by (localPosition.x, localPosition.y) // Matrix.rotate rotates by localRotation radians // Matrix.scale scales by a factor of (localScale.x, localScale.y) // These are the basic transforms that are described previously // in this article return Matrix.translate(localPosition) * Matrix.rotate(localRotation) * Matrix.scale(localScale); } // gets the matrix that converts from local // coordinates to world coordinates public Matrix getLocalToWorldMatrix() { // if the dirty flag is set, the the // localToWorldMatrix is out of date // and needs to be reclaculated if (isDirty) { if (parent == null) { // if the parent is null then the parent is // the world so the localToWorldMatrix // is the same as local to parent matrix localToWorldMatrix = calculateLocalToParentMatrix(); } else { // if there is a parent, then the localToWorldMatrix // is calcualted recursively using the parent's localToWorldMatrix // concatenated with the local to parent matrix localToWorldMatrix = parent.getLocalToWorldMatrix() * calculateLocalToParentMatrix(); } // clear the dirty flag since the // matrix is now up to date isDirty = false; } return localToWorldMatrix; } public Matrix getWorldToLocalMatrix() { if (isInverseDirty) { // the inverse is out of date // so it needs to be updated // the worldToLocalMatrix is the inverse of // the localToWorldMatrix worldToLocalMatrix = getLocalToWorldMatrix().inverse(); // clear the dirty flag since the // matrix is now up to date isInverseDirty = false; } return worldToLocalMatrix; } // transforms a point from local coordinates to world coordinates public Vector2 transformPoint(Vector2 point) { // matrix multiply padding the extra element with a 1 Matrix1x3 transformResult = getLocalToWorldMatrix() * Matrix1x3(point.x, point.y, 1); return new Vector2(transformResult[1,1], transformResult[1,2], transformResult[1,3]); } // transforms a direction from local coordinates to world coordinates public Vector2 transformDirection(Vector2 point) { // matrix multiply padding the extra element with a 0 // notice that the worldToLocalMatrix is used here // and the point is multiplied as a row matrix before the // transform matrix. This is the proper way to transform // directions as described before in this article Matrix3x1 transformResult = Matrix3x1(point.x, point.y, 0) * getWorldToLocalMatrix(); return new Vector2(transformResult[1,1], transformResult[2,1], transformResult[3,1]); } // transforms a point from world coordinates to local coordinates public Vector2 inverseTransformPoint(Vector2 point) { // same logic as transformPoint only with the inverse matrix Matrix1x3 transformResult = getWorldToLocalMatrix() * Matrix1x3(point.x, point.y, 1); return new Vector2(transformResult[1,1], transformResult[1,2], transformResult[1,3]); } // transforms a direction from world coordinates to local coordinates public Vector2 inverseTransformDirection(Vector2 point) { // same logic as transformDirection only with the inverse of the // inverse localToWorldMatrix which is just the localToWorldMatrix Matrix3x1 transformResult = Matrix3x1(point.x, point.y, 0) * getLocalToWorldMatrix(); return new Vector2(transformResult[1,1], transformResult[2,1], transformResult[3,1]); } public Vector2 getLocalPosition() { return localPosition; } // sets the position relative to the parent // and marks the transform as dirty public void setLocalPosition(Vector2 value) { localPosition = value; // set the dirty flag since the localToWorldMatrix needs to be updated setDirty(); } /* localRoation and localScale should also have getters and setters * like the local position does. Be sure to call setDirty in the * setters for each of them */ }

There is a lot of code there so let me explain the key points. First of all, I am using a dirty flag to indicate when the transform matrices are out of date. That allows the position, rotation, and scale to be changed multiple times and the matrix is only actually recalculated once it is requested, reducing the amount of uneeded recalculations. Also take note that the dirty flag is only set if the transform is not already dirty. This is to keep the

The inverse matrix is also stored in the transform to keep from having to calculate it everytime it is needed. It has its own dirty flag so it isn't calculated if it isn't needed.

Now that we have a transform class, let's see how it can be used. First of all, the

class Renderer extends Component { void render(graphics) { graphics.setTransformMatrix(this.gameObject.transform.getLocalToWorldMatrix()); graphics.drawSprite(this.frame); } // whatever else the renderer does // would go here ... }

Keep in mind the above code doesn't account for the view transform. Think of the view transform as the camera for the scene. If you wanted to be able to scroll the view of the scene you should specify a camera object and multiply all transforms by the

graphics.setTransformMatrix( camera.transform.getWorldToLocalMatrix() * gameObject.transform.getLocalToWorldMatrix());

The transform could be used in game logic code

class Mine extends Behavoir { void update(float deltaTime, InputState input) { // transforming the zero vector gets that // transform's origin in world coordinates Vector2 playerWorldPosition = player.transform.transformPoint(new Vector2(0.0, 0.0)); // take the players world position and convert it the mine's // local coordinates Vector2 playerLocalPosition = this.transform.inverseTransformPoint(playerWorldPosition); // since playerLocalPosition is the players position relative to the mine // the magnitude of the position is the distance from the mine to the player if (playerLocalPosition.magnitudeSquared() < sensorRange * sensorRange) { this.detonate(); } } }

Transforms may take a while to fully grasp but once understood help simplify problems that would otherwise be a huge challenge to tackle. Hopefully this article helped you understand how to use them in your games better.

For this article, I'm assuming you know all about vectors, points, dot and cross products. We'll cover some quick properties of polynomials, some things about some basic curves, and then go over intersection tests.

The Stone-Weierstrass theorem states that any continuous function defined on a closed interval can be uniformly approximated as closely as desired with a polynomial function. For graphics and game development, that means that most 2D objects we deal with can be expressed as polynomials. The key to these intersection test methods is using that to our advantage.

In order to find all our intersection points, we need to know how many intersections there can be between 2 polynomials. Bezout's theorem states that for a planar algebraic curve of degree

Bezout's theorem also extends to surfaces. A surface of degree

The key here is to identify how to count intersections. For example, if 2 curves are tangent, they intersect twice. If they have the same curvature, then they intersect 3 times. Simple intersections (not tangent and not self-intersecting) are counted once. So how do we count intersections at infinity? Using homogeneous coordinates, of course!

Counting intersections at infinity sounds hard, but we can use

This helps us with define infinities with finite numbers. Let's say we have the point \((2,3,1)\) in homogeneous coordinates. If we changed the weight coordinate W so that we had \((2,3,0.5)\), the 2D point would be \((4,6)\). Smaller weights mean larger projected coordinates. As the weight approaches zero, the 2D point gets larger and larger, until at W = 0, the projected point is at infinity. Thus, any homogeneous point \((X,Y,0)\) is a point at infinity.

Usually, polynomials have terms that differ in their algebraic degrees. Some are quadratic, some cubic, some constant, etc. The polynomial takes the following form:

\[f(x,y) = \sum_{i+j\le n} a_{ij}x^iy^j = 0 \]

However, the same curve can be expressed in homogeneous form by adding a homogenizing variable

\[f(X,Y,W) = \sum_{i+j+k = n} a_{ij}X^iY^jW^k = 0 \]

Here, every term in the polynomial is of degree

To use polynomials effectively, we need to be familiar with how they can be expressed. There are basically 3 types of equations that can be used to describe planar curves: parametric, implicit, and explicit. If you've only had high-school math, you've probably dealt with explicit curves a lot and not so much with the others.

Let's start with a very basic curve: the line. It's a degree-1 polynomial. There are many definitions of this kind of curve. Some are helpful for games and others...not so much. Some you may have seen in algebra class and others you may be seeing for the first time.

In secondary schools, they use the first 2 methods almost exclusively, but these turn out to be the least helpful for computing. Here, I've tried to order the definitions from least helpful to most helpful for our needs.

Probably one of the most useful forms, the implicit form is nice for a few reasons. First, we can express it in homogeneous coordinates: \(aX+bY+cW=0\). From here, we can define the coordinates as an ordered triple \((a,b,c)\) and then modify the equation to use the dot product so we can use a simple test if a point lies on a given line in implicit form:

\[L(a,b,c) \cdot P(X,Y,W) = 0\]

bool pointOnLine = fabs(dot(L,P)) < epsilon

Although this isn't strictly an intersection test, this method is valuable enough to mention here. Although points that lie on a line satisfy the equation \(L(a,b,c) \cdot P(X,Y,W) = 0\), if the point is not on the line, the scalar value can let us know on which side of the line the point lies by the sign. If \(L(a,b,c) \cdot P(X,Y,W) > 0\), the point lies on one side of the line; if \(L(a,b,c) \cdot P(X,Y,W) < 0\) it lies on the other. This is a convenient way of dividing a plane into 2 separate regions.

This scalar value is also a scaled distance from the line. If only the relative position of a point from the closest point on a line is desired, then this scalar can suffice. However, if true distance is required, then the full signed distance formula for a point \((x,y)\) from an implicit line defined by the ordered triple \((a,b,c)\) is given by the following:

\[D = \frac{ax+by+c}{\sqrt{a^2+b^2}}\]

\[\begin{aligned}A&=-v\\B&=u\\C&=vx_0-uy_0\\\end{aligned}\]

We want to use this notion of direction to apply to the signed distance formula so we can tell what side of the line we are on when looking in the direction of the line. Using this convention, we can say that if the sign of the dot product is greater than zero, it lies on the left side of the line. If the dot product is less than zero, it lies on the right side.

This ordered triple form a convenient way to define coordinate axes. For example, the x-axis can be defined as \((0,1,0)\) and the y-axis can be defined as \((1,0,0)\). The ordered triple is also really nice because we can compute it really easily if we have 2 points. Let's go back to our vector math to see how this might work.

In the above picture, two vectors \(\vec{a}\) and \(\vec{b}\) are crossed to produce a 3rd vector \(\vec{c} = \vec{a}\times\vec{b}\). This vector \(\vec{c}\) is orthogonal to the vectors \(\vec{a}\) and \(\vec{b}\), meaning that \(\vec{c}\cdot\vec{a} = 0\) and \(\vec{c}\cdot\vec{b} = 0\). This is important for this next neat trick.

Imagine we have points \(P_1\) and \(P_2\). To tie back into the example above, let's let \(\vec{a} = P_1 = (x_1,y_1,w_1)\) and \(\vec{b} = P_2 = (x_2,y_2,w_2)\). If we cross these vectors, we get a vector \(\vec{c} = \vec{a} \times\vec{b}\) for which \(\vec{c}\cdot\vec{a} = 0\) and \(\vec{c}\cdot\vec{b} = 0\). Since \(\vec{a}\) and \(\vec{b}\) are points, we could say that \(\vec{c} = (a,b,c)\), an implicit ordered triple that passes through both points. We can check that the line given by \((a,b,c)\) passes through both points by the dot product method.

\[L(a,b,c) = P(X_1,Y_1,W_1) \times P(X_2,Y_2,W_2)\]

Using the same kind of logic, we can get the point at which 2 lines intersect. To use the cross product example again, let's let the vectors be the ordered triples of 2 implicit lines, \(\vec{a} = (a_1,b_1,c_1)\) and \(\vec{b} = (a_2,b_2,c_2)\). Again, crossing these vectors yields a vector \(\vec{c} = \vec{a} \times\vec{b}\) for which \(\vec{c}\cdot\vec{a} = 0\) and \(\vec{c}\cdot\vec{b} = 0\). Since \(\vec{a}\) and \(\vec{b}\) are lines, we could say that \(\vec{c} = (X,Y,W)\), the intersection point of both lines. This point has to be on both lines, and we can verify that using the dot-product method.

\[P(X,Y,W) = L(a_1,b_1,c_1) \times L(a_2,b_2,c_2)\]

Lines are degree-1 algebraic curves. Bezout's theorem states that 2 lines must intersect at exactly 1 point. So what happens with parallel lines? Well, if we use homogeneous coordinates, the intersection point will take the form \((X,Y,0)\), a point at infinity. Bezout's theorem still holds for that case.

Vector P = cross(new Vector(a1,b1,c1), new Vector(a2,b2,c2)); // check to see if the lines are parallel if(P[2] != 0) { // this is to transform back to (x,y) from (X,Y,W) x = P[0] / P[2]; y = P[1] / P[2]; }

Sometimes it's advantageous to define some curves as parametric and some as implicit to solve for intersections. Most times, it's better to define the simpler curve as parametric and the more complex curve as implicit, if possible. This method solves for all algebraic intersections, which may or may not be "real" intersections. As well, multiple roots might come up, in which case we have to reconcile this with our intersection counting methods above.

Let's take a fairly common case: line-circle intersection. Bezout's theorem says we will get 2 intersections since a circle is a degree-2 algebraic curve. The parametric line at point \((x_0,y_0)\) with direction \((c,d)\) and implicit circle centered at \((a,b)\) with radius \(r\) are defined as follows:

\[ \begin{aligned} C(x,y) &= (x-a)^2+(y-b)^2-r^2 = 0 \\

L(x(t),y(t)) &= \begin{cases} x &= x_0+ct \\ y&=y_0+dt \end{cases} \end{aligned}\]

We can substitute the parametric equations into the implicit equation to get a polynomial in terms of the parameter \(t\). The roots of the polynomial are the parameter values at which the line intersects the circle. Substituting, we get:

\[

\begin{aligned}

0 &= (x_0+ct-a)^2+(y_0+dt-b)^2-r^2 \\

&= [c^2t^2+2c(x_0-a)t+(x_0-a)^2]+[d^2t^2+2d(y_0-b)t+(y_0-b)^2]-r^2 \\

&= (c^2+d^2)t^2+2[c(x_0-a)+d(y_0-b)]t + [(x_0-a)^2+(y_0-b)^2-r^2] \\

&= At^2+Bt+C \\

\end{aligned}

\]

This looks promising. The quadratic formula can solve really quickly for the roots of the polynomial:

\[t=\frac{-B\pm\sqrt{B^2-4AC}}{2A}\]

The result of this will either be 2 complex roots, 1 real root, or 2 real roots, depending on the discriminant \(B^2-4AC\). The 1 real root case means the line is tangent to the circle, which according to our intersection counting rules above, we count twice.

That might be nice mathematically, but what does that mean for us? Well, the parameter \(t\) has to be a real number, so the real roots are values of \(t\) that the line intersects, so plugging the roots back into the parametric line gives us the Cartesian points of the intersection. What about the complex roots? Well, the restriction on \(t\) is that it has to be a real number, so in the case of 2 complex roots, we say the line doesn't intersect the circle.

A = c*c+d*d; B = 2*(c(x0-a)+d(y0-b)); C = (x0-a)*(x0-a)+(y0-b)*(y0-b)-r*r; disc = B*B-4*A*C; if(disc == 0) { t = -B/(2*A); x1 = x0 + c*t; y1 = y0 + d*t; return [Point(x1,y1)]; } else if(disc > 0) { t1 = (-B+sqrt(disc))/(2*A); t2 = (-B-sqrt(disc))/(2*A); x1 = x0 + c*t1; y1 = y0 + d*t1; x2 = x0 + c*t2; y2 = y0 + d*t2; return [Point(x1,y1), Point(x2,y2)]; } else return [];

This is a simpler case than the line-circle intersection, although it involves a surface and a curve. A plane is algebraically a degree-1 polynomial in implicit form, so according to Bezout's theorem, they should intersect at exactly 1 point. We take the plane in implicit form and the line in parametric form and apply our method as above:

\[

\begin{aligned}

P(x,y) &= ax+by+cz+d=0 \\

L(x(t),y(t),z(t)) &= \begin{cases} x &= x_0+ut \\ y&=y_0+vt \\ z &= z_0+wt \\ \end{cases} \end{aligned}\]

We substitute the parametric equations into the implicit form and solve for \(t\) as before:

\[

\begin{aligned}

P(x(t),y(t),z(t)) = 0 &= a(x_0+ut)+b(y_0+vt)+c(z_0+wt)+d \\

&= (ax_0+by_0+cy_0+d) + (au+bv+cw)t \\

t &= -\frac{ax_0+by_0+cy_0+d}{au+bv+cw} \\

&= -\frac{(a,b,c,d)\cdot(x_0,y_0,z_0,1)}{(a,b,c,d)\cdot(u,v,w,0)}

\end{aligned}

\]

By substituting the value for \(t\) into the parametric line, we get the intersection point of the line and the plane. If the line runs parallel to the plane, the dot product of the plane normal and the line direction (which is the denominator) will be zero.

Vector4 PL = new Vector4(a,b,c,d); Vector4 P = new Vector4(x0,y0,z0,1); Vector4 D = new Vector4(u,v,w,0); den = dot(PL,D); if(den != 0) // check for div by zero { t = -dot(PL,P) / den; x = x0 + u*t; y = y0 + v*t; z = z0 + w*t; }

One method of ray-triangle intersection is basically a two-step process, the first step being computing the point on the plane that intersects the ray (line). The next is to determine if the point lies inside the triangle. Barycentric coordinates is a common method of doing this, but another method would be to create lines in the plane with directions such that a point inside the triangle would be on the same side of each line (i.e. the point would be on the left or right sides of each line). The side of the line can be computed using the scaled signed distance as detailed above.

Just to illustrate how general this method is, let's take a more advanced example:

\[ \begin{aligned} C(x,y) &= x^2+y^2-1 = 0 \\

L(x(t),y(t)) &= \begin{cases} x &= 2t-1 \\ y&=8t^2-9t+1 \end{cases} \end{aligned}\]

The picture above shows the curves. The red curve is the parametric curve and the black curve is the implicit curve. As you can see, there are 4 real intersections, and since both curves are degree-2 we should end up with all real roots.

Inserting definitions of the parametric equations into the implicit form, we get:

\[

\begin{aligned}

f(x(t),y(t)) &= (2t-1)^2+(8t^2-9t+1)^2-1 \\

&= 64t^4-144t^3+101t^2-22t+1 \\

&= 0

\end{aligned}

\]

This is a quartic polynomial, so a more advanced numeric root finding method needs to be used, like bisection, regula falsi, or Newton's method. The roots of the polynomial are t = 0.06118, 0.28147, 0.90735, and 1.0. We do have all real roots, so Bezout's theorem is satisfied.

If the parametric curves are rational, then this method needs to be slightly modified. Rational parametric curves are usually of the form:

\[x = \frac{a(t)}{c(t)}, \, y = \frac{b(t)}{c(t)}\]

We can use homogeneous coordinates here really well. Since we have the mapping \((x,y) = (X/W,Y/W)\), we can define each homogeneous coordinate as \(X = a(t),\,Y = b(t),\,W=c(t)\). We also need to modify the implicit curve to handle homogeneous coordinates too, but this isn't very hard. Let's illustrate with an example:

\[

\begin{aligned}

S(x,y) &= x^2+2x+y^2+1 = 0 \\

P(x(t),y(t)) &= \begin{cases}

x &= \frac{t^2+t}{t^2+1} \\

y &= \frac{2t}{t^2+1} \\

\end{cases}

\end{aligned}\]

Our parametric curve in homogeneous coordinates is \(X = t^2+t\),\(Y=2t\),and \(W={t^2+1}\). Our implicit curve can be changed to use homogeneous coordinates by substituting in the mapping \((x,y) = (X/W,Y/W)\):

\[

\begin{aligned}

S &= \left(\frac{X}{W}\right)^2+2\left(\frac{X}{W}\right)+\left(\frac{Y}{W}\right)^2+1 \\

&= X^2+2XW+Y^2+W^2 \\

\end{aligned}

\]

We can substitute our homogeneous coordinate equation into our modified implicit function:

\[

\begin{aligned}

S &= (t^2+t)^2 + 2(t^2+t)(t^2+1) + (2t)^2 + (t^2+1)^2 \\

&= 4t^4+6t^3+5t^2+4t+1 \\

&= 0 \\

\end{aligned}

\]

We get 2 real roots at t = 0.3576 and t = 1, and 2 complex roots. Both curves are degree-2, so we expect 4 intersections. We have 4 intersections total, so Bezout's theorem is satisfied.

The way to solve intersections of 2 parametric curves is via

A Bezier curve is just a degree-n Bernstein polynomial, which means it's just a regular polynomial of a different form. The above methods work well for Bezier curves, but they are more efficient and more numerically stable if modified slightly to take advantage of the Bernstein form. We won't cover this here, but there is literature out there on this as well.

These common methods of intersection testing can greatly aid any game programmer, whether purely for graphics or other game-specific logic. Hopefully you can make some use of these simple, yet powerful techniques.

A lot of the information here was taught to me by Dr. Thomas Sederberg (associate dean at Brigham Young University, inventor of the T-splines technology, and recipient of the 2006 ACM SIGGRAPH Computer Graphics Achievement Award). The rest came from my own study of relevant literature. The article image is from his self-published book,

Written by Michael Schmidt Nissen

Version 0.5.2, December 18th 2014

The damped spring based on Hooke's law of elasticity is often regarded as a black sheep in the field of game physics, and some papers and websites even warn the reader against implementing them. There are a couple of good reasons for this.

The most straight-forward reason is that springs are notoriously difficult to tune. It can be very time consuming to adjust the spring stiffness and damping coefficients in order to make a simulation behave just right. For large systems of interconnected particles and springs the task can be daunting. If a variable is changed, for instance a spring coefficient, the mass of a particle, or the simulation time-step, you might have to re-tune the entire system from scratch.

A more profound reason for avoiding springs is that they are potentially unstable. When increasing the spring coefficients above some unknown threshold, the simulation will start to oscillate chaotically and explode in all directions. Everyone who ever worked with springs has experienced this frustrating phenomenon. The problem is that there’s no good method to determine in advance if a given spring system will show this behavour or not. Thus, the programmers or game designers ability to balance a spring system often rely on experience and gut feeling, which makes it look more like black magic than actual physics.

In this article I’ll introduce a modified spring equation that is easy to tune, displays maximum stiffness and damping, and is guaranteed to keep the spring system stable under all conditions. The math behind the equation is surprisingly simple, which makes me wonder why it doesn’t appear to have been mentioned or implemented earlier.

The improved spring equation stems from trying to answer a simple question: Is it possible to make a spring that reaches its rest state in just one simulation loop? It turns out that the answer is yes! This wouldn’t be possible in the real world, since it would require infinitely high stiffness and damping coefficients or zero mass. But in the not-quite real world of discrete, time-stepping physics simulation, this is achievable.

To explain how this is done, let's look at a one dimensional harmonic oscillator with a damped spring attached to a fixed point in one end and to a moving body in the other. Assume that the body is displaced distance

Distance can be expressed as velocity multiplied by time

\[ \large x = -{v} {\Delta t} \]

The minus sign simply means that we're moving in the opposite direction of the displacement. Velocity can be expressed as acceleration multiplied by time

\[ \large x = -{a} {\Delta t^{2}} \]

Newton's 2nd law of motion states that acceleration equals force over mass

\[ \large x = -\dfrac{f}{m} {\Delta t^{2}} \]

Now we can isolate force

\[ \large f= -\dfrac{m}{\Delta t^{2}} x \]

And since the spring force in Hooke's law of elasticity is defined as

\[ \large f= -k x \]

It becomes immediately clear that

\[ \large k = \dfrac{m}{\Delta t^{2}} \]

Which is the exact spring stiffness coefficient value required to move the particle back to its rest position in one simulation loop. However, since we are not doing anything to stop the particle, it will keep oscillating back and forth through the rest position. We need to add damping, which is done with the second part of the spring equation.

Velocity is equal to acceleration multiplied by time step

\[ \large v = -{a} {\Delta t} \]

Which is equal to force over mass

\[ \large v = -\dfrac{f}{m} {\Delta t} \]

When isolating force we get

\[ \large f = -\dfrac{m}{\Delta t} v \]

And since the damping force is defined

\[ \large f = -d v \]

We immediately see by inspection that the damping coefficient is

\[ \large d = \dfrac{m}{\Delta t} \]

This is the exact damping coefficient value needed to stop the particle in one simulation loop. Now we can write the complete damped spring equation.

\[ \large f= -\dfrac{m}{\Delta t^{2}} x -\dfrac{m}{\Delta t} v \]

This spring equation has some very interesting properties. The first thing we notice is the lack of coefficients. We've simply replaced

Now we have a really powerful spring equation. It is easy to implement, very stable, and reaches its rest state in just one loop. But in our quest towards a better spring it has lost its springiness. We need to somehow get the softness and bounciness back again, and for this purpose we will re-introduce the spring and damping coefficients. To avoid confusion, the new coefficients are named

\[ \large f= -\dfrac{m}{\Delta t^{2}} C_{k} x -\dfrac{m}{\Delta t} C_{d} v \qquad [ 0 \leq C_{k}, C_{d} \leq 1 ] \]

As we can see, the new coefficients are simply the fraction of completely rigid behavior we would like the spring to display. Soft, bouncy springs would usually have values in the range of 0.001 – 0.00001. In the other end of the scale, values of just 0.1 – 0.01 is enough to display rigid behavior. Setting both values to 1.0 would of course still satisfy the constraint in one loop.

Please notice that spring behavior is determined exclusively by these two coefficients. Particle mass or simulation time-step has no influence on how the spring behaves, and changing them wouldn't make it necessary to tune the system!

Interestingly, the spring will get less rigid and less stable if we increase

\[ \large k_{max} = \dfrac{m}{\Delta t^{2}} \]

\[ \large d_{max} = \dfrac{m}{\Delta t} \]

This allows us to simplify the spring equation

\[ \large f= -k_{max} C_{k} x -d_{max} C_{d} v \]

There is an important conclusion to be drawn from this. It is a misunderstanding to think that spring stiffness and damping can be increased towards infinity by simply increasing

It is only slightly more complicated to constrain two free-moving particles with the improved spring. To do this, we need to introduce the concept of

\[ \large m_{red} = \dfrac{m_1 m_2}{(m_1 + m_2)} \]

Since the inverse mass quantity is often already pre-computed for other purposes, it can also be useful to define reduced mass as

\[ \large m_{red}= \dfrac{1}{ ( \dfrac{1}{m_1} + \dfrac{1}{m_2} ) } \]

For two connected particles, the maximum coefficient values are

\[ \large k_{max} = \dfrac{m_{red}}{\Delta t^{2}} \]

\[ \large d_{max} = \dfrac{m_{red}}{\Delta t} \]

When replacing mass with reduced mass we get

\[ \large f= -\dfrac{m_{red}}{\Delta t^{2}} C_{k} x -\dfrac{m_{red}}{\Delta t} C_{d} v \]

This spring equation will bring the two connected particles to equilibrium distance in one loop and make them stand still relative to each other. However, since energy and momentum are conserved, the particles may rotate around each other, which will make the bodies come to rest at a larger distance depending on how fast they rotate.

The spring equation can also be used for angular springs, also commonly named rotational or torsional springs. Rather than keeping two particles at a fixed distance by applying opposite equal forces, the angular spring will try to keep two rotating bodies at a fixed angle by applying opposite equal torques. The equation introduces the concept

\[ \large I_{red} = \dfrac{I_1 I_2}{(I_1 + I_2)} \]

Or alternatively

\[ \large I_{red}= \dfrac{1}{ ( \dfrac{1}{I_1} + \dfrac{1}{I_2} ) } \]

The maximum coefficient values for angular springs are

\[ \large k_{max} = \dfrac{I_{red}}{\Delta t^{2}} \]

\[ \large d_{max} = \dfrac{I_{red}}{\Delta t} \]

When replacing the variables of linear motion with those of angular motion we get

\[ \large \tau = -\dfrac{I_{red}}{\Delta t^{2}} C_{k} \theta -\dfrac{I_{red}}{\Delta t} C_{d} \omega \]

This spring equation will keep two rotating bodies at any given rest angle. If both coefficients are set to 1, the constraint will be solved in one loop. The spring allows for angular displacements larger than one full turn, making it possible to “wind up” bodies like the coil spring of a mechanical clock.

Today a lot of physics engines and simulation methods are based on impulses - direct changes in velocities - rather than forces and acceleration. The linear and angular spring equations described above works equally well if we redesign them to work with impulses. Here are the equations without further ado

\[ \large J_{linear}= -\dfrac{m_{red}}{\Delta t} C_{k} x - C_{d} v \]

\[ \large J_{angular} = -\dfrac{I_{red}}{\Delta t} C_{k} \theta - C_{d} \omega \]

In this particular case there is no difference between the force-based and impulse-based spring. They should return the exact same results.

When connecting a multitude of springs and particles into larger bodies, we run into the same trouble as any other type of distance and velocity constraint. Rather than cooperating, the constraints tend compete against each other, and this spells trouble. When a spring moves two particles to satisfy distance and velocity, it usually means dissatisfying one or several other springs. It is outside the scope of this article to dig deeply into this problem, but the author would like to provide a bit of advice on how to prevent the worst disasters.

When two or more springs are connected to the same particle, which is the case in any kind of rope, mesh, or deformable body, setting the coefficients to the maximum value of 1.0 will lead to stability problems. Although the spring equation is stable when used with a single or two particles, this is sadly not the case for higher number of particles. The author of this article has after some lengthy tampering worked out that a safe upper limit for both the stiffness and damping coefficient is

\[ \large c_{max} \approx \dfrac{1}{n + 1} \]

Where

http://en.wikipedia.org/wiki/Reduced_mass]]>

The quickest way to do a single evaluation of a point on the Bezier surface is to evaluate the Bernstein basis functions directly. Recall the formal Bernstein basis function form:

\[

B(t) = \binom{n}{i} (1-t)^{n-i} t^i

\]

Naive evaluation will take a long time computationally, especially considering that the binomial term is often expressed in terms of factorials. Let's see if we can't use a trick or two to cut this down. Bernstein basis functions have some unique properties. Leveraging these properties will help us do more quick evaluations. Here's a short list of the properties of Bernstein basis functions:

- Nonnegativity: \(B_{i,n}(u) \geq 0 \,\, \forall i,n \, \text{and} \, 0 \leq u \leq 1\)
- Partition of Unity: \(\sum \limits_{i=0}^n B_{i,n}(u) = 1\)
- Endpoint Interpolation: \(B_{0,n}(0) = B_{n,n}(1) = 1\)
- Single Maximum: \(\text{max} \left ( B_{i,n}(u) \right ) \, \text{at} \, u=\frac{i}{n} \)
- Recursive: \( B_{i,n}(u) = (1-u)B_{i,n-1}(u) + uB_{i-1,n-1}(u),\,\,B_{i,n}(u) = 0 \, \text{if} \, i \le 0 \, \text{or} \, i \ge n \)
- Derivative: \( \prime{B}_{i,n}(u) = n \left ( B_{i-1,n-1}(u) - B_{i,n-1}(u) \right ), \, B_{-1,n-1}(u) = B_{n,n-1}(u) = 0 \)

The recursive property allows us to construct higher order basis functions from lower order ones. From the Bernstein basis definition, we know that \(B_{0,0} = 1]), so we can set up a loop to construct the basis functions from degree-0 to degree-n. For Bernstein polynomials of degree n, we will have (n+1) basis functions at the end. Code in C# to compute the Bernstein basis functions would look like this:

public double[] GetBasisFunctions(int n, double t) { double[] F = new double[n+1]; F[0] = 1; double u = 1.0 - t; for(int j = 1; j <= n; j++) { for(int i = j; i >= 0; i--) { double lp = (i > n ? 0 : u * F[i]); double rp = (i - 1 < 0 ? 0 : t * F[i-1]); F[i] = lp + rp; } } return F; }

Bernstein basis forms are good for evaluating single points, but what if you want to evaluate a lot of points all at once, like a grid? Let's look at the matrix formulation of a Bezier curve. This is actually a good starting point because we can expand this to work for surfaces. To "derive" the formulas, let's take the expansion of a cubic (degree-3) Bezier curve:

\[ B(t) = \binom{3}{0}P_0(1-t)^3 + \binom{3}{1}P_1t(1-t)^2 + \binom{3}{2}P_2(1-t)t^2 + \binom{3}{3}P_3t^3 \]

This isn't anything you haven't seen before. But, we can convert this into an equivalent matrix form:

\[

B(t)=

\begin{bmatrix} P_0 & P_1 & P_2 & P_3 \\ \end{bmatrix}

\begin{bmatrix} \binom{3}{0}(1-t)^3 \\ \binom{3}{1}t(1-t)^2 \\ \binom{3}{2}t^2(1-t) \\ \binom{3}{3}t^3 \\ \end{bmatrix}

\]

If you're not comfortable with this, multiply it out and check the result. This is a nice math trick, but there is no significant advantage to this, is there? Not yet, but there can be if we notice that we've separated the equation so that the parametric variable

\[

B(t)=

\begin{bmatrix} P_0 & P_1 & P_2 & P_3 \\ \end{bmatrix}

\begin{bmatrix} \binom{3}{0}(-t^3+3t^2-3t+1) \\ \binom{3}{1}(t^3-2t^2+t) \\ \binom{3}{2}(-t^3+t^2) \\ \binom{3}{3}t^3 \\ \end{bmatrix}

\]

This looks like it's getting complicated. Well, it is and it isn't. If we split this column vector up into a couple of more matrices (I promise we're at the end of this), we can get something more simplified:

\[

B(t)=

\begin{bmatrix} P_0 & P_1 & P_2 & P_3 \\ \end{bmatrix}

\begin{bmatrix}

\binom{3}{0}(-1) & \binom{3}{0}(3) & \binom{3}{0}(-3) & \binom{3}{0}(1) \\

\binom{3}{1}(1) & \binom{3}{1}(-2) & \binom{3}{1}(1) & 0 \\

\binom{3}{2}(-1) & \binom{3}{2}(1) & 0 & 0 \\

\binom{3}{3}(1) & 0 & 0 & 0 \\

\end{bmatrix}

\begin{bmatrix} t^3 \\ t^2 \\ t \\ 1 \end{bmatrix}

\]

Isn't that beautiful? We've reduced the Bezier curve to 3 matrices and separated out the parametric variable into a simple column vector. If we want to evaluate the whole curve efficiently, we can simply multiply the first 2 matrices, and multiply it by the column vector for every parameter value. The simple form for the triangular part of the matrix holding the binomial coefficient can be calculated with this formula:

\[ M(i,j) = \begin{cases}

\binom{n}{i} \binom{n-i}{j} (-1)^{n+i-j}; & 0 \le i \le n, \, 0 \le j \le n-i \\

0, & \text{otherwise} \\

\end{cases}

\]

Now, this form isn't just elegant, it's useful. For what? Well...for a couple things. First off, we can get derivative information here really easily. The derivative of the curve is dB/dt, so by differentiating the parametric terms, we can calculate the derivative. And since the parametric terms are all collected in a column vector, that's the only matrix we need to change! To calculate the derivative in the example, it's just:

\[

B'(t)=

\begin{bmatrix} P_0 & P_1 & P_2 & P_3 \\ \end{bmatrix}

\begin{bmatrix}

\binom{3}{0}(-1) & \binom{3}{0}(3) & \binom{3}{0}(-3) & \binom{3}{0}(1) \\

\binom{3}{1}(1) & \binom{3}{1}(-2) & \binom{3}{1}(1) & 0 \\

\binom{3}{2}(-1) & \binom{3}{2}(1) & 0 & 0 \\

\binom{3}{3}(1) & 0 & 0 & 0 \\

\end{bmatrix}

\begin{bmatrix} 3t^2 \\ 2t \\ 1 \\ 0 \end{bmatrix}

\]

Now, you might be saying, "Hey, this form is way harder than the control point subtraction method of finding the derivative of the curve!" Well, to that I say....yeah, it is...for curves at least. When we discuss surfaces, you'll thank me for this. I promise.

The other thing this form allows us to do is to express surfaces very simply.

In math jargon, a Bezier surface patch is called a tensor-product surface because it is the product of pairs of univariate blending functions, in this case, the Bernstein basis functions. This might be alien speak to you, but don't worry, you won't have to understand it. Really, this means a couple of things:

- This surface is formed from a rectangular grid of control points.
- This surface is a function of two independent variables.
- This surface depends on two separate blending functions.

\[ S(u,v) = \sum_{j=0}^m \sum_{i=0}^m P_{ij}B_i^m(u)B_j^n(v) \]

Here, you can see how similar this is to a Bezier curve. The main differences are the presence of another variable and thus another blending function, and the control points form a grid or net. To evaluate a point on the surface, simply plug in the values for

The only thing we have to understand now is that there are 2 parameter coordinates for each point on the surface,

For this method, we can evaluate the Bernstein basis functions for the u- and v-parameters. Then, we just have to multiply the correct evaluated Bernstein basis functions together with the control point, just like in the formal definition.

As mentioned before, this method is fine for evaluating a few points on the surface, but it's not optimal for quick evaluation of the entire surface. This is because we have to compute new Bernstein basis function sets for each (u,v) points. For large sets of evaluation points, the matrix method is better.

To make it look even more like the Bezier curve form, we can convert this to matrix form using the technique as explained above. The resulting form for a bicubic surface patch (cubic in both u- and v-parameter directions) is as follows:

\[

S(u,v) =

\begin{bmatrix} v^3 & v^2 & v & 1 \\ \end{bmatrix}

\begin{bmatrix}

\binom{3}{0}(-1) & \binom{3}{0}(3) & \binom{3}{0}(-3) & \binom{3}{0}(1) \\

\binom{3}{1}(1) & \binom{3}{1}(-2) & \binom{3}{1}(1) & 0 \\

\binom{3}{2}(-1) & \binom{3}{2}(1) & 0 & 0 \\

\binom{3}{3}(1) & 0 & 0 & 0 \\

\end{bmatrix}

\begin{bmatrix}

P_{00} & P_{01} & P_{02} & P_{03} \\

P_{10} & P_{11} & P_{12} & P_{13} \\

P_{20} & P_{21} & P_{22} & P_{23} \\

P_{30} & P_{31} & P_{32} & P_{33} \\

\end{bmatrix}

\begin{bmatrix}

\binom{3}{0}(-1) & \binom{3}{0}(3) & \binom{3}{0}(-3) & \binom{3}{0}(1) \\

\binom{3}{1}(1) & \binom{3}{1}(-2) & \binom{3}{1}(1) & 0 \\

\binom{3}{2}(-1) & \binom{3}{2}(1) & 0 & 0 \\

\binom{3}{3}(1) & 0 & 0 & 0 \\

\end{bmatrix}

\begin{bmatrix} u^3 \\ u^2 \\ u \\ 1 \end{bmatrix}

\]

Again, if you're not comfortable, multiply it out and check this for yourself. As you can see, in this form the surface really doesn't look too much different than the curve! Sure, there's another set of matrices on the left side and the center matrix is now not a vector, but that's really it. A simplified form of this is:

\[ S(u,v) = \mathbf{v}\mathbf{M_v}\mathbf{P}\mathbf{M_u}\mathbf{u} \]

The Mu and Mv matrices are constructed with the same formula as above. Notice, however, that the patch doesn't need to be bicubic; each direction can be a different degree. If in the u-direction the patch is quadratic (degree-2) and quartic (degree-4) in the v-direction, it only really changes the size of the matrices, not the formulation. In this example, the Mu matrix would be 3x3, the u-parameter matrix would be 3x1, the Mv matrix would be 5x5, the v-parameter matrix would be 1x5, and the control point grid matrix would be 5x3. Thus, the multiplication would produce a 1x1 matrix, the evaluated point on the surface.

If we want to evaluate the surface quickly, we can precompute the multiplication of the center 3 matrices (Mv*P*Mu) and then change the u- and v-parameter matrices for every value and then multiply to get the points on the surface:

\[ S(u,v) =

\begin{bmatrix} v^3 & v^2 & v & 1 \\ \end{bmatrix}

\mathbf{M_vPM_u}

\begin{bmatrix} u^3 \\ u^2 \\ u \\ 1 \end{bmatrix}

\]

This is all fine and good to get the surface points, but when we render surfaces, we need surface normals! Thus, we'll need to compute derivatives. Normally, this would be somewhat difficult. Although there are other ways of computing the derivative, we'll illustrate this using the matrix formulation.

In the curve example, we saw how only the column vectors needed to be changed in order to get the derivative. We can do something similar to get the derivatives at evaluated points. However, we can't just take the derivative of both

It's because the surface depends on more than one variable. Because the surface has 2 directions, you have to take partial derivatives to get a tangent vector in each parameter direction (

Thus, for the tangent in the u-direction:

\[ \frac{\partial S}{\partial u} =

\begin{bmatrix} v^3 & v^2 & v & 1 \\ \end{bmatrix}

\mathbf{M_vPM_u}

\begin{bmatrix} 3u^2 \\ 2u \\ 1 \\ 0 \end{bmatrix}

\]

And for the tangent in the v-direction:

\[ \frac{\partial S}{\partial v} =

\begin{bmatrix} 3v^2 & 2v & 1 & 0 \\ \end{bmatrix}

\mathbf{M_vPM_u}

\begin{bmatrix} u^3 \\ u^2 \\ u \\ 1 \end{bmatrix}

\]

Notice that for the tangents in each direction, the center matrix doesn't change. Now, computing the surface normal is as simple as computing the cross product: \( N = \frac{\partial S}{\partial u} \times \frac{\partial S}{\partial v} \).

Now you should have enough knowledge to be able to make cool-looking Bezier surfaces! There are other techniques to generate surface points and normals, but this should get you started. Experiment around and if you've got other neat tricks for Bezier surfaces, let us all know!

A lot of the information here was taught to me by Dr. Thomas Sederberg (associate dean at Brigham Young University, inventor of the T-splines technology, and recipient of the 2006 ACM SIGGRAPH Computer Graphics Achievement Award). The rest came from my own study of relevant literature. The images were taken from Dr. Sederberg's book,

To quickly sum things up, Bezier (pronounced [be.zje]) curves use the Bernstein basis functions multiplied by

A Bernstein polynomial is simply where the basis functions are multiplied by coefficients and summed: \( B_{i,n}(t) = \sum_{i=0}^n c_ib_{i,n}(t) = \sum_{i=0}^n c_i \binom{n}{i} t^i (1-t)^{n-i} \), where

You've probably seen this equation so many times you know it by heart: \[ C(t) = \sum_{i=0}^n \binom{n}{i} (1-t)^{n-i} t^i P_i \] N-dimensional Bezier curves can be expressed as N Bernstein polynomials where the coefficients of the Bernstein polynomials are the control points in a certain dimension. For example, a 2D Bezier curve can be expressed as 2 Bernstein polynomials, one for the x-direction and one for the y-direction:

\[ \begin{cases} x(t) &= \sum_{i=0}^n \binom{n}{i} (1-t)^{n-i} t^i x_i \\ y(t) &= \sum_{i=0}^n \binom{n}{i} (1-t)^{n-i} t^i y_i \\ \end{cases} \]

The control points are given as \( P_i = (x_i,y_i) \). There are n+1 control points for a degree-n Bezier curve. 3D and 4D curves can be expressed in this same fashion. Bezier curves can have any bounds on the

\[ C_{[t_0,t_1]}(t) = \sum_{i=0}^n \binom{n}{i} \left ( \frac{t_1-t}{t_1-t_0} \right )^{n-i} \left ( \frac{t-t_0}{t_1-t_0} \right )^{i} \]

Bezier curves allow the user to intuitively control the shape of the curve. As the control points move in a certain direction, the curve moves in that direction too. We can gain some additional control of our curve by creating rational Bezier curves.

Before we get too far into rational Beziers, let's go over evaluation of Bezier curves. We could just do the simple evaluation of Bezier curves and get our answer. Let's say we're not so naive and used the recurrence relation for the binomial coefficient (if you're not familiar with the binomial coefficient, read the Wikipedia article): \( \binom{n}{i} = \frac{n-i+1}{i} \binom{n}{i-1} \). For 3D curves, the pseudocode might go like this:

function evaluate(double t) { x = 0, y = 0, z = 0; nChooseI = 1; for(int i=0;i<controlpoints.length;i++) { u = pow(t,i); uComp = pow(1-t,i); p = u * uComp; if(i != 0) { nChooseI *= (n-i+1)/i; } x += nChooseI * controlpoints[i].x * p; y += nChooseI * controlpoints[i].y * p; z += nChooseI * controlpoints[i].z * p; } return x,y,z; }

Even with all of our tricks, we're still computing the power terms on each loop. So what can we do to get faster? Turns out there's 4 simple methods for quick evaluation:

- Use the de Casteljau method.
- Use a modified form of Horner's algorithm.
- If you want to compute the whole curve quickly, use a forward difference table.
- **Construct a set of matrices and multiply for evaluation.

The de Casteljau method is a technique that takes advantage of the nature of the Bernstein basis functions. To start, let's say the t-parameter of a Bezier curve is between 0 and 1. If that curve is a degree-1, it represents a line between the 2 control points. If t=0, the evaluated point is P0; if t=1, the evaluated point is P1. If t=0.5, the point is halfway between the control points. The t-parameter tells us how far to go between P0 and P1. The Bezier curve takes the form \( P(t) =(1-t)P_0+tP_1 \). This is how linear interpolation is done.

Now let's say we had a degree-2 curve. The Bernstein basis functions would arise if we linearly interpolated between P0 and P1, between P1 and P2, then linearly interpolated between those 2 linearly interpolated points. The t-parameter would tell us how far to interpolate at each step. Mathematically, it would look something like this:

\[

\begin{align}

P(t)&=[(1-t)P_0+tP_1](1-t)+[(1-t)P_1+tP_2]t \\

P(t)&=(1-t)^2P_0+t(1-t)P_1+t(1-t)P_1+t^2P_2 \\

P(t)&=(1-t)^2P_0+2t(1-t)P_1+t^2P_2 \\

\end{align}

\]

The Wikipedia article shows how the curve is constructed in this manner. Since the degree for this is 2, 2 linear interpolations are done. The higher order the polynomial, the more linear interpolations are done. This is precisely how the de Casteljau method works. Mathematically, the de Casteljau method looks like this:

\[ P_i^j = (1-\tau)P_i^{j-1}+\tau P_{i+1}^{j-1}; j=1,\dots,n; i=0,\dots,n-j \]

The tau must be between 0 and 1. Incidentally, the de Casteljau method is useful for breaking up a Bezier curve into 2 Bezier curves. In the formula above, the points \( P_0^0,P_0^1,\dots,P_0^n \) are the control points for the Bezier curve on the interval \( (0, \tau) \) of the original curve, and the points \( P_0^n,P_1^{n-1},\dots,P_n^0 \) are the control points for the Bezier curve on the interval \( (\tau, 1) \) of the original curve.

The de Casteljau method is really nice because it's numerically stable, meaning that repeated use of it won't introduce numerical error into your calculations and throw them off.

The following code is the modified version of Horner's algorithm for quick evaluation. It takes advantage of the recurrence relation for the binomial coefficient as well as implements the nested multiplication like the classic Horner's algorithm:

function eval_horner(t) { u = 1 - t; bc = 1; tn = 1; tmp = controlpoints[0] * u; for(int i = 1; i < n; i++) { tn *= t; bc *= (n-i+1)/i; tmp = (tmp + tn * bc * controlpoints[i]) * u; } return (tmp + tn * t * controlpoints[n]); }

If you're evaluating a curve at evenly-spaced intervals, forward differencing is the fastest method, even faster than Horner's algorithm. For a degree-n polynomial, Horner's algorithm requires

This method requires n+1 curve evaluations to initiate the forward differencing. The easiest explanation of the method is to demonstrate it. Let's say that we had a degree-4 polynomial and the first 5 evaluations were: f(i)=1, f(i+1)=3, f(i+2)=2, f(i+3)=5, and f(i+4)=4. We build a table by subtracting f(i+1)-f(i), putting that value in the next row, and so on, like the following:

\[

\begin{matrix}

t & t_i & t_{i+1} & t_{i+2} & t_{i+3} & t_{i+4} & t_{i+5} & t_{i+6} & t_{i+7} \\

f(t) & 1 & 3 & 2 & 5 & 4 & & & \\

\delta_1(t) & 2 & -1 & 3 & -1 & & & & \\

\delta_2(t) & -3 & 4 & -4 & & & & & \\

\delta_3(t) & 7 & -8 & & & & & & \\

\delta_4(t) & -15 & & & & & & & \\

\end{matrix}

\]

The method states that the number in the last row, -15 in this case, will stay constant for every column in that row. That means that if we put -15 in the next column, then add -15 to the number above it (-8), store that value in the next column on that row, and do this until we reach the value row, we can compute the value for f(i+5). Repeat for f(i+6), and all successive values for f(t). It would look something like this:

\[

\begin{matrix}

t & t_i & t_{i+1} & t_{i+2} & t_{i+3} & t_{i+4} & t_{i+5} & t_{i+6} & t_{i+7} \\

f(t) & 1 & 3 & 2 & 5 & 4 & -24 & -117 & -328 \\

\delta_1(t) & 2 & -1 & 3 & -1 & -28 & -93 & -211 & \\

\delta_2(t) & -3 & 4 & -4 & -27 & -65 & -118 & & \\

\delta_3(t) & 7 & -8 & -23 & -38 & -53 & & & \\

\delta_4(t) & -15 & -15 & -15 & -15 & & & & \\

\end{matrix}

\]

This method is fast, but it's numerically unstable. This means that eventually, over the course of the forward difference, errors creep in and throw off the computation. This means a lot to the CAD world, but for rendering forward differencing is usually sufficient.

There is a matrix evaluation that can be quick as well, but that discussion is better suited to the discussion on Bezier surfaces. See the Bezier surface article here.

A rational Bezier curve adds a weight to each control point. As the weight increases, the curve bends toward that control point. The formulation for a rational Bezier curve is given by:

\[

B(t) = \frac{\sum_{i=0}^n w_i \, P_i \, b_i(t)}{\sum_{i=0}^n w_i \, b_i(t)} = \left (

\frac{\sum_{i=0}^n w_i \, x_i \, b_i(t)}{\sum_{i=0}^n w_i \, b_i(t)},

\frac{\sum_{i=0}^n w_i \, y_i \, b_i(t)}{\sum_{i=0}^n w_i \, b_i(t)},

\frac{\sum_{i=0}^n w_i \, z_i \, b_i(t)}{\sum_{i=0}^n w_i \, b_i(t)}

\right )

\]

As you can see, if all the weights are 1, then the curve becomes a normal Bezier curve. Rational Bezier curves help us because:

- Rational Bezier curves can represent conic sections exactly.
- Perspective projections of Bezier curves are rational Bezier curves.
- The weights allow for additional control of the curve.
- Reparameterization of the curve is accomplished by changing the weights in a certain fashion (which I won't get into here).

Rational curves allow us to represent circular arcs (since a circle is a conic section). The image below shows what this would look like:

A degree-2 rational Bezier curve can represent an arc. The control points P0 and P2 must be on the circle, but the control point P1 is at the intersection of the extended tangent lines of the circle at P0 and P2. The weights of the control points P0 and P2 are 1, but the weight of control point P1 is cos(theta)/2, where theta is the angle of the arc. A degree-3 (cubic) Bezier curve can also represent a circular arc. Like in the degree-2 case, the endpoints P0 and P3 must be on the circle, but P1 and P2 must lie on the extended tangent lines of the circle at P0 and P3, respectively. The distance that P1 and P2 must be from P0 and P3, respectively, is given by the equation \( e = \frac{2\sin{\theta / 2}}{1+2\cos{\theta / 2}} \).

Yup, just like the title says, rational Bezier curves can be represented as 4D Bezier curves. I know some of you might be scratching your head and clicking to a new page, but let me explain. If we create a mapping from a 4D curve to our 3D rational curve, we could do some normal Bezier curve evaluation using the algorithms above and not have to modify them. The mapping is simple: if in 3D we have (x,y,z) and weight w, then in 4D we can express it as (X,Y,Z,W) = (x*w, y*w, z*w, w). That means that the conversion from 4D to 3D is as simple as division. This does create weird equalities, such as point (x,y,z) in 3D maps to multiple points in 4D, such as (2x, 2y, 2z, 2), (3x, 3y, 3z, 3), and so on. For our purposes, this is just fine, so just something to note. This mapping works for rational curves because, as we stated above, when all the weights equal 1, then the rational curve is identical to a normal Bezier curve. By elevating the problem to a higher dimension, we gain a lot of computational simplicity. (As an aside, a similar mapping can be made from 2D rational Bezier curves to 3D normal Bezier curves, and you can picture the 3D-to-2D mapping as the projection of the 3D curve onto the plane z=1.)

In 4D, that means a 3D rational Bezier curve takes the following form:

\[

\begin{cases}

X(t)&=x_iw_ib_i(t)\\

Y(t)&=y_iw_ib_i(t)\\

Z(t)&=z_iw_ib_i(t)\\

W(t)&=w_ib_i(t)\\

\end{cases}

\]

If you look at the top 3 equations, they're identical to the expanded numerator of the rational Bezier definition. If you notice, the bottom equation is identical to the denominator of the rational Bezier. This means that if we want to compute a 3D rational Bezier curve, we can simply elevate it to 4D space, do one more additional computation in our curve evaluation, then just do the perspective divide of the point back to 3D. In fact, we can elevate all our regular Bezier curves into 4D by making the weights 1, and then just do perspective divide to get the point back to 3D.

One of the primary difficulties with the Bezier formulation is that each control point influences the entire curve. Also, the number of control points is directly tied to the degree of the resulting Bernstein polynomials. Thus, a Bezier curve with 6 control points yields a 5th degree polynomial. Evaluating higher degree polynomials takes extra computation time (even with Horner's algorithm, so we want to avoid having high degree polynomials if possible. These problems have been overcome by B-splines. B-splines aren't much harder to understand once you get Bezier curves down. In the future, if there's enough response here, I'll add a basic introduction to B-splines and easy ways to generate them (not necessarily computationally efficient) and do them by hand too.

A lot of the information here was taught to me by Dr. Thomas Sederberg (associate dean at Brigham Young University, inventor of the T-splines technology, and recipient of the 2006 ACM SIGGRAPH Computer Graphics Achievement Award). The rest came from my own study of relevant literature.]]>

Of course, you probably already knew that. Anyhow, the purpose of this article is to present the first of two methods for modelling collisions in your own games. I'm hoping to release another article detailing the second method in the near future. The method described in this article, which I call

I call the second method

I created these methods (or rather implemented them, as it's likely they've been used before) with two aims. First off, each method should allow me to detect collisions, which is an obvious requirement of collision detection! Second, each method should handle collisions in such a way as to 'slide' the player/object around the obstacle rather than bringing them to an awkward halt as soon as they touch it.

As mentioned, this system allows you to handle collisions with objects which can be enclosed by bounding cylinders. These cylinders can be defined by two numbers, or rather, a vector and a float - one containing a

The player can move the camera around in a first person view, and is able to walk among the trees. We will use the 'radial' collision detection method to prevent the camera from passing through the trees. When the player attempts to walk through a tree, the algorithm will slide them smoothly around it, as seen in pretty much all current first person games. This process involves two steps, which are repeated every frame:

**If the camera has entered a tree's bounding cylinder, register a collision.****Calculate a new location for the camera which puts it back outside the bounding cylinder.**

So, how do we do this in practice?

We'll start out by defining the location of each tree using an array of 2D vectors. We only need 2D vectors because the trees are not going to affect our motion in the up-down direction. If you wanted to use this algorithm for objects on 3D terrain rather than a 2D plane, it should work just as well with the same vector format because it's not possible for a walking person to collide with a tree from above or below, only from the side. Because of this, we only need to worry about the positions of the trees in the x-z plane, whether we have a flat terrain or not. As mentioned, we will use DirectX syntax and functions from here on in:

//The array contains 3 vectors - one for each tree: D3DXVECTOR2 v_treePositions[3] = {D3DXVECTOR2(0, 1), D3DXVECTOR2(1, 0), D3DXVECTOR2(-1, 0)};

We will also define a variable to describe the radius of each tree trunk. We will assume they're all the same radius, but if you wanted to personalise each one, you could define an array of numbers, or add a third component to the vectors describing tree locations.

//Each tree is 1m in radius: float f_treeRadius = 1.0f;

...and while we're at it, we will define a vector to hold the location of our first-person camera as it changes from frame to frame:

//This is a global vector holding the camera's location. It starts at the centre of the world (0, 0, 0): D3DXVECTOR3 v_camPos(0, 0, 0);

OK, now each tree has an entry in our program, and we know where our player's camera is from frame to frame. All that remains is to put this information to use! We will define a function which will be called every frame. This function will loop through the tree locations, comparing each one to the location of the camera for that frame. If it registers a collision (i.e. if the camera's location is closer to one of the trees than is permitted by the bounding radius), then we will push the camera back outside the bounding radius. The effect is cumulative, meaning that each tree makes its own contribution to the overall position change. This way, the algorithm also works for closely bound groups of trees where the camera could be in contact with more than one at once.

Let's go through the function line by line:

D3DXVECTOR2 handle_Collisions_Radial() { D3DXVECTOR2 v_deltaPos = D3DXVECTOR2(0, 0); for (int i = 0; i < 3, i++) { D3DXVECTOR2 v_vectorToCentre(v_camPos.x - v_treePositions[i].x, v_camPos.z - v_treePositions[i].z); if (D3DXVec2Length(&v_vectorToCentre) < f_treeRadius) { v_deltaPos += ((f_treeRadius / D3DXVec2Length(&v_vectorToCentre)) - 1) * v_vectorToCentre; } } return v_deltaPos; }

Next, we enter a loop which will iterate through all the trees. Since we have three trees, we will have three iterations.

Once the loop is complete, we return the resulting position change.

The process is fairly intuitive until the final step. This diagram explains the maths behind it a little clearer:

As you can see, the location of the camera has moved within the bounding radius of

In this equation, |

This can easily be rearranged to get:

We have now calculated our value of

//Call the function to assess collisions: D3DXVECTOR2 v_camPosCollisionDelta = handle_Collisions_Radial(); //Since this function returns a D3DXVECTOR2 and our camera position is a D3DXVECTOR3, we //will need to create a dummy D3DXVECTOR3 with no y-component to add to v_camPos: D3DXVECTOR3 v_camPosCollisionDelta3D(v_camPosCollisionDelta.x, 0, v_camPosCollisionDelta.y); //Finally, we will move the camera according to our collison delta vector: v_camPos += v_camPosCollisionDelta3D; //Now that we have our new camera position, we can continue to render the frame as normal...

That's it! You may be surprised at how well this simple method works. The effect is quite satisfying, and it certainly added an extra professional depth to my exploits in the first-person camera world. Of course, this method could be used to handle collisions involving objects other than a first-person camera, and subjects other than trees. This is also a bare-bones version of the algorithm, and it could easily be adapted to handle collisions in 2D games as well as collisions between circles and spheres and their environment (throughout, we assumed that the first person camera was a simple point rather than an object with dimensions).

I intend to write another article describing another collision method which can be used to simulate walls and interiors/exteriors of buildings and things like hedges and maze walls. Until then, thanks for reading!

A vector has a direction and magnitude, or length. One way to represent a vector is uing polar coordinates. In polar coordinates the vector's direction is measured by the angle it forms with the positive x-axis. So a vector pointing right is 0 at degress and a vector pointing up is 90 at degress. The vector also has magnitude. This is measured by the distance from the origin of the vector to its end. Rotating a vector in polar coordinates is trivial. You simply change the angle. This simple rotation operation makes the polar representation appealing, but don't take the bait. The polar representation makes other operations more difficult. As a simple example, adding two vectors in polar cannot be done. You would have to convert to rectangular first then back to polar.

A vector in rectangular coordinates is repesented by an x and y value. This is the location of the head of the vector if you were to plot it on the Cartesian plane. The appeal of rectangular coordinates may not be immediately apparent but as we go over some of the uses of vectors it becomes clear why they are preferred.

While it is better to do all of the mathematical operations in one representation when possible, it is sometimes necessary to convert between the two representations. The conversions are based on the properties of a right triangle.

To convert from polar to rectangular:

x = cos(angle) * magnitude y = sin(angle) * magnitude

And to convert back to polar:

// to get the magnitude, use the pathagorean theorem magnitude = sqrt(x * x + y * y) angle = atan2(y, x)

That is it. Just some basic trigonometry.

There is a class of vectors called Unit Vectors. Simply put, they have a length of one. This property is very useful when dealing with angles. More on that later. Any vector with a length greater than zero can be normalized or, in other words, be modified to have a length of one. In rectangular coordintes you simply divide both x and y by the magnitude.

Normalize a vector:

function normalize() { float magnitude = sqrt(x * x + y * y) x = x / magnitude y = y / magnitude }

The above code has a problem. If the length of the vector is 0 then the x and y values will become invalid. In many languages they become NAN or infinity. If you were to normalize a zero vector with the above code and then try and use it other numbers added to or multiplied by the infinity result, these values will also become infinity. For example, if you normalized the player's velocity that was zero the velocity will be infinite. When you then try and update the position using the infinite velocity, the position will become infinite as well. These invalid values can propogate through the code causing weird things to happen. The fix is fairly easy though.

function normalize() { magnitude = sqrt(x * x + y * y) if (magnitude > 0.0) { x = x / magnitude y = y / magnitude } }

Now the code will leave a zero vector as is, so it wont go corrupting all of the floating point values in the game.

Let's take an example where we have a turret and we want the turret to fire at the player. How do we calculate the heading of the bullet? Well, it is pretty straight-forward. The first step is finding a vector that represents the direction from the turret to the player.

// subtract the player position from the turret position // to get the direction direction = player.position - turret.position // the above code is really just shorthand for direction.x = player.position.x - turret.position.x direction.y = player.position.y - turret.position.y // when using vectors you should use a vector class that lets you write out // operations like a - b, or a.sub(b) instead of having to subtract // each component individually

Now we have a vector, in rectangular coordinates, that represents the direction from the turret to the player with the magnitude of the vector being distance between the two. If we used that vector as it is for the velocity then the bullet would always take one second regardless of how far the two objects were from each other. We want to be able to control the velocity of the bullet, however. If we were to use polar coordinates we could simply set the magnitude to be the velocity we want, but that would require us to convert to polar to make the change, then back to rectangular in order to update the bullet's position. We want to take the rectangular approach here.

// make the length of direction one direction.normalize() // set the velocity of the bullet bullet.velocity = speed * direction // remember, the above code is the same as bullet.velocity.x = speed * direction.x bullet.velocity.y = speed * direction.y // since speed is just a number value, not a vector, it // scales both components equalily

And that's it. The unit vector direction could also be used to rotate a sprite to face in the same direction, but that will depend on how your graphics library handles the rotation of sprites. If the graphics library expects an angle you will have to extract the angle of direction using

This is where using rectangular coordinates really starts to make sense. You may have come across this problem before. Let's imagine there are two objects moving in different directions and you want to find the angle between their movement. Let's first look at this problem using polar coordinates. This may seem like a simple problem. Simply take the absolute value of one angle minus the other. For example, given the two angles are 50 and 90, you subtract one from the other and are left with 40 degrees. But what if one angle is 310 degress and the other is 20 degrees? Since degrees go up to 360 for a full circle, 360 degrees is the same as 0 degrees, and 20 degrees is the same as 380. If we applied the basic algorithm to 20 and 310 we get 290 degrees. However, since 20 is the same direction as 380 we can substitute them out and using 380 and 310 we are left with 70 degrees, the actual rotation we were after. There are algorithms that can handle this problem while staying in polar coordinates but I am not going to show that here. We are interested in using rectangular coordinates.

To find the angle between vectors, we will use a useful little equation called the dot product and is defined as follows:

// dot product dot(a, b) = a.magnitude * b.magnitude * cos(angleBetween(a,b)) // also equal to dot(a, b) = a.x * b.x + a.y * b.y

So how can we use this equation to find the angle between

angleBetween(a,b) = acos((a.x * b.x + a.y * b.y) / (a.magnitude * b.magnitude))

This equation can be even simpler if we represent directions using unit vectors, since the magnitude of a unit vector is one. Keep in mind this only works if the two vectors are unit vectors.

angleBetweenUnitVectors(a,b) = acos(a.x * b.x + a.y * b.y) // also the same as angleBetweenUnitVectors(a,b) = acos(dot(a,b))

And there we have it, the angle between two unit vectors. This is a simple, elegant solution to the problem written as a single line of code.

Another common problem in games is changing an object's heading by a certian amount. For example, if we need to rotate the heading of an object by five degrees every frame. It isn't immediately apparent how to do this in rectagular coordinates. The answer is fairly simple. You simply multiply the two vectors together as if they were imaginary numbers. Actually, 2D vectors in essence are complex numbers where x is the real part and y is the imaginary part.

(a + bi) * (c + di) = (a*c - b*d) + (b*c + a*d)i // or using x and y function rotateVector(Vector2 a, Vector2 b) { float x = a.x * b.x - a.y * b.y; float y = a.x * b.y + a.y * b.x; return new Vector2(x, y); }

For 2D vectors, it doesn't matter the order of

function unitVector(float angle) { // this returns a unit vector in the direction of angle // it will always be of length 1 since, in this case // magnitude = cos(angle)^2 + sin(angle)^2 = 1 return new Vector2(cos(angle), sin(angle)); } // rotate the velocity by 10 degress velocity = rotateVector(velocity, unitVector(degToRads(10)))

In this situation we want to take the current heading of the object and rotate to face a new heading. A great example is a turret that can only turn a certain amount of degrees per second. We want to make sure it always turns in the most direct way. For example, if the player is directly to the turret's left we don't want the turret to turn 270 degrees to the right, we want it to turn 90 to the left. To do this, we will need to add one more tool in our toolbox: Slerping.

Slerp is short for Spherical linear interpolation. To understand how it works first imagine two vectors both coming out of the origin. Then draw a path from the end of one vector to the other. In reality, there are an infinite number of ways to get from point a to point b, but we are interested in a direct path that smoothly changes the angle of a vector. The the path we want the point to travel along is a curved path between the ends of the vectors forming circles or sections of circles. This is what slerping is and is implemented as follows:

// a and b should be unit vectors // t is a value in the range [0, 1] // when t = 0, slerp returns a // when t = 1, slerp returns b // any value between 1 and 0 will be // vector between a and b function slerp(Vector2 vectorA, Vector2 vectorB, float t) { float cosAngle = dot(vectorA, vectorB); float angle = acos(cosAngle); return (vectorA * sin((1 - t) * angle) + vectorB * sin(t * angle)) / sin(angle); }

There is a lot there to take in. You might have to do a little more research on slerping to fully understand it. You will want to use somebody else's implementation of it however, the simple implimentation shown here will have some stability issues when

// rotates rotateFrom towards rotateTo by maxDeltaTheta radians and returns // the result. If the angle between them is less than maxDeltaTheta then it // simply returns rotateTo // rotateFrom and rotateTo should both be unit vectors function rotateTowards(Vector2 rotateFrom, Vector2 rotateTo, float maxDeltaTheta) { float angleBetween = acos(dot(rotateFrom, rotateTo)); if (angleBetween <= maxDeltaTheta) { return rotateTo; } else { return slerp(rotateFrom, rotateTo, angleBetween / maxDeltaTheta); } } // and an example of how it can be used Vector2 playerOffset = player.position - turret.position // the input to rotates towards needs to be a unit vector playerOffset.normalize() // turretDirection is a unit vector specifying where the turret is pointing // delta time is a float holding the number of seconds between this frame and the last // turretTurnRate is the turn rate of the turret in radians/second turretDirection = rotateTowards(turretDirection, playerOffset, turretTurnRate * deltaTime)

You may have noticed how

As I have created games I have found using vectors to be vastly more useful than trying to represent directions using angles. Vectors have always led me to more robust and elegant solutions that always seems to exhibit correct behavior, even in the odd cases that I didn't think to try. The more familiar you get with them the easier it becomes to manipulate a game world with ease, allowing you to complete seemingly hard challenges simply by building off the basic building blocks of vector operations. If you are serious about game development then you should understand how vectors work as they are a powerful asset to have in your toolbox. This article explained how things work in 2D space. To make the step into 3D you will need to work with Quaternions. They are a great way to represent 3D rotations and have many of the same properties as 2D unit vectors.