Original Post: Limitless Curiosity
Out of various phases of the physics engine. Constraint Resolution was the hardest for me to understand personally. I need to read a lot of different papers and articles to fully understand how constraint resolution works. So I decided to write this article to help me understand it more easily in the future if, for example, I forget how this works.
This article will tackle this problem by giving an example then make a general formula out of it. So let us delve into a pretty common scenario when two of our rigid bodies collide and penetrate each other as depicted below.
From the scenario above we can formulate:
We don't want our rigid bodies to intersect each other, thus we construct a constraint where the penetration depth must be more than zero.
\(C: d>=0\)
This is an inequality constraint, we can transform it to a more simple equality constraint by only solving it if two bodies are penetrating each other. If two rigid bodies don't collide with each other, we don't need any constraint resolution. So:
if d>=0, do nothing else if d < 0 solve C: d = 0
Now we can solve this equation by calculating \( \Delta \vec{p1},\Delta \vec{p2},\Delta \vec{r1}\),and \( \Delta \vec{r2}\) that cause the constraint above satisfied. This method is called the position-based method. This will satisfy the above constraint immediately in the current frame and might cause a jittery effect.
A much more modern and preferable method that is used in Box2d, Chipmunk, Bullet and my physics engine is called the impulse-based method. In this method, we derive a velocity constraint equation from the position constraint equation above.
We are working in 2D so angular velocity and the cross result of two vectors are scalars.
Next, we need to find \(\Delta V\) or impulse to satisfy the velocity constraint. This \(\Delta V\) is caused by a force. We call this force 'constraint force'. Constraint force only exerts a force on the direction of illegal movement in our case the penetration normal. We don't want this force to do any work, contribute or restrict any motion of legal direction.
\(\lambda\) is a scalar, called Lagrangian multiplier. To understand why constraint force working on \(J^{T}\) direction (remember J is a 12 by 1 matrix, so \(J^{T}\) is a 1 by 12 matrix or a 12-dimensional vector), try to remember the equation for a three-dimensional plane.
Now we can draw similarity between equation(1) and equation(2), where \(\vec{n}^{T}\) is similar to J and \(\vec{v}\) is similar to V. So we can interpret equation(1) as a 12 dimensional plane, we can conclude that \(J^{T}\) as the normal of this plane. If a point is outside a plane, the shortest distance from this point to the surface is the normal direction.
After we calculate the Lagrangian multiplier, we have a way to get back the impulse from equation(3). Then, we can apply this impulse to each rigid body.
Note that solving the velocity constraint doesn't mean that we satisfy the position constraint. When we solve the velocity constraint, there is already a violation in the position constraint. We call this violation position drift. What we achieve is stopping the two bodies from penetrating deeper (The penetration depth will stop growing). It might be fine for a slow-moving object as the position drift is not noticeable, but it will be a problem as the object moving faster. The animation below demonstrates what happens when we solve the velocity constraint.
[caption id="attachment_38" align="alignnone" width="800"]
So instead of purely solving the velocity constraint, we add a bias term to fix any violation that happens in position constraint.
So what is the value of the bias? As mentioned before we need this bias to fix positional drift. So we want this bias to be in proportion to penetration depth.
This method is called Baumgarte Stabilization and \(\beta\) is a baumgarte term. The right value for this term might differ for different scenarios. We need to tweak this value between 0 and 1 to find the right value that makes our simulation stable.
If our world consists only of two rigid bodies and one contact constraint. Then the above method will work decently. But in most games, there are more than two rigid bodies. One body can collide and penetrate with two or more bodies. We need to satisfy all the contact constraint simultaneously. For a real-time application, solving all these constraints simultaneously is not feasible. Erin Catto proposes a practical solution, called sequential impulse. The idea here is similar to Project Gauss-Seidel. We calculate \(\lambda\) and \(\Delta V\) for each constraint one by one, from constraint one to constraint n(n = number of constraint). After we finish iterating through the constraints and calculate \(\Delta V\), we repeat the process from constraint one to constraint n until the specified number of iteration. This algorithm will converge to the actual solution.The more we repeat the process, the more accurate the result will be. In Box2d, Erin Catto set ten as the default for the number of iteration.
Another thing to notice is that while we satisfy one constraint we might unintentionally satisfy another constraint. Say for example that we have two different contact constraint on the same rigid body.
When we solve \(\dot{C1}\), we might incidentally make \(\dot{d2} >= 0\). Remember that equation(5), is a formula for \(\dot{C}: \dot{d} = 0\) not \(\dot{C}: \dot{d} >= 0\). So we don't need to apply it to \(\dot{C2}\) anymore. We can detect this by looking at the sign of \(\lambda\). If the sign of \(\lambda\) is negative, that means the constraint is already satisfied. If we use this negative lambda as an impulse, it means we pull it closer instead of pushing it apart. It is fine for individual \(\lambda\) to be negative. But, we need to make sure the accumulation of \(\lambda\) is not negative. In each iteration, we add the current lambda to normalImpulseSum. Then we clamp the normalImpulseSum between 0 and positive infinity. The actual Lagrangian multiplier that we will use to calculate the impulse is the difference between the new normalImpulseSum and the previous normalImpulseSum
Okay, now we have successfully resolve contact penetration in our physics engine. But what about simulating objects that bounce when a collision happens. The property to bounce when a collision happens is called restitution. The coefficient of restitution denoted \(C_{r}\), is the ratio of the parting speed after the collision and the closing speed before the collision.
The coefficient of restitution only affects the velocity along the normal direction. So we need to do the dot operation with the normal vector.
Notice that in this specific case the \(V_{initial}\) is similar to JV. If we look back at our constraint above, we set \(\dot{d}\) to zero because we assume that the object does not bounce back(\(C_{r}=0\)).So, if \(C_{r} != 0\), instead of 0, we can modify our constraint so the desired velocity is \(V_{final}\).
We can merge our old bias term with the restitution term to get a new bias value.
// init constraint // Calculate J(M^-1)(J^T). This term is constant so we can calculate this first for (int i = 0; i < constraint->numContactPoint; i++) { ftContactPointConstraint *pointConstraint = &constraint->pointConstraint; pointConstraint->r1 = manifold->contactPoints.r1 - (bodyA->transform.center + bodyA->centerOfMass); pointConstraint->r2 = manifold->contactPoints.r2 - (bodyB->transform.center + bodyB->centerOfMass); real kNormal = bodyA->inverseMass + bodyB->inverseMass; // Calculate r X normal real rnA = pointConstraint->r1.cross(constraint->normal); real rnB = pointConstraint->r2.cross(constraint->normal); // Calculate J(M^-1)(J^T). kNormal += (bodyA->inverseMoment * rnA * rnA + bodyB->inverseMoment * rnB * rnB); // Save inverse of J(M^-1)(J^T). pointConstraint->normalMass = 1 / kNormal; pointConstraint->positionBias = m_option.baumgarteCoef * manifold->penetrationDepth; ftVector2 vA = bodyA->velocity; ftVector2 vB = bodyB->velocity; real wA = bodyA->angularVelocity; real wB = bodyB->angularVelocity; ftVector2 dv = (vB + pointConstraint->r2.invCross(wB) - vA - pointConstraint->r1.invCross(wA)); //Calculate JV real jnV = dv.dot(constraint->normal pointConstraint->restitutionBias = -restitution * (jnV + m_option.restitutionSlop); } // solve constraint while (numIteration > 0) { for (int i = 0; i < m_constraintGroup.nConstraint; ++i) { ftContactConstraint *constraint = &(m_constraintGroup.constraints); int32 bodyIDA = constraint->bodyIDA; int32 bodyIDB = constraint->bodyIDB; ftVector2 normal = constraint->normal; ftVector2 tangent = normal.tangent(); for (int j = 0; j < constraint->numContactPoint; ++j) { ftContactPointConstraint *pointConstraint = &(constraint->pointConstraint[j]); ftVector2 vA = m_constraintGroup.velocities[bodyIDA]; ftVector2 vB = m_constraintGroup.velocities[bodyIDB]; real wA = m_constraintGroup.angularVelocities[bodyIDA]; real wB = m_constraintGroup.angularVelocities[bodyIDB]; //Calculate JV. (jnV = JV, dv = derivative of d, JV = derivative(d) dot normal)) ftVector2 dv = (vB + pointConstraint->r2.invCross(wB) - vA - pointConstraint->r1.invCross(wA)); real jnV = dv.dot(normal); //Calculate Lambda ( lambda real nLambda = (-jnV + pointConstraint->positionBias / dt + pointConstraint->restitutionBias) * pointConstraint->normalMass; // Add lambda to normalImpulse and clamp real oldAccumI = pointConstraint->nIAcc; pointConstraint->nIAcc += nLambda; if (pointConstraint->nIAcc < 0) { pointConstraint->nIAcc = 0; } // Find real lambda real I = pointConstraint->nIAcc - oldAccumI; // Calculate linear impulse ftVector2 nLinearI = normal * I; // Calculate angular impulse real rnA = pointConstraint->r1.cross(normal); real rnB = pointConstraint->r2.cross(normal); real nAngularIA = rnA * I; real nAngularIB = rnB * I; // Apply linear impulse m_constraintGroup.velocities[bodyIDA] -= constraint->invMassA * nLinearI; m_constraintGroup.velocities[bodyIDB] += constraint->invMassB * nLinearI; // Apply angular impulse m_constraintGroup.angularVelocities[bodyIDA] -= constraint->invMomentA * nAngularIA; m_constraintGroup.angularVelocities[bodyIDB] += constraint->invMomentB * nAngularIB; } } --numIteration; }
In this article, we have learned how to solve contact penetration by defining it as a constraint and solve it. But this framework is not only used to solve contact penetration. We can do many more cool things with constraints like for example implementing hinge joint, pulley, spring, etc.
So this is the step-by-step of constraint resolution:
This marks the end of this article. Feel free to ask if something is still unclear. And please inform me if there are inaccuracies in my article. Thank you for reading.
NB: Box2d use sequential impulse, but does not use baumgarte stabilization anymore. It uses full NGS to resolve the position drift. Chipmunk still use baumgarte stabilization.
References
]]>
Physics simulations commonly use the Coulomb friction model, which requires a coefficent of friction between the materials of two interacting objects to calculate the friction force. Similarly, a coefficient of restitution is required to calculate the collision response force. But how do you determine these coefficients for each pair of materials?
A common approach is to define values of friction and restitution for each individual material and then combine them to get the material-material values. Given individual material friction values \(x\) and \(y\), we need to define a combination function \(f(x,y)\). Similarly, \(r(x,y)\) for restitution (using \(x\) and \(y\) to mean the material restitution values this time).
The value of \(f(x,y)\) and \(r(x,y)\) should lie between \(x\) and \(y\). So \(f(x,x) = x\) and \(r(x,x) = x\).
For any constant \(c\), \(f(x,c)\) and \(r(x,c)\) should be monotonically increasing (it shouldn't ever decrease as \(x\) increases). It should also avoid long flat sections, so as to be able to discriminate bewteen materials. For instance, putting boxes of ice and rubber on an plane and increasing the slope, the ice should slip first, whether the plane is made of ice or rubber.
\(f(x,y)\) should favour slippy over grippy, i.e. \(f(0,1) \lt 0.5\). This corresponds to the intuitive notion that ice should have a greater effect on the combined restitution value than rubber, e.g. vehicle tyres on ice.
I decided that I wanted \(r(x,y)\) to favour the extremes, in a similar way that \(f(x,y)\) should favour values towards \(0\). So very bouncy or very energy-absorbing materials predominate over average ones. In a game world, you then have the option of making the world less or more bouncy for all objects within it by changing the world material, while maintaining a reasonably wide range of restitution values when the world is set to an averagely bouncy material.
Candidates for \(f\) that are often used are the minimum, the arithmetic average, and the geometric average.
\(f_{min}(x,y) = min(x,y)\)
This has too many flat sections and doesn't adequately discrimiate between materials, e.g. \(f(0.1,0.1) = f(0.1,1)\).
\(f_{aa}(x,y) = \frac{x + y}{2}\)
This doesn't favour slippy over grippy enough.
\(f_{ga}(x,y) = \sqrt{{x}{y}}\)
This is good, but the values aren't equally spaced. For this reason, I have found an alternative function - a weighted sum of \(x\) and \(y\).
\(f_{ws}(x,y) = \frac{{x}{w_{x}} + {y}{w_{y}}}{w_{x} + w_{y}}\) where \(w_{x} = \sqrt{2}(1 - x) + 1\)
\(f_{ws}\) has a more regular spacing between values than \(f_{ga}\). The trade-off is that it doesn't cover the full range of values for \(f(x,1)\) (\(f(0,1) \approx 0.3\)). However, they are approximately equal for \(0.2 \le x \le 1\).
As with friction, the minimum, the arithmetic average, and the geometric average are often used.
\(r_{min}\) has the same objections as \(f_{min}\).
\(r_{ga}\) is not suitable because it would require the world material to have a restiution near \(1\) to provide a wide range of combined values.
\(r_{aa}\) is better, but it doesn't give a very wide range of values for \(r(x,0.5)\). We can improve on this range by allowing some flatness and defining \(r\) as a piecewise min/max/sum function.
\(r_{pmms} = \begin{cases}min(x,y) & \text{if }x \lt 0.5 \land y \lt 0.5\\max(x,y) & \text{if }x \gt 0.5 \land y \gt 0.5\\x + y - 0.5 & \text{otherwise}\end{cases}\)
This has the same shortcomings as \(r_{min}\) at the corners where the min and max functions are used. But you can't avoid that if you want to have the full range of values for \(r(x,0.5)\) and still satisfy \(r(x,x) = x\). Similar to the friction, I have created a weighted sum function that I think is better.
\(r_{ws}(x,y) = \frac{{x}{w_{x}} + {y}{w_{y}}}{w_{x} + w_{y}}\) where \(w_{x} = \sqrt{2}\left\vert{2x - 1}\right\vert + 1\)
As with \(f_{ws}\), \(r_{ws}\) sacrifices some of the range for \(r(x,0.5)\) to provide better continuity in the corners.
It's the choice that maximizes the function range, while maintaining monotonicity. These graphs show what \(f_{ws}\) looks like with \(w_{x} = c(1 - x) + 1\), for \(c = 0.5\), \(c = 1\), \(c = \sqrt{2}\), \(c = 2\), and \(c = 4\). For \(c \lt \sqrt{2}\), \(f_{ws}(x,1)\) has less range. For \(c \gt \sqrt{2}\), \(f_{ws}\) is no longer monotonic - near \(f_{ws}(0.1,0.9)\) it starts to curve back, making a more grippy material have a lower combined friction! You can see this more clearly for higher values of \(c\).
The method to combine material friction and restitution values may seem like a minor detail, but sometimes the devil's in the details.
Since the whole concept of defining values for friction and restitution for a single material and then combining them isn't physically accurate, this is obviously just a case of finding a suitable function rather than attempting to model reality. You may have different requirements for your functions, but I hope this discussion of the alternatives is useful.
I made the graphs at the WolframAlpha web site. I find it's a pretty useful tool. Here's an example plot.
The video below contains the playlist for all the videos in this series, which can be accessed via the playlist icon at the top of the embedded video frame. The first video in the series is loaded automatically
2D physics libraries like Box2D and Chipmunk are fine things, but not for every game. Sometimes they're total overkill. Other times, technical constraints force you to go your own way. Sometimes you just want to avoid the well-worn path trod by countless Limbo and Angry Birds knock-offs. Whatever the reason, here are some algorithms to get you started should you opt to do it yourself.
BACKGROUND
Having worked on a physics-library-based space shooter, I was acutely aware of the difficulties. First of all, you need concave polygons. Our spaceships weren't, so we had to divide them into a bunch of little triangles using a fancy algorithm. When the ships collided, the triangles would fly apart and snap back together as if held together with elastic, sometimes with part of the other ship wedged in between - not the effect we had in mind. And then there were issues with "bullet tunneling", and the general pain of storing part of our objects' state in a black box - even if it is an open-source black box.
If I had to do it all over again, I'd use simple bounding-circle collisions. That's exactly what the MARS guys did.
After that ordeal, and with HTML5 coming into vogue a few years back, I decided to make a nice, simple, slow-paced, sailing ship battle game... in JavaScript. I tried a JS port of the Chipmunk physics engine, but it was just too slow. So, I looked up the algorithms I really needed, and implemented them in about 100 lines of JavaScript. And when HTML5 soured on me and I switched to C++ for my next game, porting the code was a piece of cake. I love working this way.
DESIGN
There are three kinds of objects in my naval combat game:
1. Ships - a few slow-moving convex polygons
2. Cannonballs - many fast-moving point particles
3. Land masses - a few large static polygons
I only need to check for ship-vs-ship, ship-vs-cannonball, and ship-vs-land collisions. What are the algorithms for those?
1. Ship vs Cannonball - polygon vs line segment.
2. Ship vs Land - point vs polygon. Shorelines are fuzzy and imprecise, so for this purpose I treat ships as point particles.
3. Ship vs Ship - polygon vs polygon, specifically convex hulls (literally!)
There's no need for joints, polygon stacking, or polygon-vs-polygon continuous collision detection (Box2D creator Erin Catto wrote some great papers about that; it's rather advanced for DIYers but definitely worth reading).
I also use bounding boxes for first-pass collision detection. That's just an optimization, and it's easy, so I'll leave it as an exercise to the reader.
Potentially I might need a spatial partitioning scheme (e.g. quadtrees) if I create an MMO with huge sprawling maps. I haven't done that, and it could get complicated, so I'll leave it at that and move on.
ALGORITHMS
A few notes on the following code:
- This is working JavaScript code from my game, with only minor edits for clarity.
- Porting to C, C++, and similar languages should be very direct in most cases.
- Polygons are stored as flat interleaved arrays: [x1,y1, x2,y2, ...]
- Points are stored as [x,y] arrays for consistency with polygons. (Yes, p[0] is a little harder to type than p.x but I got used to it and I prefer it now. And anyway, it'll be a moot point if/when array-oriented languages come back into vogue.)
One piece of advice: if you intend to roll your own physics engine, allow yourself a few weeks to iron it out (maybe months, if you have ambitious goals). Study the algorithms. Code up some demos. Step through the algorithms in a debugger. Get comfortable with the math. Look for other algorithms that might suit your particular needs better than these ones.
LINE VS LINE
This is the standard line intersection formula, useful for all kinds of things.
function lineIntersection(a,b, c,d){
var v,u,d,r,s;
v= [ d[0]-c[0], d[1]-c[1] ];
u= [ b[0]-a[0], b[1]-a[1] ];
d= u[0]*v[1] - u[1]*v[0];
r= ((a[1]-c[1])*v[0] - (a[0]-c[0])*v[1]) / d;
s= ((a[1]-c[1])*u[0] - (a[0]-c[0])*u[1]) / d;
return (0<=r && r<=1 && 0<=s && s<=1);
}
LINE VS POLYGON
To detect when a cannonball hits a ship, trace a line from the cannonball's previous position (last frame) to its current position, and test to see if it crossed any line segment of the ship's hull. This is reasonably accurate even if the framerate drops to a miserable 10 fps. This is an easy form of Continuous Collision Detection that's ideal for CCD's main use cases: bullets and beam weapons.
// a, b = [x,y]
// edges = [x1,y1, x2,y2, ...]
//
function linePolygonIntersection(a,b, edges){
var c,d,i;
d= edges.slice(edges.length-2);
for(i=0; i
// point = [x,y]
// polygon = [x1,y1, x2,y2, ...]
//
function isPointInPolygon(point, polygon){
var tx= point[0], ty=point[1];
var a;
var b= polygon.slice(polygon.length-2);
var yflag0= (b[1] >= ty);
var yflag1;
var inside_flag=false;
for(var j=0; j
POLYGON VS POLYGON
I'm using the Separating Axis Theorem (SAT) algorithm. It's not "the best" but it's relatively simple and fast enough for most purposes.
Right: if we project along every line of both ship polygons, we always get an overlap (no separation), so the ships have collided. The axis with the smallest overlap (red) provides the 'separation vector' which tells us exactly how far we must move one ship to resolve the collision state. We could also move each ship half that amount, or whatever makes sense.
Left: there is at least one separating axis, therefore the ships have not collided.
// Algorithm:
// For each side, compute its "axis" (normal vector), B. Normalize it.
// Project each vertex, of both polygons, onto the axis.
// Note: dot(A,B) is an adequate approximation of proj(A,B) for this purpose.
// Keep track of min and max values for each polygon.
// If there's an axis with no overlap - a SEPARATING AXIS - we can stop.
// If the min/max values overlap on all sides, we have a collision.
// Use the smallest overlap to separate the shapes.
//
// P1,P2 = two *convex* polygons in [x,y, x,y, ...] format
// cx = canvas context for debug drawing (optional)
//
function collidePolyPoly(P1, P2, cx){
var collision= true;
var dmin= -Infinity;
var collaxis= null;
function _projectVerts(P1, P2, flip){
var A=P1, B=P2;
var v1;
var v2= A.slice(A.length-2);
for(var i=0; i
NOTES
I mentioned that I'm using similar algorithms in my current C++ sidescroller project. One advantage over physics libraries is that I'm able to use the same structure-of-arrays data for collisions, skeleton animation, and rendering.
These old algorithms are heavy on Structures-of-Arrays, a staple of modern CPU/GPU/cache optimization and Data Oriented Design. That was a happy accident. The code felt questionable to me at first, I hadn't heard of DOD, and optimization would have been premature at the time. It was only when I started porting to C++/OpenGL that I realized how well these algorithms fit today's hardware.
While backporting recent improvements from C++ to JS, I used interleaved arrays to overcome JS's lack of pointers. For example, by storing splines as flat arrays (ax1,ay1, ax2,ay2, ax3,ay3, ...) I can refer to any spline node, point, or x/y value by an array index, which eliminates a bit of slow/cumbersome indirection.
SOURCES
Line Intersection formula from the comp.graphics.algorithms FAQ (section 1.03)
Point vs Polygon
ptinpoly.c (by Eric Haines, in Graphic Gems IV) (I used the last algorithm, "CrossingsMultiplyTest")
SAT visual explanation + Flash demo
SAT tutorial with formulas + VB code
Related article on gamedev.net: Shooting At Stuff (how to aim at moving targets)
ARTICLE UPDATE LOG
29 Nov 2015: Initial release
2 Dec 2015: Formatting fixes
]]>
4106 Sat, 27 Jun 2015 18:02:12 +0000
Leading the Target https://www.gamedev.net/articles/programming/math-and-physics/leading-the-target-r4223/
Where should we aim if we want to hit a moving target with a finite-speed projectile? This is one of the recurrent questions from beginner game developers. If we naively aim at the target's current position, by the time our projectile gets there the target will have moved, so we need to aim ahead of the target's current position. The technical name for the technique is "deflection", but most people use "leading the target". Wikipedia page here.
There are several variations of the problem, where perhaps the projectile is a missile that needs to accelerate, or where the shooter is a turret that is currently aiming in some direction and needs time to aim somewhere else... We'll cover the simple case first, and then we'll present a general template for solving variations of the problem.
Plain-vanilla deflection
Let's assume the shooter can aim and shoot anywhere instantly, the target is moving at a constant velocity and the projectile will travel at a constant velocity too. We are given as inputs the target's current position, its velocity and the speed of our projectile. We'll use coordinates where the shooter is at the origin and has zero velocity.
First-order correction
As mentioned before, if we naively aim at the target's current position, by the time the projectile gets there, the target will have moved. We can compute how long it will take for the projectile to get to the target's current position, compute where the target will be then and aim there instead.
Position compute_first_order_correction(Position target_position, Vector target_velocity, float projectile_speed) {
float t = distance(Origin, target_position) / projectile_speed;
return target_position + t * target_velocity;
}
This simple piece of code is probably good enough in many cases (if the target is moving slowly compared to the projectile speed, if the target is moving perpendicularly to the shooter-to-target vector, or if we want to sometimes miss because a more precise solution would be detrimental to the fun of the game).
Iterative approximation
For a more precise solution, you could iterate this first-order correction until it converges.
Position iterative_approximation(Position target_position, Vector target_velocity, float projectile_speed) {
float t = 0.0f;
for (int iteration = 0; iteration < MAX_ITERATIONS; ++iteration) {
float old_t = t;
t = distance(Origin, target_position + t * target_velocity) / projectile_speed;
if (t - old_t < EPSILON)
break;
}
return target_position + t * target_velocity;
}
Computing the answer directly
In the iterative approximation, we would stop if we found a place where old_t and t match. This gives us an equation to solve:
t = distance(Origin, target_position + t * target_velocity) / projectile_speed
Let's do some computations to try to solve it.
t = sqrt(dot_product(target_position + t * target_velocity, target_position + t * target_velocity)) / projectile_speed
t^2 * projectile_speed^2 = dot_product(target_position + t * target_velocity, target_position + t * target_velocity)
t^2 * projectile_speed^2 = dot_product(target_position, target_position) + 2 * t * dot_product(target_position, target_velocity) + t^2 * dot_product(target_velocity, target_velocity)
This is a second-degree equation in t which we can easily solve, leading to the following code:
// a*x^2 + b*x + c = 0
float first_positive_solution_of_quadratic_equation(float a, float b, float c) {
float discriminant = b*b - 4.0f*a*c;
if (discriminant < 0.0f)
return -1.0f; // Indicate there is no solution
float s = std::sqrt(discriminant);
float x1 = (-b-s) / (2.0f*a);
if (x1 > 0.0f)
return x1;
float x2 = (-b+s) / (2.0f*a);
if (x2 > 0.0f)
return x2;
return -1.0f; // Indicate there is no positive solution
}
Position direct_solution(Position target_position, Vector target_velocity, float projectile_speed) {
float a = dot_product(target_velocity, target_velocity) - projectile_speed * projectile_speed;
float b = 2.0f * dot_product(target_position, target_velocity);
float c = dot_product(target_position, target_position);
float t = first_positive_solution_to_quadratic_equation(a, b, c);
if (t <= 0.0f)
return Origin; // Indicate we failed to find a solution
return target_position + t * target_velocity;
}
The general case
There are many variations of the problem we could consider: Accelerating targets, accelerating projectiles, situations where it takes time to aim at a new direction... All of them can be solved following the same template.
The things that could change can be encoded in two functions:
- position_of_target_at(time)
- time_to_hit(position)
All we are really doing is finding a time t at which the following equation holds:
t = time_to_hit(position_of_target_at(t))
We then compute where the target will be at time t and aim there.
Just as before, we could do a first-order correction, use iterative approximation or solve the problem directly. It might be the case that an analytical solution can be found, like we did in the previous section, but things can get messy quickly and you may have to resort to a numerical solution.
Conclusion
This article covered three methods to implement deflection in your games: First-order correction, iterative approximation and directly finding the solution. You'll need to use some judgement to decide which one to use. Hopefully this article gives you enough to make an informed decision.
Article Update Log
30 Oct 2015: Initial release
4 Nov 2015: Minor fixes]]> 4223 Fri, 30 Oct 2015 11:45:19 +0000 2D Transforms 101 https://www.gamedev.net/articles/programming/math-and-physics/2d-transforms-101-r4212/
A presentation/tutorial on 2D transforms for programmers and practitioners favouring intuition over mathematical rigour; animations are used to illustrate the effect of every transform explained.
Click Here to view presentation.
Firefox, Chrome or Opera recommended as they support animations; see Requirements below for details. Press ? while in the presentation for controls and Esc for an overview and quick navigation. Comments and suggestions are welcome.
Overview
Transformations are a general mathematical concept that is applicable to anyone working in:
- Computer Graphics
- Animation
- Game Programming (2D and 3D)
- Image Processing
- Motion Capture
- UI Design
- Computer Vision
- Robotics
- Aeronautics
- Parallel Computing
Concepts discussed are dimension-indepedant; it's just easier to explain and visualize things in 2D but they're applicable to higher dimensions without loss of generality.
Instead of dealing with only elementary transforms (rotate, scale, translate) on points, which most resources do, it also covers:
- Basic math behind transforms (without matrices)
- Matrix introduced past the basics since it's just a tool
- Composite transforms -- concatenation of multiple transforms
- Anti-commutativity
- Transforms about an arbitrary origin
- Active and passive viewpoints of transforms
- Transformation of coordinate systems
- Mapping between multiple coordinate systems
- Hierarchical Transforms
Requirements
The presentation was hand-crafted using HTML5, CSS3, JavaScript and SVG; so nothing more than a browser is required to view the presentation. Firefox, Chrome or Opera, even if you've an older version, is highly recommended since support for SVG animations isn't that great in other browsers; with browsers like Safari, Edge or IE you will not be able to see these animations -- caveat lector.
You can also view the presentation on a mobile or a tablet; the content, without any loss of fidelity, should get resized and fit to given form factor, thanks to vector graphics and CSS3. Touch (tap and swipe left/right) is supported both for navigation and animation control.
Animations
Transformations are better understood visually than with just dry theory. Hence every transformation workout is accompanied, along with the math, by an animation. Images with a graph sheet background contain animations. Simply click on such an image to cycle through the animation. Every click changes the image's state, animating it into a new figure. When you see the original figure you started with, it denotes the end of all the animation the image contains.
If you're using a keyboard to navigate the presentation, clicking the image steals focus from the presentation and sets in on to the image; simply click anywhere outside the image to give the focus back to the presentation. Now press space / -> to continue with the presentation.
Solution
The solution to problem of mapping the boy space to street space that is posed at the end of the presentation is in map_boy_to_street.md.
SVG
The demonstrations/interactive animations embedded within the presentation was done in SVG, the XML-based, open standard file format for vector graphics with support for animation. In English, all it has are just instructions like move to, line to, rect, circle, ellipse, etc. in a very readable, XML format and no unintelligible binary data. So a reader (yes, a human one too) can easily read and understand it; a renderer can render it at any resolution without any loss in fidelity. The presentation's first slide has a 3D-transformed background which too is an SVG -- it should show up as something similar to this; a simple check to see how well your browser supports SVGs and CSS3 transforms.
It's highly recommended that you fiddle with the SVGs under images directory. SVG format is very similar to PostScript (which also has commands like move to, line to, etc.) and is an excellent Hello World test bed (Short, Self Contained, Correct, Example) for learning transformations or 2D graphics in general. Oh they also have a tag for groupings which may be used to learn hierarchical transformations. An SVG is only more readable than a PS, PDF or XAML. Just open it in a (modern) browser to view it (no, not Edge, it doesn't do 3D CSS3 transforms in SVGs yet :), open it in your favourite text editor, muck around, save and refresh your browser to see the changes immediately; rinse and repeat.
Credits
- Computer Graphics using OpenGL, Francis Hill and Stephen Kelley
- 3-D Computer Graphics, Samuel R. Buss
- Essential Math for Games and Interactive Applications, James Van Verth and Lars Bishop
- 3D Math Primer, Fletcher Dunn and Ian Parberry
- reveal.js, the presentation framework, generously shared under the MIT licence by Hakim El Hattab
- MathJax for rendering beautiful math equations on any browser, American Mathematical Society
- Elementary affine transforms chart shared under CC 3.0, used as first slide background, CM Lee
- What is or isn't a linear transform, shared under CC 3.0, Ldo
- Reveal.js on Github pages, Vasko Zdravevski
]]> 4212 Mon, 02 Nov 2015 00:40:00 +0000 Math for Game Developers: Advanced Vectors https://www.gamedev.net/articles/programming/math-and-physics/math-for-game-developers-advanced-vectors-r4181/
Math for Game Developers is exactly what it sounds like - a weekly instructional YouTube series wherein I show you how to use math to make your games. Every Thursday we'll learn how to implement one game design, starting from the underlying mathematical concept and ending with its C++ implementation. The videos will teach you everything you need to know, all you need is a basic understanding of algebra and trigonometry. If you want to follow along with the code sections, it will help to know a bit of programming already, but it's not necessary. You can download the source code that I'm using from GitHub, from the description of each video. If you have questions about the topics covered or requests for future topics, I would love to hear them! Leave a comment, or ask me on my Twitter, @VinoBS
The video below contains the playlist for all the videos in this series, which can be accessed via the playlist icon at the top of the embedded video frame. The first video in the series is loaded automatically
Advanced Vectors
[playlist=PLW3Zl3wyJwWMcLmUYXMIIxCiLKGOWLETh]
]]> 4181 Wed, 16 Sep 2015 06:19:00 +0000 Particle Systems using Constrained Dynamics https://www.gamedev.net/articles/programming/math-and-physics/particle-systems-using-constrained-dynamics-r4170/
Simulating physics can be fairly complex. Spatial motion (vehicles, projectiles, etc.), friction, collision, explosions, and other types of physical interactions are complicated enough to describe mathematically, but making them accurate when computed adds another layer on top of that. Making it run in real time adds even more complexity. There is lots of active research into quicker and more accurate methods. This article is meant to showcase a really interesting way to simulate particles with constraints in a numerically stable way. As well, I'll try to break down the underlying principles so it's more understandable by those who forgot their physics.
Note: the method presented in this article is described in the paper "Interactive Dynamics" and was written by Witkin, Gleicher, and Welch. It was published in ACM in 1990.
A posted PDF of the paper can be found here: http://www.cs.cmu.edu/~aw/pdf/interactive.pdf
A link to another article by Witkin on this subject can be found here: https://www.cs.cmu.edu/~baraff/pbm/constraints.pdf
Physical Theory
Newton's Laws
Everyone's familiar with Newton's second law: \( F = ma\). It forms the basis of Newtonian mechanics. It looks very simple by itself, but usually it's very hard to deal with Newton's laws because of the number of equations involved. The number of ways a body can move in space is called the degrees of freedom. For full 3D motion, we have 6 degrees of freedom for each body and thus need 6 equations per body to solve for the motion. For the ease in explaining this method, we will consider translations only, but this can be extended for rotations as well.
We need to devise an easy way to build and compute this system of equations. For a point mass moving in 3D, we can set up the general equations as a matrix equation:
\[ \left [ \begin{matrix} m_1 & 0 & 0 \\ 0 & m_1 & 0 \\ 0 & 0 & m_1 \\ \end{matrix} \right ] \left [ \begin{matrix} a_{1x} \\ a_{1y} \\ a_{1z} \\ \end{matrix} \right ] = \left [ \begin{matrix} F_{1x} \\ F_{1y} \\ F_{1z} \\ \end{matrix} \right ]\]
This can obviously be extended to include accelerations and net forces for many particles as well. The abbreviated equation is:
\[ M \ddot{q} = F \]
where \(M\) is the mass matrix, \(\ddot{q}\) is acceleration (the second time derivative of position), and \(F\) is the sum of all the forces on the body.
Motivating Example
One of the problems with computing with Newton-Euler methods is that we have to compute all the forces in the system to understand how the system will evolve, or in other words, how the bodies will move with respect to each other. Let's take a simple example of a pendulum.
Technically, we have a force on the wall by the string, and a force on the ball by the string. In this case, we can reduce it to the forces shown and solve for the motion of the ball. Here, we have to figure out how the string is pulling on the ball ( \( T = mg \cos{\theta}\) ), and then break it into components to get the forces in each direction. This yields the following:
\[ \left [ \begin{matrix} m & 0 \\ 0 & m \\ \end{matrix} \right ] \left [ \begin{matrix} \ddot{q}_x \\ \ddot{q}_y \\ \end{matrix} \right ] = \left [ \begin{matrix} -T\sin{\theta} \\ -mg+T\cos{\theta} \\ \end{matrix} \right ]\]
We can then model this motion without needing to use the string. The ball can simply exist in space and move according to this equation of motion.
Constraints and Constraint Forces
One thing we've glossed over without highlighting in the pendulum example is the string. We were able to ignore the fact that the mass is attached to the string, so does the string actually do anything in this example? Well, yes and no. The string provides the tension to hold up the mass, but anything could do that. We could have had a rod or a beam hold it up. What it really does is define the possible paths the mass can travel on. The motion of the mass is dictated, or constrained, by the string. Here, the mass is traveling on a circular path about the point where the pendulum hangs on the wall. Really, anything that constrains the mass to this path with no additional work can do this. If the mass was a bead on a frictionless circular wire with the same radius, we would get the same equations of motion!
If we rearrange the pendulum's equations of motion, we can illustrate a point:
\[ \left [ \begin{matrix} m & 0 \\ 0 & m \\ \end{matrix} \right ] \left [ \begin{matrix} \ddot{q}_x \\ \ddot{q}_y \\ \end{matrix} \right ] = \left [ \begin{matrix} 0 \\ -mg \\ \end{matrix} \right ] + \left [ \begin{matrix} -T\sin{\theta} \\ T\cos{\theta} \\ \end{matrix} \right ]\]
In our example, the only applied force on the mass is gravity. That is represented as the first term on the right hand side of the equation. So what's the second term? That is the constraint force, or the other forces necessary to keep the mass constrained to the path. We can consider that a part of the net forces on the mass, so the modified equation is:
\[ M_{ij} \ddot{q}_j = Q_j + C_j \]
where \(M\) is the mass matrix, \(\ddot{q}\) is acceleration (the second time derivative of position), \(Q\) is the sum of all the applied forces on the body, and \(C\) is the sum of all the constraint forces on the body. Let's note as well that the mass matrix is basically diagonal. It's definitely sparse, so that can work to our advantage later when we're working with it.
Principle of Virtual Work
The notion of adding constraint forces can be a bit unsettling because we are adding more forces to the body, which you would think would change the energy in the system. However, if we take a closer look at the pendulum example, we can see that the tension in the string is acting perpendicular (orthogonal) to the motion of the mass. If the constraint force is orthogonal to the displacement, then there is no additional work being done on the system, meaning no energy is being added to the system. This is called d'Alembert's principle, or the principle of virtual work.
Believe it or not, this is a big deal! This is one of the key ideas in this method. Normally, springs are used to create the constraint forces to help define the motion between objects. For this pendulum example, we could treat the string as a very stiff spring. As the mass moves, it may displace a small amount from the circular path (due to numerical error). Then the spring force will move the mass back toward the constraint. As it does this, it may overshoot a little or a lot! In addition to this, sometimes the spring constants can be very high, creating what are aptly named stiff equations. This causes the numerical integration to either take unnecessarily tiny time steps where normally larger ones would suffice. These problems are well-known in the simulation community and many techniques have been created to avoid making the equations of motion stiff.
As illustrated above, as long as the constraint forces don't do any work on the system, we can use them. There are lots of combinations of constraint forces that can be used that satisfy d'Alembert's principle, but we can illustrate a simple way to get those forces.
Witkin's Method
Constraints
Usually a simulation has a starting position \(q = \left [ \begin{matrix} x_1 & y_1 & z_1 & x_2 & y_2 & z_2 & \cdots \\ \end{matrix} \right ] \) and velocity \(\dot{q} = \left [ \begin{matrix} \dot{x}_1 & \dot{y}_1 & \dot{z}_1 & \dot{x}_2 & \dot{y}_2 & \dot{z}_2 & \cdots \\ \end{matrix} \right ] \). The general constraint function is based on the state \(q(t)\) and possibly on time as well: \(c(q(t))\). The constraints need to be implicit, meaning that the constraints should be an equation that equals zero. For example, the 2D implicit circle equation is \(x^2 + y^2 - R^2 = 0\).
Remember there are multiple constraints to take into consideration. The vector that stores them will be denoted in an index notation as \(c_i\). Taking the total derivative with respect to time is:
\[\dot{c}_i=\frac{d}{dt}c_i(q(t),t)=\frac{\partial c_i}{\partial q_j}\dot{q}_j+\frac{\partial c_i}{\partial t}\]
The first term in this equation is actually a matrix, the Jacobian of the constraint vector \(J = \partial c_i/\partial q_j\), left-multiplied to the velocity vector. The Jacobian is made of all the partial derivatives of the constraints. The second term is just the partial time derivative of the constraints.
Differentiating again to get \(\ddot{c_i}\) yields:
\[\ddot{c}_i = \frac{\partial c_i}{\partial q_j} \ddot{q}_j + \frac{\partial \dot{c}_i}{\partial q_j} \dot{q}_j + \frac{\partial^2 c_i}{\partial t^2}\]
Looking at the results, the first term is the Jacobian of the constraints multiplied by the acceleration vector. The second term is actually the Jacobian of the time derivative of the constraint. The third term is the second partial time derivative of the constraints.
The formulas for the complicated terms, like the Jacobians, can be calculated analytically ahead of time. As well, since the constraints are position constraints, the second time derivatives are accelerations.
Newton's Law with Constraints
If we solve the matrix Newton's Law equation for the accelerations, we get:
\[\ddot{q}_j = W_{jk}\left ( C_k + Q_k \right )\]
where \(W = M^{-1}\), the mass matrix inverse. If we were to replace this with the acceleration vector from our constraint equation, we would get the following:
\[\frac{\partial c_i}{\partial q_j} W_{jk}\left ( C_k + Q_k \right ) + \frac{\partial \dot{c}_i}{\partial q_j} \dot{q}_j + \frac{\partial^2 c_i}{\partial t^2} = 0\]
\[JW(C+Q) + \dot{J} \dot{q} + c_{tt} = 0\]
Here, the only unknowns are the constraint forces. From our discussion before, we know that the constraint forces must satisfy the principle of virtual work. As we said before, the forces need to be orthogonal to the displacements, or the legal paths. We will take the gradient of the constraint path to get vectors orthogonal to the path. The reason why this works will be explained later. Since the constraints are placed in a vector, the gradient of that vector would be the Jacobian matrix of the constraints: \(\partial c/\partial q\). Although the row vectors of the matrix have the proper directions to make the dot product with the displacements zero, they don't have the right magnitudes to force the masses to lie on the constraint. We can construct a vector of scalars that will multiply with the row vectors to make the magnitudes correct. These are known as Lagrange multipliers. This would make the equation for the constraint forces as follows:
\[C_j = \lambda_i \frac{\partial c_i}{\partial q_j} = J^T \lambda_i\]
Plugging that equation back into the augmented equation for Newton's law:
\[ \left ( -JWJ^T \right ) \lambda = JWQ + \dot{J}\dot{q} + c_{tt}\]
Note that the only unknowns here are the Lagrange multipliers.
Attempt at an Explanation of the Constraint Force Equation
If you're confused at how Witkin got that equation for the constraint forces, that's normal. I'll attempt to relate it to something easier to visualize and understand: surfaces. Let's take a look at the equation of a quadric surface:
\[Ax^2+By^2+Cz^2+Dxy+Eyz+Fxz+Gx+Hy+Iz+J=0\]
The capital letters denote constants. Notice also the equation is implicit. We can see the equation for an ellipse is a quadric surface:
\[f(x,y,z) = (1/a^2)x^2+(1/b^2)y^2+(1/c^2)z^2-1=0\]
For a point (x,y,z) to be on the surface, it must satisfy this equation. To put it into more formal math terms, we could say the surface takes a point in \(\mathbb{R}^3\) and maps it to the zero vector in \(\mathbb{R}\), which is just 0. Any movement on this surface is "legal" because the new point will still satisfy the surface equation. If we were to take the gradient of this ellipse equation, we'd get:
\[ \left [ \begin{matrix} \frac{\partial f}{\partial x} \\ \frac{\partial f}{\partial y} \\ \frac{\partial f}{\partial z} \\ \end{matrix} \right ] = \left [ \begin{matrix} \frac{2x}{a^2} \\ \frac{2y}{b^2} \\ \frac{2z}{c^2} \\ \end{matrix} \right ] \]
This vector is the normal to the surface at the given (x,y,z) coordinates. If we were to visualize a plane tangent to the ellipse at that point, the dot product of the normal and the tangent plane would be zero by definition.
With this same type of thinking, we can see the constraints as a type of algebraic surface since they're also implicit equations (it's hard to visualize these surfaces geometrically since they can be n-dimensional). Just like with the geometry example, if we were to take the gradient of the constraints the resulting vector would be orthogonal to a tangent plane on the surface. In more formal math terms, the constraints/surfaces can be called the null space since it contains all the points (vectors) that map to the zero vector. The gradients/normals to these constraints is termed the null space complement. The constraint force equation produces vectors that lie in the null space complement, and are therefore orthogonal to the constraint surface.
The purpose of these math terms are to help generalize this concept for use in situations where the problems are not easy to visualize or intuitively understand.
Calculating the State
With these equations in place, the process of calculating the system state can now be summed up as follows:
- Construct the \(W\), \(J\), \(\dot{J}\), \(c_{tt}\) matrices.
- Multiply and solve the \(Ax=b\) problem for the Lagrange multipliers \(\lambda\).
- Compute the constraint forces using \(C = J^T \lambda\).
- Compute the accelerations using \(\ddot{q} = W(C+Q)\).
- Integrate the accelerations to get the new velocities \( \dot{q}(t) \) and positions \( q(t) \).
This process can be optimized to take advantage of sparsity in the matrices. The Jacobian matrix wll generally be sparse since the constraints won't generally depend on a large number of particles. This can help with the matrix multiplication on both sides. The main challenge will be in building these matrices.
Feedback
Due to numerical error, there will be a tendency to drift away from the proper solution. Recall that in order to derive the modified Newton's law equation above, we forced \( c_i = 0 \) and \( \dot{c_i} = 0 \). Numerical error can produce solutions that won't satisfy these equations within a certain tolerance, so we can use a method used in control systems engineering: the PID loop.
In most systems we want to control, there are inputs to the system (force, etc.) and measurements of a system's state (angle, position, etc.). A PID loop feeds the error in the actual and desired state back into the force inputs to drive the error to zero. For example, the human body has many different inputs to the force on the muscles when we stand upright. The brain measures many different things to see if we're centered on our feet or if we're teetering to one side or another. If we're falling or we're off-center, the brain makes adjustments to our muscles to stay upright. A PID loop does something similar by measuring the error and feeding that back into the inputs. If done correctly, the PID system will drive the error in the measured state to zero by changing the inputs to the system as needed.
Here, we use the error in the constraint and the constraint derivative to feedback into the system to better control the numerical drift. We augment the forces by adding terms that account for the \(c_i =0\) and \(\dot{c_i}=0\):
\[ F_j = Q_j + C_j + \alpha c_i \frac{\partial c_i}{\partial q_j} + \beta \dot{c_i} \frac{\partial c_i}{\partial q_j} \]
This isn't a true force being added since these extra terms will vanish when the forces are correct. This is just to inhibit numerical drift, so the constants \(\alpha\) and \(\beta\) are "magic", meaning that they are determined empirically to see what "fits" better.
Conclusion
Witkin's method for interactive simulations is pretty widely applicable. Although this can be obviously used for movable truss structures and models of Tinkertoys, he also talks about using them for deformable bodies, non-linear optimization and keyframe animation as well. There are lots of applications of this method. Hopefully this showcase of Witkin's method will help make this interesting solution more accessible to anyone doing any type of simulation engines.
Article Update Log
27 Aug 2015: Initial release]]> 4170 Thu, 27 Aug 2015 18:39:13 +0000 Math for Game Developers: Probability and Randomness https://www.gamedev.net/articles/programming/math-and-physics/math-for-game-developers-probability-and-randomness-r4114/
Math for Game Developers is exactly what it sounds like - a weekly instructional YouTube series wherein I show you how to use math to make your games. Every Thursday we'll learn how to implement one game design, starting from the underlying mathematical concept and ending with its C++ implementation. The videos will teach you everything you need to know, all you need is a basic understanding of algebra and trigonometry. If you want to follow along with the code sections, it will help to know a bit of programming already, but it's not necessary. You can download the source code that I'm using from GitHub, from the description of each video. If you have questions about the topics covered or requests for future topics, I would love to hear them! Leave a comment, or ask me on my Twitter, @VinoBS
The video below contains the playlist for all the videos in this series, which can be accessed via the playlist icon at the top of the embedded video frame. The first video in the series is loaded automatically
Probability and Randomness
[playlist=PLW3Zl3wyJwWOkXKMWvuH5y_41_SjBNVAd]
]]> 4114 Thu, 09 Jul 2015 18:27:08 +0000 PID Control of Physics Bodies https://www.gamedev.net/articles/programming/math-and-physics/pid-control-of-physics-bodies-r3885/
This article discusses an approach to making physics bodies rotate smoothly, whether they are stationary or actively moving. I used cocos2d-x and Box2D, but the basic approach will work for any physics body (even a 3D one if you are trying to rotate in a 2D plane).
The approach uses a Proportional-Integral-Derivative (PID) control loop on the angle of the body to apply torque to it in a well controlled and predictable manner. Two examples of using the approach are shown in the video, one for a missile that can only move in the direction it is facing and another for an "Entity" like a game character that moves independent of its facing direction.
The video below shows this in action. You don't have to watch the whole video to get the idea, but there is a lot in there to see...
Facing Direction
Whether your game is in 2-D or 3-D, you often have the need to make an object "turn" to face another direction. This could be a character's walking direction as they are moving, the direction they are shooting while sitting crouched, the direction a missile is flying in, the direction a car is racing towards, etc. This is the job of the "character controller", the piece of code in your system responsible for the basic "movement" operations that a character must undergo (seek, turn, arrive, etc.).
Building games that use physics engines is a lot of fun and adds a level of realism to the game that can dramatically improve the gameplay experience. Objects collide, break, spin, bounce, and move in more realistic ways.
Moving in more realistic ways is not what you usually think about, though, when you think about facing direction. Your usual concern is something like "The character needs to turn to face left in 0.5 seconds." From the standpoint of physics, this means you want to apply torque to make it turn 90? left in 0.5 seconds. You want it to stop exactly on the spot. You don't want to worry about things like angular momentum, which will tend to keep it turning unless you apply counter torque. You really don't want to think about applying counter torque to make it stop "on a dime". Box2D will allow you to manually set the position and angle of a body. However, if you manually set the position and angle of a physics body in every frame, it can interfere (in my experience) with the collision response of the physics engine.
Most important of all, this is a physics engine. You should be using it as such to make bodies move as expected.
Our goal is to create a solution to change the facing direction of the body by applying turning torque to it.
NOTE THAT PID CONTROL LOOPS CAN BE USED FOR LOTS OF OTHER THINGS. This example just happens to be the one I chose.
If we decouple the problem of "how it turns" from "how it moves", we can use the same turning solution for other types of moving bodies where the facing direction needs to be controlled. For this article, we are considering a missile that is moving towards its target.
Here, the missile is moving in a direction and has a given velocity. The angle of the velocity is measured relative to the x-axis. The "facing direction" of the missile is directly down the nose and the missile can only move forward. We want to turn it so that it is facing towards the target, which is at a different angle. For the missile to hit the target, it has to be aiming at it. Note that if we are talking about an object that is not moving, we can just as easily use the angle of the body relative to the x-axis as the angle of interest.
Feedback Control Systems 101
The basic idea behind a control system is to take the difference of "what you want the value to be" and "what the value is" and adjust your input to the system so that, over time, the system converges to your desired value.
From this wikipedia article:
A familiar example of a control loop is the action taken when adjusting hot and cold faucets (valves) to maintain the water at a desired temperature. This typically involves the mixing of two process streams, the hot and cold water. The person touches the water to sense or measure its temperature. Based on this feedback they perform a control action to adjust the hot and cold water valves until the process temperature stabilizes at the desired value.
There is a huge body of knowledge in controls system theory. Polynomials, poles, zeros, time domain, frequency domain, state space, etc. It can seem daunting to the uninitiated. It can seem daunting to the initiated as well! That being said, while there are more "modern" solutions to controlling the facing direction, we're going to stick with PID Control. PID control has the distinct advantages of having only three parameters to "tune" and a nice intuitive "feel" to it.
PID Control
Let's start with the basic variable we want to "control", the difference between the angle we want to be facing and the angle of the body/velocity:
\(e(t) = desired - actual\)
Here, \(e(t)\) is the "error". We want to drive the error to 0. We apply forces to the body to make it turn in a direction to make it turn so that it moves \(e(t)\) towards 0. To do this, we create a function \(f(.)\), feed \(e(t)\) into it, and apply torque to the body based on it. Torque makes bodies turn:
\(torque(t) = I * f(e(t)), I \equiv Angular Inertia \)
Proportional Feedback
The first and most obvious choice is to apply a torque that is proportional to the \(e(t)\) itself. When the error is large, large force is applied. When the error is small, small force is applied. Something like:
\(f(e(t)) = K_p * e(t)\)
And applying the torque with just proporational feedback will work. Somewhat. The problem is that when the error is small, the corrective torque is small as well. So as the body is turning \(e(t)\) gets small (nearing the intended angle), the retarding torque is small also. So the body overshoots and goes past the angle intended. Then you start to swing back and eventually oscillate to a steady state. If the value \(K_p\) is not too large, the oscillation should settle into "damped" (exponentially decaying) sinusoid, dropping a large amount each oscillation (stable solution). It may also spiral off towards infinity (unstable solution), or just oscillate forever around the target point (marginally stable solution). If you reduce \(K_p\) so that it is not moving so fast, then when \(e(t)\) is large, you don't have a lot of driving force to get moving.
A pure proportional error also has a bias (Steady State Error) that keeps the final output different from the input. The error is a function of \(K_p\) of the form:
\(Steady State Error = \frac{desired}{ [1 + constant * K_p]} \)
So increasing the \(K_p\) value makes the bias smaller (good). But this will also make it oscillate more (bad).
Integral Feedback
The next term to add is the integral term, the "I" in PID:
\(f(e(t)) = K_p * e(t) + \int\limits_{-\infty}^{now} K_i * e(t) \)
For each time step, if \(e(t)\) has a constant value, the integral term will work to counter it:
- If direction to the target suddenly changes a small amount, then over each time step, this difference will build up and create turning torque.
- If there is a bias in the direction (e.g. Steady State Error), this will accumulate over the time steps and be countered.
The integral term works to counter any constant offset being applied to the output. At first, it works a little but over time, the value accumulates (integrates) and builds up, pushing more and more as time passes.
We don't have to calculate the actual integral. We probably don't want to anyway since it stretches back to \(-\infty\) and an error back in the far past should have little effect on our near term decisions.
We can estimate the integral over the short term by summing the value of \(e(t)\) over the last several cycles and multiplying by the time step (Euler Integration) or some other numerical technique. In the code base, the Composite Simpson's Rule technique was used.
Derivative Feedback
Most PID controllers stop at the "PI" version. The proportional part gets the output swinging towards the input and the integral part knocks out the bias or any steady external forces that might be countering the proportional control. However, we still have oscillations in the output response. What we need is a way to slow down the rotation as the body is heading towards the target angle. The proportional and integral components work to push towards it. By looking at the derivative of \(e(t)\), we can estimate its value in the near term and apply force to drive it towards not changing. This is a counter-torque to the proportional and integral components:
\(f(e(t)) = K_p * e(t) + \int\limits_{-\infty}^{now} K_i * e(t) dt + K_d * \frac{de(t)}{dt}\)
Consider what happens when \(e(t)\) is oscillating. Its behavior is like a sine function. The derivative of this is a cosine function and its maximum occurs when sin(e(t)) = 0. That is to say, the derivative is largest when \(e(t)\) is swinging through the position we want to achieve. Conversely, when the oscillation is at the edge, about to change direction, its rate of change switches from positive to negative (or vice versa), so the derivative is smallest (minimum). So the derivative term will apply counter torque hardest when the body is swinging through the point we want to be at, countering the oscillation, and least when we are at either edge of the "swing".
Just like the integral, the derivative can be estimated numerically. This is done by taking differences over the last several \(e(t)\) values (see the code).
Using derivative control is not usually a good idea in real control systems. Sensor noise can make it appear as if \(e(t)\) is changing rapidly back and forth, causing the derivative to spike back and forth with it. However, in our case, unless we are looking at a numerical issue, we should not have a problem.
Classes and Sequences
Because we are software minded, whatever algorithm we want to use for a PID controller, we want to wrap it into a convenient package, give it a clean interface, and hide everything except what the user needs. This needs to be "owned" by the entity that is doing the turning.
The MovingEntityInterface represents a "Moving Entity". In the case of this demo, it can be an entity like a Missile, which moves forward only, or a "character", which can turn while moving. While they have different methods internally for "applying thrust" they both have nearly identical methods for controlling turning. This allows the implementation of a "seek" behavior tailored more to the entity type.
The interface itself is generic so that the MainScene class can own an instance and manipulate it without worrying about what type it is.
The PIDController class itself has this interface:
/********************************************************************
* File : PIDController.h
* Project: Interpolator
*
********************************************************************
* Created on 10/13/13 By Nonlinear Ideas Inc.
* Copyright (c) 2013 Nonlinear Ideas Inc. All rights reserved.
********************************************************************
* This software is provided 'as-is', without any express or implied
* warranty. In no event will the authors be held liable for any
* damages arising from the use of this software.
*
* Permission is granted to anyone to use this software for any
* purpose, including commercial applications, and to alter it and
* redistribute it freely, subject to the following restrictions:
*
* 1. The origin of this software must not be misrepresented; you must
* not claim that you wrote the original software. If you use this
* software in a product, an acknowledgment in the product
* documentation would be appreciated but is not required.
* 2. Altered source versions must be plainly marked as such, and
* must not be misrepresented as being the original software.
* 3. This notice may not be removed or altered from any source
* distribution.
*/
#ifndef __Interpolator__PIDController__
#define __Interpolator__PIDController__
#include "CommonSTL.h"
#include "MathUtilities.h"
/* This class is used to model a Proportional-
* Integral-Derivative (PID) Controller. This
* is a mathemtical/control system approach
* to driving the state of a measured value
* towards an expected value.
*
*/
class PIDController
{
private:
double _dt;
uint32 _maxHistory;
double _kIntegral;
double _kProportional;
double _kDerivative;
double _kPlant;
vector _errors;
vector _outputs;
enum
{
MIN_SAMPLES = 3
};
/* Given two sample outputs and
* the corresponding inputs, make
* a linear pridiction a time step
* into the future.
*/
double SingleStepPredictor(
double x0, double y0,
double x1, double y1,
double dt) const
{
/* Given y0 = m*x0 + b
* y1 = m*x1 + b
*
* Sovle for m, b
*
* => m = (y1-y0)/(x1-x0)
* b = y1-m*x1
*/
assert(!MathUtilities::IsNearZero(x1-x0));
double m = (y1-y0)/(x1-x0);
double b = y1 - m*x1;
double result = m*(x1 + dt) + b;
return result;
}
/* This funciton is called whenever
* a new input record is added.
*/
void CalculateNextOutput()
{
if(_errors.size() < MIN_SAMPLES)
{ // We need a certain number of samples
// before we can do ANYTHING at all.
_outputs.push_back(0.0);
}
else
{ // Estimate each part.
size_t errorSize = _errors.size();
// Proportional
double prop = _kProportional * _errors[errorSize-1];
// Integral - Use Extended Simpson's Rule
double integral = 0;
for(uint32 idx = 1; idx < errorSize-1; idx+=2)
{
integral += 4*_errors[idx];
}
for(uint32 idx = 2; idx < errorSize-1; idx+=2)
{
integral += 2*_errors[idx];
}
integral += _errors[0];
integral += _errors[errorSize-1];
integral /= (3*_dt);
integral *= _kIntegral;
// Derivative
double deriv = _kDerivative * (_errors[errorSize-1]-_errors[errorSize-2]) / _dt;
// Total P+I+D
double result = _kPlant * (prop + integral + deriv);
_outputs.push_back(result);
}
}
public:
void ResetHistory()
{
_errors.clear();
_outputs.clear();
}
void ResetConstants()
{
_kIntegral = 0.0;
_kDerivative = 0.0;
_kProportional = 0.0;
_kPlant = 1.0;
}
PIDController() :
_dt(1.0/100),
_maxHistory(7)
{
ResetConstants();
ResetHistory();
}
void SetKIntegral(double kIntegral) { _kIntegral = kIntegral; }
double GetKIntegral() { return _kIntegral; }
void SetKProportional(double kProportional) { _kProportional = kProportional; }
double GetKProportional() { return _kProportional; }
void SetKDerivative(double kDerivative) { _kDerivative = kDerivative; }
double GetKDerivative() { return _kDerivative; }
void SetKPlant(double kPlant) { _kPlant = kPlant; }
double GetKPlant() { return _kPlant; }
void SetTimeStep(double dt) { _dt = dt; assert(_dt > 100*numeric_limits::epsilon());}
double GetTimeStep() { return _dt; }
void SetMaxHistory(uint32 maxHistory) { _maxHistory = maxHistory; assert(_maxHistory >= MIN_SAMPLES); }
uint32 GetMaxHistory() { return _maxHistory; }
void AddSample(double error)
{
_errors.push_back(error);
while(_errors.size() > _maxHistory)
{ // If we got too big, remove the history.
// NOTE: This is not terribly efficient. We
// could keep all this in a fixed size array
// and then do the math using the offset from
// the beginning and module math. But this
// gets complicated fast. KISS.
_errors.erase(_errors.begin());
}
CalculateNextOutput();
}
double GetLastError() { size_t es = _errors.size(); if(es == 0) return 0.0; return _errors[es-1]; }
double GetLastOutput() { size_t os = _outputs.size(); if(os == 0) return 0.0; return _outputs[os-1]; }
virtual ~PIDController()
{
}
};
This is a very simple class to use. You set it up calling the SetKXXX functions as needed, set the time step for integration, and call AddSample(...) each update cycle with the error term.
Looking at the Missile class, which owns an instance of this, the step update (called in Update) looks like this:
void ApplyTurnTorque()
{
Vec2 toTarget = GetTargetPos() - GetBody()->GetPosition();
float32 angleBodyRads = MathUtilities::AdjustAngle(GetBody()->GetAngle());
if(GetBody()->GetLinearVelocity().LengthSquared() > 0)
{ // Body is moving
Vec2 vel = GetBody()->GetLinearVelocity();
angleBodyRads = MathUtilities::AdjustAngle(atan2f(vel.y,vel.x));
}
float32 angleTargetRads = MathUtilities::AdjustAngle(atan2f(toTarget.y, toTarget.x));
float32 angleError = MathUtilities::AdjustAngle(angleBodyRads - angleTargetRads);
_turnController.AddSample(angleError);
// Negative Feedback
float32 angAcc = -_turnController.GetLastOutput();
// This is as much turn acceleration as this
// "motor" can generate.
if(angAcc > GetMaxAngularAcceleration())
angAcc = GetMaxAngularAcceleration();
if(angAcc < -GetMaxAngularAcceleration())
angAcc = -GetMaxAngularAcceleration();
float32 torque = angAcc * GetBody()->GetInertia();
GetBody()->ApplyTorque(torque);
}
Nuances
If you look carefully at the video, there is a distinct difference in the way path following works for the missile vs. the character (called the MovingEntity in the code). The missile can overshoot the path easily, especially when its maximum turn rate is reduced.
The MovingEntity always moves more directly towards the points because it is using a "vector feedback" of its position vs. the target position to adjust its velocity. This is more like a traditional "seek" behavior than the missile.
I have also, quite deliberately, left out a bit of key information on how to tune the constants for the PID controller. There are numerous articles on Google for how to tune a PID control loop, and I have to leave something for you to do, after all.
You will also note that the default value for _dt, the time step, is set for 0.01 seconds. You can adjust this value to match the timestep you actually intend to use, but there will be tradeoffs in the numerical simulation (error roundoff, system bandwidth concerns, etc.) that you will encounter. In practice, I use the same controller with the same constants across multipe sized physical entities without tweaking and the behavior seems realistic enough (so far) that I have not had to go and hunt for minor tweaks to parameters. Your mileage may vary.
The source code for this, written in cocos2d-x/C++, can be found on github here. The PIDController class has no dependencies other than standard libraries and should be portable to any system.
Article Update Log
10 Dec 2014: Deconflated torque and force and other textual updates to make article more clear.
3 Nov 2014: Text correction in "Integral" section.
29 Oct 2014: Initial release]]> 3885 Wed, 29 Oct 2014 19:19:10 +0000 Capsule Inertia Tensor https://www.gamedev.net/articles/programming/math-and-physics/capsule-inertia-tensor-r3856/
Although there are robust algorithms that calculate body mass, center of mass and inertia tensor based on input triangle mesh, there are several reasons why alternatives should be used when possible. Probably the first one is speed. When working with triangles we end up with O(n) algorithm, as we need to process every of n triangles. Calculating mathematically defined body such as capsule gets us O(1) algorithm. Secondly, often we want to create a rigid body for our physics simulation but have only one mesh, a very detailed one for rendering. In such cases it is inconvenient to construct a special mesh for physics calculations and would be much easier to pass a few arguments to a function for a specific geometric primitive. Last and not so significant (at least for games) is accuracy. It is reasonable to say that a mesh that consists of triangles will never truly have round parts of its surface. This goes for spheres, cylinders, capsules and many other primitives. So if we pass our capsule as a mesh we will end up with slightly incorrect results. The analytical method that will be presented here provides accurate results.
It is assumed that the reader has insight in theory behind inertia tensors and/or has an existing physics system that can make use of such a matrix. Knowledge in multivariate calculus is also assumed.
Concept
Firstly, what is meant by saying capsule? Well, simply put, it is a cylinder that (instead of flat ends) has hemispherical ends. By cylinder I mean a circular one (not elliptical). Hemispheres have the same radius. It should also be mentioned that the given body is solid (not hollow).
Where H is height of the cylinder part and R is radius of the hemisphere caps. This is the coordinate system that has its origin in the center of the mass of the capsule and whose axes are the axes which we will be calculating moments of inertia for.
The approach that we will be using to calculate the inertia tensor can be seen from Figure 2. The body will be decomposed into three parts: upper hemisphere, lower hemisphere and cylinder. Inertia tensors will be calculated for each of them and then just summed together to result in a complete capsule tensor. Here are given, in Cartesian coordinates, the well-known equations for moments and products of inertia needed for the inertia tensor in general.
(eq. 1)
V represents the three dimensional region of integration and dm is an infinitesimal amount of mass at some point in our capsule. The integrand is simply squared distance from the axis.
(eq. 2)
Note that in eq. 2 products of inertia are equal to zero. There exists a theorem that says that it is possible to find three (mutually perpendicular) axes for every body so the tensor produced from those axes has zero products of inertia. Those axes are called the principal axes. Furthermore, for bodies with constant density an axis of rotational symmetry is a principal axis. We have three perpendicular axes of rotational symmetry for our capsule. First one is the y axis of continuous rotational symmetry and then there are x and z axes of discrete rotational symmetry. I will not go into detail about what axes of rotational symmetry are, but I recommend to go and check it out.
(eq. 3)
Since we will integrate over space eq. 3 will be used. If density is required to be some other constant than 1, the end result can simply be multiplied by that constant.
Cylinder
Before we dive into integration here is the Cartesian coordiante system we will be using for cylinder:
Figure 3. Cylinder and its position relative to coordinate system.
Let's start with easiest to compute moment of inertia, the one around the y axis. As previously mentioned, integrand is just a squared distance from the axis. With that in mind we can leave the Cartesian coordinate system and use the cylindrical one. The y axis remains unchanged. Also note that solutions will be written immediately without showing any steps.
(eq. 4)
First we add up (integrate) over 2*pi*r interval to get the moment of inertia of a circular curve around the y axis. After that we integrate that curve over the interval R to get the moment of inertia of a disc surface perpendicular to the y axis. Lastly those discs are added up across [-H/2, H/2] interval to get the moment of inertia of a whole cylinder. The integral for the mass of a cylinder is very similar and follows the same intuition. It is given by:
(eq. 5)
Now, slightly more complicated integrals are the ones for moments of inertia around x and z axis. Those are equal, so only one calculation is needed. Note that we are back in Cartesian coordinates for this one and the integral is set up like we are calculating Ixx.
(eq. 6)
This is it for the cylinder as I will not go into detail about intuition for this integral since there is more work to be done for hemisphere caps.
Hemisphere caps
The following is how the hemisphere is defined relative to Cartesian coordinate system for the purpose of integration:
Figure 4. Hemisphere and its position relative to coordinate system.
The easiest to calculate moment of inertia of hemispheres is also the one around y axis.
(eq. 7)
We are in the cylindrical coordinate system and the intuition is similar to eq. 4. The only difference is that now the radius of a disc we need to integrate is a variable of the y position of a disc (sqrt(R^2 - y^2)). Then we just integrate those discs from 0 to R and we end up with the whole hemisphere region. As for the cylinder, the integral for the mass of a hemisphere follows the same intuition and is given by:
(eq. 8)
Now, when we try to calculate the moments of inertia around x and z axis (which are equal), things get tricky. Note that integral is set up like we are calculating Ixx:
(eq. 9)
As it can be seen in figure 5, center of mass is not in the origin of the coordinate system, but the axis for which we calculated the moment of inertia is. In order to work with this moment of inertia (to apply Steiner's rule) we need it to be in the center of mass. This is where we calculate the distance b and use it in reverse Steiner's rule:
(eq. 10)
(eq. 11)
Where Icm is moment of inertia of an axis that goes through the center of mass and m is hemisphere mass from eq. 8. Now it's finally possible to calculate Ixx and Izz for our hemisphere caps. We can now apply Steiner's rule to translate the axes from the center of mass of a hemisphere to the center of mass of the whole body (capsule):
(eq. 12)
Moments of inertia for the hemisphere cap on the bottom of the capsule do not differ to those here.
The tensor
(eq. 13)
Equations in eq. 13 give us separated masses of cylinder and hemisphere as well as the mass of the whole body. Notice how mass of the hemisphere (mhs) is multiplied by two. This is, of course, because there are two hemispheres on both ends of the capsule. Also, note our previous assumption in which density is equal to one unit of measurement, which is why the mass is equal to volume. If needed mcy and mhs can simply be multiplied by some constant density and the results will be accurate. The given code is written in such way. The full tensor uses this values and is given by:
(eq. 14)
Although previously shown formulas for moments of inertia are not given by mcy and mhs, by substituting those values equality can be easily shown. The parts that represent moments of inertia of a hemisphere are multiplied by two, because we have two of them.
Code
The following code implementation is given in C-like language and is fairly optimized.
#define PI 3.141592654f
#define PI_TIMES2 6.283185307f
const float oneDiv3 = (float)(1.0 / 3.0);
const float oneDiv8 = (float)(1.0 / 8.0);
const float oneDiv12 = (float)(1.0 / 12.0);
void ComputeRigidBodyProperties_Capsule(float capsuleHeight, float capsuleRadius, float density,
float &mass, float3 ¢erOfMass, float3x3 &inertia)
{
float cM; // cylinder mass
float hsM; // mass of hemispheres
float rSq = capsuleRadius*capsuleRadius;
cM = PI*capsuleHeight*rSq*density;
hsM = PI_TIMES2*oneDiv3*rSq*capsuleRadius*density;
// from cylinder
inertia._22 = rSq*cM*0.5f;
inertia._11 = inertia._33 = inertia._22*0.5f + cM*capsuleHeight*capsuleHeight*oneDiv12;
// from hemispheres
float temp0 = hsM*2.0f*rSq / 5.0f;
inertia._22 += temp0 * 2.0f;
float temp1 = capsuleHeight*0.5f;
float temp2 = temp0 + hsM*(temp1*temp1 + 3.0f*oneDiv8*capsuleHeight*capsuleRadius);
inertia._11 += temp2 * 2.0f;
inertia._33 += temp2 * 2.0f;
inertia._12 = inertia._13 = inertia._21 = inertia._23 = inertia._31 = inertia._32 = 0.0f;
mass = cM + hsM * 2.0f;
centerOfMass = {0.0f, 0.0f, 0.0f};
}
It would seem that centerOfMass is useless, but I have left it here because it is important part of rigid body properties and often a requirement for physics engines.
Conclusion
Capsules defined in a way which was presented here are very commonly used in games and physical simulations, so this tensor should be of use to anyone building their own 3D game engine or similar software.
Although, right-handed Cartesian coordinate system was used in this article, you can also use the code in left-handed system as long as y axis is the longitudinal axis (the axis that goes through caps).
The somewhat tedious processes of integration were skipped and only the end results were shown. This is mostly because it would take too much space. If you are curious whether those expressions are correct, you can try them on Wolfram or try to integrate them yourself.
Also, you can follow me on Twitter.
Article Update Log
December 1, 2014: Initial release
December 3, 2014: Updated figures, equations and text]]> 3856 Tue, 07 Oct 2014 09:14:03 +0000 Towards a Simpler, Stiffer, and more Stable Spring https://www.gamedev.net/articles/programming/math-and-physics/towards-a-simpler-stiffer-and-more-stable-spring-r3227/
Download article:
Towards a Simpler, Stiffer, and more Stable Spring
Written by Michael Schmidt Nissen
Version 0.5.2, December 18th 2014
The trouble with springs
The damped spring based on Hooke's law of elasticity is often regarded as a black sheep in the field of game physics, and some papers and websites even warn the reader against implementing them. There are a couple of good reasons for this.
The most straight-forward reason is that springs are notoriously difficult to tune. It can be very time consuming to adjust the spring stiffness and damping coefficients in order to make a simulation behave just right. For large systems of interconnected particles and springs the task can be daunting. If a variable is changed, for instance a spring coefficient, the mass of a particle, or the simulation time-step, you might have to re-tune the entire system from scratch.
A more profound reason for avoiding springs is that they are potentially unstable. When increasing the spring coefficients above some unknown threshold, the simulation will start to oscillate chaotically and explode in all directions. Everyone who ever worked with springs has experienced this frustrating phenomenon. The problem is that there's no good method to determine in advance if a given spring system will show this behavour or not. Thus, the programmers or game designers ability to balance a spring system often rely on experience and gut feeling, which makes it look more like black magic than actual physics.
In this article I'll introduce a modified spring equation that is easy to tune, displays maximum stiffness and damping, and is guaranteed to keep the spring system stable under all conditions. The math behind the equation is surprisingly simple, which makes me wonder why it doesn't appear to have been mentioned or implemented earlier.
Making a better spring
The improved spring equation stems from trying to answer a simple question: Is it possible to make a spring that reaches its rest state in just one simulation loop? It turns out that the answer is yes! This wouldn't be possible in the real world, since it would require infinitely high stiffness and damping coefficients or zero mass. But in the not-quite real world of discrete, time-stepping physics simulation, this is achievable.
To explain how this is done, let's look at a one dimensional harmonic oscillator with a damped spring attached to a fixed point in one end and to a moving body in the other. Assume that the body is displaced distance x from its rest position. Now let us try to find the exact amount of force required to move the mass this distance in one simulation loop.
Distance can be expressed as velocity multiplied by time
\[ \large x = -{v} {\Delta t} \]
The minus sign simply means that we're moving in the opposite direction of the displacement. Velocity can be expressed as acceleration multiplied by time
\[ \large x = -{a} {\Delta t^{2}} \]
Newton's 2nd law of motion states that acceleration equals force over mass
\[ \large x = -\dfrac{f}{m} {\Delta t^{2}} \]
Now we can isolate force
\[ \large f= -\dfrac{m}{\Delta t^{2}} x \]
And since the spring force in Hooke's law of elasticity is defined as
\[ \large f= -k x \]
It becomes immediately clear that
\[ \large k = \dfrac{m}{\Delta t^{2}} \]
Which is the exact spring stiffness coefficient value required to move the particle back to its rest position in one simulation loop. However, since we are not doing anything to stop the particle, it will keep oscillating back and forth through the rest position. We need to add damping, which is done with the second part of the spring equation.
Velocity is equal to acceleration multiplied by time step
\[ \large v = -{a} {\Delta t} \]
Which is equal to force over mass
\[ \large v = -\dfrac{f}{m} {\Delta t} \]
When isolating force we get
\[ \large f = -\dfrac{m}{\Delta t} v \]
And since the damping force is defined
\[ \large f = -d v \]
We immediately see by inspection that the damping coefficient is
\[ \large d = \dfrac{m}{\Delta t} \]
This is the exact damping coefficient value needed to stop the particle in one simulation loop. Now we can write the complete damped spring equation.
\[ \large f= -\dfrac{m}{\Delta t^{2}} x -\dfrac{m}{\Delta t} v \]
This spring equation has some very interesting properties. The first thing we notice is the lack of coefficients. We've simply replaced k with m/?t2 and d with m/?t. When implementing the equation we see that it really does work! The spring reaches equilibrium in one loop, completely independent of particle position, velocity, mass, and simulation time-step. This is the stiffest possible spring, and it displays behavior more similar to a rigid constraint than a soft, bouncy spring. The equation also has another interesting property. It simply cannot blow up no matter what values you feed it. Practically speaking, we can regard the spring as being unconditionally stable.
Re-introducing the spring coefficients
Now we have a really powerful spring equation. It is easy to implement, very stable, and reaches its rest state in just one loop. But in our quest towards a better spring it has lost its springiness. We need to somehow get the softness and bounciness back again, and for this purpose we will re-introduce the spring and damping coefficients. To avoid confusion, the new coefficients are named Ck and Cd. While the original coefficients could represent any positive numerical value, these both lie in the interval between zero and one.
\[ \large f= -\dfrac{m}{\Delta t^{2}} C_{k} x -\dfrac{m}{\Delta t} C_{d} v \qquad [ 0 \leq C_{k}, C_{d} \leq 1 ] \]
As we can see, the new coefficients are simply the fraction of completely rigid behavior we would like the spring to display. Soft, bouncy springs would usually have values in the range of 0.001 - 0.00001. In the other end of the scale, values of just 0.1 - 0.01 is enough to display rigid behavior. Setting both values to 1.0 would of course still satisfy the constraint in one loop.
Please notice that spring behavior is determined exclusively by these two coefficients. Particle mass or simulation time-step has no influence on how the spring behaves, and changing them wouldn't make it necessary to tune the system!
Interestingly, the spring will get less rigid and less stable if we increase Ck or Cd above 1. If we keep increasing the coefficient values, the system will start oscillating chaotically, and at some point it will explode. In other words, we have determined the exact upper limit for the two spring coefficients, which we define
\[ \large k_{max} = \dfrac{m}{\Delta t^{2}} \]
\[ \large d_{max} = \dfrac{m}{\Delta t} \]
This allows us to simplify the spring equation
\[ \large f= -k_{max} C_{k} x -d_{max} C_{d} v \]
There is an important conclusion to be drawn from this. It is a misunderstanding to think that spring stiffness and damping can be increased towards infinity by simply increasing k and d. A system of very stiff springs doesn't necessarily blow up because the integration algorithm can't handle it, but because the coefficients might have been set to a value that simply does not make sense.
Two connected particles with different mass
It is only slightly more complicated to constrain two free-moving particles with the improved spring. To do this, we need to introduce the concept of reduced mass. This is a quantity that can be used to compute interactions between two bodies as if one body was stationary, which allows us to use the equation we've already derived. The reduced mass for two particles with mass m1 and m2 is defined as
\[ \large m_{red} = \dfrac{m_1 m_2}{(m_1 + m_2)} \]
Since the inverse mass quantity is often already pre-computed for other purposes, it can also be useful to define reduced mass as
\[ \large m_{red}= \dfrac{1}{ ( \dfrac{1}{m_1} + \dfrac{1}{m_2} ) } \]
For two connected particles, the maximum coefficient values are
\[ \large k_{max} = \dfrac{m_{red}}{\Delta t^{2}} \]
\[ \large d_{max} = \dfrac{m_{red}}{\Delta t} \]
When replacing mass with reduced mass we get
\[ \large f= -\dfrac{m_{red}}{\Delta t^{2}} C_{k} x -\dfrac{m_{red}}{\Delta t} C_{d} v \]
This spring equation will bring the two connected particles to equilibrium distance in one loop and make them stand still relative to each other. However, since energy and momentum are conserved, the particles may rotate around each other, which will make the bodies come to rest at a larger distance depending on how fast they rotate.
Angular springs
The spring equation can also be used for angular springs, also commonly named rotational or torsional springs. Rather than keeping two particles at a fixed distance by applying opposite equal forces, the angular spring will try to keep two rotating bodies at a fixed angle by applying opposite equal torques. The equation introduces the concept reduced moment of inertia, which is calculated in a similar way as reduced mass
\[ \large I_{red} = \dfrac{I_1 I_2}{(I_1 + I_2)} \]
Or alternatively
\[ \large I_{red}= \dfrac{1}{ ( \dfrac{1}{I_1} + \dfrac{1}{I_2} ) } \]
The maximum coefficient values for angular springs are
\[ \large k_{max} = \dfrac{I_{red}}{\Delta t^{2}} \]
\[ \large d_{max} = \dfrac{I_{red}}{\Delta t} \]
When replacing the variables of linear motion with those of angular motion we get
\[ \large \tau = -\dfrac{I_{red}}{\Delta t^{2}} C_{k} \theta -\dfrac{I_{red}}{\Delta t} C_{d} \omega \]
This spring equation will keep two rotating bodies at any given rest angle. If both coefficients are set to 1, the constraint will be solved in one loop. The spring allows for angular displacements larger than one full turn, making it possible to "wind up" bodies like the coil spring of a mechanical clock.
Impulse-based springs
Today a lot of physics engines and simulation methods are based on impulses - direct changes in velocities - rather than forces and acceleration. The linear and angular spring equations described above works equally well if we redesign them to work with impulses. Here are the equations without further ado
\[ \large J_{linear}= -\dfrac{m_{red}}{\Delta t} C_{k} x - C_{d} v \]
\[ \large J_{angular} = -\dfrac{I_{red}}{\Delta t} C_{k} \theta - C_{d} \omega \]
In this particular case there is no difference between the force-based and impulse-based spring. They should return the exact same results.
Limitations and pitfalls
When connecting a multitude of springs and particles into larger bodies, we run into the same trouble as any other type of distance and velocity constraint. Rather than cooperating, the constraints tend compete against each other, and this spells trouble. When a spring moves two particles to satisfy distance and velocity, it usually means dissatisfying one or several other springs. It is outside the scope of this article to dig deeply into this problem, but the author would like to provide a bit of advice on how to prevent the worst disasters.
When two or more springs are connected to the same particle, which is the case in any kind of rope, mesh, or deformable body, setting the coefficients to the maximum value of 1.0 will lead to stability problems. Although the spring equation is stable when used with a single or two particles, this is sadly not the case for higher number of particles. The author of this article has after some lengthy tampering worked out that a safe upper limit for both the stiffness and damping coefficient is
\[ \large c_{max} \approx \dfrac{1}{n + 1} \]
Where n denotes the highest number of springs attached to any of the two particles connected by the spring. So for example, in a rope where any particle is connected by at most two springs, Ck and Cd can both safely be set to 0.33, and in a square mesh, where any particle is at most connected by four springs, they can be set to 0.2.
Reference
http://en.wikipedia.org/wiki/Reduced_mass]]> 3227 Fri, 19 Dec 2014 07:53:00 +0000 Shooting At Stuff https://www.gamedev.net/articles/programming/math-and-physics/shooting-at-stuff-r3884/
You often have to have an entity controlled by an AI "shoot" at something in a 2D game. By "shoot" here, I mean launch something towards another moving something in the hopes that it will intercept it. This may be a bullet, a grenade, or trying to jump to a moving platform. If the target is moving, shooting at the current position (1) will usually miss except in trivial cases and (2) make your entity look dumb. When you play hide and seek, you don't run to where the person is unless they are stopped. You run to where the (moving) thing you are chasing is going to be.
There are two cases for "shooting" at something:
- Non-Rotating Shooter
- Rotating Shooter
This article only covers the case of the non-rotating shooter. The second case is a bit more complicated. I've posted a second article "Throwing a Winning Pass" to cover that.
Your goal is to predict the future position of the target so that you can launch something at it and have them collide. It is given that you know the following:
- The position of the shooter when the projectile will be launched: \(\vec{P_s}\)
- The position of the target when the shooter will launch the projectile (i.e. now): \(\vec{P_T^0}\)
- The speed at which your projectiles travel: \(S_b\)
- The velocity of the target, \(\vec{v_T}\)
What you would like to find is \(\vec{P_T^1}\), the final position where the projectile will intercept the target.
We are going to have to assume that the target will travel at a constant velocity while the projectile is moving. Without this, the projectile will have to adapt after it has been launched (because it cannot predict the future position of the target if it is changing velocity in an unpredictable way). It should be pointed out that this is not limiting and in fact, still works out pretty well as long as the "edges" are handled well (target right in front of you or headed straight at you).
The Problem Space
Consider the image below:
This image shows the situation for the shooter. The target is moving with some constant velocity and the intercept position for the projectile, when launched, is unknown until the calculation is done. When the projectile is launched, it will intercept the target after \(t_B\) seconds. The projectile is launched with a constant speed. We don't know its direction yet, but we know the magnitude of its velocity, its speed will be \(S_b\).
If the target is at position \(\vec{P_T^0}\) now, it will be at position \(\vec{P_T^1}\), given by:
(1) \(\vec{P_T^1} = \vec{P_T^0} + \vec{v_T} * t_B\)
Since the projectile will have traveled for the same amount of time, it will have moved from \(\vec{P_s}\) to \(\vec{P_T^1}\) as well. In that time, it will have moved a distance of \(S_B x t_B\). Since we are talking about vector quantities here, we can write this as:
\(\mid\vec{P_T^1}-\vec{P_s}\mid = S_b * t_B\)
If we square both sides and break it into components to get rid of the absolute value:
(2) \((P_{Tx}^1 - P_{Sx})^2 +(P_{Ty}^1 - P_{Sy})^2 = S_b^2 * t_B^2\)
Breaking (1) into components as well and substituting back into (2) for the value of \(P_{Tx}^1\) and \(P_{Ty}^1\), we get the following:
\((P_{T0x} - P_{Sx} + v_{Tx}t_B)^2 + (P_{T0y} - P_{Sy} + v_{Ty}t_B)^2 = S_b^2 * t_B^2\)
For the sake of simplicity, we going to redefine:
\(P_T^0 - P_s = R \)(this is a constant)
After some algebra, this gives us the final equation:
\(t_B^2(v_{Tx}^2 + v_{Ty}^2-S_B^2) + t_B(2*R_x*v_{Tx} + 2*R_y*v_{Ty}) + (R_x^2 + R_y^2) = 0\)
This is a quadratic in \(t_B\):
\(t_b = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}\)
where:
\( a =v_{Tx}^2 + v_y^2-S_B^2\)
\( b =2(R_x*v_{Tx} + R_y*v_{Ty})\)
\( c = R_x^2 + R_y^2\)
You can test the discriminant, \(b^2-4ac\):
< 0 \(\Rightarrow\) No Solution.
= 0 \(\Rightarrow\) One solution.
> 0 \(\Rightarrow\) Two solutions, pick the lowest positive value of \(t_B\).
Once you have solved the quadratic for \(t_B\), you can then substitute it back into (1) and calculate the intercept position, \(\vec{P_T^1}\).
The Code
Putting this together and covering some edge cases:
/* Calculate the future position of a moving target so that
* a projectile launched immediately can intercept (collide)
* with it.
*
* Some situations where this might be useful for an AI to
* make this calculation.
*
* 1. Shooting a projectile at a moving target.
* 2. Launching a football or soccer ball to a player.
* 3. Figuring out the best position to jump towards in
* a platform game.
*
*
* The output value, solution, is the position that the
* intercept will occur at and the location that the
* projectile should be launched towards.
*
* The function will return false if a solution cannot
* be found. Consider the case of a target moving away
* from the shooter faster than the speed of the
* projectile and you will see at least one case where
* this calculation may fail.
*/
bool CalculateInterceptShotPosition(const Vec2& pShooter,
const Vec2& pTarget0,
const Vec2& vTarget,
float64 sProjectile,
Vec2& solution
)
{
// This formulation uses the quadratic equation to solve
// the intercept position.
Vec2 R = pTarget0 - pShooter;
float64 a = vTarget.x*vTarget.x + vTarget.y*vTarget.y - sProjectile*sProjectile;
float64 b = 2*(R.x*vTarget.x + R.y*vTarget.y);
float64 c = R.x*R.x + R.y*R.y;
float64 tBullet = 0;
// If the target and the shooter have already collided, don't bother.
if(R.LengthSquared() < 2*DBL_MIN)
{
return false;
}
// If the squared velocity of the target and the bullet are the same, the equation
// collapses to tBullet*b = -c. If they are REALLY close to each other (float tol),
// you could get some weirdness here. Do some "is it close" checking?
if(fabs(a) < 2*DBL_MIN)
{
// If the b value is 0, we can't get a solution.
if(fabs(b) < 2*DBL_MIN)
{
return false;
}
tBullet = -c/b;
}
else
{
// Calculate the discriminant to figure out how many solutions there are.
float64 discriminant = b*b - 4 * a * c;
if(discriminant < 0)
{ // All solutions are complex.
return false;
}
if (discriminant > 0)
{ // Two solutions. Pick the smaller one.
// Calculate the quadratic.
float64 quad = sqrt(discriminant);
float64 tBullet1 = (-b + quad)/(2*a);
float64 tBullet2 = (-b - quad)/(2*a);
if((tBullet1 < 0) && (tBullet2 < 0))
{ // This would be really odd.
// Both times are negative.
return false;
}
else if(tBullet2 < 0 && tBullet1 >= 0)
{ // One negative, one positive.
tBullet = tBullet1;
}
else if(tBullet1 < 0 && tBullet2 >= 0)
{ // One negative, one positive.
tBullet = tBullet2;
}
else if(tBullet1 < tBullet2)
{ // First less than second
tBullet = tBullet1;
}
else
{ // Only choice left
tBullet = tBullet2;
}
}
else
{
tBullet = -b / (2*a);
}
}
// If the time is negative, we can't get there from here.
if(tBullet < 0)
{
return false;
}
// Calculate the intercept position.
solution = pTarget0 + tBullet*vTarget;
return true;
}
I have posted a working solution, with a simulation of using the above function and which you can tinker with, on github.
Article Update Log
17 Jan 2015: Fixed hole in pos/neg check.
3 Nov 2014: Cleaned up some text.
1 Nov 2014: Added code example.
1 Nov 2014: Update discriminant description.
28 Oct 2014: Initial Release]]> 3884 Tue, 28 Oct 2014 11:17:34 +0000 Math for Game Developers: Calculus https://www.gamedev.net/articles/programming/math-and-physics/math-for-game-developers-calculus-r3833/
Math for Game Developers is exactly what it sounds like - a weekly instructional YouTube series wherein I show you how to use math to make your games. Every Thursday we'll learn how to implement one game design, starting from the underlying mathematical concept and ending with its C++ implementation. The videos will teach you everything you need to know, all you need is a basic understanding of algebra and trigonometry. If you want to follow along with the code sections, it will help to know a bit of programming already, but it's not necessary. You can download the source code that I'm using from GitHub, from the description of each video. If you have questions about the topics covered or requests for future topics, I would love to hear them! Leave a comment, or ask me on my Twitter, @VinoBS
This series is ongoing - check back every Thursday for new content!
The video below contains the playlist for all the videos in this series, which can be accessed via the playlist icon at the top of the embedded video frame. The first video in the series is loaded automatically
Calculus
[playlist=PLW3Zl3wyJwWMC516An98BF8Pn6nfXaEWe]
]]> 3833 Tue, 23 Sep 2014 00:44:35 +0000 Simple but Effective Collisions Part 1: Radial Collision Handling https://www.gamedev.net/articles/programming/math-and-physics/simple-but-effective-collisions-part-1-radial-collision-handling-r3147/
The illusion of smooth collision detection is paramount to a realistic gaming experience. Imagine if you were playing a horror game and you suddenly slipped through the dungeon walls, or you were driving a vehicle through a forest, only to crash into a tree with no effect or feedback. The facade of realism would instantly be broken. The horror game's environment would no longer be immersive, and driving your vehicle would simply feel like moving a camera through a world you cannot touch or feel.
Of course, you probably already knew that. Anyhow, the purpose of this article is to present the first of two methods for modelling collisions in your own games. I'm hoping to release another article detailing the second method in the near future. The method described in this article, which I call 'radial' collision handling, can be used to model objects which can be enclosed by bounding cylinders. It is ideal for objects such as trees and poles, and can even be used to simulate collisions with small rectangular objects.
I call the second method 'nodal' collision handling, and it can be used to provide bounding regions for a series of connected walls in any arrangement. It could be used to model the interior of a house or hut, or the walls of a maze. It will hopefully be discussed in a future article.
I created these methods (or rather implemented them, as it's likely they've been used before) with two aims. First off, each method should allow me to detect collisions, which is an obvious requirement of collision detection! Second, each method should handle collisions in such a way as to 'slide' the player/object around the obstacle rather than bringing them to an awkward halt as soon as they touch it.
Radial collision handling
The Concept
As mentioned, this system allows you to handle collisions with objects which can be enclosed by bounding cylinders. These cylinders can be defined by two numbers, or rather, a vector and a float - one containing a location, and the other containing a radius. Let's imagine we have a simple scenario with a triangle of trees placed on a simple flat plane:
The player can move the camera around in a first person view, and is able to walk among the trees. We will use the 'radial' collision detection method to prevent the camera from passing through the trees. When the player attempts to walk through a tree, the algorithm will slide them smoothly around it, as seen in pretty much all current first person games. This process involves two steps, which are repeated every frame:
- If the camera has entered a tree's bounding cylinder, register a collision.
- Calculate a new location for the camera which puts it back outside the bounding cylinder.
So, how do we do this in practice?
The Implementation
We'll start out by defining the location of each tree using an array of 2D vectors. We only need 2D vectors because the trees are not going to affect our motion in the up-down direction. If you wanted to use this algorithm for objects on 3D terrain rather than a 2D plane, it should work just as well with the same vector format because it's not possible for a walking person to collide with a tree from above or below, only from the side. Because of this, we only need to worry about the positions of the trees in the x-z plane, whether we have a flat terrain or not. As mentioned, we will use DirectX syntax and functions from here on in:
//The array contains 3 vectors - one for each tree:
D3DXVECTOR2 v_treePositions[3] = {D3DXVECTOR2(0, 1), D3DXVECTOR2(1, 0), D3DXVECTOR2(-1, 0)};
We will also define a variable to describe the radius of each tree trunk. We will assume they're all the same radius, but if you wanted to personalise each one, you could define an array of numbers, or add a third component to the vectors describing tree locations.
//Each tree is 1m in radius:
float f_treeRadius = 1.0f;
...and while we're at it, we will define a vector to hold the location of our first-person camera as it changes from frame to frame:
//This is a global vector holding the camera's location. It starts at the centre of the world (0, 0, 0):
D3DXVECTOR3 v_camPos(0, 0, 0);
OK, now each tree has an entry in our program, and we know where our player's camera is from frame to frame. All that remains is to put this information to use! We will define a function which will be called every frame. This function will loop through the tree locations, comparing each one to the location of the camera for that frame. If it registers a collision (i.e. if the camera's location is closer to one of the trees than is permitted by the bounding radius), then we will push the camera back outside the bounding radius. The effect is cumulative, meaning that each tree makes its own contribution to the overall position change. This way, the algorithm also works for closely bound groups of trees where the camera could be in contact with more than one at once.
Let's go through the function line by line:
D3DXVECTOR2 handle_Collisions_Radial()
{
D3DXVECTOR2 v_deltaPos = D3DXVECTOR2(0, 0);
for (int i = 0; i < 3, i++)
{
D3DXVECTOR2 v_vectorToCentre(v_camPos.x - v_treePositions.x, v_camPos.z - v_treePositions.z);
if (D3DXVec2Length(&v_vectorToCentre) < f_treeRadius)
{
v_deltaPos += ((f_treeRadius / D3DXVec2Length(&v_vectorToCentre)) - 1) * v_vectorToCentre;
}
}
return v_deltaPos;
}
First, we notice that the function returns a D3DXVECTOR2. This corresponds to the amount by which the camera needs to move to put it outside of the bounding cylinders of all the trees. It returns a D3DXVECTOR2 because the trees only affect the location of the camera in the x-z plane.
Second, we create an inline vector called v_deltaPos which will be filled with the overall change in position required to put the camera outside the bounding radii of all the trees. This is the vector which the function will eventually return.
Next, we enter a loop which will iterate through all the trees. Since we have three trees, we will have three iterations.
Third, we create a temporary vector called v_vectorToCentre. This contains a directional vector between the location of the camera and the location of tree in the x-z plane.
Fourth, we compare the length of this vector to the bounding radius. Note that the +y direction is up.
Fifth and finally, if the camera is indeed within the bounding radius of the current tree, we will add on just enough of v_vectorToCentre to put it back outside the tree trunk.
Once the loop is complete, we return the resulting position change.
The process is fairly intuitive until the final step. This diagram explains the maths behind it a little clearer:
As you can see, the location of the camera has moved within the bounding radius of tree . In order to put the camera back outside of the bounding radius while simultaneously making it slide smoothly around the edge, we need to add a certain amount of the vector v_vectorToCentre to the camera's position until it is once again outside the bounding radius, such that:
In this equation, |v_vectorToCentre| denotes the length of the vector between the camera position and the tree's centre. It is calculated using the D3DXVec2Length() function. We are looking for the value of k - that is, the linear multiple of v_vectorToCentre which we should add to our camera position in order to put it back outside the bounding radius. In order to find the value of k, we can first divide through by |v_vectorToCentre|. This gives us:
This can easily be rearranged to get:
We have now calculated our value of k. All that remains is to add this linear multiple of v_vectorToCentre to our overall change in location (v_deltaPos). If you review the fifth step in the function, you can see that this is exactly what is happening. If called every frame, this function will calculate a collision contribution for each tree in the loop. Once this contribution is acquired, it should be added to the camera position vector as follows:
//Call the function to assess collisions:
D3DXVECTOR2 v_camPosCollisionDelta = handle_Collisions_Radial();
//Since this function returns a D3DXVECTOR2 and our camera position is a D3DXVECTOR3, we
//will need to create a dummy D3DXVECTOR3 with no y-component to add to v_camPos:
D3DXVECTOR3 v_camPosCollisionDelta3D(v_camPosCollisionDelta.x, 0, v_camPosCollisionDelta.y);
//Finally, we will move the camera according to our collison delta vector:
v_camPos += v_camPosCollisionDelta3D;
//Now that we have our new camera position, we can continue to render the frame as normal...
Conclusion
That's it! You may be surprised at how well this simple method works. The effect is quite satisfying, and it certainly added an extra professional depth to my exploits in the first-person camera world. Of course, this method could be used to handle collisions involving objects other than a first-person camera, and subjects other than trees. This is also a bare-bones version of the algorithm, and it could easily be adapted to handle collisions in 2D games as well as collisions between circles and spheres and their environment (throughout, we assumed that the first person camera was a simple point rather than an object with dimensions).
I intend to write another article describing another collision method which can be used to simulate walls and interiors/exteriors of buildings and things like hedges and maze walls. Until then, thanks for reading!
Article Update Log
19 May 2013: Initial release]]> 3147 Thu, 01 Jan 1970 00:00:00 +0000 Matrix Inversion using LU Decomposition https://www.gamedev.net/articles/programming/math-and-physics/matrix-inversion-using-lu-decomposition-r3637/
Introduction
The forums are replete with people trying to learn how to efficiently find inverses of matrices or solve the classic \(Ax=b\) problem. Here's a decent method that is fairly easy to learn and implement. Hopefully it might also serve as a stepping stone to learning some of the more advanced matrix factorization methods, like Cholesky, QR, or SVD.
Overview
In 1948, Alan Turing came up with LU decomposition, a way to factor a matrix and solve \(Ax=b\) with numerical stability. Although there are many different schemes to factor matrices, LU decomposition is one of the more commonly-used algorithms. Interestingly enough, Gauss elimination can be implemented as LU decomposition. The computational effort expended is about the same as well.
So why would anyone want to use this method? First of all, the time-consuming elimination step can be formulated so that only the entries of \(A\) are required. As well, if there are several matrices of \(b\) that need to be computed for \(Ax=b\), then we only need to do the decomposition once. This last benefit helps us compute the inverse of \(A\) very quickly.
How It Works
Let's start with an \(Ax=b\) problem, where you have an \(n\)x\(n\) matrix \(A\) and \(n\)x\(1\) column vector \(b\) and you're trying to solve for a \(n\)x\(1\) column vector \(x\). The idea is that we break up \(A\) into two matrices: \(L\), which is a lower triangular matrix, and \(U\), which is an upper triangular matrix. It would look something like this:
\[
\left [
\begin{matrix}
a_{11} & a_{12} & a_{13} \\
a_{21} & a_{22} & a_{23} \\
a_{31} & a_{32} & a_{33} \\
\end{matrix} \right ] = \left [
\begin{matrix}
l_{11} & 0 & 0 \\
l_{21} & l_{22} & 0 \\
l_{31} & l_{32} & l_{33} \\
\end{matrix} \right ] \left [
\begin{matrix}
u_{11} & u_{12} & u_{13} \\
0 & u_{22} & u_{23} \\
0 & 0 & u_{33} \\
\end{matrix} \right ]
\]
This seems like a lot more work, but allows us to use some cool math tricks. Our original equation, substituted with our decomposition is now \((LU)x=b\):
\[
\left (
\left [ \begin{matrix}
l_{11} & 0 & 0 \\
l_{21} & l_{22} & 0 \\
l_{31} & l_{32} & l_{33} \\
\end{matrix} \right ]
\left [ \begin{matrix}
u_{11} & u_{12} & u_{13} \\
0 & u_{22} & u_{23} \\
0 & 0 & u_{33} \\
\end{matrix} \right ]
\right )
\left [ \begin{matrix}
x_1 \\
x_2 \\
x_3 \\
\end{matrix} \right ] =
\left [ \begin{matrix}
b_1 \\
b_2 \\
b_3 \\
\end{matrix} \right ]
\]
If we shift the parentheses, we get the following:
\[
\left [ \begin{matrix}
l_{11} & 0 & 0 \\
l_{21} & l_{22} & 0 \\
l_{31} & l_{32} & l_{33} \\
\end{matrix} \right ]
\left (
\left [ \begin{matrix}
u_{11} & u_{12} & u_{13} \\
0 & u_{22} & u_{23} \\
0 & 0 & u_{33} \\
\end{matrix} \right ]
\left [ \begin{matrix}
x_1 \\
x_2 \\
x_3 \\
\end{matrix} \right ]
\right )
=
\left [ \begin{matrix}
b_1 \\
b_2 \\
b_3 \\
\end{matrix} \right ]
\]
Looking just inside the parentheses, we can see another \(Ax=b\) type problem. If we say that \(Ux=d\), where \(d\) is a different column vector from \(b\), we have 2 separate \(Ax=b\) type problems:
\[
\left [ \begin{matrix}
l_{11} & 0 & 0 \\
l_{21} & l_{22} & 0 \\
l_{31} & l_{32} & l_{33} \\
\end{matrix} \right ]
\left [ \begin{matrix}
d_1 \\
d_2 \\
d_3 \\
\end{matrix} \right ]
=
\left [ \begin{matrix}
b_1 \\
b_2 \\
b_3 \\
\end{matrix} \right ]
\\
\left [ \begin{matrix}
u_{11} & u_{12} & u_{13} \\
0 & u_{22} & u_{23} \\
0 & 0 & u_{33} \\
\end{matrix} \right ]
\left [ \begin{matrix}
x_1 \\
x_2 \\
x_3 \\
\end{matrix} \right ]
=
\left [ \begin{matrix}
d_1 \\
d_2 \\
d_3 \\
\end{matrix} \right ]
\]
It looks like we just created more work for ourselves, but we actually made it easier. If we look at the problems, the matrices on the left are in a form like row echelon form. Using forward and back substitutions, we can solve easily for all the elements of the \(x\) and \(d\) matrices. We first solve \(Ld=b\) for \(d\), then we substitute into \(Ux=d\) to solve for \(x\). The real trick to this method is the decomposition of \(A\).
The Catch
There are a couple catches to this method:
- The matrix \(A\) must be square to use LU factorization. Other factorization schemes will be necessary if \(A\) is rectangular.
- We have to be sure that \(A\) is a nonsingular (i.e. invertible) matrix. If it can't be inverted, then the decomposition will produce an \(L\) or \(U\) that is singular and the method will fail because there is no unique solution.
- To avoid division by zero or by really small numbers, we have to implement a pivoting scheme just like with Gaussian elimination. This makes the problem take the form \(PA=LU\), where P is a permutation matrix that allows us to swap the rows of A. P is usually the identity matrix with rows swapped such that \(PA\) produces the \(A\) matrix with the same rows swapped as P. Then the \(Ax=b\) problem takes the form \(LUx=Pb\) since \(PA=LU\).
- Some of the entries in the \(L\) and \(U\) matrices must be known before the decomposition, or else the system has too many unknowns and not enough equations to solve for all the entries of both matrices. For what's formally known as Doolittle decomposition, the diagonal entries of the \(L\) matrix are all 1. If we use Crout decomposition, the diagonals of the \(U\) matrix are all 1.
The method presented here won't have the pivoting part implemented, but it shouldn't be a problem to implement later.
The Algorithm
For a square matrix \(A\) with entries \(a_{ij},\,i=1,\cdots,n,\,j=1,\cdots,n\), the Crout decomposition is as follows:
First:
\[
l_{i1} = a_{i1},\,\textrm{for}\,i=1,2,\cdots,n\\
u_{1j} = \frac{a_{1j}}{l_{11}},\,\textrm{for}\,j=2,3,\cdots,n
\]
For \(j=2,3,\cdots,n-1\):
\[
l_{ij} = a_{ij}-\sum_{k=1}^{j-1}l_{ik}u_{kj},\,\textrm{for}\,i=j,j+1,\cdots,n \\
u_{jk} = \frac{a_{jk}-\sum_{i=1}^{j-1}l_{ji}u_{ik}}{l_{jj}},\,\textrm{for}\,k=j+1,j+2,\cdots,n
\]
Finally:
\[
l_{nn}=a_{nn}-\sum_{k=1}^{n-1}l_{nk}u_{kn}
\]
That's it! If you notice, \(A\) is being traversed in a specific way to build \(L\) and \(U\). To start building \(L\), we start with \(a_{11}\) and then traverse down the rows in the same column until we hit \(a_{n1}\). To start building \(U\), we start at \(a_{12}\) and traverse along the same row until we hit \(a_{1n}\). We then build \(L\) further by starting at \(a_{22}\) and traverse along the column until we hit \(a_{n2}\), then we build \(U\) further by starting at \(a_{23}\) and traverse along the row until \(a_{2n}\), and so on.
Since the 1's on the diagonal of \(U\) are given as well as the 0's in both the \(L\) and \(U\) matrices, there's no need to store anything else. In fact, there's a storage-saving scheme where all the calculated entries of both matrices can be stored in a single matrix as large as \(A\).
From here, \(d\) can be solved very easily with forward-substitution:
\[
d_1 = \frac{b_1}{l_{11}} \\
d_i = \frac{b_i - \sum_{j=1}^{i-1}l_{ij}d_j}{l_{ii}},\,\textrm{for}\,i=2,3,\cdots,n
\]
As well, \(x\) can be solved very easily with back-substitution:
\[
x_n = d_n \\
x_i = d_i - \sum_{j=i+1}^n u_{ij}x_j,\,\textrm{for}\,i=n-1,n-2,\cdots,1
\]
A Numerical Example
Let's try to solve the following problem using LU decomposition:
\[
\left [ \begin{matrix}
3 & -0.1 & -0.2 \\
0.1 & 7 & -0.3 \\
0.3 & -0.2 & 10 \\
\end{matrix} \right ]
\left [ \begin{matrix}
x_1 \\ x_2 \\ x_3 \\
\end{matrix} \right ]
=
\left [ \begin{matrix}
7.85 \\ -19.3 \\ 71.4 \\
\end{matrix} \right ]
\]
First, we copy the first column to the \(L\) matrix and the scaled first row except the first element to the \(U\) matrix:
\[
L =
\left [ \begin{matrix}
3 & 0 & 0 \\
0.1 & - & 0 \\
0.3 & - & - \\
\end{matrix} \right ],\,\,\,
U =
\left [ \begin{matrix}
1 & -0.0333 & -0.0667 \\
0 & 1 & - \\
0 & 0 & 1 \\
\end{matrix} \right ]
\]
Then we compute the following entries:
\[
l_{22} = a_{22} - l_{21}u_{12} = 7-(0.1)(-0.0333) = 7.00333 \\
l_{32} = a_{32} - l_{31}u_{12} = -0.2-(0.3)(-0.0333) = -0.19 \\
u_{23} = \frac{a_{23}-l_{21}u_{13}}{l_{22}} = \frac{-0.3-(0.1)(-0.0667)}{7} = -0.0419 \\
l_{33} = a_{33}-l_{31}u_{13}-l_{32}u_{23} = 10-(0.3)(-0.0667)-(-0.19)(-0.0419) = 10.012 \\
\]
Inserting them into our matrices:
\[
L =
\left [ \begin{matrix}
3 & 0 & 0 \\
0.1 & 7.00333 & 0 \\
0.3 & -0.19 & 10.012 \\
\end{matrix} \right ],\,\,\,
U =
\left [ \begin{matrix}
1 & -0.0333 & -0.0667 \\
0 & 1 & -0.0419 \\
0 & 0 & 1 \\
\end{matrix} \right ]
\]
This is the LU factorization of the matrix. You can check if \(LU=A\). Now, we use back- and forward-substitution to solve the problem:
\[
d_1 = \frac{b_1}{l_{11}} = 7.85/3 = 2.6167 \\
d_2 = \frac{b_2-l_{21}d_1}{l_{22}} = (-19.3-(0.1)(2.6167))/7.00333 = -2.7932 \\
d_3 = \frac{b_3-l_{31}d_1-l_{32}d_2}{l_{33}} = (71.4-(0.3)(2.6167)-(-0.19)(-2.7932))/10.012 = 7 \\
\]
\[
x_3 = d_3 = 7 \\
x_2 = d_2 - u_{23}x_3 = -2.7932-(-0.0419)(7) = -2.5 \\
x_1 = d_1 - u_{12}x_2 - u_{13}x_3 = 2.6167-(-0.0333)(-2.5)-(-0.0667)(7) = 3 \\
\]
So the solution to the problem is:
\[
\left [ \begin{matrix}
x_1 \\ x_2 \\ x_3 \\
\end{matrix} \right ]
=
\left [ \begin{matrix}
3 \\ -2.5 \\ 7 \\
\end{matrix} \right ]
\]
This solution can easily be verified by plugging it back into \(Ax=b\).
MATLAB Code
Here's some quick MATLAB code for LU decomposition:
function [L,U] = lucrout(A)
[~,n] = size(A);
L = zeros(n,n);
U = eye(n,n);
L(1,1) = A(1,1);
for j=2:n
L(j,1) = A(j,1);
U(1,j) = A(1,j) / L(1,1);
end
for j=2:n-1
for i=j:n
L(i,j) = A(i,j);
for k=1:j-1
L(i,j) = L(i,j) - L(i,k)*U(k,j);
end
end
for k=j+1:n
U(j,k) = A(j,k);
for i=1:j-1
U(j,k) = U(j,k) - L(j,i)*U(i,k);
end
U(j,k) = U(j,k) / L(j,j);
end
end
L(n,n) = A(n,n);
for k=1:n-1
L(n,n) = L(n,n) - L(n,k)*U(k,n);
end
end
Matrix Inverse with LU Decomposition
LU decomposition is nice for solving a series of \(Ax=b\) problems with the same \(A\) matrix and different \(b\) matrices. This is advantageous for computing the inverse of \(A\) because only one decomposition is required. The definition of the inverse of a matrix \(A^{-1}\) is a matrix such that \(AA^{-1}=I\), where \(I\) is the identity matrix. This looks like this for a general 3x3 matrix:
\[
\left [ \begin{matrix}
A_{11} & A_{12} & A_{13} \\
A_{21} & A_{22} & A_{23} \\
A_{31} & A_{32} & A_{33} \\
\end{matrix} \right ]
\left [ \begin{matrix}
a_{11}^\prime & a_{12}^\prime & a_{13}^\prime \\
a_{21}^\prime & a_{22}^\prime & a_{23}^\prime \\
a_{31}^\prime & a_{32}^\prime & a_{33}^\prime \\
\end{matrix} \right ]
=
\left [ \begin{matrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1 \\
\end{matrix} \right ]
\]
This can be set up like \(n\) \(Ax=b\) problems with different \(b\) matrices and the \(x\) matrix becomes the nth column in the inverse matrix. For example, the first problem is:
\[
\left [ \begin{matrix}
A_{11} & A_{12} & A_{13} \\
A_{21} & A_{22} & A_{23} \\
A_{31} & A_{32} & A_{33} \\
\end{matrix} \right ]
\left [ \begin{matrix}
a_{11}^\prime \\
a_{21}^\prime \\
a_{31}^\prime \\
\end{matrix} \right ]
=
\left [ \begin{matrix}
1 \\
0 \\
0 \\
\end{matrix} \right ]
\]
The second column of the inverse can be computed by changing \(b\) to \([0,1,0]^T\), the third column with \([0,0,1]^T\), and so on. This method is quick because only back- and forward-substitution is required to solve for the column vectors after the initial LU decomposition.
WARNING: As you can see, to get the inverse of this matrix, we end up having to solve n Ax=b problems for each of the columns of the inverse. If you're trying to get the inverse of the matrix just to solve an Ax=b problem, you're introducing more numerical error into your solution and slowing down your computation time. In short, make sure you really need the matrix inverse and never use the matrix inverse to solve a system of equations!
Beyond LU Decomposition
There are a lot of other matrix factorization schemes besides LU, like Cholesky or QR factorization, but the general idea of decomposing a matrix into other matrices is roughly the same. The real key to computational savings comes from knowing beforehand what kind of matrix you're factoring and choosing the appropriate algorithm. For example, in structural finite element analysis, the matrix being decomposed is always symmetric positive definite. Cholesky decomposition is way more efficient and quicker than LU for those kinds of matrices, so it's preferred.
Conclusion
LU decomposition is a great tool for anyone working with matrices. Hopefully you can make use of this simple, yet powerful method.
Article Update Log
28 Apr 2014: Added warning to matrix inverse
16 Apr 2014: Initial release]]> 3637 Wed, 16 Apr 2014 15:55:16 +0000 Math for Game Developers: Fragment Shaders https://www.gamedev.net/articles/programming/math-and-physics/math-for-game-developers-fragment-shaders-r3619/
Math for Game Developers is exactly what it sounds like - a weekly instructional YouTube series wherein I show you how to use math to make your games. Every Thursday we'll learn how to implement one game design, starting from the underlying mathematical concept and ending with its C++ implementation. The videos will teach you everything you need to know, all you need is a basic understanding of algebra and trigonometry. If you want to follow along with the code sections, it will help to know a bit of programming already, but it's not necessary. You can download the source code that I'm using from GitHub, from the description of each video. If you have questions about the topics covered or requests for future topics, I would love to hear them! Leave a comment, or ask me on my Twitter, @VinoBS
The video below contains the playlist for all the videos in this series, which can be accessed via the playlist icon in the bottom-right corner of the embedded video frame once the video is playing. The first video in the series is loaded automatically
Fragment Shaders
[playlist=PLW3Zl3wyJwWMpFSRpeMmSBGDShbkiV1Cq]
]]> 3619 Tue, 25 Mar 2014 19:53:02 +0000 Making a Game Engine: Transformations https://www.gamedev.net/articles/programming/math-and-physics/making-a-game-engine-transformations-r3566/
See Also:
Making a Game Engine: Core Design Principles
Moving objects around in a scene is not hard to do. You simply draw objects at different locations each frame. Adding rotation, and resizing on top of that makes the code a little trickier but still not hard to manage. But then start trying to move objects relative to another such as putting a sword in a character's hand and things can get really tricky. The sword should stay in the character's hand regardless of what direction they rotate or where they move. This is where using matrices to represent these transformations is extremely valuable.
Coordinate Systems
In order for a coordinate to have any meaning there needs to be a point of reference to base that coordinate off of. Imagine you have a dot on a piece of paper. The dot does not have a coordinate unless you come up with a way to measure its location. One way would be to determine the distance the point is from the left side of the page and its distance from the bottom. These two measurements give the coordinate meaning since it has a point of reference, the bottom left corner of the page. The point of reference is known as a basis and any point measured relative to the same basis are said to be in the same coordinate system. If a worm were on a ball you could devise a way to measure the location of the worm on the ball. This coordinate would be the worm's location in the ball's coordinate space.
The ball itself also has a location but its location is measured in world coordinates. Let's assume that the location of the ball is measured from the bottom left corner of the image to the center of the ball. If you were to throw that ball the worm would move with the ball. The position of the worm in world coordinates would move as the ball moves, even though its position in ball coordinates stay the same. Transformations give us a way to convert between these coordinate systems relatively easy using matrices allowing us to easily keep the worm on the ball as it travels through the air.
Matrices have a lot of useful properties that lend themselves to solving many complex problems. We will be using them to transform objects on the screen. This article will focus on three simple transformations: translation, rotation, and scaling. I will be using 2D transformations for the sake of simplicity. The principles in the article also apply to 3D transformations. If you want to do 3D transformations, search around on the web for 3D transformation matrices. The jump from 2D to 3D is trivial with the exception of rotations. I am assuming you already know what matrices are and how to multiply them, what the inverse and transpose of a matrix are, and other basic matrix operations. (For help with matrices, try this video series)
So far this talk of coordinate spaces and matrices may seem to be complicated things when all you need to do is move or rotate objects. How exactly do these things help move objects on the screen anyway? Well, imagine you define a character sprite with a width of 50 and a height of 100. You could define four points that represent the corners of the sprite image (-25, 0), (-25, 0), (25, 100), (25, 0). These points represent the corners of the sprite relative to the character. So what whappens when the character moves? The transform of the player converts the points of the sprite to world space. Then the camera transform converts from world space to screen space to be drawn. If the player clicks on the screen you can use the inverse of these transforms to determine where on the character they clicked.
Basic Transformations
Translation
A translation simply changes an object's location. The matrix used to represent this is pretty simple too.
T[sub]x[/sub] and T[sub]y[/sub] is the number of units to move in the x and y directions respectively.
Rotation
Rotates points around the origin.
The above matrix will rotate counter clockwise by ? radians.
Scaling
Scaling an object changes its size. Scalar values in the range (0, 1) make objects smaller. A scale of 1 keeps it the same size, a scale greater than 1 makes it bigger. Really all the scale is doing is taking the current size and multiplying by the scale value.
Scales along the x and y directions by S[sub]x[/sub] and S[sub]y[/sub] respectively. When S[sub]x[/sub] = S[sub]y[/sub] the size will change uniformly. If you only modify a single value then the object will stretch meaning you can make an object wider by setting S[sub]x[/sub] to a value greater than 1 and keeping y the same. You can even make a value negative which will flip the object.
Using Matrices as Transforms
Transforming Points
Now that we have a few different transformation matrices we need a way to transform points using them. To do this you simply treat the point as a 1x3 matrix padding the extra slots with a 1 and multiplying it with the transform matrix.
The result of that multiplication is a another 1x3 matrix. Just taking the top two elements of the resulting matrix gives a new x, y pair of the point in the new coordinate system.
Transforming Directions
Transforming a direction is slightly different than transforming a point. Take a look at the example below.
The ball has been squashed. This is achieved by scaling the y direction down while keeping the x the same. If directions were transformed using the same matrix that were used to transform points, then you would get the red arrow as a result, but notice how the red arrow is no longer perpendicular to the ball. We want the green arrow that still is. To get that result, use the following equation.
As shown above, to get the proper matrix for transforming direction you need the inverse transpose of the transform matrix. You can avoid transposing a 3x3 matrix if you simply do a row vector multiplication in front of the matrix. Also take note that the third element is a 0 instead of a 1. This makes transforming a direction uneffected by any translations. This is proper behavior for transforming a direction since directions still point the same way regaurdless of where they are moved to.
Concatenating transforms
The three basic tranformations listed above aren't that useful on their own. To make things interesting we want to be able to join them together allowing objects to be translated, rotated, and scaled simultaneously. To do this we simply multiply the matrices together.
The resulting matrix, M, is a combination of scaling an object, rotating it, then translating it. Keep in mind that the order of T, R and S does matter. Since S is the right-most matrix its transformation will be applied first when transforming a point followed by R, followed by T. To see this grab some object nearby, such as a pencil, and hold it so it is pointing to the right. Now imagine that the pencil is first scaled by half in the x direction making the pencil shorter but not changing its thickness. Then the pencil is rotated 90 upward. The result is a short pencil with the same width pointing upward. Now switch the transformation order. First rotate it upward then scale. The resulting trasnform is a pencil of the same length but different width. So pay attention to the transform order.
One thing to note is that roatations pivot around the origin. This means if you move an object up ten units then rotate it clockwise 90 degrees it will end up 10 units to the right. If you want the object to pivot around its own center, rotate it first then move it.
Since matrices can be multiplied to concatenate transforms you can easily keep an object at the same position relative to another, like the example of the worm on a ball described above. You first create a transform using the position and orientation of the worm on the ball. This transform represents the conversion of the worm's local coordinates to the ball coordinates. You then create the transform of the ball relative to world coordinates. Since the worm is on the ball, to find the transform from worm coordinates to world coordinates you simply multiply them together.
Matrix insectToWorldTransform = ballToWorldTransform * insectToBallTransform;
Column Vector vs Row Vector
Before moving into the next section I should mention the difference between column vectors and row vectors. I have been using column vectors in this article. This means vectors are represented as a 3x1 matrix that is multiplied on the right of the transform.
The alternative are row vectors. A row vector is a 1x3 matrix and is multiplied on the left of the matrix.
Mathematically, these two methods are the same. Either one works. Some prefer how compact column vectors are to write out on paper. Others find row vectors to be more intiuative. The important thing is to pick one and be consistent with it. However, the graphics package you are using may require matrices for row vectors or column vectors. If you are using the other method you will need to convert your matrices before passing them in. To do so, simply transpose the matrix. As an example, the translate matrix using column vectors is
The same matrix for row vectors is
Also, keep in mind that when using row vectors the order to multiply matrices in is reversed.
So using row vectors, the left most matrix is applied first to the point and the matrix on the far right is applied last. An easy way to remember the difference is the matrix closest to the vector is applied first. This means the order is left to right using row matrices and right to left using column matrices.
Transform Class
Now that we have covered some basics of how transforms work let's build a transform class. Since we don't want to deal with matrices at a high level, we will allow the transform to be manipulated using a position, rotation, and scale. The transform class will also have children to allow for a heirarchy. When the parent transform moves all of the children move with it and the children's position, rotation, and scale will be defined relative to its parent. Finally the transform will extend the component class that is attached to a game object. Game objects and components are touched upon in this article.
class Transform extends Component
{
// the parent transform of this transform
// if it is null then the parent transform
// is the world coordinate system
private Transform parent;
// all of the transforms that have this
// transform set as their parent
private Transform[] children;
// the position relative to the parent transform
private Vector2 localPosition = new Vector2(0.0f, 0.0f);
// rotation relative to the parent
private float localRotation = 0.0f;
// scale relative to the parent
private Vector2 localScale = new Vector2(1.0f, 1.0f);
// specifies if the localToWorldTransform
// needs to be recalulated
private bool isDirty = false;
// the transform that converts local coordinates
// to world coordinates
private Matrix localToWorldMatrix = Matrix.identity;
// specifies if the worldToLocalMatrix
// needs to be recalculated
private bool isInverseDirty = false;
// the transform that converts world cooridnates
// to local coordinates
private Matrix worldToLocalMatrix = Matrix.identity;
/*
* Whenever any change happens that changes the localToWorldMatrix
* this should be called. That way the next time localToWorldMatrix
* is requested it will be recalculated
*/
private void setDirty()
{
// only update dirty boolean if it isn't already dirty
if (!isDirty)
{
isDirty = true;
isInverseDirty = true;
// set all children to be dirty since any modification
// of a parent transform also effects its children's
// localToWorldTransform
foreach (Transform child in children)
{
child.setDirty();
}
}
}
// change the parent transform.
// setting it to null makes the
// transform a child of world coordinates
public void setParent(Transform value)
{
// remove this from the previous parent
if (parent != null)
{
parent.children.remove(this);
}
// assign new parent
parent = value;
// add this to new parent
if (parent)
{
parent.children.add(this);
}
// changes parents effects
// the world position
setDirty();
}
public Transform getParent()
{
return parent;
}
// calculates the transform matrix that converts
// from local coordinates to the coordinate space
// of the parent transform
public Matrix calculateLocalToParentMatrix()
{
// Matrix.translate creates a translation matrix
// that shifts by (localPosition.x, localPosition.y)
// Matrix.rotate rotates by localRotation radians
// Matrix.scale scales by a factor of (localScale.x, localScale.y)
// These are the basic transforms that are described previously
// in this article
return Matrix.translate(localPosition) *
Matrix.rotate(localRotation) *
Matrix.scale(localScale);
}
// gets the matrix that converts from local
// coordinates to world coordinates
public Matrix getLocalToWorldMatrix()
{
// if the dirty flag is set, the the
// localToWorldMatrix is out of date
// and needs to be reclaculated
if (isDirty)
{
if (parent == null)
{
// if the parent is null then the parent is
// the world so the localToWorldMatrix
// is the same as local to parent matrix
localToWorldMatrix = calculateLocalToParentMatrix();
}
else
{
// if there is a parent, then the localToWorldMatrix
// is calcualted recursively using the parent's localToWorldMatrix
// concatenated with the local to parent matrix
localToWorldMatrix = parent.getLocalToWorldMatrix() * calculateLocalToParentMatrix();
}
// clear the dirty flag since the
// matrix is now up to date
isDirty = false;
}
return localToWorldMatrix;
}
public Matrix getWorldToLocalMatrix()
{
if (isInverseDirty)
{
// the inverse is out of date
// so it needs to be updated
// the worldToLocalMatrix is the inverse of
// the localToWorldMatrix
worldToLocalMatrix = getLocalToWorldMatrix().inverse();
// clear the dirty flag since the
// matrix is now up to date
isInverseDirty = false;
}
return worldToLocalMatrix;
}
// transforms a point from local coordinates to world coordinates
public Vector2 transformPoint(Vector2 point)
{
// matrix multiply padding the extra element with a 1
Matrix1x3 transformResult = getLocalToWorldMatrix() * Matrix1x3(point.x, point.y, 1);
return new Vector2(transformResult[1,1], transformResult[1,2], transformResult[1,3]);
}
// transforms a direction from local coordinates to world coordinates
public Vector2 transformDirection(Vector2 point)
{
// matrix multiply padding the extra element with a 0
// notice that the worldToLocalMatrix is used here
// and the point is multiplied as a row matrix before the
// transform matrix. This is the proper way to transform
// directions as described before in this article
Matrix3x1 transformResult = Matrix3x1(point.x, point.y, 0) * getWorldToLocalMatrix();
return new Vector2(transformResult[1,1], transformResult[2,1], transformResult[3,1]);
}
// transforms a point from world coordinates to local coordinates
public Vector2 inverseTransformPoint(Vector2 point)
{
// same logic as transformPoint only with the inverse matrix
Matrix1x3 transformResult = getWorldToLocalMatrix() * Matrix1x3(point.x, point.y, 1);
return new Vector2(transformResult[1,1], transformResult[1,2], transformResult[1,3]);
}
// transforms a direction from world coordinates to local coordinates
public Vector2 inverseTransformDirection(Vector2 point)
{
// same logic as transformDirection only with the inverse of the
// inverse localToWorldMatrix which is just the localToWorldMatrix
Matrix3x1 transformResult = Matrix3x1(point.x, point.y, 0) * getLocalToWorldMatrix();
return new Vector2(transformResult[1,1], transformResult[2,1], transformResult[3,1]);
}
public Vector2 getLocalPosition()
{
return localPosition;
}
// sets the position relative to the parent
// and marks the transform as dirty
public void setLocalPosition(Vector2 value)
{
localPosition = value;
// set the dirty flag since the localToWorldMatrix needs to be updated
setDirty();
}
/* localRoation and localScale should also have getters and setters
* like the local position does. Be sure to call setDirty in the
* setters for each of them */
}
There is a lot of code there so let me explain the key points. First of all, I am using a dirty flag to indicate when the transform matrices are out of date. That allows the position, rotation, and scale to be changed multiple times and the matrix is only actually recalculated once it is requested, reducing the amount of uneeded recalculations. Also take note that the dirty flag is only set if the transform is not already dirty. This is to keep the setDirty call from propagating to all of the children every time the transform is modified and only setting the dirty flag when necessary.
The inverse matrix is also stored in the transform to keep from having to calculate it everytime it is needed. It has its own dirty flag so it isn't calculated if it isn't needed.
Using the Transform Class
Now that we have a transform class, let's see how it can be used. First of all, the localToWorldMatrix can be used in draw calls. Most drawing libraries will allow you to specify a matrix to position objects on the screen.
class Renderer extends Component
{
void render(graphics)
{
graphics.setTransformMatrix(this.gameObject.transform.getLocalToWorldMatrix());
graphics.drawSprite(this.frame);
}
// whatever else the renderer does
// would go here
...
}
Keep in mind the above code doesn't account for the view transform. Think of the view transform as the camera for the scene. If you wanted to be able to scroll the view of the scene you should specify a camera object and multiply all transforms by the worldToLocalMatrix from the transform of the camera.
graphics.setTransformMatrix(
camera.transform.getWorldToLocalMatrix() *
gameObject.transform.getLocalToWorldMatrix());
The transform could be used in game logic code
class Mine extends Behavoir
{
void update(float deltaTime, InputState input)
{
// transforming the zero vector gets that
// transform's origin in world coordinates
Vector2 playerWorldPosition = player.transform.transformPoint(new Vector2(0.0, 0.0));
// take the players world position and convert it the mine's
// local coordinates
Vector2 playerLocalPosition = this.transform.inverseTransformPoint(playerWorldPosition);
// since playerLocalPosition is the players position relative to the mine
// the magnitude of the position is the distance from the mine to the player
if (playerLocalPosition.magnitudeSquared() < sensorRange * sensorRange)
{
this.detonate();
}
}
}
Conclusion
Transforms may take a while to fully grasp but once understood help simplify problems that would otherwise be a huge challenge to tackle. Hopefully this article helped you understand how to use them in your games better.
Article Update Log
5 Feb 2014: Initial release]]> 3566 Thu, 30 Jan 2014 06:19:27 +0000 Math for Game Developers: Triangle Meshes https://www.gamedev.net/articles/programming/math-and-physics/math-for-game-developers-triangle-meshes-r3540/
Math for Game Developers is exactly what it sounds like - a weekly instructional YouTube series wherein I show you how to use math to make your games. Every Thursday we'll learn how to implement one game design, starting from the underlying mathematical concept and ending with its C++ implementation. The videos will teach you everything you need to know, all you need is a basic understanding of algebra and trigonometry. If you want to follow along with the code sections, it will help to know a bit of programming already, but it's not necessary. You can download the source code that I'm using from GitHub, from the description of each video. If you have questions about the topics covered or requests for future topics, I would love to hear them! Leave a comment, or ask me on my Twitter, @VinoBS
The video below contains the playlist for all the videos in this series, which can be accessed via the playlist icon in the bottom-right corner of the embedded video frame once the video is playing. The first video in the series is loaded automatically
Triangle Meshes
[playlist=PLW3Zl3wyJwWNdt_cjnzfzR27GkxnyDP4r]
]]> 3540 Wed, 15 Jan 2014 17:58:25 +0000 Math for Game Developers: Geometry Testing https://www.gamedev.net/articles/programming/math-and-physics/math-for-game-developers-geometry-testing-r3539/
Math for Game Developers is exactly what it sounds like - a weekly instructional YouTube series wherein I show you how to use math to make your games. Every Thursday we'll learn how to implement one game design, starting from the underlying mathematical concept and ending with its C++ implementation. The videos will teach you everything you need to know, all you need is a basic understanding of algebra and trigonometry. If you want to follow along with the code sections, it will help to know a bit of programming already, but it's not necessary. You can download the source code that I'm using from GitHub, from the description of each video. If you have questions about the topics covered or requests for future topics, I would love to hear them! Leave a comment, or ask me on my Twitter, @VinoBS
The video below contains the playlist for all the videos in this series, which can be accessed via the playlist icon in the bottom-right corner of the embedded video frame once the video is playing. The first video in the series is loaded automatically
Geometry Testing
[playlist=PLW3Zl3wyJwWN6V7IEb2BojFYOlgpryp1-]
]]> 3539 Wed, 15 Jan 2014 17:52:58 +0000 Advanced Intersection Test Methods https://www.gamedev.net/articles/programming/math-and-physics/advanced-intersection-test-methods-r3536/
When working with geometry, you'll need to do some intersection tests at some point. Sometimes it's directly related to graphics, but sometimes it helps determine other useful things, like optimum paths. This article is meant to give the up-and-coming game developer a few more tools in their computational toolbox.
For this article, I'm assuming you know all about vectors, points, dot and cross products. We'll cover some quick properties of polynomials, some things about some basic curves, and then go over intersection tests.
Note: In an attempt to make this article more accessible to those people uncomfortable with making code from straight math, I've added some coding tips to help you figure out how to use these methods.
Polynomials
The Stone-Weierstrass theorem states that any continuous function defined on a closed interval can be uniformly approximated as closely as desired with a polynomial function. For graphics and game development, that means that most 2D objects we deal with can be expressed as polynomials. The key to these intersection test methods is using that to our advantage.
Bezout's Theorem
In order to find all our intersection points, we need to know how many intersections there can be between 2 polynomials. Bezout's theorem states that for a planar algebraic curve of degree n and a second planar algebraic curve m, they intersect in exactly mn points if we properly count complex intersections, intersections at infinity, and possible multiple intersections. If they intersect in more than that number, then they intersect at infinitely many points, which means that they are the same curve.
Bezout's theorem also extends to surfaces. A surface of degree m intersects a surface of degree n in a curve of degree mn. As well, a space curve of degree m intersects a surface of degree n in mn points.
The key here is to identify how to count intersections. For example, if 2 curves are tangent, they intersect twice. If they have the same curvature, then they intersect 3 times. Simple intersections (not tangent and not self-intersecting) are counted once. So how do we count intersections at infinity? Using homogeneous coordinates, of course!
Homogeneous Coordinates
Counting intersections at infinity sounds hard, but we can use homogeneous coordinates to do this. Here, we define a point in 3D homogeneous space \((X,Y,W)\) to correspond to a 2D point \((x,y)\) whose coordinates are \((X/W,Y/W)\). This means a 3D homogeneous point \((4,2,2)\) corresponds to a the 2D point \((4/2,2/2) = (2,1)\). Going the other way, the 2D point \((3,1)\) becomes the 3D point \((3,1,1)\), since the transformation back to 2D is simply \((3/1,1/1) = (3,1)\). This creates some weird equalities, but you can visualize this 3D-2D transformation as projecting the points (and curves) in 3D onto the plane \(z=1\).
This helps us with define infinities with finite numbers. Let's say we have the point \((2,3,1)\) in homogeneous coordinates. If we changed the weight coordinate W so that we had \((2,3,0.5)\), the 2D point would be \((4,6)\). Smaller weights mean larger projected coordinates. As the weight approaches zero, the 2D point gets larger and larger, until at W = 0, the projected point is at infinity. Thus, any homogeneous point \((X,Y,0)\) is a point at infinity.
Aside: Equations in Homogeneous Form
Usually, polynomials have terms that differ in their algebraic degrees. Some are quadratic, some cubic, some constant, etc. The polynomial takes the following form:
\[f(x,y) = \sum_{i+j\le n} a_{ij}x^iy^j = 0 \]
However, the same curve can be expressed in homogeneous form by adding a homogenizing variable w:
\[f(X,Y,W) = \sum_{i+j+k = n} a_{ij}X^iY^jW^k = 0 \]
Here, every term in the polynomial is of degree n.
Equation Types
To use polynomials effectively, we need to be familiar with how they can be expressed. There are basically 3 types of equations that can be used to describe planar curves: parametric, implicit, and explicit. If you've only had high-school math, you've probably dealt with explicit curves a lot and not so much with the others.
Parametric: The curve is specified by a real number \(t\) and each coordinate is given by a function of \(t\), like \(x = x(t)\) and \(y = y(t)\). Rational polynomials are defined the same way, but each coordinate divided by a different function of \(t\), like \(x = x(t) / w(t)\) and \(y = y(t)/w(t)\).
Implicit: This curve takes the form \(f(x,y) = 0\), where the point \((x,y)\) is on the curve.
Explicit: This is really a special case of both parametric and implicit forms. The explicit form of a curve is the classic \(y=f(x)\) form.
Lines
Let's start with a very basic curve: the line. It's a degree-1 polynomial. There are many definitions of this kind of curve. Some are helpful for games and others...not so much. Some you may have seen in algebra class and others you may be seeing for the first time.
Common Forms
Slope-intercept: \( y = mx+b \)
Point-slope: \( y - y_0 = m(x-x_0) \)
Affine: \( P(t) = [(t_1-t)P_0+(t-t_0)P_1] / (t_1-t_0) \)
Vector Implicit: \((P-P_0)\cdot n = 0\)
Parametric: \( P(x(t),y(t)) = (x_0+at,y_0+bt) \)
Algebraic Implicit: \( ax+by+c=0 \)
In secondary schools, they use the first 2 methods almost exclusively, but these turn out to be the least helpful for computing. Here, I've tried to order the definitions from least helpful to most helpful for our needs.
Sometimes it's helpful to illustrate how useful these forms can be with a motivating example. This will hopefully help to connect formulas like dot product to traditional algebra and polynomials so you can see the connections.
Motivating Example: Closest Point on 3D Line to a Given Point
In 3D, we can't represent a line as an implicit equation like ax+by+c=0. It becomes a parametric equation:
\[ L(x(t),y(t),z(t)) = \begin{cases} x&=x_0+ut \\ y&=y_0+vt \\ z&=z_0+wt \end{cases}\]
where \(\vec{a}=(u,v,w)\) defines the vector along the line and \(P_0=(x_0,y_0,z_0)\) defines a point on the line. These two things define the line completely.
What we want is an expression for the closest point on that line to another given point, which we'll name \(Q\). If we drew a vector from the closest point on the line to \(Q\) we would see that this vector would be perpendicular to \(\vec{a}\), or in other words, the dot product will be zero. This point will be unique on the line, so we should be able to solve for it algebraically. Remember the equation for the general point is \(P=P_0+\vec{a}t\).
The dot product equation would look like this:
\[ \begin{aligned} \vec{a} \cdot (Q-P) &= 0 \\ (u,v,w) \cdot (dx-ut,dy-vt,dz-wt) &= 0 \end{aligned} \]
where \( (dx,dy,dz) = (x_Q-x_{P_0},y_Q-y_{P_0},z_Q-z_{P_0} ) \). If we expand this out and solve for \(t\), we get the following expression:
\[ t = \frac{\vec{a} \cdot (Q-P_0)}{\vec{a} \cdot \vec{a}} \]
If we restrict \(\vec{a}\) to be a unit vector, the denominator will become 1 and vanish. This gives us a very simple expression for the point on the line, \( P = P_0 + \left [ \vec{a} \cdot (Q-P_0) \right ] \vec{a} \). The parameter multiplied to the vector is simply the dot product of the vector \(\vec{a}\) with the vector from the known point on the line to the given point. This serves to show 2 things:
The dot product helps us find components of vectors in certain directions. We can see that what we get back from the dot product is the component of the \((Q-P_0)\) vector in the direction of \(\vec{a}\) vector. This is consistent with the linear algebra definition.
If we're clever, we can use parametric and implicit equations to build polynomials and solve for the parameter values using algebra. These parameter values will help us find the points they're related to, if we need that information.
Implicit Form, Line-Point, and Line-Line Intersection
Probably one of the most useful forms, the implicit form is nice for a few reasons. First, we can express it in homogeneous coordinates: \(aX+bY+cW=0\). From here, we can define the coordinates as an ordered triple \((a,b,c)\) and then modify the equation to use the dot product so we can use a simple test if a point lies on a given line in implicit form:
\[L(a,b,c) \cdot P(X,Y,W) = 0\]
Coding Tip: Any decent vector library will have a dot product method for 3D vectors (if not, coding one up is trivial). Passing in the ordered triple as one vector and the point in homogeneous form to the dot product function will return a scalar value (number). That scalar can be tested against an epsilon value to see if it's near enough to the line:
bool pointOnLine = fabs(dot(L,P)) < epsilon
Signed Distance From a Line
Although this isn't strictly an intersection test, this method is valuable enough to mention here. Although points that lie on a line satisfy the equation \(L(a,b,c) \cdot P(X,Y,W) = 0\), if the point is not on the line, the scalar value can let us know on which side of the line the point lies by the sign. If \(L(a,b,c) \cdot P(X,Y,W) > 0\), the point lies on one side of the line; if \(L(a,b,c) \cdot P(X,Y,W) < 0\) it lies on the other. This is a convenient way of dividing a plane into 2 separate regions.
This scalar value is also a scaled distance from the line. If only the relative position of a point from the closest point on a line is desired, then this scalar can suffice. However, if true distance is required, then the full signed distance formula for a point \((x,y)\) from an implicit line defined by the ordered triple \((a,b,c)\) is given by the following:
\[D = \frac{ax+by+c}{\sqrt{a^2+b^2}} \]
Side note: In parametric form, the line is defined as a point \((x_0,y_0)\) and a direction vector \(\vec{d}= (u,v)\). Through a quick derivation, we can equate the implicit form and parametric form to determine the "direction" of the line. Here, the implicit coefficients are related to the direction vector like this:
\[\begin{aligned}A&=-v\\B&=u\\C&=vx_0-uy_0\\\end{aligned}\]
We want to use this notion of direction to apply to the signed distance formula so we can tell what side of the line we are on when looking in the direction of the line. Using this convention, we can say that if the sign of the dot product is greater than zero, it lies on the left side of the line. If the dot product is less than zero, it lies on the right side.
The reader should note that if this "right is less than zero, left is greater than zero" convention bothers anyone, you can flip this by multiplying the A, B, C coefficients by -1 to make "right is greater than 0, left is less than 0".
Implicit Line from 2 Points
This ordered triple form a convenient way to define coordinate axes. For example, the x-axis can be defined as \((0,1,0)\) and the y-axis can be defined as \((1,0,0)\). The ordered triple is also really nice because we can compute it really easily if we have 2 points. Let's go back to our vector math to see how this might work.
In the above picture, two vectors \(\vec{a}\) and \(\vec{b}\) are crossed to produce a 3rd vector \(\vec{c} = \vec{a}\times\vec{b}\). This vector \(\vec{c}\) is orthogonal to the vectors \(\vec{a}\) and \(\vec{b}\), meaning that \(\vec{c}\cdot\vec{a} = 0\) and \(\vec{c}\cdot\vec{b} = 0\). This is important for this next neat trick.
Imagine we have points \(P_1\) and \(P_2\). To tie back into the example above, let's let \(\vec{a} = P_1 = (x_1,y_1,w_1)\) and \(\vec{b} = P_2 = (x_2,y_2,w_2)\). If we cross these vectors, we get a vector \(\vec{c} = \vec{a} \times\vec{b}\) for which \(\vec{c}\cdot\vec{a} = 0\) and \(\vec{c}\cdot\vec{b} = 0\). Since \(\vec{a}\) and \(\vec{b}\) are points, we could say that \(\vec{c} = (a,b,c)\), an implicit ordered triple that passes through both points. We can check that the line given by \((a,b,c)\) passes through both points by the dot product method.
\[L(a,b,c) = P(X_1,Y_1,W_1) \times P(X_2,Y_2,W_2)\]
Line-Line Intersection
Using the same kind of logic, we can get the point at which 2 lines intersect. To use the cross product example again, let's let the vectors be the ordered triples of 2 implicit lines, \(\vec{a} = (a_1,b_1,c_1)\) and \(\vec{b} = (a_2,b_2,c_2)\). Again, crossing these vectors yields a vector \(\vec{c} = \vec{a} \times\vec{b}\) for which \(\vec{c}\cdot\vec{a} = 0\) and \(\vec{c}\cdot\vec{b} = 0\). Since \(\vec{a}\) and \(\vec{b}\) are lines, we could say that \(\vec{c} = (X,Y,W)\), the intersection point of both lines. This point has to be on both lines, and we can verify that using the dot-product method.
\[P(X,Y,W) = L(a_1,b_1,c_1) \times L(a_2,b_2,c_2)\]
Connection to Bezout's Theorem
Lines are degree-1 algebraic curves. Bezout's theorem states that 2 lines must intersect at exactly 1 point. So what happens with parallel lines? Well, if we use homogeneous coordinates, the intersection point will take the form \((X,Y,0)\), a point at infinity. Bezout's theorem still holds for that case.
Coding Tip: Any decent vector library will have a cross product method for 3D vectors (if not, coding one up is trivial). Passing in the ordered triples as vectors to the cross product function will return a 3D vector. This is the point in homogeneous form. In pseudocode:
Vector P = cross(new Vector(a1,b1,c1), new Vector(a2,b2,c2));
// check to see if the lines are parallel
if(P[2] != 0)
{
// this is to transform back to (x,y) from (X,Y,W)
x = P[0] / P[2];
y = P[1] / P[2];
}
Parametric-Implicit Curve Intersection
Sometimes it's advantageous to define some curves as parametric and some as implicit to solve for intersections. Most times, it's better to define the simpler curve as parametric and the more complex curve as implicit, if possible. This method solves for all algebraic intersections, which may or may not be "real" intersections.
Line-Circle Intersection
There are cases where there might be multiple roots, in which case we have to reconcile this with our intersection counting methods above (i.e. we have to count some of them differently). Let's take a fairly common case: line-circle intersection. Bezout's theorem says we will get 2 intersections since a circle is a degree-2 algebraic curve. The parametric line at point \((x_0,y_0)\) with direction \((c,d)\) and implicit circle centered at \((a,b)\) with radius \(r\) are defined as follows:
\[ \begin{aligned} C(x,y) &= (x-a)^2+(y-b)^2-r^2 = 0 \\
L(x(t),y(t)) &= \begin{cases} x &= x_0+ct \\ y&=y_0+dt \end{cases} \end{aligned}\]
We can substitute the parametric equations into the implicit equation to get a polynomial in terms of the parameter \(t\). The roots of the polynomial are the parameter values at which the line intersects the circle. Substituting, we get:
\[
\begin{aligned}
0 &= (x_0+ct-a)^2+(y_0+dt-b)^2-r^2 \\
&= [c^2t^2+2c(x_0-a)t+(x_0-a)^2]+[d^2t^2+2d(y_0-b)t+(y_0-b)^2]-r^2 \\
&= (c^2+d^2)t^2+2[c(x_0-a)+d(y_0-b)]t + [(x_0-a)^2+(y_0-b)^2-r^2] \\
&= At^2+Bt+C \\
\end{aligned}
\]
This looks promising. The quadratic formula can solve really quickly for the roots of the polynomial:
\[t=\frac{-B\pm\sqrt{B^2-4AC}}{2A}\]
The result of this will either be 2 complex roots, 1 real root, or 2 real roots, depending on the discriminant \(B^2-4AC\). The 1 real root case means the line is tangent to the circle, which according to our intersection counting rules above, we count twice. (Note: technically we have 1 real root with multiplicity 2 since we have the roots \(t = (-B+0)/2A\) and \(t = (-B-0)/2A\).) Bezout's theorem still holds.
That might be nice mathematically, but what does that mean for us? Well, the parameter \(t\) has to be a real number, so the real roots are values of \(t\) that the line intersects, so plugging the roots back into the parametric line gives us the Cartesian points of the intersection. What about the complex roots? Well, the restriction on \(t\) is that it has to be a real number, so in the case of 2 complex roots, we say the line doesn't intersect the circle.
Coding Tip: Here, it's easy enough to get the constants A, B, and C by following the formula. To check for intersections, just test if the discriminant \(B^2-4AC \ge 0\). If it's equal to zero, the line is tangent and the intersection point is just \(-B/2A\). Otherwise, use the quadratic formula to find the values of \(t\) and then plug them back into the parametric line equation to get the intersection points. In pseudocode:
A = c*c+d*d;
B = 2*(c(x0-a)+d(y0-b));
C = (x0-a)*(x0-a)+(y0-b)*(y0-b)-r*r;
disc = B*B-4*A*C;
if(disc == 0)
{
t = -B/(2*A);
x1 = x0 + c*t;
y1 = y0 + d*t;
return [Point(x1,y1)];
}
else if(disc > 0)
{
t1 = (-B+sqrt(disc))/(2*A);
t2 = (-B-sqrt(disc))/(2*A);
x1 = x0 + c*t1;
y1 = y0 + d*t1;
x2 = x0 + c*t2;
y2 = y0 + d*t2;
return [Point(x1,y1), Point(x2,y2)];
}
else return [];
Line-Sphere Intersection
Line-sphere intersection is just like line-circle intersection. Algebraically, a sphere is also a degree-2 polynomial, and Bezout's theorem says we will still only get 2 intersections. The parametric line at point \((x_0,y_0,z_0)\) with direction \((p,q,r)\) and implicit sphere centered at \((a,b,c)\) with radius \(R\) are defined as follows:
\[ \begin{aligned} S(x,y,z) &= (x-a)^2+(y-b)^2+(z-c)^2-R^2 = 0 \\
L(x(t),y(t),z(t)) &= \begin{cases} x &= x_0+pt \\ y&=y_0+qt \\ z&=z_0+rt \end{cases} \end{aligned}\]
We can substitute the parametric equations into the implicit equation to get a polynomial in terms of the parameter \(t\). The roots of the polynomial are the parameter values at which the line intersects the circle. Substituting, we get:
\[
\begin{aligned}
0 &= (x_0+pt-a)^2+(y_0+qt-b)^2+(z_0+rt-c)^2-R^2 \\
&= [p^2t^2+2p(x_0-a)t+(x_0-a)^2]+[q^2t^2+2q(y_0-b)t+(y_0-b)^2]+[r^2t^2+2r(z_0-c)t+(z_0-c)^2]-R^2 \\
&= (p^2+q^2+r^2)t^2+2[p(x_0-a)+q(y_0-b)+r(z_0-c)]t + [(x_0-a)^2+(y_0-b)^2+(z_0-c)^2-R^2] \\
&= At^2+Bt+C \\
\end{aligned}
\]
The solution method is exactly the same as for line-circle intersections.
Line-Plane Intersection
This is a simpler case than the line-circle intersection, although it involves a surface and a curve. A plane is algebraically a degree-1 polynomial in implicit form, so according to Bezout's theorem, they should intersect at exactly 1 point. We take the plane in implicit form and the line in parametric form and apply our method as above:
\[
\begin{aligned}
P(x,y) &= ax+by+cz+d=0 \\
L(x(t),y(t),z(t)) &= \begin{cases} x &= x_0+ut \\ y&=y_0+vt \\ z &= z_0+wt \\ \end{cases} \end{aligned}\]
We substitute the parametric equations into the implicit form and solve for \(t\) as before:
\[
\begin{aligned}
P(x(t),y(t),z(t)) = 0 &= a(x_0+ut)+b(y_0+vt)+c(z_0+wt)+d \\
&= (ax_0+by_0+cy_0+d) + (au+bv+cw)t \\
t &= -\frac{ax_0+by_0+cy_0+d}{au+bv+cw} \\
&= -\frac{(a,b,c,d)\cdot(x_0,y_0,z_0,1)}{(a,b,c,d)\cdot(u,v,w,0)}
\end{aligned}
\]
By substituting the value for \(t\) into the parametric line, we get the intersection point of the line and the plane. If the line runs parallel to the plane, the dot product of the plane normal and the line direction (which is the denominator) will be zero.
Coding Tip: Getting the plane and line into this form is probably the trickiest thing here. Once you've done that, you can use the dot product operation for a 4D vector to get what you're after (if your vector library doesn't support 4D dot product, either code it up or use the equation given above before the dot product representation). In pseudocode:
Vector4 PL = new Vector4(a,b,c,d);
Vector4 P = new Vector4(x0,y0,z0,1);
Vector4 D = new Vector4(u,v,w,0);
den = dot(PL,D);
if(den != 0) // check for div by zero
{
t = -dot(PL,P) / den;
x = x0 + u*t;
y = y0 + v*t;
z = z0 + w*t;
}
Ray-Triangle Intersection
One method of ray-triangle intersection is basically a two-step process, the first step being computing the point on the plane that intersects the ray (line). The next is to determine if the point lies inside the triangle. Barycentric coordinates is a common method of doing this, but another method would be to create lines in the plane with directions such that a point inside the triangle would be on the same side of each line (i.e. the point would be on the left or right sides of each line). The side of the line can be computed using the scaled signed distance as detailed above.
Higher-Degree Curve Intersection
Circle-Circle Intersection
This implicit-parametric curve intersection method can even work for higher degrees. Analytically, the Abel-Ruffini theorem states that there is no algebraic solution for a polynomial of degree 5 or higher, so beyond that we'll have to use numerical methods.
In the case of circle-circle intersection, Bezout's theorem says we could get up to 4 intersections since a circle is a degree-2 algebraic curve, however only up to 2 of them will ever be real intersections.
The parametric circle at point \((x_1,y_1)\) with radius \(r_1\) and implicit circle centered at \((x_0,y_0)\) with radius \(r_0\) are defined as follows:
\[ \begin{aligned} C(x,y) &= (x-x_0)^2+(y-y_0)^2-r_0^2 = 0 \\
L(x(t),y(t)) &= \begin{cases} x &= x_1+r_1 \cos{t} \\ y &= y_1+r_1 \sin{t} \end{cases} \end{aligned}\]
We can combine the equations as before:
\[
\begin{aligned}
0 &= (x_1+r_1 \cos{t} - x_0)^2+(y_1+r_1 \sin{t}-y_0)^2-r_0^2 \\
&= [(x_1-x_0)+r_1 \cos{t}]^2 + [(y_1-y_0)+r_1 \sin{t}]^2 - r_0^2 \\
&= [(x_1-x_0)^2 + 2 r_1 \cos{t}(x_1-x_0)+r_1^2 \cos{t}^2] + [(y_1-y_0)^2 + 2 r_1 \sin{t}(y_1-y_0)+r_1^2 \sin{t}^2] - r_0^2 \\
&= [(x_1-x_0)^2 + (y_1-y_0)^2] + 2 r_1 [\cos{t}(x_1-x_0)+\sin{t}(y_1-y_0)] + r_1^2 [\cos{t}^2 + \sin{t}^2] - r_0^2 \\
&= [(x_1-x_0)^2 + (y_1-y_0)^2 + (r_1^2- r_0^2)] + [2 r_1 (x_1-x_0)] \cos{t} + [2 r_1 (y_1-y_0)] \sin{t} \\
&= - c + a \cos{t} + b \sin{t} \\
\end{aligned}
\]
Solving such a problem is a bit tricky, but we can try to take advantage of the sum angle identity because it has a similar structure:
\[ \sin{(\alpha + \beta)} = \sin{\alpha}\cos{\beta}+\cos{\alpha}\sin{\beta} \]
If we set \(t = \alpha\), we just have to find a \(\beta\) such that \(b = \cos{\beta}\) and \(a = \sin{\beta}\). There are two problems, however. Although \(\sin{\beta}^2 + \cos{\beta}^2 = 1\), we can't be sure that \(a^2+b^2=1\). As well, \(a\) and \(b\) have to be between -1 and 1 because that's the range of both the sine and cosine functions. If we alter the equation above to meet these criteria, we can apply the sum angle identity and solve.
If we let \(A = \frac{a}{\sqrt{a^2+b^2}}\) and \(B = \frac{b}{\sqrt{a^2+b^2}}\), both coefficients are between -1 and 1 and \(A^2+B^2=1\). Thus we can set \(A = \sin{\beta}\) and \(B = \cos{\beta}\), and we can set up the following equation:
\[
\begin{aligned}
a \cos{t} + b \sin{t} &= \sqrt{a^2+b^2}(A \cos{t} + B \sin{t}) \\
&= \sqrt{a^2+b^2}(\sin{\beta} \cos{t} + \cos{\beta} \sin{t}) \\
&= \sqrt{a^2+b^2} \sin{(t +\beta)} \\
\end{aligned}
\]
Substituting into the top equation, we get \(\sqrt{a^2+b^2} \sin{(t +\beta)} = c\), or:
\[\sin{(t +\beta)} = C\]
where \(C = \frac{c}{\sqrt{a^2+b^2}}\).
When solving this equation, remember although the arcsine function is one-to-one there are technically 2 values on the interval \([0,2\pi)\) that are solutions to the equation.
With this we can solve the following with a few steps:
Solve for \(\beta\) by using \(\tan{\beta} = A/B\).
Solve for the 2 possible values of \(t+\beta) by using the last equation.
Solve for possible values of t.
Solve for the intersections by substituting the value of t into the parametric equation.
Just like in the line-circle case, we can separate the 0, 1, and 2 intersection cases as follows:
If \(|C| > 1 \) , then the circles don't intersect at all.
If \(|C| = 1 \) , then the circles are tangent.
If \(|C| < 1 \) , then the circles intersect in 2 places.
Advanced Intersections
Just to illustrate how general this method is, let's take a more advanced example:
\[ \begin{aligned} C(x,y) &= x^2+y^2-1 = 0 \\
L(x(t),y(t)) &= \begin{cases} x &= 2t-1 \\ y&=8t^2-9t+1 \end{cases} \end{aligned}\]
The picture above shows the curves. The red curve is the parametric curve and the black curve is the implicit curve. As you can see, there are 4 real intersections, and since both curves are degree-2 we should end up with all real roots.
Inserting definitions of the parametric equations into the implicit form, we get:
\[
\begin{aligned}
f(x(t),y(t)) &= (2t-1)^2+(8t^2-9t+1)^2-1 \\
&= 64t^4-144t^3+101t^2-22t+1 \\
&= 0
\end{aligned}
\]
This is a quartic polynomial, so a more advanced numeric root finding method needs to be used, like bisection, regula falsi, or Newton's method. The roots of the polynomial are t = 0.06118, 0.28147, 0.90735, and 1.0. We do have all real roots, so Bezout's theorem is satisfied.
Higher-Degree Rational Curve Intersection
If the parametric curves are rational, then this method needs to be slightly modified. Rational parametric curves are usually of the form:
\[x = \frac{a(t)}{c(t)}, \, y = \frac{b(t)}{c(t)}\]
We can use homogeneous coordinates here really well. Since we have the mapping \((x,y) = (X/W,Y/W)\), we can define each homogeneous coordinate as \(X = a(t),\,Y = b(t),\,W=c(t)\). We also need to modify the implicit curve to handle homogeneous coordinates too, but this isn't very hard. Let's illustrate with an example:
\[
\begin{aligned}
S(x,y) &= x^2+2x+y^2+1 = 0 \\
P(x(t),y(t)) &= \begin{cases}
x &= \frac{t^2+t}{t^2+1} \\
y &= \frac{2t}{t^2+1} \\
\end{cases}
\end{aligned}\]
Our parametric curve in homogeneous coordinates is \(X = t^2+t\),\(Y=2t\),and \(W={t^2+1}\). Our implicit curve can be changed to use homogeneous coordinates by substituting in the mapping \((x,y) = (X/W,Y/W)\):
\[
\begin{aligned}
S &= \left(\frac{X}{W}\right)^2+2\left(\frac{X}{W}\right)+\left(\frac{Y}{W}\right)^2+1 \\
&= X^2+2XW+Y^2+W^2 \\
\end{aligned}
\]
We can substitute our homogeneous coordinate equation into our modified implicit function:
\[
\begin{aligned}
S &= (t^2+t)^2 + 2(t^2+t)(t^2+1) + (2t)^2 + (t^2+1)^2 \\
&= 4t^4+6t^3+5t^2+4t+1 \\
&= 0 \\
\end{aligned}
\]
We get 2 real roots at t = 0.3576 and t = 1, and 2 complex roots. Both curves are degree-2, so we expect 4 intersections. We have 4 intersections total, so Bezout's theorem is satisfied.
Parametric-Parametric Curve Intersection
The way to solve intersections of 2 parametric curves is via implicitization of one of the curves. This can be a very fast method, but it suffers from numeric instability for high degree polynomials. We won't cover implicitization in this article, but there is a lot of literature on it out there.
Bezier Curve Intersection
A Bezier curve is just a degree-n Bernstein polynomial, which means it's just a regular polynomial of a different form. The above methods work well for Bezier curves, but they are more efficient and more numerically stable if modified slightly to take advantage of the Bernstein form. We won't cover this here, but there is literature out there on this as well.
Conclusion
These common methods of intersection testing can greatly aid any game programmer, whether purely for graphics or other game-specific logic. Hopefully you can make some use of these simple, yet powerful techniques.
References
A lot of the information here was taught to me by Dr. Thomas Sederberg (associate dean at Brigham Young University, inventor of the T-splines technology, and recipient of the 2006 ACM SIGGRAPH Computer Graphics Achievement Award). The rest came from my own study of relevant literature. The article image is from his self-published book, Computer-Aided Geometric Design.
Article Update Log
22 Apr 2017: Added a section for circle-circle intersection
23 Nov 2015: Added a motivating example to help those new to algebraic geometry
17 Nov 2015: Added a section for line-sphere intersection
03 Feb 2014: Added coding tips for applying the methods
15 Jan 2014: Initial release
]]> 3536 Wed, 15 Jan 2014 01:06:08 +0000 Problem Solving Using Vectors https://www.gamedev.net/articles/programming/math-and-physics/problem-solving-using-vectors-r3123/
A few common problems game developers run into is how to rotate a sprite to face a specific direction, finding the angle between two headings, or rotating from one direction to another given a specific turn rate. It is quite common for beginners to try and tackle this problem using an angle to represent rotations and directions. There is a better way.
Overview of Vectors
Polar Coordinates
A vector has a direction and magnitude, or length. One way to represent a vector is uing polar coordinates. In polar coordinates the vector's direction is measured by the angle it forms with the positive x-axis. So a vector pointing right is 0 at degress and a vector pointing up is 90 at degress. The vector also has magnitude. This is measured by the distance from the origin of the vector to its end. Rotating a vector in polar coordinates is trivial. You simply change the angle. This simple rotation operation makes the polar representation appealing, but don't take the bait. The polar representation makes other operations more difficult. As a simple example, adding two vectors in polar cannot be done. You would have to convert to rectangular first then back to polar.
Rectangular Coordinates
A vector in rectangular coordinates is repesented by an x and y value. This is the location of the head of the vector if you were to plot it on the Cartesian plane. The appeal of rectangular coordinates may not be immediately apparent but as we go over some of the uses of vectors it becomes clear why they are preferred.
Conversion Between Representations
While it is better to do all of the mathematical operations in one representation when possible, it is sometimes necessary to convert between the two representations. The conversions are based on the properties of a right triangle.
To convert from polar to rectangular:
x = cos(angle) * magnitude
y = sin(angle) * magnitude
And to convert back to polar:
// to get the magnitude, use the pathagorean theorem
magnitude = sqrt(x * x + y * y)
angle = atan2(y, x)
That is it. Just some basic trigonometry.
Unit Vectors
There is a class of vectors called Unit Vectors. Simply put, they have a length of one. This property is very useful when dealing with angles. More on that later. Any vector with a length greater than zero can be normalized or, in other words, be modified to have a length of one. In rectangular coordintes you simply divide both x and y by the magnitude.
Normalize a vector:
function normalize()
{
float magnitude = sqrt(x * x + y * y)
x = x / magnitude
y = y / magnitude
}
The above code has a problem. If the length of the vector is 0 then the x and y values will become invalid. In many languages they become NAN or infinity. If you were to normalize a zero vector with the above code and then try and use it other numbers added to or multiplied by the infinity result, these values will also become infinity. For example, if you normalized the player's velocity that was zero the velocity will be infinite. When you then try and update the position using the infinite velocity, the position will become infinite as well. These invalid values can propogate through the code causing weird things to happen. The fix is fairly easy though.
function normalize()
{
magnitude = sqrt(x * x + y * y)
if (magnitude > 0.0)
{
x = x / magnitude
y = y / magnitude
}
}
Now the code will leave a zero vector as is, so it wont go corrupting all of the floating point values in the game.
Applications of Vectors
Moving Towards a Target
Let's take an example where we have a turret and we want the turret to fire at the player. How do we calculate the heading of the bullet? Well, it is pretty straight-forward. The first step is finding a vector that represents the direction from the turret to the player.
// subtract the player position from the turret position
// to get the direction
direction = player.position - turret.position
// the above code is really just shorthand for
direction.x = player.position.x - turret.position.x
direction.y = player.position.y - turret.position.y
// when using vectors you should use a vector class that lets you write out
// operations like a - b, or a.sub(b) instead of having to subtract
// each component individually
Now we have a vector, in rectangular coordinates, that represents the direction from the turret to the player with the magnitude of the vector being distance between the two. If we used that vector as it is for the velocity then the bullet would always take one second regardless of how far the two objects were from each other. We want to be able to control the velocity of the bullet, however. If we were to use polar coordinates we could simply set the magnitude to be the velocity we want, but that would require us to convert to polar to make the change, then back to rectangular in order to update the bullet's position. We want to take the rectangular approach here.
// make the length of direction one
direction.normalize()
// set the velocity of the bullet
bullet.velocity = speed * direction
// remember, the above code is the same as
bullet.velocity.x = speed * direction.x
bullet.velocity.y = speed * direction.y
// since speed is just a number value, not a vector, it
// scales both components equalily
And that's it. The unit vector direction could also be used to rotate a sprite to face in the same direction, but that will depend on how your graphics library handles the rotation of sprites. If the graphics library expects an angle you will have to extract the angle of direction using atan2 as shown previously in convertering from rectangular to polar.
Angle Between Two Directions
This is where using rectangular coordinates really starts to make sense. You may have come across this problem before. Let's imagine there are two objects moving in different directions and you want to find the angle between their movement. Let's first look at this problem using polar coordinates. This may seem like a simple problem. Simply take the absolute value of one angle minus the other. For example, given the two angles are 50 and 90, you subtract one from the other and are left with 40 degrees. But what if one angle is 310 degress and the other is 20 degrees? Since degrees go up to 360 for a full circle, 360 degrees is the same as 0 degrees, and 20 degrees is the same as 380. If we applied the basic algorithm to 20 and 310 we get 290 degrees. However, since 20 is the same direction as 380 we can substitute them out and using 380 and 310 we are left with 70 degrees, the actual rotation we were after. There are algorithms that can handle this problem while staying in polar coordinates but I am not going to show that here. We are interested in using rectangular coordinates.
To find the angle between vectors, we will use a useful little equation called the dot product and is defined as follows:
// dot product
dot(a, b) = a.magnitude * b.magnitude * cos(angleBetween(a,b))
// also equal to
dot(a, b) = a.x * b.x + a.y * b.y
So how can we use this equation to find the angle between a and b in rectangular coordinates? Simply set the two equations equal to each other and solve for the angle. The angle measure calculated here is smallest angle between to vectors meaning it could be a clockwise or a counter clockwise direction and the angle measure returned will always be between 0 and pi radians, or 0 to 180 degrees.
angleBetween(a,b) = acos((a.x * b.x + a.y * b.y) / (a.magnitude * b.magnitude))
This equation can be even simpler if we represent directions using unit vectors, since the magnitude of a unit vector is one. Keep in mind this only works if the two vectors are unit vectors.
angleBetweenUnitVectors(a,b) = acos(a.x * b.x + a.y * b.y)
// also the same as
angleBetweenUnitVectors(a,b) = acos(dot(a,b))
And there we have it, the angle between two unit vectors. This is a simple, elegant solution to the problem written as a single line of code.
Rotate by Another Angle
Another common problem in games is changing an object's heading by a certian amount. For example, if we need to rotate the heading of an object by five degrees every frame. It isn't immediately apparent how to do this in rectagular coordinates. The answer is fairly simple. You simply multiply the two vectors together as if they were imaginary numbers. Actually, 2D vectors in essence are complex numbers where x is the real part and y is the imaginary part.
(a + bi) * (c + di) = (a*c - b*d) + (b*c + a*d)i
// or using x and y
function rotateVector(Vector2 a, Vector2 b)
{
float x = a.x * b.x - a.y * b.y;
float y = a.x * b.y + a.y * b.x;
return new Vector2(x, y);
}
For 2D vectors, it doesn't matter the order of a and b. Switching them will give the same answer. However, this function also has the effect where the resulting vector's magnitude is a.magnitude * b.magnitude. So when rotating vectors where you want to preserve the magnitude, use a unit vector.
function unitVector(float angle)
{
// this returns a unit vector in the direction of angle
// it will always be of length 1 since, in this case
// magnitude = cos(angle)^2 + sin(angle)^2 = 1
return new Vector2(cos(angle), sin(angle));
}
// rotate the velocity by 10 degress
velocity = rotateVector(velocity, unitVector(degToRads(10)))
Rotate Towards Target
In this situation we want to take the current heading of the object and rotate to face a new heading. A great example is a turret that can only turn a certain amount of degrees per second. We want to make sure it always turns in the most direct way. For example, if the player is directly to the turret's left we don't want the turret to turn 270 degrees to the right, we want it to turn 90 to the left. To do this, we will need to add one more tool in our toolbox: Slerping.
Slerp is short for Spherical linear interpolation. To understand how it works first imagine two vectors both coming out of the origin. Then draw a path from the end of one vector to the other. In reality, there are an infinite number of ways to get from point a to point b, but we are interested in a direct path that smoothly changes the angle of a vector. The the path we want the point to travel along is a curved path between the ends of the vectors forming circles or sections of circles. This is what slerping is and is implemented as follows:
// a and b should be unit vectors
// t is a value in the range [0, 1]
// when t = 0, slerp returns a
// when t = 1, slerp returns b
// any value between 1 and 0 will be
// vector between a and b
function slerp(Vector2 vectorA, Vector2 vectorB, float t)
{
float cosAngle = dot(vectorA, vectorB);
float angle = acos(cosAngle);
return (vectorA * sin((1 - t) * angle) + vectorB * sin(t * angle)) / sin(angle);
}
There is a lot there to take in. You might have to do a little more research on slerping to fully understand it. You will want to use somebody else's implementation of it however, the simple implimentation shown here will have some stability issues when vectorA and vectorB are pointing nearly in the same direction due to floating point rounding errors. Anyway, now we have our slerp so let's use it to help us rotate towards a target.
// rotates rotateFrom towards rotateTo by maxDeltaTheta radians and returns
// the result. If the angle between them is less than maxDeltaTheta then it
// simply returns rotateTo
// rotateFrom and rotateTo should both be unit vectors
function rotateTowards(Vector2 rotateFrom, Vector2 rotateTo, float maxDeltaTheta)
{
float angleBetween = acos(dot(rotateFrom, rotateTo));
if (angleBetween <= maxDeltaTheta)
{
return rotateTo;
}
else
{
return slerp(rotateFrom, rotateTo, angleBetween / maxDeltaTheta);
}
}
// and an example of how it can be used
Vector2 playerOffset = player.position - turret.position
// the input to rotates towards needs to be a unit vector
playerOffset.normalize()
// turretDirection is a unit vector specifying where the turret is pointing
// delta time is a float holding the number of seconds between this frame and the last
// turretTurnRate is the turn rate of the turret in radians/second
turretDirection = rotateTowards(turretDirection, playerOffset, turretTurnRate * deltaTime)
You may have noticed how rotateTowards calculates the angle between the two vectors and that some calculation is done right away inside of slerp. It would probably be a good idea to inline the slerp code into rotateTowards so it doesn't have to do the redundant calculation, but I will keep it seperate here for simplicity.
Conclusion
As I have created games I have found using vectors to be vastly more useful than trying to represent directions using angles. Vectors have always led me to more robust and elegant solutions that always seems to exhibit correct behavior, even in the odd cases that I didn't think to try. The more familiar you get with them the easier it becomes to manipulate a game world with ease, allowing you to complete seemingly hard challenges simply by building off the basic building blocks of vector operations. If you are serious about game development then you should understand how vectors work as they are a powerful asset to have in your toolbox. This article explained how things work in 2D space. To make the step into 3D you will need to work with Quaternions. They are a great way to represent 3D rotations and have many of the same properties as 2D unit vectors.
Article Update Log
20 Dec 2013: First publish of the article
23 Dec 2013: Small revisions to clarify some things
11 Jan 2014: Fixed some typos]]> 3123 Thu, 09 May 2013 17:27:30 +0000 Practical Guide to Bezier Curves https://www.gamedev.net/articles/programming/math-and-physics/practical-guide-to-bezier-curves-r3166/
This is meant as an intermediate guide to Bezier curves. I'm assuming you've all read about Bezier curves and the Bernstein basis functions and have a general idea about how it all works. This is meant as the next level up and some ways to implement them in practical code. This isn't meant as a thorough exploration of the mathematics, but as a jump-off point. I'm sure others here know more than I do, but I'd like to introduce some things here that can be helpful to those who want to delve into this topic.
The Story So Far...
Bernstein Basis Polynomials
To quickly sum things up, Bezier (pronounced [be.zje]) curves use the Bernstein basis functions multiplied by control points to create a curve. A Bernstein basis function takes the form \( b_{i,n}(t) = \binom{n}{i}t^i(1-t)^{n-i} \), where i is the ith basis function and n is the degree of the Bernstein basis. The basis function form a partition of unity, which means that for any parameter value between 0 and 1, \( \sum_{i=0}^n b_i(t) = 1 \).
A Bernstein polynomial is simply where the basis functions are multiplied by coefficients and summed: \( B_{i,n}(t) = \sum_{i=0}^n c_ib_{i,n}(t) = \sum_{i=0}^n c_i \binom{n}{i} t^i (1-t)^{n-i} \), where c is the coefficient for the ith term. This polynomial is a parametric curve. To clarify, the sum in the Bernstein polynomial doesn't just add to 1 because of the coefficients.
Bezier Curves
You've probably seen this equation so many times you know it by heart: \[ C(t) = \sum_{i=0}^n \binom{n}{i} (1-t)^{n-i} t^i P_i \] N-dimensional Bezier curves can be expressed as N Bernstein polynomials where the coefficients of the Bernstein polynomials are the control points in a certain dimension. For example, a 2D Bezier curve can be expressed as 2 Bernstein polynomials, one for the x-direction and one for the y-direction:
\[ \begin{cases} x(t) &= \sum_{i=0}^n \binom{n}{i} (1-t)^{n-i} t^i x_i \\ y(t) &= \sum_{i=0}^n \binom{n}{i} (1-t)^{n-i} t^i y_i \\ \end{cases} \]
The control points are given as \( P_i = (x_i,y_i) \). There are n+1 control points for a degree-n Bezier curve. 3D and 4D curves can be expressed in this same fashion. Bezier curves can have any bounds on the t-parameter, but unless specified it can be assumed that it only varies from 0 to 1. Note: if the interval isn't between 0 and 1, the curve definition takes on a slightly modified form:
\[ C_{[t_0,t_1]}(t) = \sum_{i=0}^n \binom{n}{i} \left ( \frac{t_1-t}{t_1-t_0} \right )^{n-i} \left ( \frac{t-t_0}{t_1-t_0} \right )^{i} \]
Bezier curves allow the user to intuitively control the shape of the curve. As the control points move in a certain direction, the curve moves in that direction too. We can gain some additional control of our curve by creating rational Bezier curves.
The Cool, New Stuff
Bezier Curve Evaluation
Before we get too far into rational Beziers, let's go over evaluation of Bezier curves. We could just do the simple evaluation of Bezier curves and get our answer. Let's say we're not so naive and used the recurrence relation for the binomial coefficient (if you're not familiar with the binomial coefficient, read the Wikipedia article): \( \binom{n}{i} = \frac{n-i+1}{i} \binom{n}{i-1} \). For 3D curves, the pseudocode might go like this:
function evaluate(double t)
{
x = 0, y = 0, z = 0;
nChooseI = 1;
for(int i=0;i.x * p;
y += nChooseI * controlpoints.y * p;
z += nChooseI * controlpoints.z * p;
}
return x,y,z;
}
Even with all of our tricks, we're still computing the power terms on each loop. So what can we do to get faster? Turns out there's 4 simple methods for quick evaluation:
- Use the de Casteljau method.
- Use a modified form of Horner's algorithm.
- If you want to compute the whole curve quickly, use a forward difference table.
- **Construct a set of matrices and multiply for evaluation.
de Casteljau Method
The de Casteljau method is a technique that takes advantage of the nature of the Bernstein basis functions. To start, let's say the t-parameter of a Bezier curve is between 0 and 1. If that curve is a degree-1, it represents a line between the 2 control points. If t=0, the evaluated point is P0; if t=1, the evaluated point is P1. If t=0.5, the point is halfway between the control points. The t-parameter tells us how far to go between P0 and P1. The Bezier curve takes the form \( P(t) =(1-t)P_0+tP_1 \). This is how linear interpolation is done.
Now let's say we had a degree-2 curve. The Bernstein basis functions would arise if we linearly interpolated between P0 and P1, between P1 and P2, then linearly interpolated between those 2 linearly interpolated points. The t-parameter would tell us how far to interpolate at each step. Mathematically, it would look something like this:
\[
\begin{align}
P(t)&=[(1-t)P_0+tP_1](1-t)+[(1-t)P_1+tP_2]t \\
P(t)&=(1-t)^2P_0+t(1-t)P_1+t(1-t)P_1+t^2P_2 \\
P(t)&=(1-t)^2P_0+2t(1-t)P_1+t^2P_2 \\
\end{align}
\]
The Wikipedia article shows how the curve is constructed in this manner. Since the degree for this is 2, 2 linear interpolations are done. The higher order the polynomial, the more linear interpolations are done. This is precisely how the de Casteljau method works. Mathematically, the de Casteljau method looks like this:
\[ P_i^j = (1-\tau)P_i^{j-1}+\tau P_{i+1}^{j-1}; j=1,\dots,n; i=0,\dots,n-j \]
The tau must be between 0 and 1. Incidentally, the de Casteljau method is useful for breaking up a Bezier curve into 2 Bezier curves. In the formula above, the points \( P_0^0,P_0^1,\dots,P_0^n \) are the control points for the Bezier curve on the interval \( (0, \tau) \) of the original curve, and the points \( P_0^n,P_1^{n-1},\dots,P_n^0 \) are the control points for the Bezier curve on the interval \( (\tau, 1) \) of the original curve.
The de Casteljau method is really nice because it's numerically stable, meaning that repeated use of it won't introduce numerical error into your calculations and throw them off.
Modified Horner's Method
The following code is the modified version of Horner's algorithm for quick evaluation. It takes advantage of the recurrence relation for the binomial coefficient as well as implements the nested multiplication like the classic Horner's algorithm:
function eval_horner(t)
{
u = 1 - t;
bc = 1;
tn = 1;
tmp = controlpoints[0] * u;
for(int i = 1; i < n; i++)
{
tn *= t;
bc *= (n-i+1)/i;
tmp = (tmp + tn * bc * controlpoints) * u;
}
return (tmp + tn * t * controlpoints[n]);
}
Forward Differencing
If you're evaluating a curve at evenly-spaced intervals, forward differencing is the fastest method, even faster than Horner's algorithm. For a degree-n polynomial, Horner's algorithm requires n multiplies and n adds. However, the forward differencing method only requires n additions, after a small initialization (which we could use Horner's algorithm for). There is a derivation for this method that I won't show here, but you can try this out and evaluate with Horner's and compare the methods.
This method requires n+1 curve evaluations to initiate the forward differencing. The easiest explanation of the method is to demonstrate it. Let's say that we had a degree-4 polynomial and the first 5 evaluations were: f(i)=1, f(i+1)=3, f(i+2)=2, f(i+3)=5, and f(i+4)=4. We build a table by subtracting f(i+1)-f(i), putting that value in the next row, and so on, like the following:
\[
\begin{matrix}
t & t_i & t_{i+1} & t_{i+2} & t_{i+3} & t_{i+4} & t_{i+5} & t_{i+6} & t_{i+7} \\
f(t) & 1 & 3 & 2 & 5 & 4 & & & \\
\delta_1(t) & 2 & -1 & 3 & -1 & & & & \\
\delta_2(t) & -3 & 4 & -4 & & & & & \\
\delta_3(t) & 7 & -8 & & & & & & \\
\delta_4(t) & -15 & & & & & & & \\
\end{matrix}
\]
The method states that the number in the last row, -15 in this case, will stay constant for every column in that row. That means that if we put -15 in the next column, then add -15 to the number above it (-8), store that value in the next column on that row, and do this until we reach the value row, we can compute the value for f(i+5). Repeat for f(i+6), and all successive values for f(t). It would look something like this:
\[
\begin{matrix}
t & t_i & t_{i+1} & t_{i+2} & t_{i+3} & t_{i+4} & t_{i+5} & t_{i+6} & t_{i+7} \\
f(t) & 1 & 3 & 2 & 5 & 4 & -24 & -117 & -328 \\
\delta_1(t) & 2 & -1 & 3 & -1 & -28 & -93 & -211 & \\
\delta_2(t) & -3 & 4 & -4 & -27 & -65 & -118 & & \\
\delta_3(t) & 7 & -8 & -23 & -38 & -53 & & & \\
\delta_4(t) & -15 & -15 & -15 & -15 & & & & \\
\end{matrix}
\]
This method is fast, but it's numerically unstable. This means that eventually, over the course of the forward difference, errors creep in and throw off the computation. This means a lot to the CAD world, but for rendering forward differencing is usually sufficient.
Matrix Evaluation
There is a matrix evaluation that can be quick as well, but that discussion is better suited to the discussion on Bezier surfaces. See the Bezier surface article here.
Rational Bezier Curves
A rational Bezier curve adds a weight to each control point. As the weight increases, the curve bends toward that control point. The formulation for a rational Bezier curve is given by:
\[
B(t) = \frac{\sum_{i=0}^n w_i \, P_i \, b_i(t)}{\sum_{i=0}^n w_i \, b_i(t)} = \left (
\frac{\sum_{i=0}^n w_i \, x_i \, b_i(t)}{\sum_{i=0}^n w_i \, b_i(t)},
\frac{\sum_{i=0}^n w_i \, y_i \, b_i(t)}{\sum_{i=0}^n w_i \, b_i(t)},
\frac{\sum_{i=0}^n w_i \, z_i \, b_i(t)}{\sum_{i=0}^n w_i \, b_i(t)}
\right )
\]
As you can see, if all the weights are 1, then the curve becomes a normal Bezier curve. Rational Bezier curves help us because:
- Rational Bezier curves can represent conic sections exactly.
- Perspective projections of Bezier curves are rational Bezier curves.
- The weights allow for additional control of the curve.
- Reparameterization of the curve is accomplished by changing the weights in a certain fashion (which I won't get into here).
Rational curves allow us to represent circular arcs (since a circle is a conic section). The image below shows what this would look like:
A degree-2 rational Bezier curve can represent an arc. The control points P0 and P2 must be on the circle, but the control point P1 is at the intersection of the extended tangent lines of the circle at P0 and P2. The weights of the control points P0 and P2 are 1, but the weight of control point P1 is cos(theta)/2, where theta is the angle of the arc. A degree-3 (cubic) Bezier curve can also represent a circular arc. Like in the degree-2 case, the endpoints P0 and P3 must be on the circle, but P1 and P2 must lie on the extended tangent lines of the circle at P0 and P3, respectively. The distance that P1 and P2 must be from P0 and P3, respectively, is given by the equation \( e = \frac{2\sin{\theta / 2}}{1+2\cos{\theta / 2}} \).
Rational Bezier Curves as 4D Bezier Curves
Yup, just like the title says, rational Bezier curves can be represented as 4D Bezier curves. I know some of you might be scratching your head and clicking to a new page, but let me explain. If we create a mapping from a 4D curve to our 3D rational curve, we could do some normal Bezier curve evaluation using the algorithms above and not have to modify them. The mapping is simple: if in 3D we have (x,y,z) and weight w, then in 4D we can express it as (X,Y,Z,W) = (x*w, y*w, z*w, w). That means that the conversion from 4D to 3D is as simple as division. This does create weird equalities, such as point (x,y,z) in 3D maps to multiple points in 4D, such as (2x, 2y, 2z, 2), (3x, 3y, 3z, 3), and so on. For our purposes, this is just fine, so just something to note. This mapping works for rational curves because, as we stated above, when all the weights equal 1, then the rational curve is identical to a normal Bezier curve. By elevating the problem to a higher dimension, we gain a lot of computational simplicity. (As an aside, a similar mapping can be made from 2D rational Bezier curves to 3D normal Bezier curves, and you can picture the 3D-to-2D mapping as the projection of the 3D curve onto the plane z=1.)
In 4D, that means a 3D rational Bezier curve takes the following form:
\[
\begin{cases}
X(t)&=x_iw_ib_i(t)\\
Y(t)&=y_iw_ib_i(t)\\
Z(t)&=z_iw_ib_i(t)\\
W(t)&=w_ib_i(t)\\
\end{cases}
\]
If you look at the top 3 equations, they're identical to the expanded numerator of the rational Bezier definition. If you notice, the bottom equation is identical to the denominator of the rational Bezier. This means that if we want to compute a 3D rational Bezier curve, we can simply elevate it to 4D space, do one more additional computation in our curve evaluation, then just do the perspective divide of the point back to 3D. In fact, we can elevate all our regular Bezier curves into 4D by making the weights 1, and then just do perspective divide to get the point back to 3D.
NOTE: Elevating rational Bezier curves makes evaluation and a few things easier, but it doesn't make all problems easier. For example, if you want the derivative of a rational Bezier curve, you can't just elevate it to 4D and then take the derivative and then project back. Derivatives of rational Bezier curves involve the quotient rule from calculus, something that a higher dimensionality doesn't simplify.
Conclusion
One of the primary difficulties with the Bezier formulation is that each control point influences the entire curve. Also, the number of control points is directly tied to the degree of the resulting Bernstein polynomials. Thus, a Bezier curve with 6 control points yields a 5th degree polynomial. Evaluating higher degree polynomials takes extra computation time (even with Horner's algorithm, so we want to avoid having high degree polynomials if possible. These problems have been overcome by B-splines. B-splines aren't much harder to understand once you get Bezier curves down. In the future, if there's enough response here, I'll add a basic introduction to B-splines and easy ways to generate them (not necessarily computationally efficient) and do them by hand too.
Article Updates
04 Aug 2013: Updated equations to use MathJax
05 Jun 2013: Fixed some typographical errors
28 May 2013: Initial release
References
A lot of the information here was taught to me by Dr. Thomas Sederberg (associate dean at Brigham Young University, inventor of the T-splines technology, and recipient of the 2006 ACM SIGGRAPH Computer Graphics Achievement Award). The rest came from my own study of relevant literature.]]> 3166 Tue, 28 May 2013 18:31:01 +0000