•      Sign In
• Create Account

Like
21Likes
Dislike

# Intersection Math & Algorithms - Learn to Derive

By Vilem Otte | Published Apr 17 2013 09:24 AM in Math and Physics
Peer Reviewed by (Toothpix, Josh Vega, Michael Tanczos)

algorithms point ray linear algebra vector plane normal aabb bounding box intersection sphere

One of the most important group of algorithms inside game engines are intersection algorithms. Even though there are plenty of sites showing code for intersection algorithms, there are only a few describing the math backround behind this, and algorithm derivation. To understand this article you need to know just a bit about linear algebra.

# Primitive description

Let's start with a mathematical description of primitives we're going to intersect. For this article I've used just ray (line), plane, sphere, axis-aligned bounding box and triangle ... these should be enough for a basic game engine.

## Ray

This is the simplest (and very useful) primitive you can think about. Let us define it with two 4-component numbers, point (further referred as origin) and direction. Mathematically the definition is:

$\boldsymbol{x}=\boldsymbol{o}+t\cdot\boldsymbol{d}$

Where:
• x represents any point on a line
• o represents some exact specified point on line (e.g. origin)
• d represents direction vector of line
• t is parameter within some specified range
The range specifies whether our line is infinite in both directions (+d and -d), whether it's just in a single direction, or whether it's finite.

Writing a ray object in some programming language is straight-forward of course, you should end up with something like this:

class Ray
{
public:
float4 origin;
float4 direction;

<<constructor>>

<<methods>>
};


Constructor and methods are straight-forward (and should be also inline for performance reasons).

## Plane

Another basic primitive. There are more ways to define a plane, but we will derive one that is well known and used. What is a plane? It's a point set that contains every such point in space, that when we create vector from that point to our known point on a plane, that vector has to be perpendicular to plane normal. And we know that perpendicular vectors are those whose dot product is equal to zero. So:

$\vec{\boldsymbol{n}}\cdot(\boldsymbol{x}-\boldsymbol{p})=0$

Where:
• n is unit vector in the plane normal direction
• x is every point on the plane
• p is our specified point on the plane
Expanding this yields:

$n_{x}\cdot(x-p_{x})+n_{y}\cdot(y-p_{y})+n_{z}\cdot(z-p_{z})=0$
$n_{x}\cdot x + n_{z}\cdot z + n_{z}\cdot z - (n_{x}\cdot p_{x} + n_{y}\cdot p_{y} + n_{z}\cdot p_{z})=0$

Substitute:

$-(n_{x}\cdot p_{x} + n_{y}\cdot p_{y} + n_{z}\cdot p_{z})=-\vec{\bold{n}}\cdot\bold{p}=d$
$n_{x}=a$
$n_{y}=b$
$n_{z}=c$

And we get something very familiar:

$a \cdot x + b \cdot y + c \cdot z + d = 0$

Which is a plane equation (so we proved that the definition we use is actually a plane). Definition in programming language could be like this:

class Plane
{
public:
float4 point;
float4 normal;

<<constructor>>

<<methods>>
};


A very simple definition. Let's jump ahead.

## Sphere

Yet another basic primitive, that is widely used. The definition is surface containing all points in distance from sphere's center less-or-equal to sphere's radius. We will define it using an analytic equation:

$(x - p_{x})^2 + (y - p_{y})^2 + (z - p_{z})^2 = r^2$

Where:
• x, y and z are coordinates of point in sphere
• p represents sphere's center
• r represents sphere's radius
A sphere class is again very trivial:

class Sphere
{
public:
float4 center;
float radius;

<<constructor>>

<<methods>>
};


So much for sphere; simple, yet very useful primitive.

## Axis-aligned Box

From here on I'll call this one just AABB (which stands for Axis-Aligned Bounding Box). Definition is very simple, as we work in Cartesian coordinate system, we have perpendicular axes X, Y and Z - AABB is defined as volume between minimum and maximum point in this system, the sides are made by planes orthogonal to planes formed by all 3 combinations of 2 base axis (XY, XZ and YZ). Mathematically we can define it as:

$\bold{min_{p}} \leq \bold{x} \leq \bold{max_{p}}$

Where:
• min_p stands for AABB minimum point
• max_p stands for AABB maximum point
• x represents points inside AABB
In programming language it can look like:

class AABB
{
public:
float4 min_p;
float4 max_p;

<<constructor>>

<<methods>>
};


Let's jump ahead to last object we're going to use.

## Triangle

The first and only primitive in this article that is going to be defined by more than 2 values, yet very important. Most of the game worlds are described by triangles, most of the simulation use objects described as triangles, and every more (or less) complex 3D object can be decomposed into triangles. They're awesome; as opposed to other boring N-gons, triangles are never concave!

Let's define triangle with 3 points, then any point on triangle can be described as:

$\bold{x} = \bold{A}\cdot \alpha + \bold{B}\cdot \beta + \bold{C}\cdot \gamma$

With conditions:

$\alpha+\beta+\gamma=1,\alpha\geq0,\beta\geq0,\gamma\geq0$

Where:
• x is every point in triangle
• A, B, C are points defining triangle
• a, ß, ? are so called barycentric coordinates (e.g. parameters in the equation above)
A triangle inside of a program could be defined as:

class Triangle
{
public:
float4 p[3];

<<constructor>>

<<methods>>
};


# Intersections

So now that we have defined some of the basic primitives, it's time to derive intersection equation and also intersection algorithms between the primitives. As I want to keep the article short (and I don't want to storm you with zillion of equations and exhausting equation solving), I will just show you 4 derivations - Ray-Plane (analytic derivation), Ray-Sphere (geometric derivation), Ray-AABB (tricky derivation) and Ray-Triangle (more complex analytic derivation). You should get an idea of how the intersection equation can be derived and in the end you should be able to derive the rest of them on your own.

## Ray-Plane intersection

I will derive Ray-Plane intersection analytically, as this is one of the most simple approaches to the derivation. When we want to derive some intersection algorithm analytically, we should write equations of those 2 objects right away, so (I intentionally used commonly known plane equation, instead of our definition):

$\boldsymbol{x}=\boldsymbol{o}+t\cdot\boldsymbol{d}$
$a \cdot x + b \cdot y + c \cdot z + d = 0$

Now we're looking for a point that lies on both, the ray and the plane - so we're looking for x. So we first write the first equation in scalar form:

$x=o_{x}+t\cdot d_{x}$
$y=o_{y}+t\cdot d_{y}$
$z=o_{z}+t\cdot d_{z}$

Let's substitute these 3 to plane equation:

$a \cdot (o_{x}+t\cdot d_{x}) + b \cdot (o_{y}+t\cdot d_{y}) + c \cdot (o_{z}+t\cdot d_{z}) + d = 0$
$a \cdot o_{x}+ a \cdot t \cdot d_{x} + b \cdot o_{y} + b \cdot t \cdot d_{y} + c \cdot o_{z} + c \cdot t \cdot d_{z} + d = 0$
$a \cdot t \cdot d_{x} + b \cdot t \cdot d_{y} + c \cdot t \cdot d_{z} = -(a \cdot o_{x} + b \cdot o_{y} + c \cdot o_{z} + d)$
$t \cdot (a \cdot d_{x} + b \cdot d_{y} + c \cdot d_{z}) = -(a \cdot o_{x} + b \cdot o_{y} + c \cdot o_{z} + d)$
$t = -{a \cdot o_{x} + b \cdot o_{y} + c \cdot o_{z} + d \over {a \cdot d_{x} + b \cdot d_{y} + c \cdot d_{z}}}$

It's time to reverse substitution we did when "deriving" plane equation, e.g.:

$-(n_{x}\cdot p_{x} + n_{y}\cdot p_{y} + n_{z}\cdot p_{z})=-\vec{\bold{n}}\cdot\bold{p}=d$
$n_{x}=a$
$n_{y}=b$
$n_{z}=c$

$t = -{{n_{x} \cdot o_{x} + n_{y} \cdot o_{y} + n_{z} \cdot o_{z} - (n_{x}\cdot p_{x} + n_{y}\cdot p_{y} + n_{z}\cdot p_{z})} \over {n_{x} \cdot d_{x} + n_{y} \cdot d_{y} + n_{z} \cdot d_{z}}}$

And transformed to vector form we get:

$t = {-{{\vec{\bold{n}}\cdot\bold{o}-\vec{\bold{n}}\cdot\bold{p}}\over{\vec{\bold{n}}\cdot\vec{\bold{d}}}}}$

But we can optimize this a little bit:

$t = {-{{\vec{\bold{n}}\cdot(\bold{o}-\bold{p})}\over{\vec{\bold{n}}\cdot\vec{\bold{d}}}}}$

Voila, that's what we have been looking for! So basically we need just subtraction, 2 dot products and 1 division. In program this can look like (assuming we want to intersect planes in front of ray origin in direction of ray):

bool IntrRayPlane(const Ray* r, const Plane* p, float* t)
{
float denom = dot(p->normal, r->direction);

// Test whether ray and plane aren't parallel
if(fabsf(dotND) < EPSILON)
return false;

*t = -(dot(p->normal, r->origin - p->point) / denom);
return ((*t) > 0.0f);
}


Almost as simple as the equation. Let's jump ahead, time for spheres!

## Ray-Sphere intersection

I could perform analytic derivation like in the Ray-Plane intersection derivation, although Ray-Sphere intersection is just one of the cases that can be easily derived using geometry. Let's start with an image:

We're now looking for 2 distances:

$t_{0} = |\bold{O} - \bold{P}|$
$t_{1} = |\bold{O} - \bold{P'}|$

Finding C-O is quite straightforward, it's just simple:

$\bold{L}=\bold{p}-\bold{o}$

Another thing we need to find is distance along the ray to point X, but as we know ray direction, it's just simple projection of L.

$t_x = \bold{L}\cdot{\vec{\bold{d}}}$

Right now we should look again at the image, we can see 3 important right angle triangle, one formed by {d, OX and OC}, another one {d, r, PX} and last one {d, r, P'X}. We need to know distance PX (respectively P'X, they're equal). So let's use Pythagorean theorem twice:

$d^2 = \bold{L}\cdot{\bold{L}}+t_{x}^2$
$t_{PX}=\sqrt{r^2-d^2}$

And our 2 hit distances are:

$t_{0} = t_{x} - t_{PX}$
$t_{1} = t_{x} + t_{PX}$

Time for the code (searching just nearest hit):

bool IntrRaySphere(const Ray* r, const Sphere* s, float* t)
{
float rad2 = s->radius * s->radius;
float4 L = s->center - r->origin;

float tPX = dot(L, r->direction);
if(tPX < 0.0)
return false;

float dsq = dot(L, L) - tPX * tPX;
if(dsq > rad2)
return false;

float thit = sqrt(rad2 - dsq);
*t = tPX - thit;
if(*t < 0.0f)
*t = tPX + thit;

return ((*t) < 0.0f);
}


Just a side note. Analytic derivation in this case is also very trivial, you will end up with quadratic equation, where determinant is negative in case of no hit, zero in case of single point hit (e.g. tangent line), and positive in case of secant.

Let's jump ahead to boxes!

## Ray-AABB intersection

Another important one, although a bit tricky. Let's start with an image again:

You can see, that box in 2D is formed by 2 slabs, horizontal and vertical. 3D is analogous to this, but of course it's formed by 3 slabs. For each slab we want to find minimum distance and maximum distance, e.g.:

$\bold{t_{min}} = {{(\bold{a} - \bold{o})} \over {\vec{\bold{d}}}}$
$\bold{t_{min}} = {{(\bold{b} - \bold{o})} \over {\vec{\bold{d}}}}$

Where a and b represents minimum (respectively maximum) point of our AABB. What we get is the minimum and maximum distance to each slab. Now we have to correctly order the distances along the ray direction, this is quite simple:

$\bold{t_{near}} = min(\bold{t_{min}}, \bold{t_{max}})$
$\bold{t_{far}} = max(\bold{t_{min}}, \bold{t_{max}})$

Where min/max is defined as:

$min(\bold{a}, \bold{b}) = \lbrace min(a_{x}, b_{x}), min(a_{y}, b_{y}), ... , min(a_{n}, b_{n}) \rbrace$
$max(\bold{a}, \bold{b}) = \lbrace max(a_{x}, b_{x}), max(a_{y}, b_{y}), ... , max(a_{n}, b_{n}) \rbrace$

Entry/exit point is then defined as maximum of near distances (see the image), thus:

$enter = hmax(\bold{t_{min}})$
$enter = hmin(\bold{t_{max}})$

Where hmax/hmin function is horizontal minimum/maximum, defined as:

$hmin(\bold{a}) = min(a_{x}, a_{y}, ... , a_{n})$
$hmax(\bold{a}) = max(a_{x}, a_{y}, ... , a_{n})$

Of course a hit only occurs when the exit point is larger than zero, and the exit point is further than the entry point (if it isn't, we missed the AABB). Time to code:

bool IntrRayAABB(const Ray* r, const AABB* b, float* enter, float* exit)
{
float4 tmin = (b->minimum - r->origin) / r->direction;
float4 tmax = (b->maximum - r->origin) / r->direction;

float4 tnear = f4min(tmin, tmax);
float4 tfar = f4min(tmin, tmax);

*enter = max(max(tnear.x, 0.0f), max(tnear.y, tnear.z));
*exit = min(tfar.x, min(tfar.y, tfar.z));

return (*exit > 0.0f && *enter < *exit);
}


It can be a little more optimized, although I'll leave that to reader as an exercise.

## Ray-Triangle intersection

I'd like to say, there is like a dozen of ways to intersect a ray with a triangle. Lots of them don't even use a triangle defined by 3 points, and re-define it in better suited way. I'll show you how to derive plain good old Barycentric test with one huge advantage, you don't have to precompute a triangle in any way, just use the standard definition with 3 points.

Let's start with equations we already have:

$\boldsymbol{x}=\boldsymbol{o}+t\cdot\boldsymbol{d}$
$\bold{x} = \bold{A}\cdot \alpha + \bold{B}\cdot \beta + \bold{C}\cdot \gamma$

The triangle equation actually is equal to this:

$\bold{x} = \bold{A} \cdot (1 - \beta - \gamma) + \bold{B} \cdot \beta + \bold{C} \cdot \gamma$

And let's arrange it like this:

$\bold{x} = \bold{A} + (\bold{B} - \bold{A}) \cdot \beta + (\bold{C} - \bold{A}) \cdot \gamma$

Substituting line equation for x yields:

$\bold{o} + t \cdot \vec{\bold{d}} = \bold{A} + (\bold{B} - \bold{A}) \cdot \beta + (\bold{C} - \bold{A}) \cdot \gamma$
$-t \cdot \vec{\bold{d}} + \beta \cdot (\bold{B} - \bold{A}) + \gamma \cdot (\bold{C} - \bold{A}) = \bold{o} - \bold{A}$

Those are 3 equations with 3 unknown variables. Instead of thinking about some more clever way, let's solve it in brute force manner with Cramer's rule. First let's re-arrange it to better form:

$\begin{pmatrix}-{\vec{\bold{d}}&(\bold{B}-\bold{A})&(\bold{C}-\bold{A})\end{pmatrix} \begin{pmatrix}t\\\beta\\\gamma\end{pmatrix}=\bold{o}-\bold{A}$

For clarity let's substitute e1 for B-A, e2 for C-A and p for o-A.

$\begin{pmatrix}-{\vec{\bold{d}}&\vec{\bold{e1}}&\vec{\bold{e2}}\end{pmatrix} \begin{pmatrix}t\\\beta\\\gamma\end{pmatrix}=\vec{\bold{p}}$

Now it's time to apply Cramer's rule, e.g.:

$\begin{pmatrix}t\\\beta\\\gamma\end{pmatrix}={1\over{\begin{vmatrix}-{\vec{\bold{d}}&\vec{\bold{e1}}&\vec{\bold{e2}}\end{vmatrix}}}\cdot\begin{pmatrix}\begin{vmatrix}\vec{\bold{p}}&\vec{\bold{e1}}&\vec{\bold{e2}}\end{vmatrix}\\\begin{vmatrix}-{\vec{\bold{d}}}&\vec{\bold{p}}&\vec{\bold{e2}}\end{vmatrix}\\\begin{vmatrix}-{\vec{\bold{d}}}&\vec{\bold{e1}}&\vec{\bold{p}}\end{vmatrix}\end{pmatrix}$

But matrix determinant can be written as scalar triple product, so:

$\begin{pmatrix}t\\\beta\\\gamma\end{pmatrix}={1\over{(\vec{\bold{d}}\times\vec{\bold{e2}})\cdot\vec{\bold{e1}}}}\cdot\begin{pmatrix}(\vec{\bold{p}}\times\vec{\bold{e1}})\cdot\vec{\bold{e2}}\\(\vec{\bold{d}}\times\vec{\bold{e2}})\cdot\vec{\bold{p}}\\(\vec{\bold{p}}\times\vec{\bold{e1}})\cdot\vec{\bold{d}}\end{pmatrix}$

I know this might look scary right now, but it really is simple. For those that understand code easier than math I also add code:

bool IntrRayTriangle( const Ray* r, const Triangle* t, float* thit, float* u, float* v)
{
float4 e1 = t->v[1] - t->v[0];
float4 e2 = t->v[2] - t->v[0];

float4 pvec = cross(r->direction, e2);

float det = dot(e1, pvec);
if(det > -EPSILON && det < EPSILON)
return false;

float inv_det = 1.0f / det;

float4 tvec = r->origin - t->v[0];
*u = dot(tvec, pvec) * inv_det;
if(*u < 0.0f || *u > 1.0f)
return false;

float4 qvec = cross(tvec, e1);
*v = dot(r->direction, qvec) * inv_det;
if(*v < 0.0f || *u + *v > 1.0f)
return false;

*thit = dot(e2, qvec) * inv_det;

return (*thit > 0.0f);
}


# Conclusion

Intersection algorithms, as I've said, are core algorithms for practically every game engine! And not just game engines, ray-tracers, physics simulators, collision detection libraries, even some AI simulations use them, etc. If you made it here, then I have to say thanks for reading my article, and I hope you've learned something new.

9 Apr 2013: Initial release

## License

GDOL (Gamedev.net Open License)

Comments

I did some grammar clean up but didn't mess around too much with the various definitions. Anyone with a better understanding of the terms can help the author clean those up if they feel it necessary.

Is float4 defined as a record of two floats? If it is, it would be good if you explained it, even if it is obvious.

It is not clear in the article why 4-dimensional vectors are used to represents 3D points and vectors. Even if it is quite common in the game and computer graphics industry, it is not very common in other fields. Since this article is about computing intersections between very simple mathematical objects, I wouldn't assume the reader is familiar with that convention.

Moreover, I would fix the notation to make it more consistent. For example, points are represented using uppercase letters in some equations and using lowercase letters in others. Vectors do sometimes have an arrow above the letter and sometimes not...

Ad float4 - using float4 to represent 3D points and vectors is quite common, there are 2 main reasons why I've chosen float4 representation of 3D points and vectors (and why you should always prefer it).

1st - performance (computation) reasons, I bet most of you know that there is SSE in CPUs today (since Pentium III - thats 1999, 14 years), using float4 (even defined as only struct of 4 floats) with few compiler flags (-mfpmath=see for gcc for example) will vectorize the code, resulting in better performance.

2nd - basically this code can also be used on GPU (and it's quite common to do ray tracing on GPU these days). GPU is huge SIMD machine and works mostly with 4-byte, 16-byte or 64-byte primitives (e.g. float, float4, float16). Now what if you want to transfer a ton of rays defined with float3 origin & direction, you either do it through new struct containing float[3]'s and experience a lot of cache misses (e.g. low performance), or use float4 in the first place. You could convert it before sending too, but that can waste even more time than cache misses on GPU.

Anyway you're right that this should be added into article.

Ad points, vectors notation - it's common in central Europe countries to represent points as bold symbols (either uppercase or lowercase), vectors as bold symbol with an arrow. I though don't know what exactly are other conventions (but I know they differ (even ISO standard for this allows for several different ways), and some differ a lot) - but I know this might be confusing.

F.e. according to ISO 80000 you can express point with any of these:

$latex.php?latex=\bold{x}~~~\bold$

And you can express vector with any of these:

$latex.php?latex=\vec{v}~~~\bold%$

So in this case I don't get idea, why it was *standardized* (because this standard actually doesn't standardize anything).

Anyway, if there is some reference for notations here on gamedev (or even another article just using some math notations), please post a link there and I'll update the notations to match.

Ad grammar - Thanks for clean up. I'm still working on my English.

I know why you have decided to use 4D vectors to represents both points and vectors. I was simply suggesting to include this motivation in the article.

You can represents points and vectors in several different ways. It isn't very important which one you decide to use. But you should be consistent. If you decide that a point is represented using an uppercase bold letter, all points should use this notation.

Then may I make a suggestion?

Since this article is about algorithms, why not drop all the performance related aspects in your code (e.g. float4), and the language specific constructs (e.g. class, public), and present the algorithms in a more generic language (some sort of pseudo code)?

Again, it is just a suggestion. But I think the article can be understood by more people if it is more generic.

I think your plane representation can be made simpler by directly using the four dimensional vector n = (a, b, c, d) (it is actually a point in a 3-dimensional projective space). It is more compact and less ambiguous (it is unique up to scaling). Moreover, if you represent your 3D points using 4D vectors in the form (x, y, z, 1) and 3D vectors in the form (x, y, z, 0) as it is common in affine (and projective) geometry, the plane equation simply becomes dot(n, P) = 0 where P = (x, y, z, 1). dot is clearly the 4D dot product.

By substituting the ray equation in this new equation you then obtain

dot(n, O + t*d) = dot(n, O) + t*dot(n, d) = 0

and finally

t = - dot(n, O)/dot(n, d).

BTW, in AABB - Ray algorithm.. Should't it be:

float4 tnear = f4min(tmin, tmax);
float4 tfar = f4max(tmin, tmax);

??

Note: Please offer only positive, constructive comments - we are looking to promote a positive atmosphere where collaboration is valued above all else.

PARTNERS