Annoying things about Fauerby's and Nettle's sphere-mesh collision detection guides

Started by
3 comments, last by Defend 8 years, 2 months ago

This is a discussion of the method for colliding a moving sphere against a mesh of triangles.

Here's what Paul Nettle wrote in the year 2000: http://pisa.ucsd.edu/cse125/2003/cse190g2/Collision.pdf

And here's what Kasper Fauerby wrote in 2003: http://www.peroxide.dk/papers/collision/collision.pdf

They've both been annoying me silly because they both have mistakes, and I haven't found anything at all that talks about solutions even though Fauerby's method in particular is referenced quite often. So I find myself looking at a 12-year old document with definite mistakes, yet seemingly no resistance from online communities.

In the case of Fauerby's work, I had coincidentally implemented the same mistake before knowing his code even existed. Once I saw his code, I saw the same logical error. So my question then became, if Fauerby has done the same thing I have, shouldn't his collision detection also fail? I threw his code into Unity and, well, it didn't fail. But the error was definitely there in the code. So what's the deal?

The logical error is in the calculation of the padding distance: the distance that a sphere is supposed to stop short of hitting a tri (in order to avoid floating point errors and wotnot). This distance means that the sphere should never actually touch any tri, but always stop a tiny bit before it.

The problem is that Fauerby applies this distance along the direction of the sphere's velocity, instead of along a direction that separates sphere from triangle. In other words, Fauerby only stops the sphere short of the point it would collide with, not the triangle it would collide with.

To demonstrate, make a point right now on your desk. Now, keeping your finger 5cm away from that point, how close can you get to the actual desk? Infinitely close. Just put your finger on the desk somewhere that's 5cm away from the point you drew. This satisfies Fauerby's padding operation but as you can see, your finger is already colliding with the desk.

The mistake is on page 46 in Fauerby's code linked above:


VECTOR V = vel; 
V.SetLength(collisionPackage->nearestDistance - veryCloseDistance);
newBasePoint = collisionPackage->basePoint + V;

The padding "veryCloseDistance" is being applied in the direction of the sphere's motion "vel". This separates the sphere from the point of collision, not from the triangle of collision. It comes into effect practically 100% of the time the sphere moves along a surface.

So why did his collision method still work?

Fauerby's code actually has another error, and the second one really does happen to compensate for the first. As soon as the sphere gets "too close" to a collision point (which should mean too close to a triangle, but it doesn't), his calculation for the plane it's meant to slide along goes wrong. By accident, it gains a slight upwards incline (assuming the plane is the flat ground), pushing the sphere up and out of the "too close" vicinity. This goes back and forth, every 2nd frame, and with no upwards acceleration being applied by the user, and with gravity in effect, close inspection reveals the sphere to actually be jumping up and down as it slides along a perfectly flat plane.

So if you grab Fauerby's code and copy pasta it into your IDE, things will work well enough for your eye to be satisfied. But I don't like it.

Here's a screen of the overall situation. Each cyan or magenta line is a new frame as the sphere moves to the right:

[attachment=30200:whole.png]

Here's the action at the base, close up:

[attachment=30198:bottom.png]

If you look closely, you can see that every cyan frame is actually lifted off the horizontal that the magenta frames are on. This is because the magenta frames were found to be "too close", and as a result, their sliding plane normals were incorrectly calculated, pushing the sphere upwards a little for the next frame.

Look even closer and you'll see white lines too. Except, only near the magenta lines. Those white lines are actually pure "up" lines, and they're really behind every coloured line there; it's just that the cyan lines hide them perfectly. The magenta lines do not hide the white lines perfectly, because the magenta lines are not going perfectly upwards.

If we look a little higher, that is made more clear:

[attachment=30199:Top.png]

So if you haven't guessed it already, the cyan and magenta lines are both showing the sliding plane normal that was calculated for that frame. The plane that the sphere is sliding along is perfectly flat (hardcoded, not modelled), so all those sliding plane normals should be perfectly upwards. But those magenta mistakes are actually the only things that keep Fauerby's sphere from slipping through triangles.

I'll come back to the problems with Nettle's code later. In that case, I haven't implemented his method and I don't plan to, because I believe I can already see a real mistake in it (as well as the same 2 mistakes Fauerby used).

Advertisement

I don't think these are mistakes. The sliding collision response done with floating point is supposed to move the sphere away from the sliding plane by a small value. Otherwise, the floating point errors would cause the sphere to pass through the plane. The veryCloseDistance value also might not be a mistake. I'm guessing that the veryCloseDistance value is the distance from the plane projected onto the velocity vector. Maybe this value was already available earlier during the algorithm, and it is being used instead of deriving a new value along the normal, which would increase the floating point error effect. The rest of the algorithm probably also does distance calculations based on the velocity instead of the plane normal.

It might be also that the veryCloseDistance is velocity-based in order to stop the sphere from colliding gradually, based on the angle of the velocity vector to the plane. This is because floating point errors cause more trouble at sharper angles. In any case, the velocity veryCloseDistance will never be smaller than the normal-based one, so this is not really going to cause the algorithm to fail. It will just spread the actual collision detection across the next frames, until the veryCloseDistance value is the same as the normal-based value. And from what you said, this is exactly how the algorithm is supposed to work, right?

I haven't read the document though - just guessing.

When I write about Paul Nettle's guide, that will be along the lines of "I think this is wrong, do you guys agree?"

But that's not what I was doing when writing about Fauerby's guide. Your post is kind of taking incorrect guesses at things my post already explained. That's ok, I admit mine is a long post and it's hard to fully explain problems with a 48-page guide. But I was stating, not guessing, when I said:

The problem is that Fauerby applies this distance along the direction of the sphere's velocity, instead of along a direction that separates sphere from triangle.

The veryCloseDistance value is not a projection, it's just a constant that always gets applied along the velocity vector only. There is no 'normal-based' version of it. That was one of the first fixes I thought of, but if it was a projection I can show how this would lead to another error of colliding well before reaching a triangle.

Also, colliding gradually is exactly what the first error (the veryCloseDistance ... finger-infinitely-close-to-your-desk error) causes if the second error (the bogus sliding plane normal) isn't happening also. I say that from having toyed with different veryCloseDistances to try to fend off the flaw. It literally makes things soggy as you move around.

Thanks for taking the interest though. smile.png I wanted to email the authors but things are so old it'd feel like emailing Patrick Stewart about Star Trek. So I vented here.

I'm working on my own version of this collision detection that doesn't use padding at all, but instead is insensitive to embedded collisions. It avoids the two mistakes in Fauerby's guide. Plus, Fauerby's guide also always runs vertex sweep tests before edge sweep tests, instead of recognising that edge sweep tests will often make vertex sweep tests redundant (as the whole process is all about delaying unnecessary tests as much as possible).

Just felt like adding a reply to say something else I just remembered.

Fauerby's approach also doesn't use the sphere's velocity to check for collisions, but rather, its translation.

To put it another way, the function doesn't ask "How fast for how long?", but simply, "How far?". The TL;DR of this is that it means you can't keep track of inertia.

It's not obvious at first, because he uses a variable named "vel" in the collideAndSlide() function, and a variable named "velocity" in the CollisionPacket object. The checkTriangle() function even does some calculations with time intervals, so it looks like it really is using velocities. With all the t0 and t1 floating around his code, I was confused for quite a while. But ultimately you can see that none of his functions, nor the CollisionPacket structure, receive a time value to work with, even though the sphere's final position is being determined from the velocity variables. By the time any of Fauerby's code begins, the sphere's velocity vector has already been converted to a distance vector. I don't think that's made very clear to the reader.

The calculations in the checkTriangle() function are actually using a parameterised equation for the sphere's position. Page 11 has this:


C(t) = basePoint + t ? velocity, t ? [0, 1]

Which is the same as saying:


end position = start position + t * max travel distance

When t == 0, end position == start position.

When t == 1, end position = start position + max travel distance.

The variable t isn't time, it's just the equation's parameter that must fall between 0 and 1.

Or you can look at page 46, where a collision-free frame returns:


// If no collision we just move along the velocity 
if (collisionPackage->foundCollision == false) 
{ 
    return pos + vel; 
}

As can be seen, "vel" is simply added straight to "pos".

What's the problem though? Asking "How fast for how long" is pretty much the same thing as "How far". And it seems simpler since it uses fewer variables.

But imagine the final collision in a frame...

Say the sphere only has a small distance left to travel; maybe just 5% of the total distance it moved that frame.

The sphere hits a wall at some angle, adjusts its end position to be some distance along that wall, and then never collides again (in that frame). Call this Case A.

Now imagine the same collision, with the same velocity, but happening half a frame earlier. So the sphere has somewhere around 50% of its total distance (for that frame) left to travel still. As before, it hits the wall at the same angle, adjusts its end position to be some distance along that wall, and never collides again (in that frame). Call this case B.

Obviously, the end position for Case B is going to be much further along the wall than the end position for Case A. That makes sense; Case B was demonstrating an earlier hit so of course it moved further.

But (ignoring friction) Case A and Case B both still finish with the same velocity. Fauerby's code has no way of knowing this. If the final deflected distance moved in a frame is very small, you can't tell if that means the final velocity was very small, or if it means the final period of movement was very small. You could muck around with checking those deflection angles, or comparing final distances to initial distances, or some other set of calculations, but these would all be a much bigger mess than just using velocity in the first place.

Or you could make it the caller's job instead, comparing the direction of motion at the beginning of a frame, to the direction of motion at the end of a frame (using the final sliding plane normal). But this wouldn't know the difference between a gentle, many-collisions, 90° turn (preserving much velocity), and a sharp, one-collision, 90° turn (preserving absolutely no velocity). Two-hit or three-hit collisions in a single frame aren't too uncommon, and this would stuff that up.

So that's why Fauerby's guide also doesn't allow the programmer to keep track of the sphere's inertia. It's not too difficult to fix (if you were to use his code), but you would have to add a time parameter to the CollisionPacket object at the very least, and would also have to be a bit careful when switching to the parameterised equations I believe. (I don't use them at all in my own code.)

Something else annoying!

http://arxiv.org/ftp/arxiv/papers/1211/1211.0059.pdf

It's a paper that declares its intent to look at areas of Fauerby's code that aren't robust. It's presented formally but I get the impression it was written for academic assessment.

One problem is on page 4, where it says:

"... once the sphere hits the first triangle, its motion should be restricted to the sliding plane, losing a degree of freedom."

Jeff Linahan goes on from there to state that when the sphere meets its 2nd plane during a frame, it loses a 2nd degree of freedom, only allowed to travel along the crease that the 2 planes form. Then if it hits a 3rd plane, it loses 3 degrees of freedom and therefore, all motion entirely. Linahan uses that chain of reasoning to limit the number of recursions to 3, instead of Fauerby's 5 (which is already just an arbitrary safe number), and to then claim that his changes therefore result in better performance.

And it's just bonkers.

It's not the performance that I'm concerned with either; it's that Linahan is up and redefining the function's job then working incorrectly from his redefinition. Simply, Linahan believes the quote above. He believes the sphere's motion has to be constrained to the plane of any triangle it touches. His reasoning why is given in this image:

[attachment=30419:restrict.png]

The sphere's velocity is pushing it downwards, but as a result of the angled plane, it is sliding along the purple line. As soon as it reaches the lower plane, the flat one, it "should" stop completely.

This does make some sense. It simply says the sphere should only ever move in response to the original velocity direction (the green line), and never the redirected velocity it changes to along its journey. So in the pic above, as soon as it hits that flat plane, it obviously has no sideways velocity, so should stop. Fair enough.

But this is as far as Linahan looks. From that example alone he declares that the sphere's motion is restricted to the first plane it was moving along. Why? No reason! Just one example and we're good to go! So what if that flat plane had been replaced by an angled plane of only slightly weaker incline than the first? Of course the sphere shouldn't stick to the first plane. It's just silly. And from there, the whole 3 degrees of freedom thing is simply wrong. Sure, if the sphere's motion *was* restricted to the planes it touches, it would fit. But it isn't.

What bothers me more with Linahan's paper is that despite its claim to focus on robustness, it misses the 2 bona-fide mistakes in Fauerby's method. It misses the problem of the sphere padding itself away from a collision *point*, instead of the collision *triangle*, plus despite all Linahan's focussing on the sliding plane calculations, it misses the fact that Fauerby quite simply got them wrong any time the sphere is too close to the collision point.

Linahan also misses the most obvious not-a-robust-use-of-floats example in the whole thing. On page 38 of Fauerby's method, we have this:


if (normalDotVelocity == 0.0f) { ...

This is a check to see if the sphere is pushing into the plane at all. If the Dot product of the velocity and the plane's normal is zero, the sphere is moving parallel to the plane, so no collision is possible. And anyone can see the problem here, especially in the context of "numerical robustness". Float comparison. Again it's just silly; silly that this is missed. If the sphere travels parallel to the plane, in real life real mathematics that Dot product will always be zero. In the ugly world of float comparisons, that Dot product will never be exactly zero. Always a tiny bit over or a bit less, positive or negative. It's the last thing in all the code you'd want to rely on (I've tested this, it definitely is borked), so why does Fauerby's get away with it? Because whenever the comparison fails, it just checks for a collision as always anyway. And as always, if a collision is found, it accidentally slips a little closer to the triangle, and as always, if it gets too close to the tri, the sliding plane accidentally shunts the sphere back up and away from it.

In other words, because it literally doesn't matter.

Anyway, back to Linahan's paper. Linahan's own conclusion says it all really.

We implemented the improved algorithm from Table II, and were not able to observe a single instance of the sphere jittering or penetrating the triangle mesh. Whereas a careful implementation of this improved algorithm is very effective at maintaining the invariant that the sphere must never intersect the geometry, there are a few unsolved problems. We noticed that the sphere snags and often gets caught on the edges of triangles, which can be described as the opposite of the jitteriness problem. Tweaking the tolerance value can mitigate this problem; however, a deeper understanding of the cause seems necessary here.

They solved something that wasn't broken, missed the things that were, and wrapped it all up with "Hmm, gets stuck on edges often, doesn't it?"

So yeah. For all its formality and initial appearance of focus on robustness, Linahan's paper appears to be a rushed job.

This topic is closed to new replies.

Advertisement