Why is collision cast expensive?

Started by
32 comments, last by EWClay 11 years, 2 months ago
Why is collision cast expensive and why not do collision cast
on every frame? Why does this bring the ps3 to its knees? Does
this apply to pc games too? When they say expensive, are they
saying the cpu uses a lot of memory?
Collision Cast is bought up at 9:38 of the video
20 Guass? This is mentioned at 9:48
Collision Cast: Tracing a line between points to see
if anything collides with the line
Can I assume other collision techniques like bounding box collisions are
just as expensive?
Advertisement

It's not the cost of individual collision checks you have to worry about - if you are performing just one check, ray-vs-sphere is only a little more expensive than box-vs-box.

What you need to do worry about is checking against every other object. If you have N objects in your scene, it requires N-1 ray-vs-sphere checks for each ray you cast, and at you will need to cast multiple rays for each player and AI character in the level...

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

If you want a normal vector like the one he describes, then you need to know the polygon that the player is "standing" on. This requires a much finer calculation than a bounding box, as you need the exact polygon to get that normal vector. How expensive it is depends on the complexity of the world, but when compared to the super-cheap alternative of subtraction of two vectors, it's extremely expensive.

It's not that you shouldn't do collision casts per frame, as you do these all the time for things like shooting projectiles and such. It's just that you only have so much processing power to spare if you want your game to run smoothly, so you'd rather save calculations for other things. One of the guys mentions that the reason why there are only so many monsters in the game is because more would be too much for the hardware.

Also, keep in mind that this hardware is the PS1 (I believe) so we're talking about much worse hardware than today.

Just a little note - the game above was originally released on PS2 (on which the developers probably play the game), but eventually got it's release on PS3 in HD as well.

ok, I personally call this operation a 'traceline', but either way.


the main cost has to do with finding everything that might be located along the line between the two points (likewise for the box case), which includes both the world geometry and other objects.

dunno about their case, but my engine dynamically builds a BSP-like tree for sorting all this out (mostly by eliminating nearly everything that can be determined not to collide).


the trick here is figuring out how to quickly rebuild the BSP in real-time, which in my case mostly consists of a naive algorithm for finding the "center of mass" (average of the origins) and relative mass distribution relative to the origin. a naive solution is simply to take the sum of the square of the distances from the center, but a slightly more advanced algorithm instead produces a 3x3 matrix, then one can look for the vector with the greatest length. at each step, the set is split in half, and repeated for the halves, generally with any objects that cross the plane being included inside the node in question.


otherwise, the cost of doing these sorts of line-check operations can easily go up in an O(n^2) relation to how many objects exist, whereas with a BSP tree it is more around O(n log2 n).

and, part of this is because my AIs are a bit lazy and do generally do tracelines and similar checks (bounding-box move checks, ...) pretty much every tick. another solution here was that any idle AIs past a certain distance will "freeze" until a/the player moves within a certain distance of them, at which point their AIs will reactivate (they also go onto a slower 4Hz tick, *1).

for example, a moving AI may take into account the direction they are walking, applied gravity, ... and this will be fed into a function that tries to figure out where they will be able to move to (generally with some amount of probing), causing any number of collision queries to be performed.

also, AI enemy targeting is pretty much querying anything within their visibility area (a big sphere query), determining whether or not it is a potential enemy, and whether or not it is inside their view-frustum and they have a clear line-of-sight (more tracelines, it is considered a pass if any of them are not blocked), ...

so, these sorts of (small) costs can easily add up.


*1: different sets of 'ticks' happen at different rates.
4Hz: idle tick, for things which aren't really speed-critical (like activation/deactivation based on distance, ...);
10Hz: AI tick, generally used for enemy AI checks (checking for targets, movements, ...);
16Hz: movement physics tick, used for player movements and projectiles, and usual (non-fancy) physics;
100Hz (virtual): used by fancy physics (rigid body dynamics), only exists when entities using fancy-physics are active (*2).

*2: this is a little off, but related. doing collision checks/... at 100Hz is expensive, even with a BSP, so it is not done unless specially needed (usually because objects which use rotational movement are not exactly stable at much lower tick-rates). nearly everything else is either resting or sliding AABBs (normal bounding-boxes) though, which don't really require any fancy checks (for the most part).

certain bounding-box objects may still require subdivision in some cases though (mostly sub-dividing the movement of rockets or other fast-moving projectiles), to make sure they actually collide with things in their path. even at 16Hz a rocket can still easily exist on both sides of an enemy or a wall without otherwise registering a collision, so a subdivided tick is used to minimize this (at the cost of making rockets a bit more expensive).

16Hz was generally chosen for movement and physics mostly as it is a little less "jittery" for movement than 10Hz, and partly because my client-side LERP'ing was resulting in nasty "rubbery" interpolation effects at 10Hz, which were greatly reduced at 16Hz.


or such...

If you want a normal vector like the one he describes, then you need to know the polygon that the player is "standing" on. This requires a much finer calculation than a bounding box, as you need the exact polygon to get that normal vector. How expensive it is depends on the complexity of the world, but when compared to the super-cheap alternative of subtraction of two vectors, it's extremely expensive.

It's not that you shouldn't do collision casts per frame, as you do these all the time for things like shooting projectiles and such. It's just that you only have so much processing power to spare if you want your game to run smoothly, so you'd rather save calculations for other things. One of the guys mentions that the reason why there are only so many monsters in the game is because more would be too much for the hardware.

Also, keep in mind that this hardware is the PS1 (I believe) so we're talking about much worse hardware than today.

Great explanation. You got a point there.

dunno about their case, but my engine dynamically builds a BSP-like tree for sorting all this out (mostly by eliminating nearly everything that can be determined not to collide).

BSP as In Binary space partitioning? Im reading up on it right now. So that tree structure is suppose to focus on the area that requires collision testing and obscure the areas that do not need collision testing?


dunno about their case, but my engine dynamically builds a BSP-like tree for sorting all this out (mostly by eliminating nearly everything that can be determined not to collide).

BSP as In Binary space partitioning? Im reading up on it right now. So that tree structure is suppose to focus on the area that requires collision testing and obscure the areas that do not need collision testing?


Yes, it is one of several spatial partitioning techniques. Quadtrees and Loose Quadtrees are also useful for items placed around a terrain.

Most major collision or physics systems will use two phases.

First is called a broad phase. This uses a BSP or Quadtree or some other spatial system. That way instead of testing against every item in the world, you are only testing against objects that are very likely to collide. The only compute cost involved is navigating a binary spatial tree, so it is possible to eliminate hundreds or thousands of items with just ten or so comparisons.

The second pass is called a narrow phase. This is where it tests the details of the model. It may check against the actual 3D mesh, or against a physics object like an invisible sphere.


As an example... Imagine casing a ray across a level with 800 objects on it. First comes the broad phase. It searches the binary tree very quickly to discover only six items are close to the ray. Next comes the narrow phase. Those six items discovered in the broad phase are tested with a more precise mesh collision test. This phase may discover that the bullet went under the arm of one actor, over the shoulder of another actor, and was a head-shot on a third actor.



dunno about their case, but my engine dynamically builds a BSP-like tree for sorting all this out (mostly by eliminating nearly everything that can be determined not to collide).

BSP as In Binary space partitioning? Im reading up on it right now. So that tree structure is suppose to focus on the area that requires collision testing and obscure the areas that do not need collision testing?


Yes, it is one of several spatial partitioning techniques. Quadtrees and Loose Quadtrees are also useful for items placed around a terrain.

Most major collision or physics systems will use two phases.

First is called a broad phase. This uses a BSP or Quadtree or some other spatial system. That way instead of testing against every item in the world, you are only testing against objects that are very likely to collide. The only compute cost involved is navigating a binary spatial tree, so it is possible to eliminate hundreds or thousands of items with just ten or so comparisons.

The second pass is called a narrow phase. This is where it tests the details of the model. It may check against the actual 3D mesh, or against a physics object like an invisible sphere.


As an example... Imagine casing a ray across a level with 800 objects on it. First comes the broad phase. It searches the binary tree very quickly to discover only six items are close to the ray. Next comes the narrow phase. Those six items discovered in the broad phase are tested with a more precise mesh collision test. This phase may discover that the bullet went under the arm of one actor, over the shoulder of another actor, and was a head-shot on a third actor.


yep. the main notable thing is that for real-time tree-building, a quick-and-dirty BSP building strategy is preferable over a slower but more balanced BSP algo (these slower algos are often used for off-line BSP building).

I have typically also added in AABB tests for the coarser checks, since, even if we know a collision is likely, it is generally cheaper to eliminate possible collisions using AABB checks first, rather than using more expensive OBB/polyhedron/mesh checks up-front.


typically, for physics ticks, an entire movement will be boxed. for example, the starting and ending positions will be boxed (usually conservatively, assuming radius=box-size), then a bounding box will be made which includes the boxes both the start and end positions. if the two movement boxes don't collide, then there is no real chance of the objects themselves colliding.

after this point, it may be possible to instead treat the movement as a pair of moving spheres. if at the CPA (closest point of approach) there is no collision (ADD: "distance>(radius1+radius2)"), then the collision can also be eliminated.

then, more precise collision checks can be done by checking the objects at the positions they would be in at the CPA, and if they still collide, then any collision handling can be done (such as applying contact forces and similar).


though, it is also notable how, for the most part, I still most often end up using simple AABBs.

dunno about their case, but my engine dynamically builds a BSP-like tree for sorting all this out (mostly by eliminating nearly everything that can be determined not to collide).


the trick here is figuring out how to quickly rebuild the BSP in real-time

BSP... BSP... BSP... what have you got dudes?

Realtime BSP. Are you serious?

Trying to make BSP realtime is like trying to adapt an internal combustion engine to run on electricity only. It's just easier to use an electric motor. Bullet uses SAP and AABB trees. The latter is quite faster when dealing with a lot of moving geometry.

I have typically also added in AABB tests for the coarser checks...
though, it is also notable how, for the most part, I still most often end up using simple AABBs.

I mean, that says all!

Previously "Krohm"

This topic is closed to new replies.

Advertisement