# frustrum culling methods

## Recommended Posts

HI i was wondering what frustrum culling methods folks are using?

i've seen testing bounding spheres vs planes mentioned.

Edited by Norman Barrows

##### Share on other sites

This is not just about frustum culling but it is in here.

##### Share on other sites

Sticking with the basic beginner friendly methods, there is pretty much the "classic" approach of extracting frustum planes and checking the distance again each of them and the "radar" approach, where you transform things to camera space and just compare z vs. near/far and x/y vs half the width/height of the frustum for the given z.

There is no huge difference in terms of complexity or performance, so I'd simply use whatever seems more intuitive to you.

##### Share on other sites

and the "radar" approach, where you transform things to camera space and just compare z vs. near/far and x/y vs half the width/height of the frustum for the given z.

There is no huge difference in terms of complexity or performance, so I'd simply use whatever seems more intuitive to you.

well i started with a world space clip in x-z only, then went to a camera space clip in x-z only. but i'm computing headings to the center and corners of bounding boxes. x/y vs 1/2 width/height @ given Z sounds more efficient. my only problem is when a bounding box spans the frustrum and is so big that the center and all four corners fall outside the frustrum.

##### Share on other sites

http://www.cse.chalmers.se/~uffe/vfc_bbox.pdf

http://fgiesen.wordpress.com/2010/10/17/view-frustum-culling/

A simple one is to extract the 6 planes from your view-projection, and then test each object's bounding sphere against those planes, taking it's radius into account.

The above describe a similar scheme extended to AABB's. To support OBBs, you can rotate the planes by the boxes inverse orientation so that the box becomes axis aligned.

##### Share on other sites

What I have done in the past and it worked very well was simple sphere vs frustum with plane caching.

The idea being if a sphere vs frustum plane fails the test, the store that plane and test it first next time round as it will most likely be behind that same plane on the next frame.

I wrote a little about it here:

http://blog.bwhiting.co.uk/?p=355

It works very very quickly and was fine with anything upto 10,000 objects culled in about 1ms. With c++ this could probably be 10 times faster.

##### Share on other sites

well i started with a world space clip in x-z only, then went to a camera space clip in x-z only. but i'm computing headings to the center and corners of bounding boxes. x/y vs 1/2 width/height @ given Z sounds more efficient. my only problem is when a bounding box spans the frustrum and is so big that the center and all four corners fall outside the frustrum.

That shouldn't be a problem, unless you make the mistake of completely ignoring which plane the corners are outside of. All corners have to be outside of the same plane, even if that means not culling a few special cases. Usually that's not a big deal and you're better off with conservative quick and dirty frustum culling than being super precise and wasting more time on culling objects than it would have taken to render them.

Which also means that spheres are usually more than sufficient and if you can easily group a bunch of objects into chunks, it's a waste of time to cull every single object individually.

For a large terrain, I use a sphere tree. It's crude and draws the occasional invisible chunk, but culling takes almost no time at all.

##### Share on other sites

The other useful property of any kind of hierarchical partitioning that's rarely mentioned is: if a given node is fully inside the frustum, then all of it's child nodes are guaranteed to also be fully inside; this generalizes to: frustum checks only need to be done for nodes where the parent intersects the frustum (and for the root node, where there is no parent).  Using this knowledge you can quite easily (and quickly) skip over a huge number of tests with trivial accept and reject cases.

Edited by mhagain

##### Share on other sites

A simple one is to extract the 6 planes from your view-projection, and then test each object's bounding sphere against those planes, taking it's radius into account.
The above describe a similar scheme extended to AABB's. To support OBBs, you can rotate the planes by the boxes inverse orientation so that the box becomes axis aligned.

the title in question is a FPS/RPG. as such, there's very little to clip above and below the frustrum, so i don't bother with clipping to the upper and lower frustrum planes. i'm trying to keep the culling overhead as low as possible, so i'm trying to stay away from complex intersection tests, additional transforms, etc. I was forced to go to camera space, because you can look down over the edge of a 100 foot cliff down into a canyon below (and actually climb up and down the cliff face too).

i was just curious how the "real world" does it.   sphere/AABB vs plane etc all just sounds like overkill to me. do people really have clock cycles to burn on such trivialities?  please explain, am i missing something here? are there situations where things like AABB vs plane are required?

the problem with large bboxes that more than "double span" the frustrum is the same idea as "spanning triangle clipping" in a poly engine. easily dealt with with a couple extra checks.

am i correct in assuming that in camera space "abs(x) > 1/2 frustrum width @ given z" is faster than "plane-sphere intersection test" (assuming everything is pre-computed) ?   I haven't looked up the formulas, other than those clever "change of coord system" dots in the article cited above (metal gear 3 was it?).

##### Share on other sites

The idea being if a sphere vs frustum plane fails the test, the store that plane and test it first next time round as it will most likely be behind that same plane on the next frame.

very clever. optimize cull order on the fly!

##### Share on other sites

Usually that's not a big deal and you're better off with conservative quick and dirty frustum culling than being super precise and wasting more time on culling objects than it would have taken to render them.

yes this is the approach i'm using.   i actually cull to a pie slice (IE just left,right, and far, not above or below) that's about 20 feet (D3D units) wider than the frustrum on each side and starts 20 feet behind the camera.

##### Share on other sites

Which also means that spheres are usually more than sufficient and if you can easily group a bunch of objects into chunks, it's a waste of time to cull every single object individually.

For a large terrain, I use a sphere tree. It's crude and draws the occasional invisible chunk, but culling takes almost no time at all.

hmm...

the world is procedurally generated when the player starts a new game. what gets culled?  ground quads, trees, rocks, bushes, PCs, NPCs, monsters, dropped items, missile in flight. that's about it. the usual stuff.

ground quads are simply drawn around the camera out to clip range. the world map defines the texture tile set used (sand, leaves, etc). a "pattern map" defines the texture used for each quad. a procedural heightmap function based on the terrain in the world map is used to heightmap a quad before drawing.

trees rocks bushes etc are located via a terrain flag in the world map (yes there are rocks here), and a "pattern map" of what rock is where. the pattern maps are sparse matrices, IE lists with type (defines mesh and texture, or model), coords, scale, rotation, pre-estimated bounding box coords, etc.

PCs are in another list.   NPCs and monsters are in yet another list. Dropped items are yet another list. and active missiles is the final list of stuff to draw.

Models (any object requiring more than 1 mesh and 1 texture to draw) are clipped or not clipped in their entirety, so clipping a model automatically clips all its meshes. that's about the only place where the data is already tree-like. models are a limb based system with up to 20 limbs (currently), and any limb can be the parent of any other limb, or have object space as its parent frame of reference.

as you can see, everything is in lists, and none of these object types encapsulates another. that's already handled by the individual models. clip a tree and you clip its trunk/branch mesh, and all its leaf meshes. clip a monster model and you clip its body, limbs, head, fangs, claws, etc.

is there a way to put these lists into a tree that would speed things up? or is it already as "treed out" as possible?

Edited by Norman Barrows

##### Share on other sites

i was just curious how the "real world" does it.   sphere/AABB vs plane etc all just sounds like overkill to me. do people really have clock cycles to burn on such trivialities?  please explain, am i missing something here? are there situations where things like AABB vs plane are required?

It's a simple case of "divide and conquer" - all surfaces in your world are divided into some form of tree structure, with popular examples being a BSP tree, quadtree or octree (there are other variants but these will suffice for the purposes of this discussion).  I'm going to deal with the more simple case of a BSP tree here because it's easier to explain and visualize, but this can be quite simply expanded to quadtree/octree once you get the principle.

So you begin with a single large bounding box which is the volume enclosing the entire world and containing all static geometry.  Pick a plane and split it in two with that plane; divide your geometry among each half.  Then, for each half, continue doing likewise until you reach some program-specific lower-bound.  Straight away you can see that this is a recursive algorithm, but the recursion depth is quite low so it's no big deal, and you're trading off the overhead of recursion versus huge acceleration of accept/reject cull tests.

Now to cull you just start again at the top "node" of the tree - i.e. the big bounding box that encloses everything.  You can run a frustum test on this if you wish but it will always be visible.  Take each "child node" (i.e the two halves) and test these.  Recurse.  Here's where the real performance gain is.  If a node is outside of the frustum then all of it's child nodes are also outside; no need to test them and no need to recurse down that half.  In the best-case scenario here you can lop off fully half of your world geometry with a single cull test; you've already halved the performance impact of cull tests.  The lesser-known trick is that if a node is fully inside the frustum then all of it's child nodes are also fully inside, so no need to test them either.  Both of these cases work because each node fully encloses all of it's child nodes.

You should now be seeing how the number of cull tests gets dramatically reduced - instead of doing potentially tens of thousands of tests on each individual surface you're doing a much lower number of tests, discarding huge chunks of geometry at a time, and eventually filtering down to only having to test cases where a node may intersect the frustum (which is only true if that node's parent intersects it).

##### Share on other sites

i was just curious how the "real world" does it.   sphere/AABB vs plane etc all just sounds like overkill to me. do people really have clock cycles to burn on such trivialities?

Sphere vs plane is a dot product. You can do 12 dot products with 9 instructions.

Have you read the Dice/Battlefield presentation that HardlineDigital posted above?

Using a brute force algorithm, they do 15,000 sphere vs frustum tests in 1ms, or in 0.3ms when multi-threaded. That's enough tests for almost any game.

Comparing spheres vs planes is so simple, that adding extra early-out paths (e.g. if you're outside the first plane, don't check the other 5) actually makes the whole algorithm slower!

Also, tightening up the code so that it takes the least amount of clock cycles is useless unless you first optimize your data layouts. The number of clock cycles taken up by a single cache miss is comparable to about 50 sphere vs frustum tests, so optimal code is useless without optimal data layouts.

there's very little to clip above and below the frustrum, so i don't bother with clipping to the upper and lower frustrum planes. i'm trying to keep the culling overhead as low as possible, so i'm trying to stay away from complex intersection tests, additional transforms, etc. I was forced to go to camera space, because you can look down over the edge of a 100 foot cliff down into a canyon below (and actually climb up and down the cliff face too).

If you're looking down a cliff face, isn't there potentially a lot of stuff up/down from the camera?

##### Share on other sites

So you begin with a single large bounding box which is the volume enclosing the entire world and containing all static geometry.

ok, that right there assumes the existence of a "world mesh", correct? and "divide and conquer" methods are used to cull individual triangles from that mesh, right?

that's not what i have going on here. the static world is assembled from a small collection of meshes and models, one ground quad, 2 tree models, 2 rock meshes, and maybe 4 plant models draws EVERYTHING! just change the textures, scale, rotation, etc.

The entire game takes a sort of CSG approach to drawing vs cell or level based. so many things are drawn with primitives like plane, cube, sphere, etc, and also with game specific primitive shapes such as "rock" and "stick" and "stone knife blade".

since this leads to large batches of small numbers of triangles vs one world mesh, i use a drawing list to feed the entire scene to directx in the most efficient order. you make all your draw calls for the scene, but they go into the list, not to directx. when its time to present, the list is sorted on texture, then mesh, then near to far, and maybe some other stuff, can't recall offhand <g>. and then the entire ordered list is sent to directx.

in the past i've experimented with N x N sized ground quads vs the 1 x 1 size (2 tris) i use now. but the engine is based on Dx9 fixed function for max compatability. for max speed it follows the mantra of 256x256 textures, and just one texture per mesh. the one texture per mesh rule means you can only use one 256x256 across an N x N ground quad, which usually leads to moire patterns. using just 4 tiles and a 10 foot 1x1 ground quad and pattern maps, i've been able to do desert sand with no moire patterns.

so i guess i'm more interested in generalized methods of culling entire meshes or collections of meshes, vs specific methods for culling triangles from a single huge world mesh.

but i must say, the explanation was the most lucid i've yet seen.

##### Share on other sites

Also, tightening up the code so that it takes the least amount of clock cycles is useless unless you first optimize your data layouts.

too true, processors almost run faster than we can feed them.

##### Share on other sites

If you're looking down a cliff face, isn't there potentially a lot of stuff up/down from the camera?

up, sort of. down, not much. in the up direction, you have what was in front of you when looking ahead. but its not a huge level mesh, its just ground quads, rock and tree models drawn out to 300 D3D units.  since everything is in lists, and there is no world mesh, nothing ever gets drawn outside clip range. so i never have to cull anything that's not within 300 feet of the camera worst case scenario.   imagine you had a doom level world mesh that always extended out to a radius of just 300 units from the camera and you only had to clip that, not every triangle in the entire level.

below the camera is the cliff face, and nothing else. just a few ground quads.

climbing cliffs is the only place where there's any significant amount of stuff above or below the frustrum that's not there when you look straight ahead. and i'm not even sure that's an accurate statement.

ok, outdoors scene, camera looking straight ahead. clip range, say 300 feet. ground is drawn as individual 10' x 10' quads out to clip range. tree models (collections of meshes), plant models, rock meshes. maybe some animated animal models (models can be static or animated). nothing drawn beyond clip range.   clouds , sun and moon are drawn separately in a previous pass.

so as you're looking forward, what do you see? specifically, whats left/right/above/below you?   left and right you have more of the world, ground quads, trees, rocks, etc. lots of stuff to cull there. below? well there's nothing under the ground quads, so not much to cull there. and above? except for the occasional animated model of an avian (pterasaurs, anyone?), there's nothing above you really. remember, the sky, clouds, sun, and moon have already been drawn in a previous pass, and now are just a background, like a cleared screen.

this is why i don't bother with the upper and lower plane clips.

its only when you look down that 100 foot cliff that you have lots of stuff above the upper plane. since that doesn't happen often in the game, i decided it was better to not add the overhead of upper plane clip all the time just for that special case. birds over your head are about the only thing that omitting the upper plane check allows through during normal circumstances.

##### Share on other sites

You can do 12 dot products with 9 instructions

SIMD right?

##### Share on other sites

So you begin with a single large bounding box which is the volume enclosing the entire world and containing all static geometry.

ok, that right there assumes the existence of a "world mesh", correct?

Not necessarily.  The tree can be constructed based on arbitrary bounds and doesn't have to contain static objects (or any objects at all) at construction time; it can be a purely abstract structure into which objects you're going to draw are placed at runtime.  So for the case you have, all you're doing is inserting objects into the tree at runtime then making your culling decisions based on which nodes of the tree are visible, rather than culling per-object.

##### Share on other sites

The most sparing solution I came up with was:

1- first check wheather the AABS center is enough behind the near plane

2 - if it is not, check wheather it is behind left plane

3- if it is still not, check right plane

4-  if still not, check for bottom and top plane unless you are CPU bound,

5- fall in render queue

##### Share on other sites

So for the case you have, all you're doing is inserting objects into the tree at runtime then making your culling decisions based on which nodes of the tree are visible, rather than culling per-object.

so i'm doing a left/right sort insertion into a K-tree thing, then eh?

and if i have a critter just outside the left plane, i can skip all the children to the left of it. that kind of idea?

ok, but that may be overkill, depending on the number of objects to cull. i can see it being useful for a world mesh of 10,000 tri's. but i'm usually more on the order of 500 objects or so. maybe 2000 worst case. brute force may be sufficient. Abrash on the doom engine (CGDC 96): "the pc is good at doing simple stuff lot of times really really fast". so a loop for each list of objects to cull (trees, rocks, critters, etc) may be sufficient. I haven't noticed any cull related slowdowns. i do plan to replace my angle calculations with either "abs(x) vs 1/2 frustrum width", or sphere-plane intersection.

## Create an account

Register a new account

• ### Forum Statistics

• Total Topics
628379
• Total Posts
2982351

• 10
• 9
• 15
• 24
• 11