frustrum culling methods

Started by
19 comments, last by Norman Barrows 11 years ago

Usually that's not a big deal and you're better off with conservative quick and dirty frustum culling than being super precise and wasting more time on culling objects than it would have taken to render them.

yes this is the approach i'm using. i actually cull to a pie slice (IE just left,right, and far, not above or below) that's about 20 feet (D3D units) wider than the frustrum on each side and starts 20 feet behind the camera.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

Advertisement

Which also means that spheres are usually more than sufficient and if you can easily group a bunch of objects into chunks, it's a waste of time to cull every single object individually.

For a large terrain, I use a sphere tree. It's crude and draws the occasional invisible chunk, but culling takes almost no time at all.

hmm...

the world is procedurally generated when the player starts a new game. what gets culled? ground quads, trees, rocks, bushes, PCs, NPCs, monsters, dropped items, missile in flight. that's about it. the usual stuff.

ground quads are simply drawn around the camera out to clip range. the world map defines the texture tile set used (sand, leaves, etc). a "pattern map" defines the texture used for each quad. a procedural heightmap function based on the terrain in the world map is used to heightmap a quad before drawing.

trees rocks bushes etc are located via a terrain flag in the world map (yes there are rocks here), and a "pattern map" of what rock is where. the pattern maps are sparse matrices, IE lists with type (defines mesh and texture, or model), coords, scale, rotation, pre-estimated bounding box coords, etc.

PCs are in another list. NPCs and monsters are in yet another list. Dropped items are yet another list. and active missiles is the final list of stuff to draw.

Models (any object requiring more than 1 mesh and 1 texture to draw) are clipped or not clipped in their entirety, so clipping a model automatically clips all its meshes. that's about the only place where the data is already tree-like. models are a limb based system with up to 20 limbs (currently), and any limb can be the parent of any other limb, or have object space as its parent frame of reference.

as you can see, everything is in lists, and none of these object types encapsulates another. that's already handled by the individual models. clip a tree and you clip its trunk/branch mesh, and all its leaf meshes. clip a monster model and you clip its body, limbs, head, fangs, claws, etc.

is there a way to put these lists into a tree that would speed things up? or is it already as "treed out" as possible?

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

i was just curious how the "real world" does it. sphere/AABB vs plane etc all just sounds like overkill to me. do people really have clock cycles to burn on such trivialities? please explain, am i missing something here? are there situations where things like AABB vs plane are required?

It's a simple case of "divide and conquer" - all surfaces in your world are divided into some form of tree structure, with popular examples being a BSP tree, quadtree or octree (there are other variants but these will suffice for the purposes of this discussion). I'm going to deal with the more simple case of a BSP tree here because it's easier to explain and visualize, but this can be quite simply expanded to quadtree/octree once you get the principle.

So you begin with a single large bounding box which is the volume enclosing the entire world and containing all static geometry. Pick a plane and split it in two with that plane; divide your geometry among each half. Then, for each half, continue doing likewise until you reach some program-specific lower-bound. Straight away you can see that this is a recursive algorithm, but the recursion depth is quite low so it's no big deal, and you're trading off the overhead of recursion versus huge acceleration of accept/reject cull tests.

Now to cull you just start again at the top "node" of the tree - i.e. the big bounding box that encloses everything. You can run a frustum test on this if you wish but it will always be visible. Take each "child node" (i.e the two halves) and test these. Recurse. Here's where the real performance gain is. If a node is outside of the frustum then all of it's child nodes are also outside; no need to test them and no need to recurse down that half. In the best-case scenario here you can lop off fully half of your world geometry with a single cull test; you've already halved the performance impact of cull tests. The lesser-known trick is that if a node is fully inside the frustum then all of it's child nodes are also fully inside, so no need to test them either. Both of these cases work because each node fully encloses all of it's child nodes.

You should now be seeing how the number of cull tests gets dramatically reduced - instead of doing potentially tens of thousands of tests on each individual surface you're doing a much lower number of tests, discarding huge chunks of geometry at a time, and eventually filtering down to only having to test cases where a node may intersect the frustum (which is only true if that node's parent intersects it).

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

i was just curious how the "real world" does it. sphere/AABB vs plane etc all just sounds like overkill to me. do people really have clock cycles to burn on such trivialities?

Sphere vs plane is a dot product. You can do 12 dot products with 9 instructions.

Have you read the Dice/Battlefield presentation that HardlineDigital posted above?

Using a brute force algorithm, they do 15,000 sphere vs frustum tests in 1ms, or in 0.3ms when multi-threaded. That's enough tests for almost any game.

Comparing spheres vs planes is so simple, that adding extra early-out paths (e.g. if you're outside the first plane, don't check the other 5) actually makes the whole algorithm slower!

Also, tightening up the code so that it takes the least amount of clock cycles is useless unless you first optimize your data layouts. The number of clock cycles taken up by a single cache miss is comparable to about 50 sphere vs frustum tests, so optimal code is useless without optimal data layouts.

there's very little to clip above and below the frustrum, so i don't bother with clipping to the upper and lower frustrum planes. i'm trying to keep the culling overhead as low as possible, so i'm trying to stay away from complex intersection tests, additional transforms, etc. I was forced to go to camera space, because you can look down over the edge of a 100 foot cliff down into a canyon below (and actually climb up and down the cliff face too).

If you're looking down a cliff face, isn't there potentially a lot of stuff up/down from the camera?

So you begin with a single large bounding box which is the volume enclosing the entire world and containing all static geometry.

ok, that right there assumes the existence of a "world mesh", correct? and "divide and conquer" methods are used to cull individual triangles from that mesh, right?

that's not what i have going on here. the static world is assembled from a small collection of meshes and models, one ground quad, 2 tree models, 2 rock meshes, and maybe 4 plant models draws EVERYTHING! just change the textures, scale, rotation, etc.

The entire game takes a sort of CSG approach to drawing vs cell or level based. so many things are drawn with primitives like plane, cube, sphere, etc, and also with game specific primitive shapes such as "rock" and "stick" and "stone knife blade".

since this leads to large batches of small numbers of triangles vs one world mesh, i use a drawing list to feed the entire scene to directx in the most efficient order. you make all your draw calls for the scene, but they go into the list, not to directx. when its time to present, the list is sorted on texture, then mesh, then near to far, and maybe some other stuff, can't recall offhand <g>. and then the entire ordered list is sent to directx.

in the past i've experimented with N x N sized ground quads vs the 1 x 1 size (2 tris) i use now. but the engine is based on Dx9 fixed function for max compatability. for max speed it follows the mantra of 256x256 textures, and just one texture per mesh. the one texture per mesh rule means you can only use one 256x256 across an N x N ground quad, which usually leads to moire patterns. using just 4 tiles and a 10 foot 1x1 ground quad and pattern maps, i've been able to do desert sand with no moire patterns.

so i guess i'm more interested in generalized methods of culling entire meshes or collections of meshes, vs specific methods for culling triangles from a single huge world mesh.

but i must say, the explanation was the most lucid i've yet seen.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

Also, tightening up the code so that it takes the least amount of clock cycles is useless unless you first optimize your data layouts.

too true, processors almost run faster than we can feed them.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

If you're looking down a cliff face, isn't there potentially a lot of stuff up/down from the camera?

up, sort of. down, not much. in the up direction, you have what was in front of you when looking ahead. but its not a huge level mesh, its just ground quads, rock and tree models drawn out to 300 D3D units. since everything is in lists, and there is no world mesh, nothing ever gets drawn outside clip range. so i never have to cull anything that's not within 300 feet of the camera worst case scenario. imagine you had a doom level world mesh that always extended out to a radius of just 300 units from the camera and you only had to clip that, not every triangle in the entire level.

below the camera is the cliff face, and nothing else. just a few ground quads.

climbing cliffs is the only place where there's any significant amount of stuff above or below the frustrum that's not there when you look straight ahead. and i'm not even sure that's an accurate statement.

ok, outdoors scene, camera looking straight ahead. clip range, say 300 feet. ground is drawn as individual 10' x 10' quads out to clip range. tree models (collections of meshes), plant models, rock meshes. maybe some animated animal models (models can be static or animated). nothing drawn beyond clip range. clouds , sun and moon are drawn separately in a previous pass.

so as you're looking forward, what do you see? specifically, whats left/right/above/below you? left and right you have more of the world, ground quads, trees, rocks, etc. lots of stuff to cull there. below? well there's nothing under the ground quads, so not much to cull there. and above? except for the occasional animated model of an avian (pterasaurs, anyone?), there's nothing above you really. remember, the sky, clouds, sun, and moon have already been drawn in a previous pass, and now are just a background, like a cleared screen.

this is why i don't bother with the upper and lower plane clips.

its only when you look down that 100 foot cliff that you have lots of stuff above the upper plane. since that doesn't happen often in the game, i decided it was better to not add the overhead of upper plane clip all the time just for that special case. birds over your head are about the only thing that omitting the upper plane check allows through during normal circumstances.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

You can do 12 dot products with 9 instructions

SIMD right?

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

So you begin with a single large bounding box which is the volume enclosing the entire world and containing all static geometry.

ok, that right there assumes the existence of a "world mesh", correct?

Not necessarily. The tree can be constructed based on arbitrary bounds and doesn't have to contain static objects (or any objects at all) at construction time; it can be a purely abstract structure into which objects you're going to draw are placed at runtime. So for the case you have, all you're doing is inserting objects into the tree at runtime then making your culling decisions based on which nodes of the tree are visible, rather than culling per-object.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

The most sparing solution I came up with was:

1- first check wheather the AABS center is enough behind the near plane

2 - if it is not, check wheather it is behind left plane

3- if it is still not, check right plane

4- if still not, check for bottom and top plane unless you are CPU bound,

5- fall in render queue

This topic is closed to new replies.

Advertisement