# Frustum Culling Question

This topic is 1040 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hi all,

I have a quick question regarding my current frustum-culling implementation, that being: is it necessary to factor in the transformation matrix of each game object/entity (it's model->world transform) when constructing the clip matrix? I haven't seen any references sources do this and it sounds very ineffecient to have to reconstruct the clip matrix for each object? I'm currently constructing the clip matrix via : projection * view * model->world and can't help but think that while it works it's wrong in some way...

I can include code if required but I think the question is relatively self-contained.

Any help would be appreciated :)

##### Share on other sites

If you don't factor in the model's transformation how would you know if it's in your frustum? Nothing you wrote sounds out of the ordinary, Model View Projection matrices have MVP as an acronym for a reason!

##### Share on other sites

On a basic level? Yes. But there are methods to speed this up.

This is where you come into the spatial partitioning algorithms often used in games. These algorithms speed up the frustrum check by acting as a "set" of data in a given volume.

Remember that a set is a relational container of data, that can also exist in other sets. A venn-diagram if you will. For games however, it's more optimal if we DONT check the same object more than once.. (even if it barely impacts performance) so it will usually be a venn-diagram with no overlapping circles.

When we frustrum check, we check to see if these volumes/planes are within the frustrum, before we start checking if each object is in the frustrum or not.

Basically...

Quad-Trees, and Oct-trees work by recursion. You start off with very large volumes, and you work your way downwards, checking smaller volumes, till you finally get to the actual meshes to check. For small worlds... these tend to be more costly. For larger worlds these are efficient.

The trees... are more like graphs to be honest. These are designed to fit well with hallway like games. I won't talk about them much... because I am not that familiar with them. But they use a concept of portals, and rooms. If you can see a portal, then chances are you can see a room. If you can see the room, then you -can- do a frustrum check. But you can also choose to not waste time on that if you wish. It might just be faster to render everything than to spend some time to see if everything can be seen or not.

This is why Skyrim and Fallout flip flops between the BSPs and Quad-Trees like a schitzophrenic squirrel.

Edited by Tangletail

##### Share on other sites

For the static objects, the easiest option is to build one of trees Tangletail mentioned as the scene is being loaded; but if your scene is dynamic, then these trees will have to be rebuilt every frame. With dynamic scenes your best bet is to use the GPU and NVIDIA has a great paper on the subject. The examples they give are in CUDA; because well... That's their thing. But the algorithms could be implemented in OpenCl or Compute shaders all the same.

http://devblogs.nvidia.com/parallelforall/thinking-parallel-part-iii-tree-construction-gpu/

BVH Trees are also a great tool.

Edited by GameGeezer

##### Share on other sites

Basically...

Quad-Trees, and Oct-trees work by recursion. You start off with very large volumes, and you work your way downwards, checking smaller volumes, till you finally get to the actual meshes to check. For small worlds... these tend to be more costly. For larger worlds these are efficient.

Are hierarchical structures even necessary for frustum culling? My CPU can cull ~450,000,000 AABBs per second (on 1 thread). 1 ms would be 450,000 AABBs. I'm sure you'll run into a dozen other problems before bruteforce frustum culling becomes a bottleneck.

That's up to you really Modern CPUs can go higher than that to be honest, there is a reason to the madness however. If you are building a game engine which must calculate physics, logic, and a whole mess of other stuff, do you honestly want your engine to waste time checking against everything in a five mile radius?

That's ~450,000,000 AABB checks... vs only 4,600 AABB checks plus the small amount of time it would take to create some of the hierarchy's on demand.

keep in mind that 16ms is 60fps. 32ms = 30fps. And that's per update cycle.

Edited by Tangletail

##### Share on other sites

Hey guys, thank you all for taking the time to get back to me on this, much appreciated.

@GameGeezer: It was after looking at rastertek's tutorial on this topic that I decided to post the question, in the tutorial, he uses only the view and projection matrices (from the looks of it anyway), that's why I thought I was missing a trick or something.

@Tangletail: Thanks for mentioning spatial partitioning, I am indeed aware of this and how it can help you discard large swathes or objects/entities at a time. This is what I'll be looking into in more detail next.

Thanks again everyone :)

##### Share on other sites

It was after looking at rastertek's tutorial on this topic that I decided to post the question

my cull code is based in part on that tutorial. depending on the tutorial you used, you might find a bug in the rastertek code.  i combined it with code from another tutorial to get correct code. something related to calculation of the near plane as i recall. the tutorial made assumptions that didn't hold in the general case, something like that.

in my current project, for large outdoor scenes i calculate which terrain chunks are visible, then cull the meshes in the visible chunks. note that frustum cull for objects that span the frustum are a special case and different from your typical Bsphere type checks.

##### Share on other sites
You don't need clip space matrices for culling. You can extract the 6 planes from your view-proj matrix, and then use them to test your world-space bounding volumes.

Are hierarchical structures even necessary for frustum culling? My CPU can cull ~450,000,000 AABBs per second (on 1 thread). 1 ms would be 450,000 AABBs. I'm sure you'll run into a dozen other problems before bruteforce frustum culling becomes a bottleneck.

Yep. DICE went from spatial partitioning on Battlefield 2, to a dumb flat array of all objects on their next game... And the stupid flat array (that should be far worse in BigO notation) was way faster due to the realities on how modern CPUs/RAM work.
You probably only get a win from partitioning when your data set is too big to fit into cache, which probably is well over 10k objects. Under that size, the dumb methods are likely faster. You just have to arrange your data properly -- e.g. a contiguous array of bounding spheres can fit 16k objects into a 256KiB blob (with good SSE compatible alignment to boot).

##### Share on other sites

You probably only get a win from partitioning when your data set is too big to fit into cache, which probably is well over 10k objects. Under that size, the dumb methods are likely faster. You just have to arrange your data properly -- e.g. a contiguous array of bounding spheres can fit 16k objects into a 256KiB blob (with good SSE compatible alignment to boot).

Where does cache size fit in to this?  Isn't the only relevant thing that the cache line is pre-caching data for linear access?  I mean if you're only frustum culling you'll only use the data once or in one temporal locality... you won't visit the data again later, so I am not seeing why cache size plays into this.

1. 1
2. 2
Rutin
21
3. 3
4. 4
A4L
15
5. 5

• 13
• 26
• 10
• 11
• 9
• ### Forum Statistics

• Total Topics
633737
• Total Posts
3013606
×