Jump to content
  • Advertisement
Sign in to follow this  
irreversible

Best data structure and culling method for indoors dynamic environments

This topic is 2505 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I didn't want to call the data structure a scenegraph as I'm honestly not fond of the term. I'm looking for ideas with certain properties I'd like to explore in more detail. As such, this thread isn't a question, but rather a somewhat blind stab to see if there are any good ideas I haven't come across yet.

The data structure:

- the target is mostly indoorsy environments, but ones that can be prone to big and complete changes. In that regard what I'm dealing with is pseudo-dynamic geometry: everything can change, but much of the time doesn't. As such it's important to be able to quickly rebuild the scene hierarchy when needed
- emphasis is on minimizing actual geometry, but giving individual objects more detail, meaning that early culling should have extra effect
- parallelism: someone posted this here on GD a few days ago and even though I won't be targeting consoles or managing anywhere near 15k objects mentioned in the presentation, I'm intrigued by the idea of parallel culling for a number of frusta without bottlenecking the system with hierarchical serialization
- the best scheme I've found for my needs (after a fair amount of reading) is BIH: I love its simplicity and high performance. Sadly, being a hierarchical method, it's also as anti-parallelism as they get. I'm kind of intrigued by the Battlefield presentation, though, which mentions they managed to the get a 3x speedup from BVH, which is already one of the faster methods out there (someone wrote somewhere he managed to get a BVH into the 5-10% speed range from a kD-tree) by simply brute-forcing a regular grid
- level of detail and storage volumes: I feel heavily inclined towards AABB's for their simplicity and speed at the moment. Since my idea is to sacrifice tree depth for the benefit of better leaf granularity (I'm thinking all the way down to triangle pairs), a few extra false positives won't hurt performance
- frustum culling is still faster with spheres so my idea is to use bounding spheres for bigger objects and bounding boxes for smaller ones. It'd be cool to switch from spheres to AABB's to OBB's the deeper into the tree the node is
- one thing that requires special attention is figuring out what the best solution for merging rendering and collision is: I need per-polygon collision data, which actually blends well with the rendering data, but might make the code unnecessarily interdependent and unscalable

Culling:

I'm actually very curious about what Battlefield does in that it renders bounding boxes to a software buffer and does low res depth look-ups to test whether an object is visible. It's possible I'm missing details from the paper or I'm just stupid, but at the moment I can't see how they can handle false negatives when only using bounding boxes and severely limited lookups. What about more complex objects (more and smaller bounding boxes?)?

Also, do they have an extra hardware culling stage that they use for frame-to-frame heuristics/error checking (by allowing overdraw of "potentially visible" objects in HW)?

Collision:

I need to do a lot of it. More than usual due to debris flying around in the scene that needs to be collided with the geometry. As such it's imperative to put extra emphasis on the fastest possible way of getting the smallest possible testable dataset (which is why I'm currently bent on sacrificing tree depth to gain accuracy).




So yeah, if you have comments, ideas or thoughts, then do share! :) I'd like to figure out what the most rational solution would be before implementing anything as it's quite the investment time-wise.

Share this post


Link to post
Share on other sites
Advertisement
Smallest possible dataset is probably not something worth shooting for. You're going to find that there's a cutoff point beyond which it's actually faster to just send things to the GPU and let it sort them out than it is to cull them on the CPU before sending. I haven't seriously benchmarked this, but I'm quite confident that chunks of flying debris would comfortably fall within that cutoff, and that the point is actually positioned at objects a good deal larger and more complex than that. Going for smallest possible would also seriously hinder your ability to do good draw call batching, which is still quite important for performance, even if using D3D10/11 or OpenGL (draw calls may be cheaper but they're still not free).

Share this post


Link to post
Share on other sites
Smallest possible dataset is probably not something worth shooting for. You're going to find that there's a cutoff point beyond which it's actually faster to just send things to the GPU and let it sort them out than it is to cull them on the CPU before sending. I haven't seriously benchmarked this, but I'm quite confident that chunks of flying debris would comfortably fall within that cutoff, and that the point is actually positioned at objects a good deal larger and more complex than that. Going for smallest possible would also seriously hinder your ability to do good draw call batching, which is still quite important for performance, even if using D3D10/11 or OpenGL (draw calls may be cheaper but they're still not free).


I'm actually most concerned about generic geometry, which cannot easily be separated into meshes or chunks. Finding a cutoff point for arbitrary faces kind of creates an automatic cutoff point (which is the face itself). I'm not concerned with what the debris is, but rather how its collision and visibility should be handled in the grand scheme of things.

Share this post


Link to post
Share on other sites
- frustum culling
I believe frostbite still uses a hierarchy for thier bounding spheres. It is stored linearly though, by depth. It lends itself very well to testing higher resolution primitives at a certain depth. though i'm sure they're limiting it to 2 or 3 levels, so it's pretty much all or nothing.

- one thing that requires special attention is figuring out what the best solution for merging rendering and collision is
On toy soldiers 2, all collision operated on the renderable geometry.
For the setup it was perfect. I've since implemented a generic physics solution which can operate on the same data, or off some other lower resolution data.

- renders bounding boxes to a software buffer and does low res depth look-ups to test whether an object is visible.
I would say they are not rendering the bounding volumes, but rather using an AABB (in depth buffer space) to check the depth buffer for occlusion. There will be tons of false positives. However, the gain from all the useful rejections is supposed to outweigh that.


- collision
We used a quad tree, with KD trees at each draw call level. all of our debris simply raycasted into the world.
from what i could tell, battlefield and all of the popular fps shooters have very negligible physics. it's kind of a shame.

quad trees are terribly slow. our new system uses sweep and prune and frame to frame GJK to hopefully give us much more accurate collision for the same cost.

Share this post


Link to post
Share on other sites

- frustum culling
I believe frostbite still uses a hierarchy for thier bounding spheres. It is stored linearly though, by depth. It lends itself very well to testing higher resolution primitives at a certain depth. though i'm sure they're limiting it to 2 or 3 levels, so it's pretty much all or nothing.


I'm not sure what you mean by "stored linearly" - could you elaborate?


- one thing that requires special attention is figuring out what the best solution for merging rendering and collision is
On toy soldiers 2, all collision operated on the renderable geometry.
For the setup it was perfect. I've since implemented a generic physics solution which can operate on the same data, or off some other lower resolution data.
[/quote]

I'll be needing per-primitive collision for the world and per object approximate collision (low resolution bounding mesh) for independent meshes. Which actually brings up an interesting question (below).


- renders bounding boxes to a software buffer and does low res depth look-ups to test whether an object is visible.
I would say they are not rendering the bounding volumes, but rather using an AABB (in depth buffer space) to check the depth buffer for occlusion. There will be tons of false positives. However, the gain from all the useful rejections is supposed to outweigh that.
[/quote]

I was implying that AABB's are the bounding volumes in this case, not sphere's, so we're talking about the same thing :)


- collision
We used a quad tree, with KD trees at each draw call level. all of our debris simply raycasted into the world.
from what i could tell, battlefield and all of the popular fps shooters have very negligible physics. it's kind of a shame.

quad trees are terribly slow. our new system uses sweep and prune and frame to frame GJK to hopefully give us much more accurate collision for the same cost.
[/quote]

I need to update the tree in realtime as I'll be applying scaling and rotation to world geometry, making the entire world essentially one big deformable/dynamic geometric set.


The point I'm curious about is how world geometry is supposed to be stored in the hierarchy. Let's take a sizable room with intricate walls - a BSP tree will naturally divide the geometry up into per-poly planes, giving a good approximation of what will be collidable or even visible when you raycast through it (with the added benefit of PVS, which I won't have the luxury of having). The benefit here is that the generation of the tree is unaided and automatic, albeit slow and needs to be done offline. When sorting the world geometry into a BV-style tree, however, there are no clear guidelines how to do it or how to group what. How do I split up a room with indents and an uneven wall layout (eg a room shaped like the letter E)? At face value, the room itself will become the root level node whereas the next ones will either overlap horribly due to some of the walls being long and some being very short or the tree will become increasingly complex when using non-spherical volumes (OBB sound like the best solution here). The only alternative I see here is using artist-defined volumes for grouping, which I'd like to avoid.

It is this construction phase I'm most confused about.

Unreal uses imported meshes (I think they're called "static meshes") for more complex geometry and portals for spatial distribution, but it's up to the BSP tree to handle immediate user-edited geometry. Using a BV-based tree will throw everything in the same pot and the only solution I can think of is not pre-checking any of the world geometry, but rather either brute-forcing it through the pipeline or letting the hardware to do all of the culling within the currently visible space/portals (since doing per-poly frustum checks would kill all performance).

Share this post


Link to post
Share on other sites

- level of detail and storage volumes: I feel heavily inclined towards AABB's for their simplicity and speed at the moment. Since my idea is to sacrifice tree depth for the benefit of better leaf granularity (I'm thinking all the way down to triangle pairs), a few extra false positives won't hurt performance
I strongly encourage using AABBs, they worked really well for me.

- one thing that requires special attention is figuring out what the best solution for merging rendering and collision is: I need per-polygon collision data, which actually blends well with the rendering data, but might make the code unnecessarily interdependent and unscalable
I'm afraid that's not going to be generally feasible: see this UDK page for a visualization of collision objects (scroll to bottom). In general, collision against polygon soups is - in my opinion - way too slow to be viable on a large scale.I suggest to keep the collision model fully separate from graphics representation. In case a hit is detected and an accurate collision resolution is required, bypass the collision detection on request with ad-hoc tests.
Having spent some time in writing my system's data structure and tried to write a physics system I feel the need to warn you both tasks are fairly involved. There will be a day you'll have to use a physics API; when that day comes, all the work on physics will go down the drain. I strongly encourage in thinking forward.

More than usual due to debris flying around in the scene that needs to be collided with the geometry. As such it's imperative to put extra emphasis on the fastest possible way of getting the smallest possible testable dataset (which is why I'm currently bent on sacrificing tree depth to gain accuracy).
As far as I've understood this kind of collision testing is typically performed on simplified models, perhaps using Smooth Particle Hydrodynamics. I can tell for sure Bullet is not going to handle this in its current implementation. A small amount of debris is no problem however. Please quantify the amount of queries/objects required.

I need to update the tree in realtime as I'll be applying scaling and rotation to world geometry, making the entire world essentially one big deformable/dynamic geometric set.
I strongly advocate against that, if the feature is required, consider using ad-hoc entities. Just allowing the world to be natively dynamic is fairly complex and drops the chances of optimization by chunking/batching. Management will require consistent map/update of constant buffers on a... per-primitive settings. Worse: how is the asset pipeline going to work? I'm not sure I get in your line of thinking. I'd be extremely scared of mixing this with the lightmapper you were considering a few days ago.

How do I split up a room with indents and an uneven wall layout (eg a room shaped like the letter E)? At face value, the room itself will become the root level node whereas the next ones will either overlap horribly due to some of the walls being long and some being very short or the tree will become increasingly complex when using non-spherical volumes (OBB sound like the best solution here). The only alternative I see here is using artist-defined volumes for grouping, which I'd like to avoid.
Agreed. I'd like to read some suggestions on that as well.

Unreal uses imported meshes (I think they're called "static meshes") for more complex geometry and portals for spatial distribution...
I suppose you're talking about "gameplay portals" here? What's the problem in having an unified approach?

Share this post


Link to post
Share on other sites

Culling:
I'm actually very curious about what Battlefield does in that it renders bounding boxes to a software buffer and does low res depth look-ups to test whether an object is visible. It's possible I'm missing details from the paper or I'm just stupid, but at the moment I can't see how they can handle false negatives when only using bounding boxes and severely limited lookups. What about more complex objects (more and smaller bounding boxes?)?
they don't render bounding boxes, they render special low-poly occluder and then they test bounding boxes.

Also, do they have an extra hardware culling stage that they use for frame-to-frame heuristics/error checking (by allowing overdraw of "potentially visible" objects in HW)?
[/quote]no, the culling code won't change on the fly, so there is no need in any heuristic/error checking, if something is 'bad', then level designer/artist will see it and fix the occluders.

Unreal uses imported meshes (I think they're called "static meshes") for more complex geometry and portals for spatial distribution, but it's up to the BSP tree to handle immediate user-edited geometry. Using a BV-based tree will throw everything in the same pot and the only solution I can think of is not pre-checking any of the world geometry, but rather either brute-forcing it through the pipeline or letting the hardware to do all of the culling within the currently visible space/portals (since doing per-poly frustum checks would kill all performance).[/quote]
if you're talking about UT3, it looks like they're using hardware occlusion queries to figure out what parts of the level are visible, you can see the typical artefacts, if you pay attention.

Share this post


Link to post
Share on other sites

I strongly encourage using AABBs, they worked really well for me.


For object-based data that's my thought as well. I think I have a solution for for generic geometry, though it's probably pretty specific to my needs (below)


I'm afraid that's not going to be generally feasible: see this UDK page for a visualization of collision objects (scroll to bottom). In general, collision against polygon soups is - in my opinion - way too slow to be viable on a large scale.I suggest to keep the collision model fully separate from graphics representation. In case a hit is detected and an accurate collision resolution is required, bypass the collision detection on request with ad-hoc tests.
Having spent some time in writing my system's data structure and tried to write a physics system I feel the need to warn you both tasks are fairly involved. There will be a day you'll have to use a physics API; when that day comes, all the work on physics will go down the drain. I strongly encourage in thinking forward.
[/quote]

I'm familiar with the link and my solution is this: use level outline geometry per-poly, but add collision meshes for objects if needed. I'm also thinking of going pretty CSG-heavy in the editor to merge level geometry to a pulp, which might allow me to create approximate collision meshes for geometry that's too detailed. My whole thought, however, is to keep down the amount of geometry and either focus on textures or automatic subdivision on the chip. The main requirement for this comes from the fact that I want to keep level editing times as manageable as possible, though, not the fact that I want to desperately create an all-purpose scenegraph.

As far as I've understood this kind of collision testing is typically performed on simplified models, perhaps using Smooth Particle Hydrodynamics. I can tell for sure Bullet is not going to handle this in its current implementation. A small amount of debris is no problem however. Please quantify the amount of queries/objects required.
[/quote]

This is just an estimate, but I'm thinking a capped case of around 200 per second. Most of the time the number of particles will be zero, though, but my primary concern is framerate fluctuations when things get heavy. I cannot defer particle response more than a few frames.

I strongly advocate against that (NOTE: rebuilding the scenegraph), if the feature is required, consider using ad-hoc entities. Just allowing the world to be natively dynamic is fairly complex and drops the chances of optimization by chunking/batching. Management will require consistent map/update of constant buffers on a... per-primitive settings. Worse: how is the asset pipeline going to work? I'm not sure I get in your line of thinking. I'd be extremely scared of mixing this with the lightmapper you were considering a few days ago.[/quote]

At the minimum I need entire areas of the level to be rotated or translated, screwing up any fixed geometry. I see no problem with transforming the geometry and bounding volumes on the GPU. The good thing is that this won't screw up the spatial distribution either. However, some of the more esoteric ideas I'm toying with will require destruction on a more global scale. It's possible I'll drop these thoughts, though, as they're kind of outlandish anyway.

PS - it's not a lightmapper as I mentioned in said thread :). It works just like a real-time lightmapper, but it generates different output. Adding compatibility with deformable geometry is a bit of headache, but it won't affect the game nearly as much as lightmapping.





Unreal uses imported meshes (I think they're called "static meshes") for more complex geometry and portals for spatial distribution...
[/quote]I suppose you're talking about "gameplay portals" here? What's the problem in having an unified approach?
[/quote]

No, I'm talking about imported static geometry.




My solution to automatic geometry clipping: I'm basing my editor on the concept of Areas. Each area is defined by a user-created bounding volume (built as an N-sided cylinder type mesh from 2D base vertices + variable height), which makes up the only static geometry in the game. No geometry can exist outside of an area, making each area essentially a collection of portals that cordon off a region in space that is considered both all visible (that is, subject to occlusion testing and culling) at once as well as the only applicable geometric subset that is used for collision (several areas might be considered if the player is passing from one to the other). Any more detailed geometry needs to be constructed and constrained inside an area, doing away with leaks and holes. The only way to exit an area is through an opening that is created automatically between two areas that intersect (the user cannot clip an area mesh). A general design guideline is to use common architectural hierarchy: a room is an area, a doorway is an area, a corridor is an area, etc. By intersecting these (if even by immediate adjacency), a section from the walls of both areas are removed.

By implication, (large) open areas are not supported.

My idea is to build a scenegraph for each area and load areas dynamically as need arises, replacing level loads with chunked streaming. By keeping areas relatively small, this creates automatic and natural portaling that allows for rough collision/vis tree creation. It's up to the artist to make sure things don't get out of hand by keeping open space relatively constrained. I'm thinking of precalculating visibilty between areas by periodic raycasting from barycenters of a unit grid and 1) in the first pass intersecting only area boundaries and if visibility is found 2) intersecting the offending rays with geometry inside the areas to determine more precisely when and if an adjacent area is visible (eg if a movable mesh is intersected, it can be assumed that both areas can see each other only if the mesh has moved). Cross-area visiblity is checked up to N levels or can be defined by the artist.

When rendering, area boundaries are not culled - they are assumed to be relatively low-detail (around 10-100 N-sided polys per area, which are triangulated) whereas all geometry inside the areas is assumed to be made up of meshes that have a specific bounding volume. Each mesh is intersected with other meshes to minimize hidden surfaces, but the resulting geometry is kept attached to the original mesh handles, a new bounding volume is calculated for for the clipped geometry and used as a lookup for collision and visibility. Whether these volumes merit a separate tree structure is a different question. For complex surfaces a DOP approach might be the only solution, but I'm not prioritizing this at all for now - I'm just assuming relatively low geometric detail to start with.

I THINK this approach would work pretty nicely with small-to-moderate sized interior scenes and I'm not deluding myself to add outside support at this stage. If anyone can find flaws in this reasoning or has some suggestions, then now is definitely the time I'd like to hear them! :)

Share this post


Link to post
Share on other sites
I just wanted to answer those questions related to my post:
* Stored linearly in that the data container's traversal time is linear.
* They render polygons to the depth buffer, then for the test (here's where i'm speculating) they compute the AABB in the depth buffer space, in the sense that they compute a depth texture x,y,w,h,zmin,zmax and just run over all the pixels as fast as possible searching for a pass. This likely uses a hierarchical optimization.

Share this post


Link to post
Share on other sites

This is just an estimate, but I'm thinking a capped case of around 200 per second. Most of the time the number of particles will be zero, though, but my primary concern is framerate fluctuations when things get heavy. I cannot defer particle response more than a few frames.
Ok. If the simulated objects don't clamp together I think that would be viable with full dynamics on most systems.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!