How would you do frustum culling on sponza?

Started by
7 comments, last by 215648 9 years, 11 months ago

So I just got this question in my mind, sponza model of crysis has about 350 sub-models i.e subsets and if we do not merge the same type of mesh and texture,wouldn't it result into massive draw call? But when we merge subsets with same type of materials, draw calls/subsets reduces to 23 from 350. But now, we have problem of frustum culling. How would you do frustum culling with this?

Advertisement

350 total renderables is doable even in mobile. So you could just brute force it. In actual case where optimizations matter you can do heuristic for submesh merge where only merge close ones.

In our current project I don't merge antything but just frustum cull everything(what is actually loaded in memory), sort them and draw everything with instancing. This give me accurate culling result and draw calls are still manageable.

But now, we have problem of frustum culling. How would you do frustum culling with this?

Why is this a problem unique to sub-meshes? They have bounding boxes the same as regular meshes do.
The only difference at all is that if the main mesh is entirely inside the frustum, none of the sub-meshes are checked (all are added to the render queue).


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

If it's one single model, which the camera is inside of, there's some special purpose tricks you could use -- this is he kind of situation that FIFA/NBA games would optimize for when rendering their backgrounds.
You could split the model up intro groups of polygons. First separate the central floor into one group, and then divide the rest into radial slices, like cutting a pizza.
Leave the model as "one draw", but sort all these separate groups into different contiguous regions of it's index buffer.
E.g. Say there's 7 groups, the index buffer contains the indices to draw group 1, followed by the indices to draw group 2...and so on up to group 7. Sort these groups according to their angular/slice location, so if two slices are next to each other around the "pizza", they're also next to each other in the index buffer.

Now that you've pre-prepared your model in this way:
If poly-group #1,2,3,4 are visible, you issue a single draw with the indices from group1.begin to group4.end.
If poly-group 1,2,6,7 are visible, then you issue two draws - one with indices from group1.begin to group2.end, and one with indices from group6.begin to group7.end.

P.s. I totally stole this idea from DigitalFragment.

But now, we have problem of frustum culling. How would you do frustum culling with this?

Why is this a problem unique to sub-meshes? They have bounding boxes the same as regular meshes do.
The only difference at all is that if the main mesh is entirely inside the frustum, none of the sub-meshes are checked (all are added to the render queue).
L. Spiro

Suppose there are 20 sub-meshes (all use same material and texture) out of which, 10 are outside frustum. Without merging this same type
sub-meshes, we would end up with 10 extra draw calls (20 if all are visible) while when we have merged them, we would not be able to do frustum culling on submeshes. This is the situation where i asked question what should i do. (The mesh is a level containing many objects i.e submeshes.)
what about merge and cull at the sub mesh level simultaneously? i use something like this to generate terrain chunks on the fly from underlying map data structures. A complex chunk can contain 5,000 meshes. items that pass cull get merged. then the results of the cull-merge get drawn. i also use an approach similar to Hodgman's "radial slicing" of the scene to cull entire chunks.
in general i've found for questions like this, a good approach is:
1. try brute force. the Abrash and ID way. if that doesn't cut it....
2. reorganize your data into the format the pipeline likes best - then draw. this would include culls, merges, etc.
sometimes a middle ground between 1 and 2 is good enough to get the job done.
BTW, lots of draw calls is something one can live with. a complex scene in my current title can tip the scales at over 18,000 calls per render - and that's fixed function with no shaders and no instancing. state changes almost seem to be a bigger performance issue than draw calls. but both are still issues.
Also, sometimes you'll find that cull helps render times, and sometimes its doesn't that much. it all depends on what you're drawing. so start with brute force, then try some culling, then get jiggy with it and merge, etc.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

But now, we have problem of frustum culling. How would you do frustum culling with this?

Why is this a problem unique to sub-meshes? They have bounding boxes the same as regular meshes do.
The only difference at all is that if the main mesh is entirely inside the frustum, none of the sub-meshes are checked (all are added to the render queue).
L. Spiro

Suppose there are 20 sub-meshes (all use same material and texture) out of which, 10 are outside frustum. Without merging this same type
sub-meshes, we would end up with 10 extra draw calls (20 if all are visible) while when we have merged them, we would not be able to do frustum culling on submeshes. This is the situation where i asked question what should i do. (The mesh is a level containing many objects i.e submeshes.)

I think the "cutting edge" solution to this problem would be to not merge, cull as usual, and use something like GLs MultiDraw* to submit many draw calls for distinct objects.

I'm not even sure why this needs a solution. 350 draw calls isn't huge on modern hardware; sure it's not going be as fast as 23, but if it gives you single-digit framerates it's going to be because you're doing something else wrong. The bad old days of D3D9 on XPDM are behind us.

By way of experimentation, I just unbatched some old D3D9 code so that it went from 116 draw calls to 1141, and got very similar framerates on WDDM.

None of this is to say that batching is bad, and I'm certainly not claiming that you shouldn't batch. But batching is just an optimization like any other, so treat it the same way as you would all other optimizations: don't do it pre-emptively, profile your program, determine if you need it, and be careful that the extra work you're doing to build batches doesn't wipe out the perf gain you get from using them.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

Thanks for all of the above useful method, I'll keep your advice in my mind when I'm coding.

This topic is closed to new replies.

Advertisement