Can you use tessellation for gpu culling?

Started by
3 comments, last by MJP 11 years, 6 months ago
I noticed that if I use a 0 factor for tessellation,the geometry just doesn't render.Does this mean that in an instance buffer each frame I can set the parameters of the view frustum and in the shaders to just check the position of each instance and wether it's in the frustum or not and if it's not to tesselate it to 0?
Advertisement
You probably could but it's not going to be as efficient as you might think.

First of all, your geometry will still need to go through all preceding stages - even if it's going to be culled. For typical scenes this can be a fairly huge number, certainly higher than what you will actually be drawing, so it can be quite a performance impact.

Secondly, you're going to miss out on the main advantage of frustum culling in software which is to be able to take advantage of a hierarchical scene tree to trivially accept or reject huge chunks of geometry. Basically, with such a setup you know that if one node is outside of the frustum then all of it's child nodes are also guaranteed to be outside, and likewise with inside (the only case where child nodes need to be tested is if their parent intersects the frustum). This lets you lop off colossal parts of your scene with no work whatsoever.

And thirdly, having extra shader stages active is always going to impose some extra overhead, as it means that all geometry needs to go through more stages.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

I guess you could, but what would gain from it? You would already have processed the vertices, and if the triangle can be frustum culled then it will get clipped anyway before rasterization. If you're actually using tessellation to amplify geometry then it's definitely a win to set the tessellation factor to zero for patches that are back-facing or outside the frustum, since this will save you from having to do extra work in the tessellator and in the domain shader stage. But if you're not actually using the tessellation stages to tessellate, then there's not really any point.

I guess you could, but what would gain from it? You would already have processed the vertices, and if the triangle can be frustum culled then it will get clipped anyway before rasterization. If you're actually using tessellation to amplify geometry then it's definitely a win to set the tessellation factor to zero for patches that are back-facing or outside the frustum, since this will save you from having to do extra work in the tessellator and in the domain shader stage. But if you're not actually using the tessellation stages to tessellate, then there's not really any point.


well I got the idea after I saw this guy http://rastergrid.co...ometry-shaders/ using a geometry shader to do culling.Weird thing is,samples with stream out always run quite slow on my PC.I know it's a not a very high end gpu(ATI Mobility Radeon HD 5650),but so far from what I've seen,CPU frustum culling is more efficient than using the GPU to do it,or am I missing something?I think I need a vote of confidence before I decide which method to try to implement biggrin.png .Using a scene graph would definately be less of a headache,but if there's an efficient GPU culling technique out there,I'd be glad to learn about it.
Geometry shaders in general are typically not very fast, and stream out can make it worse because all of the memory traffic. IMO it's a dead end if you're interested in scene traversal/culling on the GPU. Instead I would recommend trying a compute shader that performs the culling, and then fills out a buffer with "DrawInstancedIndirect" or "DrawIndexedInstancedIndirect" arguments based on the culling results. I'd suspect that could actually be really efficient if you're already using a alot of instancing.

In general you don't want to just draw broad conclusions like "the CPU is better than the GPU for frustum culling" because it's actually a complex problem with a lot of variables. Whether or not its worth it to try doing culling on the GPU will depend on things like:

  • Complexity of the scene in terms of number of meshes and materials
  • What kind of CPU/GPU you have
  • How much frame time is available on the CPU vs. GPU
  • What feature level you're targetting
  • How much instancing you use
  • Whether or not you use any spatial data structures that could possible accelerate culling
  • How efficiently you implement the actual culling on the CPU or GPU

One thing that can really tip the scales here is that currently even with DrawInstancedIndirect there's no way to avoid the CPU overhead of draw calls and binding states/textures if you perform culling on the GPU. This is why I mentioned that it would probably be more efficient if you use a lot of instancing, since your CPU overhead will be minimal. Another factor that can play into this heavily is if you wanted to perform some calculations on the GPU that determine the parameters of a view or projection used for rendering, for instanced something like Sample Distribution Shadow Maps. In that case performing culling on the GPU would avoid having to read back results from the GPU onto the CPU.

This topic is closed to new replies.

Advertisement