d3d11 frustum culling on gpu

Started by
12 comments, last by Dingleberry 7 years, 7 months ago

I 'm going to implement frustum culling on gpu (not sure gs or cs ).

First fill the consume structure buffer objects, and cull them, output to append structure buffer.

I 'm not sure is it efficient ? and don't know the detail about how to use append/consume buffer ..

Any sample to learn ?

Advertisement

Um...

I think you may be missing the point of frustum culling...

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

Um...

I think you may be missing the point of frustum culling...

what point?

You can take a look at http://www.wihlidal.ca/Presentations/GDC_2016_Compute.pdf

"Some people use Singletons, some people are Simpletons." - Bill Gates
"Yum yum, I luv Cinnabon." - Mahatma Gandhi

world matrices of static instance are stored in cbuffer, and I don't want to frustum cull these static object

and modify that cbuffer, so cull them on gpu and indirect draw may be better, I think .

FWIW, GPU based scene traversal and culling is a state of the art engine design topic.

I've been working in graphics engines for 10 years and it's the kind of thing that would cause me to sit down for a solid week of planning on. There's a bunch of GDC presentations from people who are currently doing it, but you're not going to find a tutorial that will hold you hand through it yet.

The short version though -- you're going to want to merge as much of your pipeline state (fixed function / shaders) and resources as possible. That means using texture atlases, texture arrays, and giant buffers that hold geometry for many meshes at once. This will let you reduce the draw count substantially. Then you're going to want to split every mesh into many smaller clusters, which are associated with different culling structures such as bounding volumes and normal cones for backface culling. Then you write a CS to cull your clusters and produce a list of visible clusters. Then you compact that list. Then you write a CS to step inside each cluster and cull the triangles that it's made of and produce a list of visible triangles, and then compact that list. Then you use draw-indirect to draw your list of visible triangles.

sorry to say I just looking for an aabb frustum culling on gpu implementation.

Something look like this:


struct InstData 
{
    matrix world_mat;
   aabb;
};
cbuffer CB_Inst 
{
  InstData insts[1000];
};
 
 
AppendStructureBuffer<uint> id_of_visable;
ConsumeStructureBuffer<uint> id_of_all_objs;
 
[thread]
void cs_cull() 
{ 
  uint id = id_of_all_objs[thread_id];
if( inFrustum( insts[id].aabb ))
   id_of_visable.append(id);
}
  

Um...

I think you may be missing the point of frustum culling...

what point?

usually to save tons of CPU work, to maintain streaming and manage memory. Frustum culling on CPU is quite efficient and you might not have any benefits of doing it on GPU unless you have some very specific case.

wow !! I think I 'm in the wrong way ....

Not sending a ton of unused data to the GPU is another part of it. It's less of an issue than it was a few years ago, but it's still pretty relevant. I guess I could see doing a sloppy CPU cull and then doing cleanup and clipping on the GPU, but I can't imagine that it would be more effective to upload render instructions for every object in the scene and then leave the GPU to sort it all out.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

This topic is closed to new replies.

Advertisement