Octree Occlusion Culling?

Started by
4 comments, last by 3TATUK2 10 years, 5 months ago

I've implemented rudimentary octree occlusion culling... On heavy load, the performance is about 2x without it... But it makes low loads much slower (big overhead I guess)

It was a nightmare to even get working... And I'm guessing there are still many tweak fixes that could be applied to improve performance

I almost think I overcomplicated the entire thing and might entirely scrap it and restart...

On a 50K poly map, I get about 100 FPS

A 200K map in quake easily runs at 200+ FPS >_<

Does anyone have any good material on render culling? As specific to "GL octree occlusion" as possible ;) Thanks

Advertisement

You could always optimize your octree.

If the scene is small for example, could you use larger octree cells, perhaps only a single one (if the scene is really small, effectively using no octree at all)

Or maybe theres something with the octree implementation or how you modify/access it thats making it less performant.

Try profiling it i guess.

o3o

What kind of algorithm are you using for occlusion determination?

Cheers!

waterlimon, that's a good idea.. basing the octree depth on geometry complexity/size(density).... there are a couple faults with my implementation which definitely could be optimized, for example I think it needs at the least one recursive subdivision, and has a small maximum of about 6... and also takes up a huge amount of exponential memory (a depth of 6 takes like 500MB)

kauna, i'm using GL occlusion queries...

This is my octree struct:


typedef struct HxrOctree
{
bool base;
float left, right, bottom, top, back, front;
struct HxrOctree* leaves[8];
HxrTrianglePlaneList trianglePlanes;
} HxrOctree;

One quick thing I just noticed as that left/right/bottom/top/back/front are unnecessary.. I could just use a centerpoint and an extent value.. also possibly change HxrOctree*[8] to HxrOctree**

Actually, on second thought... the HxrOctree** would take the same memory cause I would still have to malloc those 8... and also using a centerpoint would exchange smaller memory size for slower performance, so maybe not worth it.

Do you allocate the nodes individually or do you store them in a contiguous array?

The latter would be preferrable to take advantage of cache.

You might also consider using a sparse octree instead to get the memory usage to sane levels, it might even improve performance since you have less nodes and thus they are more likely to fit in cache. Of course it will have some overhead when you need to create or destroy nodes.

o3o

Very good points. It's allocated instead of contiguous, and not "sparse". - By "sparse" i guess you mean don't populate nodes which have nothing inside them...

I will attempt these. Thank you very much!

Also, things like the quake engine use precomputed PVS... which I think is probably considerably harder than just using per-frame testing... and also probably uses more memory.

I may consider this, but would have to think more in depth about it for a while... For example, what would the general method be?

Something like, for each node, render 6 camera shots along both directions of all three axis and then put those visible sub-nodes into it's PVS list? Sounds very CPU expensive... I bet they even bake the PVS into the file itself instead of generating it at load time

This topic is closed to new replies.

Advertisement