Sign in to follow this  

Octree Occlusion Culling?

Recommended Posts

I've implemented rudimentary octree occlusion culling... On heavy load, the performance is about 2x without it... But it makes low loads much slower (big overhead I guess)


It was a nightmare to even get working... And I'm guessing there are still many tweak fixes that could be applied to improve performance


I almost think I overcomplicated the entire thing and might entirely scrap it and restart...


On a 50K poly map, I get about 100 FPS


A 200K map in quake easily runs at 200+ FPS >_<


Does anyone have any good material on render culling? As specific to "GL octree occlusion" as possible ;) Thanks

Share this post

Link to post
Share on other sites

You could always optimize your octree.


If the scene is small for example, could you use larger octree cells, perhaps only a single one (if the scene is really small, effectively using no octree at all)


Or maybe theres something with the octree implementation or how you modify/access it thats making it less performant.


Try profiling it i guess.

Share this post

Link to post
Share on other sites

waterlimon, that's a good idea.. basing the octree depth on geometry complexity/size(density).... there are a couple faults with my implementation which definitely could be optimized, for example I think it needs at the least one recursive subdivision, and has a small maximum of about 6... and also takes up a huge amount of exponential memory (a depth of 6 takes like 500MB)


kauna, i'm using GL occlusion queries...


This is my octree struct:


typedef struct HxrOctree
bool base;
float left, right, bottom, top, back, front;
struct HxrOctree* leaves[8];
HxrTrianglePlaneList trianglePlanes;
} HxrOctree;

One quick thing I just noticed as that left/right/bottom/top/back/front are unnecessary.. I could just use a centerpoint and an extent value.. also possibly change HxrOctree*[8] to HxrOctree**


Actually, on second thought... the HxrOctree** would take the same memory cause I would still have to malloc those 8... and also using a centerpoint would exchange smaller memory size for slower performance, so maybe not worth it.

Edited by 3TATUK2

Share this post

Link to post
Share on other sites

Do you allocate the nodes individually or do you store them in a contiguous array?


The latter would be preferrable to take advantage of cache.


You might also consider using a sparse octree instead to get the memory usage to sane levels, it might even improve performance since you have less nodes and thus they are more likely to fit in cache. Of course it will have some overhead when you need to create or destroy nodes.

Share this post

Link to post
Share on other sites

Very good points. It's allocated instead of contiguous, and not "sparse". - By "sparse" i guess you mean don't populate nodes which have nothing inside them...


I will attempt these. Thank you very much!


Also, things like the quake engine use precomputed PVS... which I think is probably considerably harder than just using per-frame testing... and also probably uses more memory.


I may consider this, but would have to think more in depth about it for a while... For example, what would the general method be?

Something like, for each node, render 6 camera shots along both directions of all three axis and then put those visible sub-nodes into it's PVS list? Sounds very CPU expensive... I bet they even bake the PVS into the file itself instead of generating it at load time

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this