Occlusion Culling + Octree

Started by
16 comments, last by duhroach 21 years, 6 months ago
quote:
However I found that a lot of cards wont give you access to the z buffer surface unless you only run in 16 bit modes. (DX8)

That's weird. I don't know too much about D3D, but in OpenGL you can get full access to the 24bit depth buffer in all screen modes. I would guess that D3D allows the same functionality ?

quote:
AFAIK only some HP gfx cards support ocllusion querying so far,

All GeForce3+ cards support hardware occlusion queries. So does the new Radeon.

quote:
So the thats why I though of the idea of rendering the oclluders to the z buffer first then the scene front to back and thus take advantage of the early z buffer reject.

That's not true occlusion culling. In fact, you aren't culling anything, you are just rejecting pixels. A speedup of 30% is nice, but with true occlusion culling, you can easily get speedups of 1000% or even more (depending on your scene). So the performance hit of reading back the zbuffer (or SW rendering it) might very well be worth it. And keep in mind, that we are talking about a fallback option here: on newer 3D cards, the whole process is HW assisted.

I just run a little test on our engine. I forced fallback mode (SW rendering on the CPU (handoptimized ASM), 500 occlusion faces max., 128² occlusion map, non-hierarchic), and timed a test scene (5 million faces). Without occlusion culling, I have around 1.25M faces in the view. With the culling, it gets down to 155k. That's 8 times less.

There is perhaps one additinal thing to mention: the results of occlusion culling *highly* depend on the nature of your scene geometry. If you are already using a portal engine, in a closed environment (Quake style), then occlusion culling won't be very effective at all, and the overhead might outweight the benefits. But on complex, open and/or outdoor scenes, scenes with lots of large (but complex) features (mountains, buildings, ...), occlusion culling can be incredibly efficient.

/ Yann

[edited by - Yann L on October 2, 2002 8:13:26 AM]
Advertisement
Hi,
I cant copy paste quotes due to using a *gasp* webTV to get to the internet at the moment. (should have brought my computer with me

So :-

"Getting the Zbuffer in 32 bits." Yup I also found OpenGL could do this, but not DX8! (What a pain )

"GeForce 3+ have ocluusion culling" Didn''t know that, but at the time of my reaserch I only had a Geforce2. But that seems good.

However the other points are vaild, but developing view culling architecture is not that easy for my situation. This is since it has to run on consoles, ( the pc is just a test bed for design ideas ). And as such some things which are dead quick on a pc are very slow on a ps2 for instance. ( not having a l2 cache for instance ).

As for the speed increase it would not be that high for our current system as it is a heavly portaled system. But I''m researching ways to remove portal systems even though they are good, due to the fact it can be very time consuming to generate "correct" geometry and portals. Esp if its artists doing most of the work!

Anyhow an interesting thread.

(If I could only get to a proper computer, damn been Ill
Heya guys, thanks for the input on the topic.
I''ve got the theory down, now to just figure out it''s code implimentation

thanks again!
~Main

==
Colt "MainRoach" McAnlis
Programmer
www.badheat.com/sinewave
==Colt "MainRoach" McAnlisGraphics Engineer - http://mainroach.blogspot.com
quote:
"Getting the Zbuffer in 32 bits." Yup I also found OpenGL could do this, but not DX8! (What a pain )

I didn''t know that. Quite surprising that DX8 lacks such an important feature (eg. for image compositing).

quote:
However the other points are vaild, but developing view culling architecture is not that easy for my situation. This is since it has to run on consoles, ( the pc is just a test bed for design ideas ). And as such some things which are dead quick on a pc are very slow on a ps2 for instance. ( not having a l2 cache for instance ).

OK, in that case you''ll be obviously more limited in your choice. I guess that SW rendering the occlusion map is no option on a PS2. But I don''t know enough about the architecture and performance considerations of your target consoles to really be of any help here.

quote:
As for the speed increase it would not be that high for our current system as it is a heavly portaled system. But I''m researching ways to remove portal systems even though they are good, due to the fact it can be very time consuming to generate "correct" geometry and portals. Esp if its artists doing most of the work!

Yep, I''ve also come to a point, where portals are no option anymore. The concept is nice, but besides the drawbacks you mentioned, they are just to restrictive in the type of geometry they accept. Good news is, that they can definitely be replaced by a good occlusion culling system, which will yield similar culling results, but on a wider range of geometry (no cells/portal required) and without preprocessing.

/ Yann
why don''t u guys think about PVS
PVS is a pre-compution algorithm. IE you have to construct your PVS before you can render your map. So it could only work in portal/bsp/cell type engines.

Outdoor map formats (octree/heightmap) can't really do a pvs calculation, as there's no "cap" to where we should stop saying "i can see you." This is because there's no "rooms" in outdoor formats. You've got 2miles worth of polygons, PVS would tell you that you can see every polgyon in that 2 mile radius, so we render all of them.

On another note. Going over the HOM method of things, couldn't we use the OpenGL Generation of MipMaps to creat our Z pyramid of heirchy images? And if we can, can we access them the same as if we made them ourselves?

thanks

~Main

==
Colt "MainRoach" McAnlis
Programmer
www.badheat.com/sinewave

[edited by - duhroach on October 4, 2002 2:36:32 AM]
==Colt "MainRoach" McAnlisGraphics Engineer - http://mainroach.blogspot.com
Hi,
I Did look into PVS, and it can be a nice system. I considered dividing up the world into or squares, then find the visible set for each section to its negibours. Basically "voxelise" the source section, do ray casts out from each "PSV-voxel" to each of its 8 neighbours, and marking down which ones it could see into. This could be stored in a byte, with each bit for a given neighbour.

The resolution of your PSV grid for each section depends on the size/complexity of your scene. Then to render you see what secton and PSV-voxel(s) you are in. Then "Flood fill" outwards, from these, clipping with the view frustrum as well.

The only problem was memory, on a pc with 128MB+ not really an issue, but on a console that only as ~32MB it was to much overhead, or the resolution so low, not worth it.

Anyhow the next big bottle neck on games is more the collision testing an AI, drawing the scene is less of an issue in comparison. (i.e. %60-%80 of cpu time is or the game code not the draw code!

roll on hardware intersection routines
quote:Original post by Anonymous Poster
why don''t u guys think about PVS


It''s traditionally used as a precomputed visibility map for indoor levels. However, if you use heightmapped terrain, it might be interesting to divide the terrain into 16x16 patches, compute visibility and use this for occlusion culling. It speeds it up and uses no cpu time ingame.

This topic is closed to new replies.

Advertisement