Jump to content
  • Advertisement
Sign in to follow this  
PhillipHamlyn

Occlusion Query Testing - Clarification on Frequency of Test

This topic is 803 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi,

 

I've experimented with occlusion querying and culling using a pre-render pass on simplified meshes of my landscape and have hit a few problems in getting a good balance between mesh complexity and over-aggressive culling with simple AABB style meshes. This is especially true when trying to cull my landscape tiles, for which there just doesn't seem a good "simplified mesh" that doesn't lead to popping artefacts.

 

I read on GameDev that the other approach is to weave the occlusion queries into the main render pipeline (taking care to avoid stalling) and checking the outputs on the following frame, and skipping the render of any objects that met the frustum cull but subsequently didn't get anything rendered. I have two questions on implementing this, which I think are not framework specific.

 

1) I need to pre-sort my object front-to-back to ensure that occluded objects are rejected base on depth value and not just overdrawn. I believe that overdrawn pixels get added to the occlusion query results even though they are never visible. Is that the correct interpretation ?
2) If I reject occluded objects on the following frame, at what point should I attempt to write them again to retest their visibility ? Is this simply based on knowledge of the world dynamics (i.e. camera and object location changes), or is there a more technical approach ?

 

Thanks in advance.

Share this post


Link to post
Share on other sites
Advertisement

1) Occlusion query returns number of pixels which passed z-test in 'region' defined by occlusion query begin/end. Your interpretation is correct :)

 

2) I don't know how exactly your render loop works, but what we did in game I worked on, which used z-prepass was following:

 

   - Render all visible geometry to z-buffer (this pass actually used occlusion queries results from previous frame)

   - Issue occlusion queries for next frame. Because we issued occlusion queries after z-buffer was completely filled with occluders, there was no sorting necessary. It is however

     useful (on most common GPU architectures) to render frontmost surfaces first, unless it costs you too much CPU time.

   - Render 3D scene (shadow maps, light buffer, main pass) + post processing

 

What you basically want is enough GPU work between issue of occlusion queries and asking for results.

Share this post


Link to post
Share on other sites

2) Me too wants to know how to use this without some oldschool occlusion culling technique like portals or BSP-tree. Still want to know why these are oldschool for rendering while you still need them for collisions. The current Doom uses only a small number of shaders, so switching should not be such a problem as it used to be.

Share this post


Link to post
Share on other sites

Thanks born49 -

 

So occlusion queries return the number of pixels rejected by Z-testing at the point they were tested, so doens't accurately refer to the number of pixels visible when the render completes. That makes sense to me, especially since I submit Begin/End commands on either side of my DrawIndexedInstanced call and the render pipeline always does things in the order its been given, that means my counter will start before my mesh is rendered, and stop immediately after. Subsequent mesh renders might overwrite my pixel and alter the depth buffer on which the Z-rejection initially passed, but since I've told the pipeline to stop collecting once my mesh render has completed it can only have my objects information in the measure.

 

I have sorted my occluders front-to-back anyway to take advantage of early Z-rejection (although I have an additional question as to how that is compatible with batching calls based on resource usage - which suggests I should sort by resource set, not world position - but thats another story).

 

I didn't quite understand where in your engine you actually reject a mesh because of its previous frame occlusion - i guess that was in the step "...used occlusion queries from the previous frame.." ? When then did you re-set the "occluded" flag for a previously occluded mesh ? Do you always render all your geometry to the z-buffer and then use occlusion queries to restrict those objects that fail from the main pass ? I can see how that would work, but when I tried that I found that the performance benefit I was hoping to find by skipping the geometry rendering of occluded objects was outweighed by the fact I was rendering all of them all of the time in the Z-buffer pass, and only getting a benefit of not rendering them in the main pass. My pixel shaders on the main pass aren't complicated enough to outweigh the cost of the double-render. Maybe I am expecting too much benefit from this technique if the double-pass method is quite a common solution.

 

My specific problem is distant forests - in most cases they are occluded by a landscape tile, but fall within the view frustum. I use imposters and pre-rendered landscape coverage for really distant treelines to reduce the hit, but I still want to eliminate them from the frame completely if I can. Maybe I'll just monitor the camera position and do another render/query for each forest in turn when the viewer has moved an appropriate distance. Its a bit of a bodge but I cant see another way.

Share this post


Link to post
Share on other sites

Don't want to be pedantic, but it is always good to get terminology right ! :)

 

1) Portals != occlusion culling. Portals are technique for determining set of visible objects (which is also case for

   occlusion culling). With portals, you determine what is visible, with occlusion culling you determine what is NOT visible.

   I would say that classic portal based techniques are more limited than occlusion culling as they are mostly tied to static

   geometry, while occlusion culling can work with dynamic objects without problems. Also for example geometry like dense

   forest of trees works very well with occlusion culling, while portalizing this kind of geometry is probably impossible :)

 

2) BSP-tree is spatial subdivision structure. It was (in old-school sense) used to spatially index set of polygons. Nowadays we don't operate

  on polygon level anymore, but more likely on object level. You definitely need some kind of spatial subdivision structure

  in renderer to efficiently cull things. Some kind of (dynamic) AABB tree will do.

 

 

Some tips how to get occlusion culling working (using HW occl queries):

 

1) Visibility culling operates hierarchically on your spatial subdivision. You don't want to test occlusion at object level granularity only (you

    probably want to do it for expensive to render entities like lights however).

     You want to cull away entire sub-trees of your spatial subdivision. For this to work nicely, your spatial subdivision should be able

     to supply some kind of 'persistent' node ID which is used as key for visibility cache.

 

2) Maintain visibility info in some kind of cache. This cache maps spatial subdivision node-ID to visibility info obtained through occlusion queries.

   When object (subtree) is visible, you can safely assume, that it will remain visible for some time (we used value about 0.5 seconds + random value).

   Using this, you can save lot of occlusion queries which will be otherwise necessary to issue, which is especially helpful in scenes, where there is not

   too much good occluders (like looking to forest from airplane). So when you detect visible node, you insert it into cache together with some time info, so

   it can be re-issued with lower frequency than per-frame.

 

3) For nodes, which are not visible, you must issue occlusion query every frame. 'Problem' arises whenever node changes its visibility from not-visible

  to visible. Now, you don't have visibility info in your cache for its children, so to be entirely correct, you have to render all child nodes. This would  however kill the purpose of occlusion culling entirely as it will be performance killer. So we do NOT do it :) Instead we rendered only objects located in  node which changed visibility to visible state and also its child nodes up to predetermined number of levels (I believe it was 2 levels deeper). It sounds  like could possibly generate lots of artifacts (missing objects), but in reality, it works very well (if your spatial subdivision is realized in right way).

 

4) There is remaining issue with above approach to occl queries based culling. You will definitely get artifacts, if camera teleports. In this case

   visibility cache is almost useless and you will get missing objects artifacts. We detected such situation (sudden change of camera position / orientations)  and we let rendering system to process the frame in synchronous occlusion queries mode (this means waiting for queries results as you traverse        subdivision). Visibility cache got re-filled with correct info and next frames were processed in normal (asynchronous & faster) mode. 

Share this post


Link to post
Share on other sites

Phillip,

I am not sure if this is typo, but you have written:

 

"So occlusion queries return the number of pixels rejected by Z-testing"

 

It is actually number of pixels which PASSED the test ! (not rejected).

 

I tried to explain how the system worked in above post. No, it did not rendered all objects in Z-pass, that would not

work for massive scenes.

 

I am not sure if I would suggest implementing occlusion queries based system for your case. In my explanations

I skipped lot's of nasty details, which always pops when you start implementing it :) It is unfortunately very complex

task.

 

If you are sure you want occlusion queries for your terrain, maybe it would be useful to look into software based

occlusion culling. It does not have to solve issues with interleaving rendering of scene with issuing queries in

efficient way. I noticed that Intel has some free library to perform efficient occl culling which is free:

 

https://github.com/GameTechDev/MaskedOcclusionCulling

http://www.highperformancegraphics.org/wp-content/uploads/2016/HPG2016_MSOC.pdf

 

 

While I am all for efficient visibility determination during rendering, I believe, that for outdoor

scenarios it is very important to have good LOD schemes. It is nice coincidence that I have

also interest in rendering terrains with lots of trees ! :)

 

I don't do any occlusion culling (yet), but I still get very good performance on common PC.

 

This is demo where milions of trees are rendered:

 

 

 

While you are flying above terrain, it is work for LOD system to get good performance.

However when you descend to ground and walk through forest, good occlusion system

in this case would definitely help !

 

Cheers

Edited by born49

Share this post


Link to post
Share on other sites

Hi born69

 

Yes - that was a typo.

 

Very nice demo - we are definitely enthused by the same things.

 

I've got my OQ inside my render loop now and its happily eliminating whole rafts of forest that are occluded by mountains, so I'm happy the framerate is not rendering overdrawn pixels now.

 

For info; my trees have five LOD - a horizontal texture splat for really distant stuff (and this is baked into the landscape tile drape texture). Two billboard levels - one single Y-axis for distant billboards, and a 8x8 texture atlas of the tree rotated aound Y-axis taken during asset creation for closer imposters. I lerp between the two atlas subtextures dependent on the viewers angle, and means I can use asymmetric trees and imposters without too many artefacts.

 

I've then got three LODS of tree model, depending on distance from viewer.

 

Out of interest, where do you source your tree/vegetation models from ? I've been using some free stuff from Turbosquid but its not really optimised for real time rendering.

 

All this works OK but I still try to eliminate geometry wherever possible. I will add the "number of seconds plus random" as my timeout for objects becoming de-occluded from being occluded - it seems a sensible way of doing it.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!