Jump to content

  • Log In with Google      Sign In   
  • Create Account


Best visibility culling algorithm for single room scene


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
17 replies to this topic

#1 user88   Members   -  Reputation: 268

Like
0Likes
Like

Posted 14 May 2012 - 05:52 AM

Hi there,

i'm looking for best visibility culling algorithm for 3D scene where is single room with a lot of objects inside. In seldom cases room can be not rectangular. Scene is static but its rebuilding should not require a lot of time. As for me candidates i'm i see:

- Octree

- BSP

Please, give me advice which one better to choose.

Sponsor:

#2 Krypt0n   Crossbones+   -  Reputation: 2358

Like
0Likes
Like

Posted 14 May 2012 - 06:04 AM

Octree and BSP are scene partitioning structures, they do not do any culling.
what kind of culling are you looking for? do you want to accelerate frustum culling, occlusion? light sources?
how big is the scene (object count, light count, triangle count) and what part is mainly limiting your performance (drawcalls, vertex work, pixel work...) ?

that would help to give you a better advice.

#3 user88   Members   -  Reputation: 268

Like
0Likes
Like

Posted 14 May 2012 - 06:23 AM

Hi Krypt0n,
i want to accelerate frustum culling. in common cases the scene is not too large. it contains from 30 to 100 object that consists from about 200,000 triangles. it has from 1 to 10 light sources. pixel work is bottleneck.

------------------------

EDIT:
200,000 triangles - i mean whole 3D scene has such count of triangles..

Edited by user88, 14 May 2012 - 06:25 AM.


#4 Krypt0n   Crossbones+   -  Reputation: 2358

Like
3Likes
Like

Posted 14 May 2012 - 06:32 AM

for 100 objects, you could put them simply into an array and call every object individually. cpu time for those 100 frustum vs object tests will not be noticeable.
However, if pixel work is the bottleneck, you need to reduce the amount of shaded pixel.

1. you could sort all visible objects front to back to reduce overdraw
2. you could make a zpass first and then draw with shading on
3. you could try to use occlusion culling to further reduce the object count and this way free some compute resources for the pixel work (although I doubt there is much occlusion in typical single rooms)
4. you could try to make your rendering deferred, that way every pixel would be shaded by every light source just once.
5. if your scene is really static, maybe a lightmap would give you an even better quality, saving a lot of pixel computations.

#5 user88   Members   -  Reputation: 268

Like
0Likes
Like

Posted 14 May 2012 - 06:58 AM

thanks for answer Krypt0n.

Currently i'm implementing deferred rendering in my engine. maybe you are right and that will be enough for good performance. But because there are a lot of code changes now it is good time to implement anything else and i think maybe Octree or BSP will be useful for my case..

#6 solenoidz   Members   -  Reputation: 515

Like
0Likes
Like

Posted 14 May 2012 - 07:06 AM

3. you could try to use occlusion culling to further reduce the object count and this way free some compute resources for the pixel work (although I doubt there is much occlusion in typical single rooms)


What kind of occlusion culling are you suggesting in a modern world, where objects could be dynamic and there could be hundreds of rooms in a level, each containing hundreds of objects ?
I heard people are using software rasterizers to build a software depth buffer and test <bounding boxes of rooms> against it..

#7 user88   Members   -  Reputation: 268

Like
0Likes
Like

Posted 14 May 2012 - 07:42 AM

By the way, i have profiled frame rendering process in my 3D engine and frustum visibility culling takes 15% from full frame time. Does it say that i need to implement any advanced visibility culling technique?

#8 bwhiting   Members   -  Reputation: 683

Like
1Likes
Like

Posted 14 May 2012 - 08:17 AM

I have simple frustum culling (running in flash) and it will culll away 10-20,000 objects in about 1ms... on a more low level languange that number should be 5 - 10 times smaller still!

Sounds suspect to me, are you frustum culling triangles or something?

If that really is an issue for you I recommend caching (it gave me a whopping speed increase). Its real simple to add in, just store the id of the plane of plane that cause the frustum check to fail, then check that one 1st next time round... about 2 minutes of coding and can almost double your speed.

With that in mind for any scene comprising of less then 10,000 objects or so its not really worth implementing any tree structure for me.

#9 user88   Members   -  Reputation: 268

Like
0Likes
Like

Posted 14 May 2012 - 08:32 AM

I do simple frustum bounding box culling. 10-20,000 objects in about 1ms is good result. I need to check whether the application performance profiler give me correct results..

Thanks for this suggestion:

With that in mind for any scene comprising of less then 10,000 objects or so its not really worth implementing any tree structure for me.



#10 bwhiting   Members   -  Reputation: 683

Like
0Likes
Like

Posted 14 May 2012 - 08:55 AM

probably wont help you much but if you have your planes in an array, then you can unroll the loop (as it is only ever 6) this can also save you a teeeny bit of time, but might not be noticeable depending on the platform.

ah your checking bounding boxes.... while spheres are not exactly the perfect bounding volume they are by far the fastest test against a frustum, you could easily go on to check a bounding box if you had an intersection, but generally its not needed. spheres all the way

#11 user88   Members   -  Reputation: 268

Like
0Likes
Like

Posted 14 May 2012 - 10:07 AM

thanks for advice. i will try to check spheres instead of boxes..

#12 Krypt0n   Crossbones+   -  Reputation: 2358

Like
0Likes
Like

Posted 14 May 2012 - 11:47 AM


3. you could try to use occlusion culling to further reduce the object count and this way free some compute resources for the pixel work (although I doubt there is much occlusion in typical single rooms)


What kind of occlusion culling are you suggesting in a modern world, where objects could be dynamic and there could be hundreds of rooms in a level, each containing hundreds of objects ?
I heard people are using software rasterizers to build a software depth buffer and test <bounding boxes of rooms> against it..

well, I'm a bit biased as you might see here: http://www.gamedev.net/page/community/iotd/index.html/_/one-billion-polys-r85
I would always use a software occlusion culler, as there is no latency, no pre-calculation, everything can be dynamic etc. to get a good visibility set.

But if you assume you have really hundrets of rooms and you're not making an earth quake simulator, it could be just as good to setup a portal system or calculate a PVS for every room to room.

By the way, i have profiled frame rendering process in my 3D engine and frustum visibility culling takes 15% from full frame time. Does it say that i need to implement any advanced visibility culling technique?

as bwhiting said, you can cull 100 times of what you're currently processing with a very low cpu usage.I'm afraid your benchmarking is somehow wrong. are you maybe benchmarking in debug mode, or on a mobile device? tell us how you benchmarked?
if you want to really see what going on, install AMD CodeAnalyst and profile with it. it's maybe one hour to get into it, but it's quite easy to setup and to see the most expensive functions.

#13 user88   Members   -  Reputation: 268

Like
0Likes
Like

Posted 15 May 2012 - 01:55 AM

install AMD CodeAnalyst and profile with it

not sure that this profiler will work with my Intel CPU properly, but thanks for advice.

I used ANTS Performance Profiler under .Net Framework, it was release. Because the stencil mirror technique was used the count of Frustum vs Bounding Box checking could be more than 100 times, but not too match..

#14 swiftcoder   Senior Moderators   -  Reputation: 9739

Like
0Likes
Like

Posted 15 May 2012 - 02:12 AM

200,000 triangles - i mean whole 3D scene has such count of triangles..

Is this for a mobile device?

Pretty much any desktop graphics card should be able to brute force 200,000 triangles at a decent framerate - have you considered just not culling at all, since your levels are so small?

Tristam MacDonald - Software Engineer @Amazon - [swiftcoding]


#15 Krypt0n   Crossbones+   -  Reputation: 2358

Like
0Likes
Like

Posted 15 May 2012 - 02:32 AM


install AMD CodeAnalyst and profile with it

not sure that this profiler will work with my Intel CPU properly, but thanks for advice.

works fine with intel for me

I used ANTS Performance Profiler under .Net Framework, it was release. Because the stencil mirror technique was used the count of Frustum vs Bounding Box checking could be more than 100 times, but not too match..

.Net? you mean c#? not sure if code analyst works for that, I never used it for time critical code.


200,000 triangles - i mean whole 3D scene has such count of triangles..

Is this for a mobile device?

Pretty much any desktop graphics card should be able to brute force 200,000 triangles at a decent framerate - have you considered just not culling at all, since your levels are so small?

that might explain a bit:

... pixel work is bottleneck.


cheers

#16 user88   Members   -  Reputation: 268

Like
0Likes
Like

Posted 15 May 2012 - 02:53 AM

View Postuser88, on 14 May 2012 - 08:23 AM, said:
200,000 triangles - i mean whole 3D scene has such count of triangles..
Is this for a mobile device?

No, this is PC Application with 3D visualization.

that might explain a bit:

View Postuser88, on 14 May 2012 - 08:23 AM, said:
... pixel work is bottleneck.

This application is targeted to visualize nice designed interior of living rooms. Because the image quality is important the pixel shaders are complicated. Currently the forward rendering is used and pixel shader is bottleneck.

I know that it is better to improve pixel shader than of any visibility culling algorithm. But after profiling i have seen that visibility culling takes 15% of frame and i decided to improve that too. Now i suspect that profiling results are wrong..

#17 jameszhao00   Members   -  Reputation: 271

Like
0Likes
Like

Posted 15 May 2012 - 03:04 AM

How are you profiling your GPU time? QueryPerformanceCounters around DirectX calls?

Edited by jameszhao00, 15 May 2012 - 03:05 AM.


#18 user88   Members   -  Reputation: 268

Like
0Likes
Like

Posted 15 May 2012 - 03:25 AM

How are you profiling your GPU time? QueryPerformanceCounters around DirectX calls?

i use nVidia PerfHUD




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS