Jump to content

  • Log In with Google      Sign In   
  • Create Account

Radiosity : What Visibility Algorithm is fastest ?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
12 replies to this topic

#1 VladR   Members   -  Reputation: 722

Like
0Likes
Like

Posted 24 May 2005 - 06:45 AM

I`ve dusted off my old radiosity renderer (progressive shooting) and started preparing it for lighting of our game levels, but have noticed that the visibility determination takes 99% of time. If I don`t count the shadows, i.e. visibility is always 1.0f (Fi,j), it takes 0.315 second to make 10 passes. But if I turn the shadowing on, it takes 815 seconds. So it`s a ratio of 2590:1. Obviously, I use just a simple brute-force algo of testing visibility of every patch in scene (except shooter and receiver, of course) because it was easiest&fastest to make. Now, that I`d like to have a faster speed, I`d be willing to devote some time for a better and faster algorithm. Which one should I use ? As for a method of intersection of a Ray and a Triangle I`m using D3DXIntersectTri, because it`s a lot faster than 2 other methods I googled (can`t remember their names from top of my head). I`m using quad patches consisting of two triangles, if that helps. Thanks for any ideas.

VladR    My 3rd person action RPG on GreenLight:    http://steamcommunity.com/sharedfiles/filedetails/?id=92951596

 


Sponsor:

#2 moagstar   Members   -  Reputation: 259

Like
0Likes
Like

Posted 24 May 2005 - 07:10 AM

The method that I use is for speeding up visibility determination is to simply put all your triangles into an octree, then when you are testing for an intersection, first test the ray against the octree. This way you can eliminate a lot of the triangles from the intersection test. There are probably other methods which will give you a better speed increase, but the octree method is simple to implement, and works pretty well.

As for the intersection test between the ray and triangle itself, if all you are interested in is visibility, then you can scrap the determination of the intersection point(uvw barycentric coordinates) it won't give you much of a speed increase but it does save you a couple of multiplies and a division per ray. Basically you will have to implement the intersection code yourself, but you can use this As a starting point, good luck!



#3 Sunray   Members   -  Reputation: 188

Like
0Likes
Like

Posted 24 May 2005 - 07:54 AM

BSP is far faster than octree for raycasting... I've been there. :)
Perl - Made by Idiots, Java - Made for Idiots, C++ - Envied by Idiots | http://sunray.cplusplus.se

#4 Yann L   Moderators   -  Reputation: 1798

Like
0Likes
Like

Posted 24 May 2005 - 08:02 AM

Quote:
Original post by VladR
Obviously, I use just a simple brute-force algo of testing visibility of every patch in scene (except shooter and receiver, of course) because it was easiest&fastest to make. Now, that I`d like to have a faster speed, I`d be willing to devote some time for a better and faster algorithm. Which one should I use ?

There are a lot of approaches you can use to accelerate that.

* Zero overlap octrees will speed up your ray tracing process a lot. Such a spatial partitioning scheme is absolutely vital for a tracing based radiosity implementation. Tracing through the octree can be accelerated by recursive parametric methods, if there is no overlap between the nodes. I use a slight variation of this algorithm, which makes the traversal extremely efficient. (Note that there is a typo in this paper at an important point, that will make you go insane when trying to implement it - contact me, if you opt to implement the algorithm).

* The directional tracing pattern can be optimized. You don't need to test against each other patch in the scene. Only the patches visible from the direction of the hemisphere over the shooter patch need to be considered. Even here, probabilistic sampling can reduce the number of rays to a minimum, for example QMC methods (Quasi-Monte-Carlo).

* Use subpatches and unified shooters, both with adaptive subdivision. This means, instead of shooting from each patch separately, shoot the unified energy from a set of patches. And instead of shooting to each and every patch, shoot to group of patches. This can be done recursively, if you arrange the receivers in a quad-tree like structure. Basically, you record the energy gradient within each level of such a patch group, and subdivide only if the gradient goes over some threshold.

* You can use image based methods to calculate the formfactor, for example hemicubes. Sometimes, this will speed up your calculations, sometimes not. Be aware that it can introduce artifacts into the solution. I personally don't like such approximations (I prefer raytracing based FF determination), but it might be an option for you.

These are just a few techniques typically used in progressive refinement radiosity systems. Many more exist, if you need even more performance, but implementation complexity increases exponentially.

Quote:
Original post by Sunray
BSP is far faster than octree for raycasting... I've been there. :)

Never ever. Not if you exploit the parametric spatial coherency of an octree. Actually, octrees are near perfect candidates for shadow tracing.

#5 VladR   Members   -  Reputation: 722

Like
0Likes
Like

Posted 24 May 2005 - 08:16 AM

Quote:
Original post by moagstar
The method that I use is for speeding up visibility determination is to simply put all your triangles into an octree, then when you are testing for an intersection, first test the ray against the octree. This way you can eliminate a lot of the triangles from the intersection test. There are probably other methods which will give you a better speed increase, but the octree method is simple to implement, and works pretty well.


Octree is easy to implement (I`ve implemented quadtree already somewhere else in my engine already), but I know that there are faster ways than this. I haven`t tried octree, but if I had 10 times faster visibility determination, it would be pretty much real-time (for the purposes of level-editor, that is) for me and I`d be happy.

Quote:
Original post by moagstarAs for the intersection test between the ray and triangle itself, if all you are interested in is visibility, then you can scrap the determination of the intersection point(uvw barycentric coordinates) it won't give you much of a speed increase but it does save you a couple of multiplies and a division per ray. Basically you will have to implement the intersection code yourself, but you can use this As a starting point, good luck!

Thanks for the link, but that algorithm was one of the two I tried and both were much slower than D3DXIntersectTri (which is most probably optimized as hell). I`m not sure, if the ommision of calculation of intersection point would help that much in performance (just few DOTs), but I might try it.



Quote:
Original post by SunrayBSP is far faster than octree for raycasting... I've been there. :)
Hm, I`m not sure, but I remember few debates/threads that discussed the opposite (probably the matter of implementation).

Maybe I should have mentioned that my final scenes consist of about 50.000 - 200.000 patches (not whole level, of course, I light only nearest scene with radiosity). Surely some algirthms scale better with such numbers.

#6 Sunray   Members   -  Reputation: 188

Like
0Likes
Like

Posted 24 May 2005 - 08:42 AM

Well, why does everybody persists (right word?) to use BSP (KD tree) for their raytracers then?

#7 VladR   Members   -  Reputation: 722

Like
0Likes
Like

Posted 24 May 2005 - 09:23 AM

Quote:
Original post by Yann L
* Zero overlap octrees will speed up your ray tracing process a lot. Such a spatial partitioning scheme is absolutely vital for a tracing based radiosity implementation. Tracing through the octree can be accelerated by recursive parametric methods, if there is no overlap between the nodes. I use a slight variation of this algorithm, which makes the traversal extremely efficient. (Note that there is a typo in this paper at an important point, that will make you go insane when trying to implement it - contact me, if you opt to implement the algorithm).

Thanks for the tip. I`ll take a look at the paper first.
BTW, how could the Visibility value (used in FormFactor equation) be other than 0.0f or 1.0f ? Because all I check is whether it`s visible or not. But how could I check it in more detail (other than shooting more rays into receiver patch) ? So, that for example, the visibility value would be some 0.35f ?


Quote:
Original post by Yann L
* The directional tracing pattern can be optimized. You don't need to test against each other patch in the scene. Only the patches visible from the direction of the hemisphere over the shooter patch need to be considered. Even here, probabilistic sampling can reduce the number of rays to a minimum, for example QMC methods (Quasi-Monte-Carlo).

But what if I don`t use the hemisphere for calculation of FormFactors ? How could the hemisphere here help in determination of which patches are actually visible ? By use of octree ?


Quote:
Original post by Yann L
* Use subpatches and unified shooters, both with adaptive subdivision. This means, instead of shooting from each patch separately, shoot the unified energy from a set of patches. And instead of shooting to each and every patch, shoot to group of patches. This can be done recursively, if you arrange the receivers in a quad-tree like structure. Basically, you record the energy gradient within each level of such a patch group, and subdivide only if the gradient goes over some threshold.

I`m not sure if I understand it well. Do you mean something like at the start of the level, each polygon(rectangle) is subdivided into,let`s say, 10x10 patches. But instead of shooting each patch separately, I`d shoot a unified energy from 2x2 patches, so it would be same effect as if I had each rectangle subdivided into 5x5 patches ? But here I could change dynamically the number of patches in each group ? Based on some threshold difference between neighbouring groups of patches ?
Interesting idea ! This might speed up the process several times !


Quote:
Original post by Yann L
* You can use image based methods to calculate the formfactor, for example hemicubes. Sometimes, this will speed up your calculations, sometimes not. Be aware that it can introduce artifacts into the solution. I personally don't like such approximations (I prefer raytracing based FF determination), but it might be an option for you.

Do You mean something like Hugo Eliases` method of rendering the scene from the view of the patch ? If so, based on the old Radiosity thread (started by Hellraizer) I decided it`s not worth the trouble. Besides, quality depends on the screen resolution, so if I wanted more soft shadows I`d be out of luck in 1600x1200. Currently, I just change one parameter - dimension of patch, and the tesselator does the rest. And if I decide that some part of the scene needs more detail, it`s also possible to change it during run-time. This, I think would be impossible with above method,AFAIK.


Quote:
Original post by Yann L
These are just a few techniques typically used in progressive refinement radiosity systems. Many more exist, if you need even more performance, but implementation complexity increases exponentially.

I`d be glad if you could name a few (although currently I`m probably not going to spend more than a week with any complicated method). I`m sure others would profit from this knowledge as well (for the future).


One more thing. I`m currently using just simple default exposure equation for Conversion of Radiosity values into RGB color, but somehow I need to set the exposure parameter a very high value (at least 1000,or more) to see something. But I`ve read at many places, that the exposure parameter is usually within 0.0f and 1.0f. But you can`t convert a value of e.g. 0.00000012f into RGB range 0-255 with low exposure value. Isn`t it weird ?
What`s the next better method of color conversion ? I remember from that thread, that you used 16 bands (because of photorealism needed for your customers). But is it possible to spot the difference of such conversion method on regular game environments (dungeons and such) ? If it is much different, it might be worth the trouble (BTW, how long do you estimate it would take to implement it?).



#8 Yann L   Moderators   -  Reputation: 1798

Like
0Likes
Like

Posted 24 May 2005 - 09:23 AM

Quote:
Original post by Sunray
Well, why does everybody persists (right word?) to use BSP (KD tree) for their raytracers then?

Define "everybody". In professional graphics software, binary AABB Kd-trees are rarely used for shadow tracing purposes, due to their huge recursive depth complexity (and the overhead that comes with it). Octrees are much flatter (require less recursion), while partitioning the same space. In other words, they allow the same amount of spatial efficiency, while being shallower. Read the paper I posted above, they have a direct performance comparison between parametric octree traversal and axis align binary trees.


#9 Yann L   Moderators   -  Reputation: 1798

Like
0Likes
Like

Posted 24 May 2005 - 09:45 AM

Quote:
Original post by VladR
Thanks for the tip. I`ll take a look at the paper first.
BTW, how could the Visibility value (used in FormFactor equation) be other than 0.0f or 1.0f ? Because all I check is whether it`s visible or not. But how could I check it in more detail (other than shooting more rays into receiver patch) ? So, that for example, the visibility value would be some 0.35f ?

You mean H(ij) ? It can get fractional with multisampling, ie. shooting several rays between two same patches (or patch groups). This allows a kind of antialias, and becomes important when you want to create very high quality images.

Quote:
Original post by Yann L
But what if I don`t use the hemisphere for calculation of FormFactors ? How could the hemisphere here help in determination of which patches are actually visible ? By use of octree ?

No, the octree just speeds up the raytracing process itself. You always use the hemisphere in formfactor calculation, but you might not be aware of that :) It's simply encoded in the cos(theta_i) * cos(theta_j) term. By using probabilistic sampling (or other forms of path, or even volume flux tracing), you can use that hemisphere directly. The basic idea of QMC sampling is as follows: instead of blindly shooting a ray to every possible other patch, you shoot rays into certain directions over the hemisphere, and check if they intersect with another patch. The directions are not isotropic over the hemisphere, but selected by importance. This selection is done on probabilistic principles.

Quote:

I`m not sure if I understand it well. Do you mean something like at the start of the level, each polygon(rectangle) is subdivided into,let`s say, 10x10 patches. But instead of shooting each patch separately, I`d shoot a unified energy from 2x2 patches, so it would be same effect as if I had each rectangle subdivided into 5x5 patches ? But here I could change dynamically the number of patches in each group ? Based on some threshold difference between neighbouring groups of patches ?
Interesting idea ! This might speed up the process several times !

Exactly. That's the idea of adaptive subdivision.

Quote:
Original post by Yann L
Do You mean something like Hugo Eliases` method of rendering the scene from the view of the patch ?

Yep, although usually the method is a little more sophisticated as the hack from Hugo. And usually, you wouldn't use the screen resolution, but a high resolution offscreen buffer. For really high quality pictures, this method is not yet suited, as it requires floating point targets - and those are slow. In the future, image based methods can become more interesting.

Quote:
Original post by Yann L
I`d be glad if you could name a few (although currently I`m probably not going to spend more than a week with any complicated method). I`m sure others would profit from this knowledge as well (for the future).

It wouldn't make much sense without knowing much more about the details of your implementation. In this older post, I listed various papers on the subject. You might want to take a look at the one from Bernard Kwok. Eventhough a little dated, it contains an excellent overview over the basics and some advanced concepts.

Quote:

One more thing. I`m currently using just simple default exposure equation for Conversion of Radiosity values into RGB color, but somehow I need to set the exposure parameter a very high value (at least 1000,or more) to see something. But I`ve read at many places, that the exposure parameter is usually within 0.0f and 1.0f.

That's wrong. Using the typical exponential exposure equation (1-exp(-colour*exposure)), the exposure exponent is usually somewhere between 1 and 10, typically at around 2 or 3.

Quote:

What`s the next better method of color conversion ? I remember from that thread, that you used 16 bands (because of photorealism needed for your customers). But is it possible to spot the difference of such conversion method on regular game environments (dungeons and such) ? If it is much different, it might be worth the trouble

This does not impact on the exposure. I implement my GI systems using an SPD (a spectral power distribution) of usually 16 bands. But in the end, the display needs RGB data, so this SPD is integrated back into an RGB tripplet, and the well known exposure equation (and gamma correction) is applied. In your case, you would basically use a 3 band distribution (Red, green, and blue). Going to more selective wavelength bands is only useful for more precise light interaction between surfaces. It is required, if you want perfectly photorealistic results (suing spectral emission profiles for existing lightsources, such as IESNA), but it's not worth it for a game.

Quote:

(BTW, how long do you estimate it would take to implement it?).

Hard to say, it depends on how well you know the subject.

#10 davepermen   Members   -  Reputation: 1022

Like
0Likes
Like

Posted 24 May 2005 - 10:57 AM

at least the ones that want to do realtime raytracing all use kd-trees, wich are just another bsp tree..

the stuff on www.openrt.de is all without any octree, and bsp only. with axis aligned planes, of course..

the stuff we've got seen on flipcode, too.. no octrees.
If that's not the help you're after then you're going to have to explain the problem better than what you have. - joanusdmentia

My Page davepermen.net | My Music on Bandcamp and on Soundcloud


#11 Yann L   Moderators   -  Reputation: 1798

Like
0Likes
Like

Posted 24 May 2005 - 11:17 AM

Quote:
Original post by davepermen
at least the ones that want to do realtime raytracing all use kd-trees, wich are just another bsp tree..

the stuff on www.openrt.de is all without any octree, and bsp only. with axis aligned planes, of course..

the stuff we've got seen on flipcode, too.. no octrees.

Well, I don't know about their internal structures, but according to our tests (and we have done a lot of them, since performance was one of our top most goals), tracing shadows (ie. simple intersection tracing without need to know the first intersection point) using octrees with optimized traversal is easily 50% faster than non-overlapping binary AABB trees (everything with hand optimized SIMD ASM, of course, partially processing multiple rays simultaneously). Although it could depend on the type of geometry you use.


#12 ajas95   Members   -  Reputation: 759

Like
0Likes
Like

Posted 24 May 2005 - 05:22 PM

Yann, while your tests showed a 50% gain of octrees over hand-optimized kd, you have to also realize that the heuristic for choosing split-plane position of a kd node has proven to have dramatic effects on traversal time.

Plain vanilla SAH was a order-of-magnitude better than median splits, but beyond that there are other clever heuristics that offer even greater benefits in the quality of the tree... early isolation of empty space, double-splits, and even optimizing the structure of the tree with adaptive termination criteria.

Now, maybe you were doing all this. But you talked just about optimizing traversal, when really that's only an incremental improvement, less significant than split-heuristic improvements.

And obviously, in-order traversal is one of the great things about KDs, so if you don't need that then... I'm curious now. It seems like octree traversal would be very similar to KD, just combining 3 binary splits into 1 8-way split step. Is it significantly different?

#13 VladR   Members   -  Reputation: 722

Like
0Likes
Like

Posted 24 May 2005 - 09:15 PM

Quote:
Original post by Yann L
You mean H(ij) ? It can get fractional with multisampling, ie. shooting several rays between two same patches (or patch groups). This allows a kind of antialias, and becomes important when you want to create very high quality images.
The same antialias could be also done if the resolution of patches was higher, right ? But it would probably be a better idea to subdivide just the patches at the shadow boundary (some threshold for difference between 2 neighbouring patches), so that we don`t loose the precious time on non-shadowed patches, right ?

Quote:
Original post by Yann L
No, the octree just speeds up the raytracing process itself. You always use the hemisphere in formfactor calculation, but you might not be aware of that :) It's simply encoded in the cos(theta_i) * cos(theta_j) term.
Yeah, I already forgot that my classical equation (F i,j = ((cosQ * cosR ) / (PI*r2 )) * H i,j * dA j) is derived from the hemicube.

Quote:
Original post by Yann L
By using probabilistic sampling (or other forms of path, or even volume flux tracing), you can use that hemisphere directly. The basic idea of QMC sampling is as follows: instead of blindly shooting a ray to every possible other patch, you shoot rays into certain directions over the hemisphere, and check if they intersect with another patch. The directions are not isotropic over the hemisphere, but selected by importance. This selection is done on probabilistic principles.
Interesting idea, I`ll google that up how it`s done and what`s the theory behind it.

Quote:
Original post by Yann L
It wouldn't make much sense without knowing much more about the details of your implementation. In this older post, I listed various papers on the subject. You might want to take a look at the one from Bernard Kwok. Eventhough a little dated, it contains an excellent overview over the basics and some advanced concepts.
Well, I haven`t had a luck googling that paper. Maybe someone has some direct link to it ?
As for the details of my implementation, it`s a basic Progressive Refinement Shooting algorithm:
1. Divide scene (including lights) into patches (quads). I have 3 energy values per patch (R,G,B), but not limited to range 0-255
2. Assign energy to lights (usually around 1000, but with higher patch tesselation, the overall brightness decreases, because I take into account the area of patch in the Form-Factor equation, so the smaller the patch, the less light gets distributed. But this is easily fixed with higher exposure (at least I haven`t spotted yet, that the smaller patch (with same energy as bigger patch) could mean lesser quality of image).
3. Pick the patch with highest undistributed energy [repeat for R/G/B]
4. Distribute to all other patches in the scene [per particular R/G/B from above step]
5. Repeat steps 3-4 until there`s a patch with undistributed energy of 1%(or some other constant) of the very first brightest shooter.
6. Apply exposure
7. Enjoy the rendering or curse the wrongly set up lights

Quote:
Original post by Yann L
That's wrong. Using the typical exponential exposure equation (1-exp(-colour*exposure)), the exposure exponent is usually somewhere between 1 and 10, typically at around 2 or 3.
I use this exposure function : (1.0f - exp (-light * K)) * 255.0f
I can`t see anything if K is 1.0 or 2.0. Doeasn`t matter if it`s one pass or 1000 passes. But values of 1.000 start showing the picture. Maybe I should assign higher energy values to lights (currently about 1000), so that the patches receive more energy - what I mean is that I`m not sure, if something doesn`t get lost this way because of floating point precision. If lights had higher intensity (100.000 for example), final radiosity values should be also higher than mine (around 0.00000014534534f) and then I might not need exposure values of 1000, right ?

Quote:
Original post by Yann L
This does not impact on the exposure. I implement my GI systems using an SPD (a spectral power distribution) of usually 16 bands. But in the end, the display needs RGB data, so this SPD is integrated back into an RGB tripplet, and the well known exposure equation (and gamma correction) is applied. In your case, you would basically use a 3 band distribution (Red, green, and blue). Going to more selective wavelength bands is only useful for more precise light interaction between surfaces. It is required, if you want perfectly photorealistic results (suing spectral emission profiles for existing lightsources, such as IESNA), but it's not worth it for a game.
I see. If you say, it`s not worth it, I definitely believe you. Besides, one of the games we`re working on, needs excessivelly colorfull scenes (many different bright lights) which have nothing to do with reality - they`re just pretty to look at. Still, I`ll google some papers regarding SPD, just in case I might have a time for this in future.

EDIT: In that old thread with rendering of the room with two lights, you mentioned something like this: "The algorithm used was multi-spectral PR with transfer tracking, using an inital HDRI shooting pass for the atmospheric light." Could you please tell more what does that Initial HDRI shooting pass actually mean ? Is it some kind of a light with very high energy value ?




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS