Jump to content
  • Advertisement
Sign in to follow this  
Max_Payne

Distributed Photon Mapping

This topic is 4880 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I plan on programming a distributed photon mapper and I am wondering about the strategy to use. For one, it seems like it would be more complicated to program distributed photon mapping than distributed path tracing because of memory consumption. Obviously, unless I want each node to have tons of RAM, I can only trace a relatively small amount of photons per node. Say, one million photons per node, for example. However, just making each node render a part of the image seems inefficient. I am wondering if it would work if I simply had each node trace 1000000 photons, and then rendered small chunks of the image, but had different nodes re-render the same sections and "average" the results. I am wondering if by averaging the result of tracing and rendering low photon densities several times, I would obtain a convergence towards the desired result. Any thoughts?

Share this post


Link to post
Share on other sites
Advertisement
Guest Anonymous Poster
http://www.qarl.com/menu/class/cs323_fl98/mrmanual_2.0/

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
http://www.qarl.com/menu/class/cs323_fl98/mrmanual_2.0/


Is that supposed to be useful in answering my question in any way (If so, please specify how), or is it some kind of lame joke along the lines of "Don't waste your time programming something that has been already done"... Because if it is the latter, we might as well all sit on our asses and do nothing eh?

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
It's more along the lines "here is an example of a system that does *exactly* what you describe. the intro sheds some light as to how to split the jobs across multiple CPUs and how the data is shared across machines. the API manual shows how the data is specified, how photon shaders interact with the scene along with a descripion of the relevent data structures. maybe you will find useful info in there".

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
It's more along the lines "here is an example of a system that does *exactly* what you describe. the intro sheds some light as to how to split the jobs across multiple CPUs and how the data is shared across machines. the API manual shows how the data is specified, how photon shaders interact with the scene along with a descripion of the relevent data structures. maybe you will find useful info in there".


Ahhh. Thats clearer. Thank you.

Share this post


Link to post
Share on other sites
Here's the strategy that I'd personally use in such a case:

- Appoint one node as a "master" node
- Master sends a copy of the scene geometry to each slave node
- The user requests n photons to be traced in the scene
- Each node then traces (n / number_of_nodes) photons via its normal strategy
- After all photons are traced, the result is sent to the master node
- Master node constructs final kd-tree and distributes it to each slave node
- Each node then traces its share of the camera rays
- As nodes finish they send their results to the master for compilation

That should strike a decent balance between simplicity and efficiency. You may be able to get a slightly faster result by allowing each slave node to construct its own kd-tree and then merging each tree on the master, but I suspect this would prove complicated and difficult to do quickly.


One thing to consider is that in the general case you cannot guarantee that all nodes will take the same amount of time to render. If you find that the master node spends a lot of time (i.e. more than a few hundred ms per phase) waiting on slaves to complete their jobs, you can try a couple of tricks. First, build the engine in asynchronous phases, so that each node can continue working on new rendering tasks even if some nodes are behind. This requires some planning and thought ahead of time, but for any non-trivial scene, or with different hardware in each node, it may become a major boost. The less time you spend idling waiting for network traffic, the better.

The second trick you can do is to weight the amount of work you assign each node based on its past performance. For instance, if you notice that node 3 consistently takes 20% longer than nodes 1 and 2 to finish rendering the same number of rays, give it 83% of the work to do. You can track this per session or keep data in a persistent file over multiple renders to "teach" the system the most efficient way to distribute work. If you work with nontrivial scenes (i.e. taking several minutes to render) and you have lots of different specs of machines in the cluster, this can end up paying off quite a bit in the long run.


The problem with averaging results is that they may not converge to the correct solution. You will lose high-frequency details and lighting boundaries will become very blurred. This is prohibitive especially if you want to render caustics, as the final result will not be sharp enough to look good. There is also a good chance that a the averaging routine will "invent" incorrect illumination details by statistically overemphasizing/underemphasizing certain areas. This will lead to very bad results in most scenes. Obtaining a good result from several partial averages is a tricky procedure and requires very carefully designed kernel density estimators and filters, which take quite a bit of time and effort (not to mention arcane statistics and calculus knowledge) to perfect. You'll have a much easier time getting good results with a more fine-granularity division of labor (i.e. give each node individual photons/rays to deal with).

Share this post


Link to post
Share on other sites
Quote:
Original post by ApochPiQ
Here's the strategy that I'd personally use in such a case:

- Appoint one node as a "master" node
- Master sends a copy of the scene geometry to each slave node
- The user requests n photons to be traced in the scene
- Each node then traces (n / number_of_nodes) photons via its normal strategy
- After all photons are traced, the result is sent to the master node
- Master node constructs final kd-tree and distributes it to each slave node
- Each node then traces its share of the camera rays
- As nodes finish they send their results to the master for compilation

That should strike a decent balance between simplicity and efficiency. You may be able to get a slightly faster result by allowing each slave node to construct its own kd-tree and then merging each tree on the master, but I suspect this would prove complicated and difficult to do quickly.


What you proposed is exactly what I'm trying to avoid. Sending the geometry, we don't really have a choice, it has to be done. However, sending the result of the photon tracing seems very wasteful. Tracing photons is relatively fast, and the photon map can grow to enormous sizes.

For one million photons (a reasonable number), we are talking about 28-50 megs of data. This is obviously too large for it to be worth sending back and forth over the net over a broadband connection, especially if each node traces that many photons. There is also another problem: not all nodes will necessarily have huge amounts of RAM. I would like my program to have requirements no higher than 256 MBs of RAM, and using any kind of swap file is out of the question (the performance would massively suffer).

However, even if I have such tight requirements on the RAM, I would like it to be possible to trace a billion photons. This is why I am basically asking this. I will reformulate the question in another way. I think it would work, but it might seem awkward.

Suppose you have one computer running a simple two-pass photon mapper. This photon mapper traces 1 million photons in each photon-tracing pass, and renders the entire frame buffer from the photon map. However, instead of directly writing an image file, it adds the contribution of the rendered frame buffer as a sum to a persistent frame buffer. Here, each rendering pass simply adds its contribution to an already existing buffer. I am wondering if by doing this photon-tracing/rendering cycle 100 times, with 1 million photons per pass, and averaging the result, would be equivalent to doing one cycle and tracing 100 million photons.

If there is indeed a convergence by doing this, then I could trace between 500 000 and 2 million photons per node, and have lots of node work on the same image over the net, effectively tracing billions of photons over all the scene. Individually, each work unit (small rectangular portion of the render) might be blurry, but I am hoping that averaging them all out, it will converge towards "the solution".

Share this post


Link to post
Share on other sites
Disclaimer: I've never written a photon mapping ray tracer (just a normal one), and I've never written a program that uses distributed computing principles, so this may be bollocks ;)

I think that getting each slave to render some photons is a good idea, how many per slave is the main issue. If you want to keep the data transfer small, trace smaller numbers at the slaves and have the master do a larger portion of the photon tracing, since it doesn't take an amazingly large length of time.

However, then sending the entire photon map back to the clients is a big problem, as you say, if it grows in size... plus you might have a high res caustics map if you're implementing that, so even more data. I don't know whether your convergance idea would work, not up on the theory.

I came up with a couple of other options, maybe:

Slaves could query the master when they need a photon map lookup. The master has to deal with a lot of requests this way, but at least you're only sending relevent data to the slave.

Slaves could (I'm not exactly sure how this would be done) compute as much of the ray tracing as possible, without the photon map. I don't know how well this would work, as it would require sending more data back to the master than just a single colour per pixel rendered, but it might work out.

-Mezz

Share this post


Link to post
Share on other sites
i fail to see why the averaging thing wouldnt work. i just dont see the reasons for any of the problems mentioned to pop up.

Share this post


Link to post
Share on other sites
Quote:
Original post by Eelco
i fail to see why the averaging thing wouldnt work. i just dont see the reasons for any of the problems mentioned to pop up.


Yeah. Nobody mentioned any problems. I'm just no expert at the mathematical theory behind this. I guess I will just try it and find out. The whole point of this strategy is to avoid sending the photon map at all, and do all the rendering on the client side.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!