Acceleration structures for photon mapping on a GPU

Started by
6 comments, last by Pragma 12 years ago
I'm working with OpenCL and I've been trying to figure this out for a few days now and I've got nothing and don't quite understand how others do it. My main hang-up is this: if I store my photons like I do right now in a completely unsorted, 1-dimensional array (which is horribly slow) then I can define that array statically at program startup with something like SCENE_MAX_PHOTONS as the array size. That way it doesn't matter how dense the photons in specific areas are as long as the total number does not exceed the array limits. Now I want to implement some sort of acceleration structure (kd-tree, uniform grids, spatial hashing, etc.) but I don't see how with the static requirements in OpenCL this is possible at all. For example, if I had a uniform grid that I defined as having resolution 256x256x256, each grid cell would have to have a statically defined container for all the photons that land within it...but that means that there's an upper bound on how many photons can be in one area that isn't the same as the maximum number of photons allowed in the scene like it should be.

The same problem comes up with kd-trees when I try to think of them. Getting around the lack of recursion is tricky but not impossible for implementing one, but at the end of the day you're left with a leaf-node that's got to contain some of the photons, and that size has to be defined statically at the time that the kd-tree is built.

The only way I can think of to resolve this is to store photons as a linked-list structure at each leaf / grid cell / bucket. But there's no way that's going to be even remotely efficient on the GPU...right?
Advertisement
You can make a KD tree structure with fixed storage requirements if you use a balanced KD tree. This is well described in Henrik Wann Jensen's book (it's probably in his papers too if you don't have the book). The key is that when you split your photons, you always split on the median (so that there are an equal number of photons left and right branches of the tree). The resulting KD tree is a complete binary tree, and its structure is known completely beforehand. As a result, you don't need any variable size storage, or even pointers.
"Math is hard" -Barbie
But how does that eliminate the problem IN the leaf nodes of the tree? I know how much space the tree itself takes up, sure, but I don't know how many photons are contained between each leaf. Leaf A may have 2 photons stored within it, but leaf G may have 200. Since I can't allocate the memory dynamically within the kernel, I have to define something like LEAF_MAX_PHOTONS and use that in the node struct itself. If that max is 300 photons per leaf, and leaf L has 500 photons I'm losing information. Or am I not understanding your meaning?
You don't store the photons only in the leaf nodes. Every node of the tree IS a photon - i.e. each node of the tree has exactly one photon. The number of nodes in your tree is equal to SCENE_MAX_PHOTONS and you can store them in a preallocated array.
"Math is hard" -Barbie

You don't store the photons only in the leaf nodes. Every node of the tree IS a photon - i.e. each node of the tree has exactly one photon. The number of nodes in your tree is equal to SCENE_MAX_PHOTONS and you can store them in a preallocated array.


Okay I think I understand. It seems I was confusing kD trees with Octrees and hadn't realized that there's a difference in what nodes actually represent. Thanks!

I'm working on an implementation as you suggested right now, but I'm not entirely sure how to build the kD-Tree within OpenCL since I don't think there's atomic operations I can utilize to set left / right children =/
I'm not really familiar with OpenCL, but in my implementation I just sorted all the photons along one axis (say the x axis). Then the middle photon (call it n) becomes the root of the tree. You can then recurse, sorting 0..n-1 and n+1 .. N-1 along the next axis (in this case y). That way the whole KD Tree building algorithm reduces to sorting, which presumably has some fast OpenCL implementations.
"Math is hard" -Barbie

I'm not really familiar with OpenCL, but in my implementation I just sorted all the photons along one axis (say the x axis). Then the middle photon (call it n) becomes the root of the tree. You can then recurse, sorting 0..n-1 and n+1 .. N-1 along the next axis (in this case y). That way the whole KD Tree building algorithm reduces to sorting, which presumably has some fast OpenCL implementations.


Yeah, unfortunately sorting is pretty complex to implement on OpenCL (not impossible though) and recursion is impossible. You can simulate it with a stack, but that's the kind of stuff I was hoping to avoid. I'm actually looking toward http://www.peterw.eu/photon-flipping.pdf instead. On page 43 he details a spatial hashing solution that exchanges high memory usage for not having to sort at all. It's actually really clever and seems to cater to the kind of work I'm doing at the moment nicely. I might still go with a kD-tree solution however, I just haven't made up my mind yet ;)
I said you recurse on the two subtrees, but actually no recursion is necessary when building the tree, since you know in advance which regions you will have to sort at each level. Though I suppose you do have to do some kind of recursion (i.e. have a stack) when you traverse the tree. But usually you need some sort of local data structure for collecting the photons anyway, so I don't see why adding a small stack (of depth log(SCENE_MAX_PHOTONS)) would be difficult.
"Math is hard" -Barbie

This topic is closed to new replies.

Advertisement