Closest Point using KdTree

Started by
12 comments, last by pukas 18 years, 9 months ago
Hi! I´m wondering if there is an exact algorithm for finding the closest point in a kdtree to any given point. I have search the internet for quite a while now but I haven´t really found anything useful!
Advertisement
If I remember right, the closes neighbor is the one that's either the parent or leaf node.

As you increase your search radius, you just follow the tree further.

Try a search for "closest neighbor" + "kd-tree"
[size=2]Darwinbots - [size=2]Artificial life simulation
You may also want to read on Jensen's radiance estimate from a photon map. The key algorithm finds the k nearest neighbours in a kd-tree. If you make k=1, you get what you want.

The basic idea (for k=1) is keeping a variable that keeps the smallest distance found so far. As you traverse the tree you update that variable and you discard branches where all the volume is farther away than the current smallest distance. The algorithm will work faster if you are careful in visiting the branches in a reasonable order (closest first).

Quote:Original post by alvaro
You may also want to read on Jensen's radiance estimate from a photon map. The key algorithm finds the k nearest neighbours in a kd-tree.


Note that the nearest-neighbour code given in Wann Jensen's photon mapping book is 'broken.' That is, it works, but it's an inefficient implementation. When trying to cull nodes that are farther away than the (k-th) smallest distance, he only considers the distance of the query point to the splitting plane of the current node. The correct approach is to consider the distance of the query point to the axis-aligned bounding box of the root of the subtree you're trying to cull. The latter has much better culling power.
Quote:Original post by Christer Ericson
Quote:Original post by alvaro
You may also want to read on Jensen's radiance estimate from a photon map. The key algorithm finds the k nearest neighbours in a kd-tree.


Note that the nearest-neighbour code given in Wann Jensen's photon mapping book is 'broken.' That is, it works, but it's an inefficient implementation. When trying to cull nodes that are farther away than the (k-th) smallest distance, he only considers the distance of the query point to the splitting plane of the current node. The correct approach is to consider the distance of the query point to the axis-aligned bounding box of the root of the subtree you're trying to cull. The latter has much better culling power.


How can Jensen's method is broken if it works, also what makes your heuristic better than his? when his works and lead to the same results



Quote:How can Jensen's method is broken if it works
The quotation marks I put around 'broken' indicates that I used the word in a non-literal sense. My post stated in what sense it is broken: efficiency.

Quote:also what makes your heuristic better than his? when his works and lead to the same results
Jensen's criteria for culling of subtrees is inefficient. To see the difference, consider an axis-aligned bounding box B describing the volume represented by some subtree of the k-d tree. For a query point P, we can have that P lies far away from B, while at the same time P lies near a plane that cuts through a face of B.

Well, Jensen's code culls subtrees based on a plane test (i.e. the latter situation) will therefore descend into the subtree because P is near the plane. The alternative approach I'm describing (which isn't mine btw, its how you traditionally do nearest-neighbor queries in a k-d tree as described by Friedman et al in 1975) directly determines that P is far from the AABB and therefore can cull the subtree from further search without descending into it.

Jensen's code will ultimately give the same result as the traditional solution, but it needlessly descends into subtrees that could have been culled earlier (that is, Jensen's code will cull them eventually only not right away), and is therefore slower.

I hope this makes the distinction clear.
Quote:Original post by Christer Ericson
Jensen's criteria for culling of subtrees is inefficient. To see the difference, consider an axis-aligned bounding box B describing the volume represented by some subtree of the k-d tree. For a query point P, we can have that P lies far away from B, while at the same time P lies near a plane that cuts through a face of B.

Well, Jensen's code culls subtrees based on a plane test (i.e. the latter situation) will therefore descend into the subtree because P is near the plane. The alternative approach I'm describing (which isn't mine btw, its how you traditionally do nearest-neighbor queries in a k-d tree as described by Friedman et al in 1975) directly determines that P is far from the AABB and therefore can cull the subtree from further search without descending into it.

Jensen's code will ultimately give the same result as the traditional solution, but it needlessly descends into subtrees that could have been culled earlier (that is, Jensen's code will cull them eventually only not right away), and is therefore slower.

I hope this makes the distinction clear.

Actually I believe Jensen's algorithm is more efficient.
The test for checking the distance to a plane is simpler and faster than the test against a AABB. Given that the all internal spaces in a KdTree are closed spaces, for each unique space there will be at, must tree test, however since these spaces are adjacent and share one plane the average number of test come down to one for each closed space statistically.

In my opinion your pseudo proof as why Jensen; method is inefficient is not only the more inefficient but also requires modification of the data structure. This is why.
A kdTree is a very lightweight structure, usually the only information in the plane and a pointer of the information, they do not encode the AABB of each volume they bound.
So to check the distance to the aabb of the space the AABB have to be added to the tree, increasing the size of the tree by almost two folds, if not more.
Notice that adding AABB to each leave of a kdTree result in a contrive solution because the leaves are open spaces, so some artificial boundary most be imposed.

Second in a kdTree each plane only has data on the two branches, therfore for each visited node two aabbs test have to be made. So the only benefic for the AABB test is to get an upper sooner, but the does not stop the search from reaching at least one path from the root to any leave of the tree.
It is my experience that after the first leave is reached the upper bound in more than 95% of the cases is the same using Jensen’s that using than AABB test. (at least for tree of practical size more than few dozen planes)
Then after the first leave is reached then in the back tracking face the number of test against other nodes test is about the same, only in very sparce tree, or planes that happens to be closer to the point give false positive results by signalling that they are closer than the upper bound but those are quickly rejected by the point-point test.

So not your explanation does not make the distinction any clear your proposal is more inefficient for kdTrees, your method is better for AABB trees but the question asked for kfTree

Quote:Original post by Anonymous Poster
In my opinion your pseudo proof as why Jensen; method is inefficient is not only the more inefficient but also requires modification of the data structure. This is why.
I'm afraid the problem is that you do not understand the algorithm I'm describing. There is no modification of the k-d tree data structure. The trees are identical in both cases. Only the traversal code changes.

Quote:Notice that adding AABB to each leave of a kdTree result in a contrive solution because the leaves are open spaces, so some artificial boundary most be imposed.
No one is adding AABBs to nodes of the k-d tree, I do not know where you're getting that from; at no point did I claim or even suggest that.

The algorithm I'm talking about tracks the (one) AABB bounding the far node incrementally (inside the traversal code) as the tree is traversed. At each node visited one and only one side of the AABB is updated, which is a trivial low-cost operation. Similarly, the distance of the query point to the AABB can be maintained incrementally so you do not even need to perform a full point-AABB distance test at each node (which is incredibly cheap in full anyway, but just to spell it out to you).

Quote:Actually I believe Jensen's algorithm is more efficient.
You can believe what you will, of course. However, I'm telling you, Mr. Anonymous, that the traversal method he presents is inefficient compared to the canonical approach, no beliefs involved just facts.

Here's two other descriptions of the algorithm that I posted to the gd-algorithms mailing list and comp.graphics.algorithms some time ago. Perhaps they will help you understand the difference between the two query algorithms:

http://article.gmane.org/gmane.games.devel.algorithms/14030/
http://groups-beta.google.com/group/comp.graphics.algorithms/msg/e7e19a62e851aa0d?dmode=source

Quote:Original post by Christer Ericson
You can believe what you will, of course. However, I'm telling you, Mr. Anonymous, that the traversal method he presents is inefficient compared to the canonical approach, no beliefs involved just facts.

Here's two other descriptions of the algorithm that I posted to the gd-algorithms mailing list and comp.graphics.algorithms some time ago. Perhaps they will help you understand the difference between the two query algorithms:

http://article.gmane.org/gmane.games.devel.algorithms/14030/
http://groups-beta.google.com/group/comp.graphics.algorithms/msg/e7e19a62e851aa0d?dmode=source


When I said, “I believe” I was been polite. What I said is a fact your method is the more inefficient one. I do not have to understand what you say because you had not writen anything new or original, everything you have written are ideas taken from other people so all your stuff is like reading yesterday newspaper. Furthermore you are trashing the original creators of those ideas. Perhaps it is you the one that need to come up with original thought and stop proclaiming yourself the ultimate authority in collision.
Are we supposed to believe you based on two quotes of yourself; I can easily post that source code for searching the closest distance in a kd tree.
The fact is that what you propose is slow inefficient and offer not benefic whatsoever over the other method.
Another thing I have absolutely not motivation to promote algorithm A or B, contrary to some people that will do anything to show their method is the best and it is well explained in his or her book, never mind the idea is not original.

This topic is closed to new replies.

Advertisement