Sign in to follow this  
pukas

Closest Point using KdTree

Recommended Posts

Hi! I´m wondering if there is an exact algorithm for finding the closest point in a kdtree to any given point. I have search the internet for quite a while now but I haven´t really found anything useful!

Share this post


Link to post
Share on other sites
If I remember right, the closes neighbor is the one that's either the parent or leaf node.

As you increase your search radius, you just follow the tree further.

Try a search for "closest neighbor" + "kd-tree"

Share this post


Link to post
Share on other sites
You may also want to read on Jensen's radiance estimate from a photon map. The key algorithm finds the k nearest neighbours in a kd-tree. If you make k=1, you get what you want.

The basic idea (for k=1) is keeping a variable that keeps the smallest distance found so far. As you traverse the tree you update that variable and you discard branches where all the volume is farther away than the current smallest distance. The algorithm will work faster if you are careful in visiting the branches in a reasonable order (closest first).

Share this post


Link to post
Share on other sites
Quote:
Original post by alvaro
You may also want to read on Jensen's radiance estimate from a photon map. The key algorithm finds the k nearest neighbours in a kd-tree.


Note that the nearest-neighbour code given in Wann Jensen's photon mapping book is 'broken.' That is, it works, but it's an inefficient implementation. When trying to cull nodes that are farther away than the (k-th) smallest distance, he only considers the distance of the query point to the splitting plane of the current node. The correct approach is to consider the distance of the query point to the axis-aligned bounding box of the root of the subtree you're trying to cull. The latter has much better culling power.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Christer Ericson
Quote:
Original post by alvaro
You may also want to read on Jensen's radiance estimate from a photon map. The key algorithm finds the k nearest neighbours in a kd-tree.


Note that the nearest-neighbour code given in Wann Jensen's photon mapping book is 'broken.' That is, it works, but it's an inefficient implementation. When trying to cull nodes that are farther away than the (k-th) smallest distance, he only considers the distance of the query point to the splitting plane of the current node. The correct approach is to consider the distance of the query point to the axis-aligned bounding box of the root of the subtree you're trying to cull. The latter has much better culling power.


How can Jensen's method is broken if it works, also what makes your heuristic better than his? when his works and lead to the same results



Share this post


Link to post
Share on other sites
Quote:
How can Jensen's method is broken if it works
The quotation marks I put around 'broken' indicates that I used the word in a non-literal sense. My post stated in what sense it is broken: efficiency.

Quote:
also what makes your heuristic better than his? when his works and lead to the same results
Jensen's criteria for culling of subtrees is inefficient. To see the difference, consider an axis-aligned bounding box B describing the volume represented by some subtree of the k-d tree. For a query point P, we can have that P lies far away from B, while at the same time P lies near a plane that cuts through a face of B.

Well, Jensen's code culls subtrees based on a plane test (i.e. the latter situation) will therefore descend into the subtree because P is near the plane. The alternative approach I'm describing (which isn't mine btw, its how you traditionally do nearest-neighbor queries in a k-d tree as described by Friedman et al in 1975) directly determines that P is far from the AABB and therefore can cull the subtree from further search without descending into it.

Jensen's code will ultimately give the same result as the traditional solution, but it needlessly descends into subtrees that could have been culled earlier (that is, Jensen's code will cull them eventually only not right away), and is therefore slower.

I hope this makes the distinction clear.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Christer Ericson
Jensen's criteria for culling of subtrees is inefficient. To see the difference, consider an axis-aligned bounding box B describing the volume represented by some subtree of the k-d tree. For a query point P, we can have that P lies far away from B, while at the same time P lies near a plane that cuts through a face of B.

Well, Jensen's code culls subtrees based on a plane test (i.e. the latter situation) will therefore descend into the subtree because P is near the plane. The alternative approach I'm describing (which isn't mine btw, its how you traditionally do nearest-neighbor queries in a k-d tree as described by Friedman et al in 1975) directly determines that P is far from the AABB and therefore can cull the subtree from further search without descending into it.

Jensen's code will ultimately give the same result as the traditional solution, but it needlessly descends into subtrees that could have been culled earlier (that is, Jensen's code will cull them eventually only not right away), and is therefore slower.

I hope this makes the distinction clear.

Actually I believe Jensen's algorithm is more efficient.
The test for checking the distance to a plane is simpler and faster than the test against a AABB. Given that the all internal spaces in a KdTree are closed spaces, for each unique space there will be at, must tree test, however since these spaces are adjacent and share one plane the average number of test come down to one for each closed space statistically.

In my opinion your pseudo proof as why Jensen; method is inefficient is not only the more inefficient but also requires modification of the data structure. This is why.
A kdTree is a very lightweight structure, usually the only information in the plane and a pointer of the information, they do not encode the AABB of each volume they bound.
So to check the distance to the aabb of the space the AABB have to be added to the tree, increasing the size of the tree by almost two folds, if not more.
Notice that adding AABB to each leave of a kdTree result in a contrive solution because the leaves are open spaces, so some artificial boundary most be imposed.

Second in a kdTree each plane only has data on the two branches, therfore for each visited node two aabbs test have to be made. So the only benefic for the AABB test is to get an upper sooner, but the does not stop the search from reaching at least one path from the root to any leave of the tree.
It is my experience that after the first leave is reached the upper bound in more than 95% of the cases is the same using Jensen’s that using than AABB test. (at least for tree of practical size more than few dozen planes)
Then after the first leave is reached then in the back tracking face the number of test against other nodes test is about the same, only in very sparce tree, or planes that happens to be closer to the point give false positive results by signalling that they are closer than the upper bound but those are quickly rejected by the point-point test.

So not your explanation does not make the distinction any clear your proposal is more inefficient for kdTrees, your method is better for AABB trees but the question asked for kfTree

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
In my opinion your pseudo proof as why Jensen; method is inefficient is not only the more inefficient but also requires modification of the data structure. This is why.
I'm afraid the problem is that you do not understand the algorithm I'm describing. There is no modification of the k-d tree data structure. The trees are identical in both cases. Only the traversal code changes.

Quote:
Notice that adding AABB to each leave of a kdTree result in a contrive solution because the leaves are open spaces, so some artificial boundary most be imposed.
No one is adding AABBs to nodes of the k-d tree, I do not know where you're getting that from; at no point did I claim or even suggest that.

The algorithm I'm talking about tracks the (one) AABB bounding the far node incrementally (inside the traversal code) as the tree is traversed. At each node visited one and only one side of the AABB is updated, which is a trivial low-cost operation. Similarly, the distance of the query point to the AABB can be maintained incrementally so you do not even need to perform a full point-AABB distance test at each node (which is incredibly cheap in full anyway, but just to spell it out to you).

Quote:
Actually I believe Jensen's algorithm is more efficient.
You can believe what you will, of course. However, I'm telling you, Mr. Anonymous, that the traversal method he presents is inefficient compared to the canonical approach, no beliefs involved just facts.

Here's two other descriptions of the algorithm that I posted to the gd-algorithms mailing list and comp.graphics.algorithms some time ago. Perhaps they will help you understand the difference between the two query algorithms:

http://article.gmane.org/gmane.games.devel.algorithms/14030/
http://groups-beta.google.com/group/comp.graphics.algorithms/msg/e7e19a62e851aa0d?dmode=source

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Christer Ericson
You can believe what you will, of course. However, I'm telling you, Mr. Anonymous, that the traversal method he presents is inefficient compared to the canonical approach, no beliefs involved just facts.

Here's two other descriptions of the algorithm that I posted to the gd-algorithms mailing list and comp.graphics.algorithms some time ago. Perhaps they will help you understand the difference between the two query algorithms:

http://article.gmane.org/gmane.games.devel.algorithms/14030/
http://groups-beta.google.com/group/comp.graphics.algorithms/msg/e7e19a62e851aa0d?dmode=source


When I said, “I believe” I was been polite. What I said is a fact your method is the more inefficient one. I do not have to understand what you say because you had not writen anything new or original, everything you have written are ideas taken from other people so all your stuff is like reading yesterday newspaper. Furthermore you are trashing the original creators of those ideas. Perhaps it is you the one that need to come up with original thought and stop proclaiming yourself the ultimate authority in collision.
Are we supposed to believe you based on two quotes of yourself; I can easily post that source code for searching the closest distance in a kd tree.
The fact is that what you propose is slow inefficient and offer not benefic whatsoever over the other method.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Another thing I have absolutely not motivation to promote algorithm A or B, contrary to some people that will do anything to show their method is the best and it is well explained in his or her book, never mind the idea is not original.

Share this post


Link to post
Share on other sites

Quote:
Furthermore you are trashing the original creators of those ideas.
That's quite funny because you are the one trashing the original creators of the k-d tree and their original algorithm for nearest neighbor search in them!

For the record, the k-d tree was first described by Jon Bentley in:

J. L. Bentley. Multidimensional binary search trees used for associative searching. Communications of the ACM, 18(9):509-517, September 1975.

In a second paper from 1975, Friedman, Bentley, and Finkel describe how to perform a nearest-neighbor query on a k-d tree. The technical report version of this paper is available here:

ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/75/482/CS-TR-75-482.pdf

An improved version of Friedman et al's code is given by Arya and Mount in their paper "Algorithms for Fast Vector Quantization", available here:

http://www.cs.umd.edu/~mount/Papers/DCC.pdf

The algorithm I'm talking about is this, efficient, nearest-neighbor algorithm, as described by the inventor of the k-d tree representation!

Now, in contrast, the algorithm Jensen gives in his photon mapping book for nearest-neighbor search on a k-d tree is a less sophisticated traversal algorithm that searches more nodes of the k-d tree than Friedman et al's algorithm. Jensen's approach is therefore trivially inefficient compared to the algorithm that more aggresively culls subtrees that cannot contain points nearest than the (k-)best nearest point(s).

I really have no interest in fueling a flamewar here, I'm just posting what are the facts, verifiable by anyone who cares to read the above papers and understand the algorithm. My assumption was that some people would be interested in learning that there are better approaches. However, if you want to remain unware of better algorithms, that is your choice.

I've said what I had to say on the issue and I will just conclude by saying that for small problems (i.e. a small number of points to search amongst) Jensen's approach is fine (just as any spatial partitioning algorithm would be). For large problems, however, Jensen's approach will be outperformed by the more sophisticated algorithm presented by Finkel et al and by Arya and Mount.

Share this post


Link to post
Share on other sites
Thanks Christer I will read the two papers you posted, I think that´s what I need. And one more thing, maybe if the "anonymous poster" can register we can take your opinions more serious..
/per larsson

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this