Sign in to follow this  
alessio_pi_

which spatial indexing structure needs?

Recommended Posts

alessio_pi_    123
Hello,

I have a set of 3d points and this set can change dynamically as the number and location.
I need to find for every iteration the closest pair of points set.I thought some spatial index structure because
it's very expensive for every iteration check the closest pair of points.
I need of spatial indexing structure that can dynamic update the points dataset, delete a point from the dataset, and
a fast search for closest point.

Thanks in advance. Edited by alessio_pi_

Share this post


Link to post
Share on other sites
Bacterius    13165
You mean, you need to find the two points A, B such that A and B are closest for any possible (A, B) pair? Do you just need to add and remove items, or is the [i]entire[/i] dataset changing occasionally? And how often? A self-balancing kd-tree was the first thought that crossed my mind, but if most of your dataset is continually changing, I don't think any spatial structure will do - in general, data needs to be constant for them to be of any use (because it takes time to build them, too). Edited by Bacterius

Share this post


Link to post
Share on other sites
alessio_pi_    123
[quote name='Bacterius' timestamp='1354131738' post='5005046']
You mean, you need to find the two points A, B such that A and B are closest for any possible (A, B) pair? Do you just need to add and remove items, or is the [i]entire[/i] dataset changing occasionally? And how often? A self-balancing kd-tree was the first thought that crossed my mind, but if most of your dataset is continually changing, I don't think any spatial structure will do - in general, data needs to be constant for them to be of any use (because it takes time to build them, too).
[/quote]

I should check that points in front to another points don't exceed a minimun distance. Maybe I should not check every points against all, but only a part of dataset.
For every iteration can happen:
1) one point is deleted from the dataset
2) one point deleted and 1 new point added
3) one point deleted and 2 new point added

I need data structure that permit do it fast.
My initial dataset can have from 200 to 1000 3d points more or less.

Thanks.

Share this post


Link to post
Share on other sites
LorenzoGatti    4442
If I understand correctly, you need to validate all points at every attempted change: the maximum distance between each point and its closest neighbour cannot exceed a given threshold [i]d[/i]. Good news: it is quite a bit cheaper than maintaining or computing nearest neighbours. Is [i]d[/i] constant?

Assuming [i]d[/i] is constant, a uniform grid of cubes with edges of length [i]d[/i] (possibly compacted with hashing from many grid cubes to few buckets) storing the list of contained points should be a simple way to limit the cost of each update. After adding and removing points in the appropriate cells, the only points that might have lost a nearest neighbour within distance [i]d[/i] (possibly becoming invalid) are those in the cells from which you deleted points, and the 26 adjacent ones (take care to process each cell only once). For each point in those cells you need to find a close enough point, which requires you to search only 27 cells. Of course if a point has become invalid you can undo your additions and deletions.

In addition to the basic spatial index, you can invest memory to search even less. Associate every point P with a list of the "dependent" points that are valid because a previous search has found P to be close enough to them: when you delete P, only the points in its list of dependents might have become invalid (if P was the only point that was close enough) and require a new search (which, if successful, will add them to the dependent lists of other points).

Share this post


Link to post
Share on other sites
alessio_pi_    123
[quote name='LorenzoGatti' timestamp='1354180098' post='5005235']
If I understand correctly, you need to validate all points at every attempted change: the maximum distance between each point and its closest neighbour cannot exceed a given threshold [i]d[/i]. Good news: it is quite a bit cheaper than maintaining or computing nearest neighbours. Is [i]d[/i] constant?

Assuming [i]d[/i] is constant, a uniform grid of cubes with edges of length [i]d[/i] (possibly compacted with hashing from many grid cubes to few buckets) storing the list of contained points should be a simple way to limit the cost of each update. After adding and removing points in the appropriate cells, the only points that might have lost a nearest neighbour within distance [i]d[/i] (possibly becoming invalid) are those in the cells from which you deleted points, and the 26 adjacent ones (take care to process each cell only once). For each point in those cells you need to find a close enough point, which requires you to search only 27 cells. Of course if a point has become invalid you can undo your additions and deletions.

In addition to the basic spatial index, you can invest memory to search even less. Associate every point P with a list of the "dependent" points that are valid because a previous search has found P to be close enough to them: when you delete P, only the points in its list of dependents might have become invalid (if P was the only point that was close enough) and require a new search (which, if successful, will add them to the dependent lists of other points).
[/quote]

Yes the threshold d can be fixed. I could build an uniform grid of cubes but I don't understand how update and remove point from the grid.
if the the edge cell have size d, before I find the cell where point is in and to find the closest point with distance <d, I should check the adjacent cell of the grid o no?
Insertion and deletion what's operation needs? Find the right cell and remove or insert new point? What's happen if the new point inserted is out the bbox of the initial grid.
How is it inserted?

Share this post


Link to post
Share on other sites
LorenzoGatti    4442
[quote name='alessio_pi_' timestamp='1354187484' post='5005261']
Yes the threshold d can be fixed. I could build an uniform grid of cubes but I don't understand how update and remove point from the grid.[/quote]
Every bin contains a set of points, and each point is in exactly one bin of your grid. This set of points can be stored in a resizable list or array; to add a point to such a list you can simply append it, to delete you need a linear search (but there should be very few points in each cell).
[quote]
if the the edge cell have size d, before I find the cell where point is in and to find the closest point[/quote]There is no need to find the closest point, only any point that's closer than [i]d[/i]; sometimes you'll pick the closest point by chance, but it isn't significant.[quote]
with distance Insertion and deletion what's operation needs? Find the right cell and remove or insert new point? [/quote]I don't understand what you mean. Finding the right cell is trivial; you only have to divide point coordinates by [i]d[/i] and round consistently (e.g. floor) to obtain the index of the cell in the grid that contains the point.[quote]What's happen if the new point inserted is out the bbox of the initial grid.
How is it inserted?
[/quote]
If you need to support a huge range of point positions you need an infinite grid rather than a bounded one. There are two very similar simple approaches:[list]
[*]Wrapping with modular arithmetic: choose a grid height H, width W and depth D such that H*W*D (the total number of cells) is tolerable and hopefully larger than the size of a typical point set. Then, instead of directly using potentially huge and out of bounds integer grid indices i=floor(x/d), j=floor(y/d), k=floor(z/d), place point (x,y,z) in cell (i%W, j%H, k%D). This has the effect of mixing together points from "aliased" distant places, which will be occasionally tested and found useless, wasting a little time. Obviously, neighbour cell indices computed by adding and subtracting 1 might wrap around to the opposite side of the grid; this easy and robust management of boundaries could be reason enough to adopt this kind of scheme, even with a small domain.
[*]Arbitrary hashing: decide on a certain number N of bins, then map i,j,k to the 0...n-1 range with a hash function (which could be hard to choose correctly).
[/list] Edited by LorenzoGatti

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this