alessio_pi_ 123 Report post Posted November 28, 2012 (edited) Hello, I have a set of 3d points and this set can change dynamically as the number and location. I need to find for every iteration the closest pair of points set.I thought some spatial index structure because it's very expensive for every iteration check the closest pair of points. I need of spatial indexing structure that can dynamic update the points dataset, delete a point from the dataset, and a fast search for closest point. Thanks in advance. Edited November 28, 2012 by alessio_pi_ 0 Share this post Link to post Share on other sites
Bacterius 13165 Report post Posted November 28, 2012 (edited) You mean, you need to find the two points A, B such that A and B are closest for any possible (A, B) pair? Do you just need to add and remove items, or is the [i]entire[/i] dataset changing occasionally? And how often? A self-balancing kd-tree was the first thought that crossed my mind, but if most of your dataset is continually changing, I don't think any spatial structure will do - in general, data needs to be constant for them to be of any use (because it takes time to build them, too). Edited November 28, 2012 by Bacterius 0 Share this post Link to post Share on other sites
alessio_pi_ 123 Report post Posted November 29, 2012 [quote name='Bacterius' timestamp='1354131738' post='5005046'] You mean, you need to find the two points A, B such that A and B are closest for any possible (A, B) pair? Do you just need to add and remove items, or is the [i]entire[/i] dataset changing occasionally? And how often? A self-balancing kd-tree was the first thought that crossed my mind, but if most of your dataset is continually changing, I don't think any spatial structure will do - in general, data needs to be constant for them to be of any use (because it takes time to build them, too). [/quote] I should check that points in front to another points don't exceed a minimun distance. Maybe I should not check every points against all, but only a part of dataset. For every iteration can happen: 1) one point is deleted from the dataset 2) one point deleted and 1 new point added 3) one point deleted and 2 new point added I need data structure that permit do it fast. My initial dataset can have from 200 to 1000 3d points more or less. Thanks. 0 Share this post Link to post Share on other sites
LorenzoGatti 4442 Report post Posted November 29, 2012 If I understand correctly, you need to validate all points at every attempted change: the maximum distance between each point and its closest neighbour cannot exceed a given threshold [i]d[/i]. Good news: it is quite a bit cheaper than maintaining or computing nearest neighbours. Is [i]d[/i] constant? Assuming [i]d[/i] is constant, a uniform grid of cubes with edges of length [i]d[/i] (possibly compacted with hashing from many grid cubes to few buckets) storing the list of contained points should be a simple way to limit the cost of each update. After adding and removing points in the appropriate cells, the only points that might have lost a nearest neighbour within distance [i]d[/i] (possibly becoming invalid) are those in the cells from which you deleted points, and the 26 adjacent ones (take care to process each cell only once). For each point in those cells you need to find a close enough point, which requires you to search only 27 cells. Of course if a point has become invalid you can undo your additions and deletions. In addition to the basic spatial index, you can invest memory to search even less. Associate every point P with a list of the "dependent" points that are valid because a previous search has found P to be close enough to them: when you delete P, only the points in its list of dependents might have become invalid (if P was the only point that was close enough) and require a new search (which, if successful, will add them to the dependent lists of other points). 1 Share this post Link to post Share on other sites
alessio_pi_ 123 Report post Posted November 29, 2012 [quote name='LorenzoGatti' timestamp='1354180098' post='5005235'] If I understand correctly, you need to validate all points at every attempted change: the maximum distance between each point and its closest neighbour cannot exceed a given threshold [i]d[/i]. Good news: it is quite a bit cheaper than maintaining or computing nearest neighbours. Is [i]d[/i] constant? Assuming [i]d[/i] is constant, a uniform grid of cubes with edges of length [i]d[/i] (possibly compacted with hashing from many grid cubes to few buckets) storing the list of contained points should be a simple way to limit the cost of each update. After adding and removing points in the appropriate cells, the only points that might have lost a nearest neighbour within distance [i]d[/i] (possibly becoming invalid) are those in the cells from which you deleted points, and the 26 adjacent ones (take care to process each cell only once). For each point in those cells you need to find a close enough point, which requires you to search only 27 cells. Of course if a point has become invalid you can undo your additions and deletions. In addition to the basic spatial index, you can invest memory to search even less. Associate every point P with a list of the "dependent" points that are valid because a previous search has found P to be close enough to them: when you delete P, only the points in its list of dependents might have become invalid (if P was the only point that was close enough) and require a new search (which, if successful, will add them to the dependent lists of other points). [/quote] Yes the threshold d can be fixed. I could build an uniform grid of cubes but I don't understand how update and remove point from the grid. if the the edge cell have size d, before I find the cell where point is in and to find the closest point with distance <d, I should check the adjacent cell of the grid o no? Insertion and deletion what's operation needs? Find the right cell and remove or insert new point? What's happen if the new point inserted is out the bbox of the initial grid. How is it inserted? 0 Share this post Link to post Share on other sites
LorenzoGatti 4442 Report post Posted November 29, 2012 (edited) [quote name='alessio_pi_' timestamp='1354187484' post='5005261'] Yes the threshold d can be fixed. I could build an uniform grid of cubes but I don't understand how update and remove point from the grid.[/quote] Every bin contains a set of points, and each point is in exactly one bin of your grid. This set of points can be stored in a resizable list or array; to add a point to such a list you can simply append it, to delete you need a linear search (but there should be very few points in each cell). [quote] if the the edge cell have size d, before I find the cell where point is in and to find the closest point[/quote]There is no need to find the closest point, only any point that's closer than [i]d[/i]; sometimes you'll pick the closest point by chance, but it isn't significant.[quote] with distance Insertion and deletion what's operation needs? Find the right cell and remove or insert new point? [/quote]I don't understand what you mean. Finding the right cell is trivial; you only have to divide point coordinates by [i]d[/i] and round consistently (e.g. floor) to obtain the index of the cell in the grid that contains the point.[quote]What's happen if the new point inserted is out the bbox of the initial grid. How is it inserted? [/quote] If you need to support a huge range of point positions you need an infinite grid rather than a bounded one. There are two very similar simple approaches:[list] [*]Wrapping with modular arithmetic: choose a grid height H, width W and depth D such that H*W*D (the total number of cells) is tolerable and hopefully larger than the size of a typical point set. Then, instead of directly using potentially huge and out of bounds integer grid indices i=floor(x/d), j=floor(y/d), k=floor(z/d), place point (x,y,z) in cell (i%W, j%H, k%D). This has the effect of mixing together points from "aliased" distant places, which will be occasionally tested and found useless, wasting a little time. Obviously, neighbour cell indices computed by adding and subtracting 1 might wrap around to the opposite side of the grid; this easy and robust management of boundaries could be reason enough to adopt this kind of scheme, even with a small domain. [*]Arbitrary hashing: decide on a certain number N of bins, then map i,j,k to the 0...n-1 range with a hash function (which could be hard to choose correctly). [/list] Edited November 29, 2012 by LorenzoGatti 2 Share this post Link to post Share on other sites
alessio_pi_ 123 Report post Posted November 30, 2012 thanks Lorenzo. 0 Share this post Link to post Share on other sites