Jump to content

  • Log In with Google      Sign In   
  • Create Account


which spatial indexing structure needs?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
6 replies to this topic

#1 alessio_pi_   Members   -  Reputation: 112

Like
0Likes
Like

Posted 28 November 2012 - 06:45 AM

Hello,

I have a set of 3d points and this set can change dynamically as the number and location.
I need to find for every iteration the closest pair of points set.I thought some spatial index structure because
it's very expensive for every iteration check the closest pair of points.
I need of spatial indexing structure that can dynamic update the points dataset, delete a point from the dataset, and
a fast search for closest point.

Thanks in advance.

Edited by alessio_pi_, 28 November 2012 - 10:51 AM.


Sponsor:

#2 Bacterius   Crossbones+   -  Reputation: 8178

Like
0Likes
Like

Posted 28 November 2012 - 01:42 PM

You mean, you need to find the two points A, B such that A and B are closest for any possible (A, B) pair? Do you just need to add and remove items, or is the entire dataset changing occasionally? And how often? A self-balancing kd-tree was the first thought that crossed my mind, but if most of your dataset is continually changing, I don't think any spatial structure will do - in general, data needs to be constant for them to be of any use (because it takes time to build them, too).

Edited by Bacterius, 28 November 2012 - 01:43 PM.

The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

 

- Pessimal Algorithms and Simplexity Analysis


#3 alessio_pi_   Members   -  Reputation: 112

Like
0Likes
Like

Posted 29 November 2012 - 02:17 AM

You mean, you need to find the two points A, B such that A and B are closest for any possible (A, B) pair? Do you just need to add and remove items, or is the entire dataset changing occasionally? And how often? A self-balancing kd-tree was the first thought that crossed my mind, but if most of your dataset is continually changing, I don't think any spatial structure will do - in general, data needs to be constant for them to be of any use (because it takes time to build them, too).


I should check that points in front to another points don't exceed a minimun distance. Maybe I should not check every points against all, but only a part of dataset.
For every iteration can happen:
1) one point is deleted from the dataset
2) one point deleted and 1 new point added
3) one point deleted and 2 new point added

I need data structure that permit do it fast.
My initial dataset can have from 200 to 1000 3d points more or less.

Thanks.

#4 LorenzoGatti   Crossbones+   -  Reputation: 2525

Like
1Likes
Like

Posted 29 November 2012 - 03:08 AM

If I understand correctly, you need to validate all points at every attempted change: the maximum distance between each point and its closest neighbour cannot exceed a given threshold d. Good news: it is quite a bit cheaper than maintaining or computing nearest neighbours. Is d constant?

Assuming d is constant, a uniform grid of cubes with edges of length d (possibly compacted with hashing from many grid cubes to few buckets) storing the list of contained points should be a simple way to limit the cost of each update. After adding and removing points in the appropriate cells, the only points that might have lost a nearest neighbour within distance d (possibly becoming invalid) are those in the cells from which you deleted points, and the 26 adjacent ones (take care to process each cell only once). For each point in those cells you need to find a close enough point, which requires you to search only 27 cells. Of course if a point has become invalid you can undo your additions and deletions.

In addition to the basic spatial index, you can invest memory to search even less. Associate every point P with a list of the "dependent" points that are valid because a previous search has found P to be close enough to them: when you delete P, only the points in its list of dependents might have become invalid (if P was the only point that was close enough) and require a new search (which, if successful, will add them to the dependent lists of other points).
Produci, consuma, crepa

#5 alessio_pi_   Members   -  Reputation: 112

Like
0Likes
Like

Posted 29 November 2012 - 05:11 AM

If I understand correctly, you need to validate all points at every attempted change: the maximum distance between each point and its closest neighbour cannot exceed a given threshold d. Good news: it is quite a bit cheaper than maintaining or computing nearest neighbours. Is d constant?

Assuming d is constant, a uniform grid of cubes with edges of length d (possibly compacted with hashing from many grid cubes to few buckets) storing the list of contained points should be a simple way to limit the cost of each update. After adding and removing points in the appropriate cells, the only points that might have lost a nearest neighbour within distance d (possibly becoming invalid) are those in the cells from which you deleted points, and the 26 adjacent ones (take care to process each cell only once). For each point in those cells you need to find a close enough point, which requires you to search only 27 cells. Of course if a point has become invalid you can undo your additions and deletions.

In addition to the basic spatial index, you can invest memory to search even less. Associate every point P with a list of the "dependent" points that are valid because a previous search has found P to be close enough to them: when you delete P, only the points in its list of dependents might have become invalid (if P was the only point that was close enough) and require a new search (which, if successful, will add them to the dependent lists of other points).


Yes the threshold d can be fixed. I could build an uniform grid of cubes but I don't understand how update and remove point from the grid.
if the the edge cell have size d, before I find the cell where point is in and to find the closest point with distance <d, I should check the adjacent cell of the grid o no?
Insertion and deletion what's operation needs? Find the right cell and remove or insert new point? What's happen if the new point inserted is out the bbox of the initial grid.
How is it inserted?

#6 LorenzoGatti   Crossbones+   -  Reputation: 2525

Like
2Likes
Like

Posted 29 November 2012 - 11:30 AM

Yes the threshold d can be fixed. I could build an uniform grid of cubes but I don't understand how update and remove point from the grid.

Every bin contains a set of points, and each point is in exactly one bin of your grid. This set of points can be stored in a resizable list or array; to add a point to such a list you can simply append it, to delete you need a linear search (but there should be very few points in each cell).

if the the edge cell have size d, before I find the cell where point is in and to find the closest point

There is no need to find the closest point, only any point that's closer than d; sometimes you'll pick the closest point by chance, but it isn't significant.

with distance Insertion and deletion what's operation needs? Find the right cell and remove or insert new point?

I don't understand what you mean. Finding the right cell is trivial; you only have to divide point coordinates by d and round consistently (e.g. floor) to obtain the index of the cell in the grid that contains the point.

What's happen if the new point inserted is out the bbox of the initial grid.
How is it inserted?

If you need to support a huge range of point positions you need an infinite grid rather than a bounded one. There are two very similar simple approaches:
  • Wrapping with modular arithmetic: choose a grid height H, width W and depth D such that H*W*D (the total number of cells) is tolerable and hopefully larger than the size of a typical point set. Then, instead of directly using potentially huge and out of bounds integer grid indices i=floor(x/d), j=floor(y/d), k=floor(z/d), place point (x,y,z) in cell (i%W, j%H, k%D). This has the effect of mixing together points from "aliased" distant places, which will be occasionally tested and found useless, wasting a little time. Obviously, neighbour cell indices computed by adding and subtracting 1 might wrap around to the opposite side of the grid; this easy and robust management of boundaries could be reason enough to adopt this kind of scheme, even with a small domain.
  • Arbitrary hashing: decide on a certain number N of bins, then map i,j,k to the 0...n-1 range with a hash function (which could be hard to choose correctly).

Edited by LorenzoGatti, 29 November 2012 - 11:38 AM.

Produci, consuma, crepa

#7 alessio_pi_   Members   -  Reputation: 112

Like
0Likes
Like

Posted 30 November 2012 - 06:39 AM

thanks Lorenzo.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS