• Advertisement
Sign in to follow this  

Selecting evenly distributed samples in a 10-dimensional space?

This topic is 4327 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi all, I'm sure this shouldn't be too hard, but for whatever reason my brain just isn't clicking today [headshake] I have a 10 component vector where each element is a floating point value ±X (X could be 1.0, 3.0, 10.0, 500.0 etc..) I want to pick N evenly distributed samples as a seed for a searching algorithm. The idea being that if they're evenly distributed then I should have a fairly unbiased search that can cover most (or all) of the search space. In context, I think my current randomly sampled approach introduces a bias if/when it clusters the initial samples too close together. Anyone got any ideas? Cheers, Jack

Share this post


Link to post
Share on other sites
Advertisement
Assume rand() is a function that generates uniform random numbers between zero and one. You want those numbers to fit into the range [-X,X], so use X*(1.0 - 2.0*rand()). Do that for each element in your array, using the appropriate X for each element.

Share this post


Link to post
Share on other sites
Quote:
Original post by jjd
Assume rand() is a function that generates uniform random numbers between zero and one. You want those numbers to fit into the range [-X,X], so use X*(1.0 - 2.0*rand()). Do that for each element in your array, using the appropriate X for each element.
Thats pretty much what I have already. I'm using a Mersenne Twister for PRNG's, so it should be uniformly distributed.

As an example of what I'm thinking...

If you had a two-dimensional search space and wanted 9 evenly distributed samples then you could easily work that out as being a 3x3 grid:

#-------#
| + + + |
| + + + |
| + + + |
#-------#


It's expanding that concept to N samples over a 10D space [smile]

Cheers,
Jack

Share this post


Link to post
Share on other sites
Quote:
Original post by jollyjeffers
Quote:
Original post by jjd
Assume rand() is a function that generates uniform random numbers between zero and one. You want those numbers to fit into the range [-X,X], so use X*(1.0 - 2.0*rand()). Do that for each element in your array, using the appropriate X for each element.
Thats pretty much what I have already. I'm using a Mersenne Twister for PRNG's, so it should be uniformly distributed.

As an example of what I'm thinking...

If you had a two-dimensional search space and wanted 9 evenly distributed samples then you could easily work that out as being a 3x3 grid:

#-------#
| + + + |
| + + + |
| + + + |
#-------#


It's expanding that concept to N samples over a 10D space [smile]

Cheers,
Jack



Ah, I see. Well, that's trickier [wink]

Perhaps you could use a Sobol sequence.

Share this post


Link to post
Share on other sites
Quote:
Original post by jjd
Ah, I see. Well, that's trickier [wink]
[headshake] Why doesn't that surprise me!

Although, based on your first reply, would it be a reasonable assertion that for good quality uniformly distributed numbers that you would get the pattern I desire?

I'm just wondering if using the Sobol Sequence (or whatever else) might actually be a whole lot of effort for minimal reward...

Cheers,
Jack

Share this post


Link to post
Share on other sites
A Sobol sequence would be sexy... [smile] but my first port of call would be the simple random search. Having said that, you might be able to improve on a simple random search like that by using adaptive monte carlo or markov-chain approaches. They're a bit more complicated, but not much. However, it depends upon whether you can get some kind of likelihood function to assist in the search, i.e. some measure of whether the thing you are searching for is likely to be closer or further away.

What are you trying to compute?

Share this post


Link to post
Share on other sites
Quote:
Original post by jollyjeffers
If you had a two-dimensional search space and wanted 9 evenly distributed samples then you could easily work that out as being a 3x3 grid:

#-------#
| + + + |
| + + + |
| + + + |
#-------#


It's expanding that concept to N samples over a 10D space [smile]


That concept in reverse is like saying "I want 3 samples on each axis, and to fill a 2-dimensional space". The number of samples you need is 3^2 = 9.

If you have a 10-dimensional space, and you want 3-samples on each axis, you'll need 3^10 = 59049 samples in total.

Going the other way, from a number of samples to where to place them, is obviously much harder, but the dificulty does not lie in the fact that you are dealing with a space of greater dimensionality. What would you do if, to copy your earlier example - "you had a two-dimensional search space and wanted 7 evenly distributed samples".

Put another way, is there any flexibility on the number of samples? If so, you could neep a number of "patterns" and choose that which had a number of samples closest to that requested. If you are wanting a solution that will work for any arbitrary number of samples, I beleive the problem is too general - your best bet will almost certainly be sticking to random samples.

Share this post


Link to post
Share on other sites
Quote:
Original post by jollyjeffers
In context, I think my current randomly sampled approach introduces a bias if/when it clusters the initial samples too close together.


If you are sampling an area for surface normal construction data, one method is to use a sphere of radius r in Rn as the guide for your sample point distribution guide.

The most basic sample would be a cross-shaped grid in Rn (the same as used for linear interpolation texture sampling).

Sample points in Rn (based at origin):

( 0, 0, 0, 0) // centre point
( 1, 0, 0, 0) // R1
(-1, 0, 0, 0)
( 0, 1, 0, 0) // R2
( 0, -1, 0, 0)
( 0, 0, 1, 0) // R3
( 0, 0, -1, 0)
( 0, 0, 0, 1) // R4
( 0, 0, 0, -1)

...

(0, 0, 0, 0, 0, 0, 0, 0, 0, 1) // R10
(0, 0, 0, 0, 0, 0, 0, 0, 0, -1)




Uniformly distributing points along a sphere surface / inside a sphere volume is a sticky problem which involves stochastic methods of creation.

An intermediate step between the cross structure and the stochastic methods would be to use the vertices from a unit icosahedron in Rn, radius scaled to r. The distribution is perfect.

To take the icosahedron from a 2D surface in R3 to a 3D space in R3, one can make a duplicate set of vertices and scale them to, say, r/2. Any number of layers (hulls) can be used.

Linear distribution of the hull radii isn't necessary either. A weighting function can be used. Where r <= 1.0, and 0.0 < scale_coeff <= 1.0; f(r, scale_coeff) = r*sqrt(scale_coeff)

[Edited by - taby on April 19, 2006 11:58:36 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by bakery2k1
...the dificulty does not lie in the fact that you are dealing with a space of greater dimensionality.


Conceptually I would agree, but performance-wise it is better to use random sampling that creating a grid-like pattern. This is the so-called "curse or dimensionality."

Share this post


Link to post
Share on other sites
Perhaps I'm missing something, but doesn't selecting N evenly distributed samples in a 10D space means that
n := ⌊N1/10
samples per dimension are available if all dimensions show the same range and N is a upper bound, or
n := ⌈N1/10
if N is an lower bound?

Thinking of each sample to "cover" the same area, the range
R
of each dimension is to be divided up into just n slices, so that each sample is in the center of that slice:
si := ( 0.5 + i ) R / n, 0 <= i < n

(That would become interesting if the ranges of each dimension are not the same...)

Quote:
Original post by jjd
Conceptually I would agree, but performance-wise it is better to use random sampling that creating a grid-like pattern. This is the so-called "curse or dimensionality."

I've understood the OP asking for a grid due to:
Quote:
Original post by jollyjeffers
The idea being that if they're evenly distributed then I should have a fairly unbiased search that can cover most (or all) of the search space.

In context, I think my current randomly sampled approach introduces a bias if/when it clusters the initial samples too close together.

Share this post


Link to post
Share on other sites
Quote:
Original post by haegarr
I've understood the OP asking for a grid...


He asked for evenly distributed samples. Given that he also mentioned random samples, I do not think he was restricting himself to a deterministic method.

Share this post


Link to post
Share on other sites
Quote:
Original post by jjd
Quote:
Original post by haegarr
I've understood the OP asking for a grid...


He asked for evenly distributed samples. Given that he also mentioned random samples, I do not think he was restricting himself to a deterministic method.

Okay, reading the OP another time gives you right.


I remember the days of programming a ray-tracer ... There was a method used to avoid lumping during (more or less) randomly sampling the scene. The basic method was to use a regular grid and to add some jitter to the sample positions. There was also an extension with a distance measure to guarantee a minimum distance. But AFAIK that method was already costly for 2 dimensions...

Share this post


Link to post
Share on other sites
Thanks for the many replies!

Quote:
it depends upon whether you can get some kind of likelihood function to assist in the search, i.e. some measure of whether the thing you are searching for is likely to be closer or further away.
Don't really have this information - I have my 10D vector and a f(x) function to evaluate it.

Quote:
What are you trying to compute?
Finding the global minimum of an arbitrary function. Nothing hugely exciting unfortunately [smile]

Quote:
If you are wanting a solution that will work for any arbitrary number of samples, I beleive the problem is too general - your best bet will almost certainly be sticking to random samples.
I think you might be right here. The number of samples is semi-variable... as in, it can be changed on different invokations, but once the algorithm starts it can't vary the number itself. Specifically, if its told to use 100 samples then it should use 100 - not 103 (etc..)

Quote:
He asked for evenly distributed samples. Given that he also mentioned random samples, I do not think he was restricting himself to a deterministic method.
My original post was possibly a bit misleading... I did want a deterministic method - my current approach is (theoretically) an evenly distributed random method.


To be honest, I'm not entirely sure what I wanted to start with [smile]

A problem I'm having is, when picking other parameters, each invokation of the algorithm is generating a new random distribution - thus the starting point is not the same. Whilst the results are roughly comparable, it'd be nice if they all started in the same state thus comparing apples-to-apples. This thought led me to devising an algorithm to distribute the samples correctly/evenly each time, and ended up leading to this thread...

Thanks for all your help so far - some interesting ideas. I think I'll just stick with my random sampling at the moment. Maybe I could improve on it, but it's not strictly "wrong" so I'll leave it for now...

Cheers,
Jack

Share this post


Link to post
Share on other sites
Quote:
Original post by haegarr
I remember the days of programming a ray-tracer ... There was a method used to avoid lumping during (more or less) randomly sampling the scene. The basic method was to use a regular grid and to add some jitter to the sample positions. There was also an extension with a distance measure to guarantee a minimum distance. But AFAIK that method was already costly for 2 dimensions...


I've used that trick too, but in many dimensions I believe it will be quite bad. However, I really like your original idea where you want to find a well distributed set of N points in the space. Perhaps finding the points that are maximally separated. Sounds tricky, but interesting.

Share this post


Link to post
Share on other sites
Quote:
Original post by jollyjeffers
Quote:
What are you trying to compute?
Finding the global minimum of an arbitrary function. Nothing hugely exciting unfortunately [smile]


In that case, if you are interested, I would suggest checking out the Metropolis-Hasting algorithm.

[Edit]
Metropolis-Hasting's is a Monte Carlo, Markov Chain method but it also falls under the name of simulated annealing. You can think of your f(x) as a likelihood function: The greater the value the more likely you have found the point you are looking for.

[Edit again] Err, for minima it should be "the lesser the value the more likely..."

Share this post


Link to post
Share on other sites
A hashing algorithm based on scene data could also be used to create the random generator seed, if one decides to rely on sample point sets generated per-frame. That way the same frame always generates the same sample point set.

There are a bunch of GPL MD5 C++ implementations out there.

To make the 128-bit MD5 hash key fit through a 32-bit random number generator, it can be chopped up into four 32-bit double words.

Feed the first dword in as the seed, then generate a random number A, repeat using each of the three remaining dwords, storing the random number in its respective random number variable B, C or D.

Calculate the arithmetic mean for A, B, C and D to obtain the final seed value for the frame.

Share this post


Link to post
Share on other sites
Quote:
Original post by jjd
[Edit]
Metropolis-Hasting's is a Monte Carlo, Markov Chain method but it also falls under the name of simulated annealing. You can think of your f(x) as a likelihood function: The greater the value the more likely you have found the point you are looking for.

[Edit again] Err, for minima it should be "the lesser the value the more likely..."


Good point, bringing up simulated annealing. I agree that the possibilities are there.

Relaxation techniques could also be used to numerically generate an approximately even distribution. :)

Fun!!!!!

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement