Sign in to follow this  
pinacolada

pseudorandom function that's 1-to-1?

Recommended Posts

Here's what I'm trying to do. I want an algorithm that can give me a random ordering of N elements. I want it to visit each element exactly once (in other words, it's 1-to-1). Now, this problem is easy if you make a big array that encodes the function. As in, you say "int func[N]; for (int i=0; i < N; i++) func[i] = i;", and then you shuffle the array elements using whatever pseudorandom function you want, and now "func" gives you a random 1-to-1 translation of your N elements. Another way to solve the problem is to somewhere keep track of which elements have been visited already. But I'd like to find a solution where you don't have to allocate memory to keep track of things. As in, both the above ideas require O(N) memory. I want a solution that uses O(1) memory. All I want is a plain old pseudorandom function, with the extra guarantee that it will spit out each number once and only once. Any ideas? My best attempt so far as been with modulo arithmetic. The formula "(i * p) mod N" (for i = 0..N) is guaranteed to spit out every value between 0 and N, as long as p and N are coprime. But the order it gives is not really random. With N = 20 and p = 37, I got this sequence: 0, 17, 14, 11, 8, 5, 2, 19, 16, 13, 10, 7, 4, 1, 18, 15, 12, 9, 6, 3. And it seems like no matter what value of p I pick, I get similarly linear-esque numbers.

Share this post


Link to post
Share on other sites
Perhaps I'm missing the point or the following is wrong in some other way, but couldn't you just do:

// The numbers to be shuffled
int numbers[N];

// Shuffle
for ( int i = 0; i < NUM_SHUFLE_STEPS; i++ )
{
int a = rand() % N;
int b = rand() % N;

int t = numbers[a];
numbers[a] = numbers[b];
numbers[b] = t;
}

Note that NUM_SHUFFLE_STEPS does not have to be equal to N. Also note that modulo is bad for rand() but it serves my example. I think you would have thought of this yourself, so I probably missed something.

Greetz,

Illco

Share this post


Link to post
Share on other sites
Your idea of using modulos with a coprime is a good one. The ordering issue becomes less a problem for certain well-chosen coprimes, but I'd suggest that you simply do multiple layers of your basic method. Given m coprimes of N named p1 .. pm (ideally, they should be relatively prime to each other, too), compute for each i0 = 0 .. N-1:

i1 = (i0 * p1) mod N
i2 = (i1 * p2) mod N
...
im = (im-1 * pm) mod N

and then your i0th index is im. The higher your m is, the more random it'll look.

This is just off the cuff, so I take no responsibility. Tell me if it works, tho.

Share this post


Link to post
Share on other sites
Quote:
Original post by Illco
Perhaps I'm missing the point or the following is wrong in some other way, but couldn't you just do:
....


Yeah, that would do the job. But my goal is to find a solution that does not require O(N) memory.

Share this post


Link to post
Share on other sites
Quote:
Original post by pinacolada
Yeah, that would do the job. But my goal is to find a solution that does not require O(N) memory.


It does not, does it? Yes for the numbers you want to permute, but no other.

Share this post


Link to post
Share on other sites
I don't think it's possible. You have to somehow store the set of numbers you haven't visited yet. Since the all the possible sets have equal probabilites, you can't really compress the set in anyway.
You could maybe start with a set of numbers you've visited and halfway through switch to a set of numbers that are left. That way you'd split the memory usage. Though that wouldn't really be worth the trouble. And it's still O(N).

Share this post


Link to post
Share on other sites
Quote:
Original post by FlowingOoze
I don't think it's possible. You have to somehow store the set of numbers you haven't visited yet.

Not really. The key is pseudorandomness: Given the seed and the current index, you can deterministically find the set of all values you've already visited. The trick is to be able to find an element not in that set that doesn't require exhaustively computing the set each time.

Share this post


Link to post
Share on other sites
If you are willing to spend time O(n) everytime you ask for an element, it's doable. Other than that, I can't think of a good solution. It seems to me that you are trying to generate a pseudo-random permutation, and the result has length N, so there's not much you can do about it.

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
Quote:
Original post by FlowingOoze
I don't think it's possible. You have to somehow store the set of numbers you haven't visited yet.

Not really. The key is pseudorandomness: Given the seed and the current index, you can deterministically find the set of all values you've already visited. The trick is to be able to find an element not in that set that doesn't require exhaustively computing the set each time.

Yes, I suppose that if the combinations are not equally probable, then you could represent the visited numbers in smaller space.
How about using huffman coding or arithmetic coding to store the set, if you only take a subset all of the sets. If you take logarithm of all the possible sets, then you'd need O(log(N)) to store a set.
How to pick the sets, is a problem though. You'd need to know what the sets are to encode a set.

EDIT: Oh, misread that quote. Sorry.

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
Your idea of using modulos with a coprime is a good one. The ordering issue becomes less a problem for certain well-chosen coprimes, but I'd suggest that you simply do multiple layers of your basic method. Given m coprimes of N named p1 .. pm (ideally, they should be relatively prime to each other, too), compute for each i0 = 0 .. N-1:

i1 = (i0 * p1) mod N
i2 = (i1 * p2) mod N
...
im = (im-1 * pm) mod N

and then your i0th index is im. The higher your m is, the more random it'll look.

This is just off the cuff, so I take no responsibility. Tell me if it works, tho.


Excellent idea. Unfortunatly I tried it out and didn't have too much success. :(

Here's the result of 37*i mod 20, for i = 0..20:
0, 17, 14, 11, 8, 5, 2, 19, 16, 13, 10, 7, 4, 1, 18, 15, 12, 9, 6, 3

Here's the result of 149 * (37*i mod 20) mod 20:
0, 13, 6, 19, 12, 5, 18, 11, 4, 17, 10, 3, 16, 9, 2, 15, 8, 1, 14, 7

Here's the result of 71 * (149 * (37*i mod 20) mod 20) mod 20:
0, 3, 6, 9, 12, 15, 18, 1, 4, 7, 10, 13, 16, 19, 2, 5, 8, 11, 14, 17

So that second sequence definitely started to look good, but then the 3rd operation actually made it far less random-looking. I think the problem is that every one of these operations is a linear-esque translation. And that if we keep piling on more and more linear-esque translations, the end result is still going to be linear-esque.

Share this post


Link to post
Share on other sites
Yeah. Hm. Drat.

What you want, essentially, is a way to compute the nth element of the mth permutation of a sequence with time complexity O(1) and storage complexity O(1) (or, at least, O(log |N|)). Your random seed is then m. Unfortunately, I don't know such a method offhand; there are methods for computing the entire permutation, but that doesn't really help you.

Share this post


Link to post
Share on other sites
The best I can think of would require O(log(N!)) storage (and you don't want to know the time complexity). I'm beginning to think this isn't actually possible for arbitrary N. Do you have an upper limit on N?

Share this post


Link to post
Share on other sites
Heh, well to be honest, I don't really "need" a solution for my current project. The problem came up while I was writing a blackjack game, so in my case N equals 52. So allocating the extra memory for this is nothing.

It's just that this problem comes up frequently enough, that I thought it would be fun to try to find a solution. You know, just to have another trick to put in the coding toolbox.

But now I'm curious to hear this solution that requires O(log(N!)) space! Wouldn't O(log(N!)) actually be bigger than O(N)?

Share this post


Link to post
Share on other sites
Basically the theory is that any permutation of N elements can be represented by a number representing which number it is in the permutation. So storage for that number is log(N!). N! for the size of the number, log for the number of bits. So for smallish N, it will take less storage than storing the N elements because the constant terms are nice. Given the number you can reconstruct which element comes next and the next storage number with some really ugly formulas involving factorials. And that's when the time term blows up.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Assuming the N elements are distinct, then it would take lg(N!) bits to enumerate each permutation.

Share this post


Link to post
Share on other sites
What you are looking for is the usual RNG with a smaller-than-usual range.

For example, here is an LCG, which is similar to your formula: yi = ( a * yi-1 + b ) % m. In your case, m = 20. You can try different values for a and b until you find a set that generates all 20 values exactly once.

Unfortunately, you probably want an algorithm would find suitable values for a and b given m, but I doubt it exists. On the other hand, there are other kinds of RNGs that might meet your requirements.

Share this post


Link to post
Share on other sites
People that point that the space requirement is O(log(N!)) are right. The issue is that if your unsigned ints only have 32 bits and you want to generate a permutation of more than 2^32 elements, it is not enough to store the permutation as `unsigned int perm[N];'. In practice, though, it doesn't matter, since you are unlikely to want a permutation of that many elements.

Let's see how fast log(N!) grows:
log(N!) = log(1*2*3*...*(N-1)*N)
= log(1)+log(2)+log(3)+...+log(N-1)+log(N)
~= Integral(log(x),x=1..N)
= N*log(N)-N

So O(log(N!)) = O(N*log(N)), which makes sense since we could store the permutation as `my_class_of_length_log2_N perm[N]'.

Share this post


Link to post
Share on other sites
Using Google a little, I think I found a solution to your problem. There are cryptographic encoding methods called "block ciphers", which encode any sequence of k-bits as another sequence of exactly k-bits. The most famous of these methods is DES.

Here's the algorithm you can use.

class PseudoRandomPermutation{
unsigned N;
BlockCipherKey key;
unsigned count;

public:
PseudoRandomPermutation(unsigned N,BlockCipherKey key):N(N), key(key), count(0){
}

unsigned next(){
unsigned result;

do{
result = BlockCipherEncode(count++,key);
}while(result>=N);

return result;
}
};





Of course, you should seed your algorithm with a random key.

You are basically just asking to encode 0,1,2,3,4,... and the results are going to be k-bit numbers that never repeat. If a result if larger than N, you discard it. You must pick a value of k such that log2(N)<k. If k is the smallest of those values, you will hve to discard less than half of the results, so your algorithm is still fast.


Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this