"Unique" random number generator

Started by
47 comments, last by random_thinker 18 years, 8 months ago
Let's say that I have a RNG that generates number between 0 and N - 1. Now I want M random numbers that are unique in the sense that the same number may not be used twice. I.e: N = 20, M = 5 Valid sequence: 10, 19, 3, 7, 8 Invalid sequence: 10, 19, 3, 10, 8 (Two tens not allowed). Is it possible to achieve this in linear time, O(M)? I've managed to do this in O(N + M) and O(M * M). I know there is a method that generates all 2^x numbers in a seamingly random order (in O(M)). This method isn't random enough for me (same sequency all the time) and I need to use a very good RNG (actually it's a real RNG not an algorithm).
Advertisement
The simplest and, I believe, fastest method is to generate an array with the numbers 0 to N-1 and shuffle it, then draw the first M elements.

Quote:I know there is a method that generates all 2^x numbers in a seamingly random order (in O(M)).


You can first randomly build a bit permutation table...
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan
Quote:The simplest and, I believe, fastest method is to generate an array with the numbers 0 to N-1 and shuffle it, then draw the first M elements.

First filling the array is O(N).
The number of times you need to swap to get good randomness is 2N (in my experience) so this is still O(N).
Third getting the numbers is O(1).
So getting M numbers is O(N + M) which I already have a method for (which I'm quite sure is faster than the init/swap setup).

Quote:You can first randomly build a bit permutation table...

Sure can!
But building it is also O(N), so the total time is O(N + M).
Quote:The number of times you need to swap to get good randomness is 2N (in my experience) so this is still O(N).


How do you define "good randomness"? std::random_shuffle is generally implemented with only does one pass.

If N is much bigger than M, you can use a rejection method, with a hash table for the already-drawn lookup. Of course, big-O analysis gets a bit more complex (meaning, I don't care about doing the math right now).

Oh, and building a bit permutation table would be O(log N), not O(N).


FWIW, I don't think there exist any magic solution that's any faster than what you already have.
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan
Quote:std::random_shuffle

Never used that one :)
But it's still O(N) to initiate the numbers.
Quote:If N is much bigger than M, you can use a rejection method

This is what my O(M * M) method was all about.

I should have mentioned that my M is around N/2 to N/5.
In my case the O(N + M) is alot better then the O(M * M) one.

Quote:FWIW, I don't think there exist any magic solution that's any faster than what you already have.

I really didn't think so either :) Just wanted to do a sanity check to see if I was missing something obvious.

Quote:Oh, and building a bit permutation table would be O(log N), not O(N).

Oh.. I was missing something here.
The method (I lost the name, anyone?) I talked about gives N numbers in the [0, N) intervall.
You pick a start value and then you get the next value from the current, ie: x = getNext(x);
The N'th number is always zero, the first may not be zero.
The only random about this method is chosing the intitial value.
I guess you could xor the current x with a "seed", is this what you mean?
I wonder how "random" this will get, quite interesting though.
In my case I can't use it since I need true random numbers.
If you're running on Windows you could use RtlGenRandom or CryptGenRandom to get cryptographically safe random numbers.

http://msdn.microsoft.com/library/en-us/seccrypto/security/rtlgenrandom.asp
http://msdn.microsoft.com/library/en-us/seccrypto/security/cryptgenrandom.asp
Quote:Original post by eq
The only random about this method is chosing the intitial value.

The only thing random about any PRNG is chosing the initial value. Unless you have a hardware generator, there's only so much you can hope for.

CM
Quote:Original post by eq
Let's say that I have a RNG that generates number between 0 and N - 1.
Now I want M random numbers that are unique in the sense that the same number may not be used twice.

I'm not sure what you're using the numbers for, but you should be aware that placing conditions like this actually decreases the randomness (in the mathematical sense) that you're getting. A malicious attacker could potentially exploit this fact to his advantage.

Quote:I know there is a method that generates all 2^x numbers in a seamingly random order (in O(M)). This method isn't random enough for me (same sequency all the time) and I need to use a very good RNG (actually it's a real RNG not an algorithm).

This is impossible. Deterministic machines (read: computers) are not capable of producing "real" RNGs without some kind of truly random input.
- k2"Choose a job you love, and you'll never have to work a day in your life." — Confucius"Logic will get you from A to B. Imagination will get you everywhere." — Albert Einstein"Money is the most egalitarian force in society. It confers power on whoever holds it." — Roger Starr{General Programming Forum FAQ} | {Blog/Journal} | {[email=kkaitan at gmail dot com]e-mail me[/email]} | {excellent webhosting}
If you use a sparse array I believe you can get the complexity down to O(M), but this will only be faster in the case where N is large and M is small, because of the increased overhead of the sparse array. Something like:
#include <algorithm>#include <deque>#include <iostream>#include <map>int random(int high){	return (std::rand() * high) / RAND_MAX;}std::deque< int > uniqueRandoms(int range, int count){	std::deque< int > results(count);	std::map< int, int > mapping;	for (int i = 0; i < count; ++i)	{		int value = random(range);		std::map< int, int >::iterator lookup1 = mapping.find(i);		std::map< int, int >::iterator lookup2 = mapping.find(value);		if (lookup1 == mapping.end())		{			if (lookup2 == mapping.end())			{				mapping.insert(std::make_pair(i, value));				mapping.insert(std::make_pair(value, i));			}			else			{				mapping.insert(std::make_pair(i, lookup2->second));				lookup2->second = i;			}		}		else		{			if (lookup2 == mapping.end())			{				mapping.insert(std::make_pair(value, lookup1->second));				lookup1->second = value;			}			else			{				std::swap(lookup1->second, lookup2->second);			}					}	}	std::map< int, int >::iterator it = mapping.begin();	for (int i = 0; i < count; ++i, ++it)	{		results = it->second;	}	return results;}int pass = 0;int fail = 0;void test(std::deque< int > results){	std::sort(results.begin(), results.end());	if (std::unique(results.begin(), results.end()) == results.end())	{		++pass;	}	else	{		++fail;	}}int main(){	for (unsigned int i = 0; i < 1000; ++i)	{		int outof = 1 + random(8000);		int count = random(outof);		std::deque< int > results = uniqueRandoms(outof, count);		test(results);	}	std::cout << "Passed: " << pass << "\nFailed: " << fail << '\n';}


Enigma
Quote:is impossible. Deterministic machines (read: computers) are not capable of producing "real" RNGs without some kind of truly random input.

This is why I wrote: "actually it's a real RNG not an algorithm".
I have hardware that detects radioactive decay and uses that as input (and some other stuff that uses thermal noise etc).

The question was not how to get good random number, that is solved.

The problem is to make sure that any random number within a range isn't used twice or more efficiently.

Quote:If you're running on Windows you could use RtlGenRandom or CryptGenRandom to get cryptographically safe random numbers.

This is what I use for testing (without the external hardware).
It's not a cryptographic application I'm working with though.

Still doesn't "solve" the problem, i.e generating M random numbers within N possible values in O(M) time, without ANY duplicates in M.

Fruny's method and mine solves the problem, but not in O(M) but rather in O(N + M).

Thanks for all suggestions anyway, but I believe that as Fruny said, it's impossible.
If you only needed a certain degree of "randomness" then a O(M) method might be possible using the *** algorithm, i.e:
unsigned int x = rand(1, N - 1);unsigned int seed = rand(0, N - 1);for (unsigned int m = 0; m < 10; ++ m){  x = getNext(x);  randomValue[m] = x ^ seed;}

This topic is closed to new replies.

Advertisement