Archived

This topic is now archived and is closed to further replies.

Generating a range of random integers

This topic is 5775 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

For that specific range, use the CRT rand() function like so:


(rand() % 90) + 10


That gives you 10-99, actually; modify as you see fit. For general ranges you could use a function similar to this:


int RandInRange( int min, int max )
{
// TODO: Assertions or checks for min < max, both > 0, etc.
return (rand() % (max - min)) + min;
}



--
Eric

Edited by - ekenslow on February 18, 2002 12:01:15 AM

Share this post


Link to post
Share on other sites
Eric's method reduces the randomness of the number returned. You'd probably only notice it if getting lots of random numbers. To get around that problem, do this:

any_type Value = ((rand()/(float)RAND_MAX)*(max - min))+min;


Edit: Left something out



Edited by - Null and Void on February 18, 2002 12:06:32 AM

Share this post


Link to post
Share on other sites
I''m tipsy from this wonderful Chimay Grande Reserve I just finished off, so I''ll point out (again) that using % with rand() won''t give you a nice distribution of random numbers - you should use the upper bits instead.

But of course, if your random numbers don''t matter too much, % is a perfectly fine way to go.

-scott

Share this post


Link to post
Share on other sites
You most certainly may recommend those - they are both great. As is Orval, Hapkin, Brigand, Piraat....oh the list is so long.

My favorite pasttime of late have been lambics, especially those from the Cantillon brewery. Last year, I was blessed to taste their apricot lambic on handpump in a local beer bar (where I will be headed tonight, as a matter of fact!). Pure enjoyment. (so sour, yet so good...)

Share this post


Link to post
Share on other sites
My assumption was that speed and simplicity were more critical than pure randomness. If randomness is critical, the CRT rand() function should be avoided anyway. See this thread for a discussion of alternate RNGs. Simply scaling the values returned from rand() is not sufficient, as rand() is just not that random to begin with. You''d be throwing away the advantages of speed and simplicity for no real gain.

In any case, I''m skeptical that using modulo on rand() reduces its randomness, although I''m not capable of proving it either way. Does anybody have a link to a good source about this?

--
Eric

Share this post


Link to post
Share on other sites
Firstly, random number generators are just patterns. A modulus operation is essentially another kind of pattern, and the two may coincide to produce a 3rd pattern. One example is that on some systems, random number generators tend to generate an odd number, then an even one, and so on. So rand() % 2 is often too predictable.

Secondly, unless the range is a multiple of the number you''re dividing by to get the remainder, then your result is biased a tiny amount towards the lower numbers. This is usually an insignificant amount, but that depends on how big your range is. If the max range of rand() was 32767 and you wanted a number between 1 and 20000, then rand() % 20000 is going to give you twice as much chance of scoring 10000 as it would of scoring 15000.

[ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost ]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
We dont use modulus because as many have said it alters the distribution. Instead you scale. If you need 0 to 20000 and you are getting 0 to 32767 then you multiply by 20000/32767 thus maintaining the distribution.

this works well:

rand()*(high-low+1)/RAND_MAX)+low

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Scaling is also faster since you only need mul and bitshifting (assuming the random range is 2^n, which it always is). % uses div, unless you do it on constant numbers that are p=2^n in which case it''s optimized to x&(p-1)

Share this post


Link to post
Share on other sites
Er, maths IS an authoritative source. You don''t need a paper by Knuth to see that taking the remainder of a division is not going to give you an equal distribution unless you''re dividing by a factor of the maximum.

[ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost ]

Share this post


Link to post
Share on other sites
Well, I just wrote a test program that compared scaling to modulo for getting 10,000,000 numbers in the range 0-20000, and except for tiny differences the results are the same either way. Tiny meaning thousandths of a percent maximum. In particular, the distribution at 5000, 10000, and 15000 was less than ten thousands of a percent different between the two methods. So much for the theory that modulo operations skew the result.

To summarize, using the modulo method with rand() is fine, provided that rand()''s randomness is good enough for your application. Scaling won''t buy you anything but floating-point headaches; integer operations are the way of speed and simplicity.

--
Eric

Share this post


Link to post
Share on other sites
Well, ekenslow, here''s your authoritative response:

"If you want to generate a random integer between 1 and 10, you should always do it by
  j=1+(int) (10.0*rand()/(RAND_MAX+1.0));  

and never by anything resembling
  j=1+((int) (1000000.0*rand()) % 10);  

(which uses lower-order bits)."
--Numerical Recipes in C: The Art of Scientific Computing (William H. Press, Brian P. Flannery, Saul A. Teukolsky, William T. Vetterling; New York: Cambridge University Press, 1990 (1st ed, p. 207))

HOWEVER, rand(), like a lot of functions, is implementation dependant, and some modern implementations (like the Linux C Library''s implementation) use a better algo. (it was this man page off of which I pulled the above quote.)

Besides, Belgian beer is a much more interesting topic.
-scott

Share this post


Link to post
Share on other sites
quote:
Original post by ekenslow
To summarize, using the modulo method with rand() is fine, provided that rand()'s randomness is good enough for your application. Scaling won't buy you anything but floating-point headaches; integer operations are the way of speed and simplicity.


OK, hate to bring cold hard facts into the middle of all this hand-waiving...oh, who am I kidding, I love it.

The period of the pseudo-random variable in my distribution is 2^32. How do I know? I wrote a program that did this:
holdrand = holdrand * 214013L + 2531011L; // ripped from my CRT
until holdrand = 0 again, incrementing a counter. My counter gets to UINT_MAX just as holdrand changes to 0 again. At this point, the pseudo-random number has run its sequence. The entire random number returned is given by this:
((holdrand = holdrand * 214013L + 2531011L) >> 16) & 0x7fff

So if your implementation's the same, you should get the same results.

I wrote the following program:
    
#include <cstdlib>
#include <climits>
#include <iostream>
#include <fstream>
#include <cassert>

using namespace std;

int main ()
{
srand (0);
unsigned long counts[(RAND_MAX+1)];
memset (counts, 0, sizeof (counts));
for (unsigned long i=0; i != ULONG_MAX; ++i)
{
++counts[rand()];
}

ofstream rpt ("report.txt");
for (i=0; i<RAND_MAX + 1; ++i)
{
rpt << i << ": " << (int) counts[i] << endl;
}
return 0;
}

This generates a report file with the frequency of every single number from 0 to RAND_MAX. Here's the abbrieviated output:
0: 131071
1: 131072
2: 131072
3: 131072
4: 131072
5: 131072
6: 131072
7: 131072
8: 131072
9: 131072
10: 131072
11: 131072
12: 131072
13: 131072
14: 131072
....should I countinue?...naw, let's skip to the end...
32757: 131072
32758: 131072
32759: 131072
32760: 131072
32761: 131072
32762: 131072
32763: 131072
32764: 131072
32765: 131072
32766: 131072
32767: 131072

So, if your range is something like, 0 - 0x6000, and you use modulo instead of a range multiplier, values of 0-0x1FFF will be TWICE as likely as values of 0x1FFF-6000. If you use the scaling strategy, individual numbers will be more likely than other numbers, but these peaks will be spread out over the range.

edit: miscalculated the aliasing error range.


Edited by - Stoffel on February 20, 2002 8:38:57 PM

Share this post


Link to post
Share on other sites
PS: Duvel rocks. There was another I liked while I was in Holland, but I can''t remember how to spell the name: La Sheuffe or something? I think it literally translates as "The Elf". Good stuff. Get you ''faced in no time at all.

Share this post


Link to post
Share on other sites
quote:
Original post by ekenslow
Well, I just wrote a test program that compared scaling to modulo for getting 10,000,000 numbers in the range 0-20000, and except for tiny differences the results are the same either way.

Find a system where RAND_MAX is 32767, and you will be very wrong. Or use a range that is a larger proportion of RAND_MAX, and the error will grow accordingly.


[ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost ]

Share this post


Link to post
Share on other sites
I''m using VC6, where RAND_MAX is 32767. If we were talking a crypto application I wouldn''t argue this at all, but for a game situation, particularly in a case where you need numbers in a small range such as 0-100, modulo is fine and faster than scaling by a factor of 2.5 on my machine. Again, my assumption is that speed is key, as for a game application. Also again, if the exact distribution is that important you should probably be using a better RNG than CRT rand() anyway.

scaught, thank you for the reference.

--
Eric

Share this post


Link to post
Share on other sites
Another thing to consider (besides just distribution) is the pattern (for lack of a better word) in which the numbers come out - for example, a random number generator is inherently faulty if each number generated is greater than the last until it wraps around some point, no matter how ''random'' the numbers it actually generates are (example, 1, 3, 4, 10, 11, 16, 18...).

Our game, at one point (7yo codebases can suck sometimes), had someone''s half-assed implementation of what they thought was a LCS-based random number generator. While it provided sufficient distribution, it''s pattern of numbers caused certain logic based on multiple random numbers to trigger very similarly most of the time - an effect which caused our gameplay to severely suffer.

-scott

Share this post


Link to post
Share on other sites
If RAND_MAX is 32767, and your range is 20000, and using modulus you don''t find that 16000 occurs twice as frequently as 17000, then your program has a bug in it. It''s that simple.

[ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost ]

Share this post


Link to post
Share on other sites
quote:
Original post by ekenslow
If we were talking a crypto application I wouldn''t argue this at all, but for a game situation, particularly in a case where you need numbers in a small range such as 0-100, modulo is fine and faster than scaling by a factor of 2.5 on my machine.


That''s certainly your decision to make. Ever played Diablo 2? For months, the best kind of weapon for a certain class (exceptional lance) would never appear in the game. Other instances have appeared over the past where certain monsters always dropped certain items, etc. Wanna take bets that it was careless implementation of the random number generator? 10,000 users playing your game for hours on end _will_ uncover any minute mistake, especially in random number generation.

quote:

Again, my assumption is that speed is key, as for a game application.


If you''ve profiled your game and found that the random number generator is a significant bottleneck, I will concede that point as well.

quote:

Also again, if the exact distribution is that important you should probably be using a better RNG than CRT rand() anyway.


See, here''s where these "rand isn''t good enough" posts get frustrating for me. (I know this isn''t a "rand isn''t good enough" post, but it''s been cross-linked with one.) When you go about proving that rand _DOES_ generate a perfectly uniform output, and that aliasing the output _WILL_ skew your probability density, people say you''re being too nitpicky. Chaps my hide.

quote:
Original post by scaught
Another thing to consider (besides just distribution) is the pattern (for lack of a better word) in which the numbers come out - for example, a random number generator is inherently faulty if each number generated is greater than the last until it wraps around some point, no matter how ''random'' the numbers it actually generates are (example, 1, 3, 4, 10, 11, 16, 18...).


What you''re talking about is the joint distribution between samples, or whether samples are independent from each other. One of the ways to evaluate this is the autocorrelation of a sequence, which is a plot of power verses lag. Assuming you have a sequence of N samples, lag 0 is the sum of the product of every sample:
x(0)*x(0) + x(1)*x(1) + ... + x(N)*x(N)
Lag 1 is the sum of all samples that are 1 unit apart:
x(0)*x(1) + x(1)*x(2) + ... + x(N-1)*x(N)
Lag 2 is the sum of all samples that are 2 units apart:
x(0)*x(2) + x(1)*x(3) + ... + x(N-2)*x(N)
etc., etc.

A random sequence of any zero-mean distribution should have a near-zero autocorrelation with a single spike at lag 0 (being the variance of the random process). (You could use this to evaluate rand since it has a known range by plotting the autocorrelation of (rand() - RAND_MAX/2), which would make it a zero-mean process.) A perfect Pseudo-random number generator would have peaks at lags corresponding to the sequence''s period (2^32 in rand ()''s case). I haven''t looked at rand''s autocorrelation, but I''m just mentioning it for those who haven''t heard.

In the case of the not-so-random function you mention, the autocorrelation would have been an increasing function with each lag--decidedly non-random.

Share this post


Link to post
Share on other sites
quote:
Original post by ekenslow
in a case where you need numbers in a small range such as 0-100, modulo is fine and faster than scaling by a factor of 2.5 on my machine.

Since when is modulo faster?

(1) rand()%100 : uses div, very slow
(2) rand()*100/(RAND_MAX+1) : uses mul and bitshifting, both quite fast operations

I did some tests too. I.e. running through a huge loop that uses rand() on each iteration, with either method (1) or (2)
I compiled on MSVC 6.0 with full optimizations, computer is Athlon 700 MHz.

(1) time elapsed: 7,3 seconds
(2) time elapsed: 2,0 seconds

So the modulo method is 3,6 times slower than the scaling method.

And if you do an inline function like
  inline int rnd(int a) { return rand()*a/(RAND_MAX+1); }  


it shouldn't be too awkward to use either (and it is still as fast).


Edited by - civguy on February 22, 2002 8:01:25 AM

Share this post


Link to post
Share on other sites
I''ve been using a version of the Mersenne Twister that I''ve wrapped up into a C++ class. I present it here as an alternative to rand(). I have been able to generate 10,000,000 random numbers in a range between 1 and 25 in about 0.28 seconds. That''s plenty fast and the distribution is completely satisfactory.

I''ve removed the comments from the original implementor as they
were very lengthy and you can find info for the MT anywhere on the web

Here''s the code;
      
// RANDOM.H - The interface for CRandomMT

#if !defined(RANDOM_GENERATOR_INCLUDED)
#define RANDOM_GENERATOR_INCLUDED

#if _MSC_VER > 1000
#pragma once
#endif // _MSC_VER > 1000


#include <Windows.h>
#include <math.h>
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

// uint32 must be an unsigned integer type capable of holding at least 32

// bits; exactly 32 should be fastest, but 64 is better on an Alpha with

// GCC at -O3 optimization so try your options and see what''s best for you

//


typedef unsigned long uint32;

const int N = 624; // I replaced the #defines for these with const values

const int M = 397;
const int K = 0x9908B0DFU;

#define hiBit(u) ((u) & 0x80000000U) // mask all but highest bit of u


#define loBit(u) ((u) & 0x00000001U) // mask all but lowest bit of u


#define loBits(u) ((u) & 0x7FFFFFFFU) // mask the highest bit of u


#define mixBits(u, v) (hiBit(u)|loBits(v)) // move hi bit of u to hi bit of v



class CRandomMT
{
uint32 state[N+1]; // state vector + 1 extra to not violate ANSI C

uint32 *next; // next random value is computed from here

int left; // can *next++ this many times before reloading


public:
CRandomMT();
~CRandomMT(){};
void SeedMT(uint32 seed);
int RandomMax(int max);
int RandomRange(int lo, int hi);
int RollDice(int face, int number_of_dice);

int D6(int die_count) { return RollDice(6,die_count); }
int D8(int die_count) { return RollDice(8,die_count); }
int D10(int die_count) { return RollDice(10,die_count); }
int D12(int die_count) { return RollDice(12,die_count); }
int D20(int die_count) { return RollDice(20,die_count); }
int D25(int die_count) { return RollDice(25,die_count); }


inline uint32 RandomMT(void);
private:
uint32 ReloadMT(void);
};

#endif //RANDOM_GENERATOR_INCLUDED



          
#include "Random.h"

CRandomMT::CRandomMT()
{
left = -1;
SeedMT(4357U);
}

void CRandomMT::SeedMT(uint32 seed)
{
register uint32 x = (seed | 1U) & 0xFFFFFFFFU, *s = state;
register int j;

for(left=0, *s++=x, j=N; --j;
*s++ = (x*=69069U) & 0xFFFFFFFFU);
}

uint32 CRandomMT::ReloadMT(void)
{
register uint32 *p0=state, *p2=state+2, *pM=state+M, s0, s1;
register int j;

if(left < -1)
SeedMT(4357U);

left=N-1, next=state+1;

for(s0=state[0], s1=state[1], j=N-M+1; --j; s0=s1, s1=*p2++)
*p0++ = *pM++ ^ (mixBits(s0, s1) >> 1) ^ (loBit(s1) ? K : 0U);

for(pM=state, j=M; --j; s0=s1, s1=*p2++)
*p0++ = *pM++ ^ (mixBits(s0, s1) >> 1) ^ (loBit(s1) ? K : 0U);

s1=state[0], *p0 = *pM ^ (mixBits(s0, s1) >> 1) ^ (loBit(s1) ? K : 0U);
s1 ^= (s1 >> 11);
s1 ^= (s1 << 7) & 0x9D2C5680U;
s1 ^= (s1 << 15) & 0xEFC60000U;
return(s1 ^ (s1 >> 18));
}


inline uint32 CRandomMT::RandomMT(void)
{
uint32 y;

if(--left < 0)
return(ReloadMT());

y = *next++;
y ^= (y >> 11);
y ^= (y << 7) & 0x9D2C5680U;
y ^= (y << 15) & 0xEFC60000U;
return(y ^ (y >> 18));
}


int CRandomMT::RandomMax(int max)
{
return((RandomMT()%max));
}


int CRandomMT::RandomRange(int lo, int hi)
{
// This is a recursive function

int num = abs(RandomMT());
int mod = (hi - lo + 1) + lo;
int rv = (num % mod);
if(rv == 0)
{
rv = RandomRange(lo,hi);
}
return(rv);
}

int CRandomMT::RollDice(int face, int number_of_dice)
{
int roll = 0;
for(int loop=0; loop < number_of_dice; loop++)
{
roll += (RandomRange(1,face));
}
return roll;
}


My contribution to the code was to wrap it up into a C++ class and add D##() functions to the interface.
Give it a try...

Dave "Dak Lozar" Loeser

here''s a test of the generator

         
#include "Random.h"

#define NUMBEROFROLLS 100000

int main(int argc, char* argv[])
{
CRandomMT mt;
printf("Generating %ld random numbers...\n",NUMBEROFROLLS*10);
DWORD start = GetTickCount();
for(int j=0; j<NUMBEROFROLLS; j++)
{
mt.D25(1);
mt.D25(1);
mt.D25(1);
mt.D25(1);
mt.D25(1);
mt.D25(1);
mt.D25(1);
mt.D25(1);
mt.D25(1);
mt.D25(1);
mt.D25(1);
//printf(" %lu%s", (unsigned long) mt.D25(1), (j%7)==6 ? "\n" : " ");

}

DWORD end = GetTickCount();
printf("%d random numbers generated in %f seconds\n",NUMBEROFROLLS*10,(double)(end-start)/1000);
return 0;
}


Dave "Dak Lozar" Loeser

Share this post


Link to post
Share on other sites