• Advertisement
Sign in to follow this  

relying on C/C++'s rand() function

This topic is 4329 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Afternoon all... I'm working on some AI-related search techniques (not hugely relevant though) that rely quite heavily on random numbers. Simple stuff like seeding lists of float's, picking a random sample from a list etc... Anyway, I can't (yet) find fault with my implementation - but the results when dumped to Excel seem to point the finger at the rand() function. I keep getting odd patterns where rand() will generate the same numbers repeatedly or the general distribution of randomness is a bit, well, un-random [lol] I don't know much about the stock rand() implementation (Visual C++ 2005), but I've always been working on the general assumption that its nothing special. Would it be a reasonable step to "blame" the PRNG? Either it's just crap and I need to roll my own (Mersenne Twister seems nice) or I possibly need to change my usage of it. My usage: At the start of each search with my algorithm I use srand( GetTickCount( ) ) to seed the generator. I then call rand() whenever I require a random number - probably 500x per iteration with 1000's of iterations. I use a simple (rand() % Max) to pick a random integer (e.g. when picking an index for a random sample from an array), and (static_cast< float >( rand() ) / static_cast< float >( RAND_MAX )) when I want a random value between 0.0 and 1.0 (often scaled to be a float within a given min/max range). I've tried various methods of srand() - never, once at application start, once at algorithm start, once every iteration, or every N iterations etc... I wanted to check with the infinite wisdom and knowledge of GDNet before I try implementing a new PRNG algorithm. Would be a waste of valuable time if my usage of rand() is just wrong or a different PRNG won't actually make any difference (etc..) Cheers, Jack

Share this post


Link to post
Share on other sites
Advertisement
Oh, one other thing...

Whilst running a forum search I came across a suggestion from Fruny about std::random_shuffle(). I'll see about trying that, but another comment in the same thread seemed to indicate that random_shuffle() would just use rand() internally - which strikes me as being no better than what I have. Or is it?

Cheers,
Jack

Share this post


Link to post
Share on other sites
The standard random function looks sth (correct me if I'm wrong):
I32 rand(void)
{
mSeed = (mSeed * 214013L + 2531011L) >> 16;
return mSeed & 0x7FFF;
}

Looks pretty simple - so I'm guessing it doesn't perform great in
terms of returning 'random' values...

The game programming gems series 1, 2 and 5 contain some article on rng,
maybe that will help...

Regards

Share this post


Link to post
Share on other sites
I had issues with rand(). It's slow, and if you're using it to create noise, it looks pretty un-random. I usually use a version of the Mersenne Twister which I found on the Internet somewhere.

Also, using modulo with rand() is A Bad Thing. The higher bits have more randomness to them, so something like int foo = (rand()&0x8000) >> 12; would be better.

Share this post


Link to post
Share on other sites
Thanks for the replies!

Quote:
The game programming gems series 1, 2 and 5 contain some article on rng,
maybe that will help...
Thanks for the tip, but I've left those books in storage at the moment [rolleyes]. Will have to settle for wikipedia for now!

Quote:
Also, using modulo with rand() is A Bad Thing. The higher bits have more randomness to them, so something like int foo = (rand()&0x8000) >> 12; would be better.
VERY interesting - very interesting indeed [grin]

Most of my arrays are between 10-250 in length, so my modulo will always be fairly small.

I think I'll go check out the Mersenne Twister and the non-modulus thing.

Cheers,
Jack

Share this post


Link to post
Share on other sites
random_shuffle() uses rand() by default, but you can pass it a custom function to use instead.
Boost has a slew of PRNGs, including a pair of Mersenne Twisters.

In either case, there is no need to actually roll your own. [smile]

Share this post


Link to post
Share on other sites
Quote:
Original post by Fruny
Boost has a slew of PRNGs, including a pair of Mersenne Twisters.
As stupid as it sounds, I've gotta stick to "vanilla" C/C++ for this particular prototype. Useful to note that Boost can help, but sadly not useful here [sad]

Quote:
Original post by Fruny
In either case, there is no need to actually roll your own. [smile]
I acquired this one (from here) in the end.

Initial results just by swapping like-for-like are substantially better: .

Cheers,
Jack

Share this post


Link to post
Share on other sites
Quote:
Original post by jollyjeffers
As stupid as it sounds, I've gotta stick to "vanilla" C/C++ for this particular prototype. Useful to note that Boost can help, but sadly not useful here [sad]


Then you need to get some better vanilla. See section 5.1. [grin]

OK, that's not helping, I know. [totally]

Share this post


Link to post
Share on other sites
Interesting... so I understand that linked document is a guideline/preview of what might be coming in a C++ compiler sometime in the future..?

I don't really keep up with the latest-n-greatest from the ISO commities [smile]

Cheers,
Jack

Share this post


Link to post
Share on other sites
Don't shy away from the simplicity of an LCG if it happens to suit your PRNG needs :)

Perhaps something simple like this:

long long iSeed;
long rand()
{
iSeed = (iSeed * 16807) & 0x0000FFFF;
return (long)iSeed;
}

A naive implementation off the top of my head, you might be able to use a regular 32-bit seed type if you know more about how the CPU handles overflow (the basic equation is Rn+1 = 16807*Rn mod (232 - 1)). If this doesn't work for you then you can stick with the Mersenne Twister, but this is simple enough to quickly substitute in and test.

Share this post


Link to post
Share on other sites
Quote:
Original post by Kitt3n
The standard random function looks sth (correct me if I'm wrong):


There is no standard implementation of rand(). The generators used can vary from junky pieces of... well, junk, to hooking into a hardware device that relies on quantum physics to produce truely random numbers.

I don't know what PRNG VS2k5 uses, but it is not unthinkable that it's implementation might be suspect. I consider boost a must, but if it really is out of your control, ripping apart one of the headers and removing the dependancies* until you par back to the basic functionality might work - plus, if you keep the same names, if your project manager ever changes his mind, dropping in the original version should get you all those features back at no additional cost.


* Note: I'm assuming there are dependancies, if there arn't it's even simpler.

Share this post


Link to post
Share on other sites
Quote:
Original post by MaulingMonkey
... to hooking into a hardware device that relies on quantum physics to produce truely random numbers.

I believe it is required that following a call to srand, rand produce the same sequence of values every time. I assume there's nothing stopping rand from being more powerful, but it can't go all the way to true randomness.

CM

Share this post


Link to post
Share on other sites
The way I understand random number implementations makes me suspect your multiple calls to srand(). The way rand() is implemented it gives you a series of (2^32)-1 psuedo-random numbers before beginning to repeat the sequence. The seed that is provided, call it n, will give you the nth term in this sequence and every subsequent call to rand() will return the (n + m)th term, m being the number of calls to rand(). So when you see repetition or non-random distribution it could be the following (just an example with small numbers):

seed = 1
sequence = 5, 3, 2, 9, 0, 10, 4, 7, ...

seed = 5
sequence = 9, 0, 10, 4, 7, ...

Notice with a seed of 5 you get the 5th term in the original sequence. I would suggest seeding once and then using rand() without re-seeding. If you need more than (2^32)-1 terms then you'll have to find something else. (or maybe it is 2^(32 - 1), I don't remember...)

Share this post


Link to post
Share on other sites
Quote:
Original post by CyberCowboy
The way I understand random number implementations makes me suspect your multiple calls to srand(). The way rand() is implemented it gives you a series of (2^32)-1 psuedo-random numbers before beginning to repeat the sequence. The seed that is provided, call it n, will give you the nth term in this sequence and every subsequent call to rand() will return the (n + m)th term, m being the number of calls to rand(). So when you see repetition or non-random distribution it could be the following (just an example with small numbers):

seed = 1
sequence = 5, 3, 2, 9, 0, 10, 4, 7, ...

seed = 5
sequence = 9, 0, 10, 4, 7, ...

Notice with a seed of 5 you get the 5th term in the original sequence. I would suggest seeding once and then using rand() without re-seeding. If you need more than (2^32)-1 terms then you'll have to find something else. (or maybe it is 2^(32 - 1), I don't remember...)

Almost everything you just said is wrong. Except the bit about only seeding once, that's sound advice.

Zipster: I recommend doing something to avoid the case where the user seeds with 0.

CM

Share this post


Link to post
Share on other sites
The check for zero would presumably come in the associated seeding function, and not in the actual generation function itself. Otherwise the LCG doesn't generate zeroes on its own. It also turns out that & 0x0000FFFF isn't the same as % (232 - 1), but that's what I was trying to do in spirit. So just replace it with the latter.

Share this post


Link to post
Share on other sites
For a proper LCG, you'd make some HUGE changes:
First, the function should not be solely multiplicative - it should have an additive term, as in "Seed = (Seed * X + Y) % M". This makes a longer cycle and makes 0 as good a seed as any other.
Next, the seed is not directly the return value of the generator - if you do that, your state can be no larger than the number you return, and that means you'll go through each number exactly once before the pattern repeats. Instead, return the seed shifted to the right, as in "return Seed >> 16"
Finally, let the computer do the modulus for you. It's not standard C/C++ afaik, but it'll work well enough on all the compilers I know of "Seed = Seed * X + Y" with an implied modulus equal to one more than the maximum integer.

Combine it all, and you get something like
DWORD Random(void)
{
Seed = Seed * 0x2109DEF1 + 0xF8B9;
return (Seed >> 16) & 0xFFFF;
}

I'd still suggest using the Mersenne Twister, though. It's far superior in every way except memory used (and the few KB it uses is negligable on a PC)

Share this post


Link to post
Share on other sites
Much of the non-randomness or repeating patterns can be attributed to calling srand more than once.
Basically you should only ever call it once, usually when the program first starts up. The only reason to ever call it again is if you want to repeat a previous series of random numbers over again. Otherwise you're not running for the full cycle length, but are instead jumping to somewhere in the middle each time. The effect of which is a reduction in the observed cycle length. Sometimes it will not affect the cycle length too much, but sometimes it will reduce it a heck of a lot![dead]

In other words, contrary to popular instincts, calling srand more often does not increase randomness, but actually decreases randomness!

What you are using it for does fall under the category of things where you do tend to want more randomness than rand typically offers. So you may also wish to try better alternatives suggested.

Another idea is to mix in some actual random data if you're comfortable with that approach. Xoring data from a network packet, or the time and value of the last key pressed are some such things that can also help.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Evil Steve
I had issues with rand(). It's slow, and if you're using it to create noise, it looks pretty un-random. I usually use a version of the Mersenne Twister which I found on the Internet somewhere.

Also, using modulo with rand() is A Bad Thing. The higher bits have more randomness to them, so something like int foo = (rand()&0x8000) >> 12; would be better.
These remarks are completely implementation dependent and may or may not hold for any specific standard library implementation.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Although a non 2^n modulo is evil for any RNG because it introcudes bias to the distribution. Suppose your RNG generates numbers on range 0-3 with even distribution. Then you want numbers on range 0-2 and use myrand()%3. Then the probabilities for numbers 0,1 and 2 are 0.5, 0.25 and 0.25 respectively.

Share this post


Link to post
Share on other sites
I don't understand why it is evil.

I wrote a quick program which generated random numbers and held the results.

I used "rand()%RANGE" where RANGE is an integer I can assign at will. For a test where RANGE was 3 I received the following counts after 100,000,000 runs through a loop;

0 = 33335937
1 = 33339453
2 = 33324610

I certainly don't see a .50, .25, .25 distribution here. I also did RANGE as 10 and 20 and received similar equal distribution.

For my uses this is pretty good. What am I missing here?

Thanks,
Eugenek

Share this post


Link to post
Share on other sites
Quote:
Original post by Eugenek
What am I missing here?

Quote:
Original post by an anonymous poster
Suppose your RNG generates numbers on range 0-3 with even distribution.


Even presuming your standard rand() function produced an even distribution, the range is (0,RAND_MAX).

Share this post


Link to post
Share on other sites
Quote:
Original post by Eugenek
I certainly don't see a .50, .25, .25 distribution here. I also did RANGE as 10 and 20 and received similar equal distribution.

For my uses this is pretty good. What am I missing here?

What is RAND_MAX of your C compiler? The bigger RAND_MAX, the less pronounced are the effects of %, depending on the ratio between RAND_MAX and RANGE the distribution is more or less uneven.

Share this post


Link to post
Share on other sites
Thanks, I think I see it now.

RAND_MAX is 32767 on my compiler so I should be okay with small numbers. However, if I were going for numbers between 0 - 32000 it may not be okay.

Thanks again,
Eugenek

Share this post


Link to post
Share on other sites
Quote:
Original post by MaulingMonkey
Quote:
Original post by Kitt3n
The standard random function looks sth (correct me if I'm wrong):


There is no standard implementation of rand().


Sorry, bad choice of words - what I meant was 'the one which
is provided by default in visual studio 6, and probably something
simular is done in the 2003 and 2005 ide's '

Share this post


Link to post
Share on other sites
A simple layer on top of rand() can avoid common pitfalls. See analysis.


// rand() is poorly implemented (e.g. in VC7) and only returns < 16 bits;
// double that amount by concatenating 2 random numbers.
#if RAND_MAX < 65536
static const uint XRAND_MAX = (RAND_MAX+1)*(RAND_MAX+1) - 1;
static uint xrand()
{
return rand()*(RAND_MAX+1) + rand();
}
// rand() is already ok; no need to do anything.
#else
static const uint XRAND_MAX = RAND_MAX;
static uint xrand()
{
return rand();
}
#endif

uint rand(uint min_inclusive, uint max_exclusive)
{
const uint range = (max_exclusive-min_inclusive);
// huge interval or min >= max
if(range == 0 || range > XRAND_MAX)
{
WARN_ERR(ERR_INVALID_PARAM);
return 0;
}

const uint inv_range = XRAND_MAX / range;

// generate random number in [0, range)
uint x;
do
x = xrand();
while(x >= range * inv_range);
x /= inv_range;

x += min_inclusive;
debug_assert(x < max_exclusive);
return x;
}


static void test_rand()
{
// complain if huge interval or min > max
TEST(rand(1, 0) == 0);
TEST(rand(2, ~0u) == 0);

// returned number must be in [min, max)
for(int i = 0; i < 100; i++)
{
uint min = rand(), max = min+rand();
uint x = rand(min, max);
TEST(min <= x && x < max);
}

// make sure both possible values are hit
uint ones = 0, twos = 0;
for(int i = 0; i < 100; i++)
{
uint x = rand(1, 3);
// paranoia: don't use array (x might not be 1 or 2 - checked below)
if(x == 1) ones++; if(x == 2) twos++;
}
TEST(ones+twos == 100);
TEST(ones > 10 && twos > 10);
}



HTH

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement