Does C++ <random> lib require seeding?

Started by
8 comments, last by Khatharr 10 years, 7 months ago

I've implemented a function:


bool doesCustomerArrive(int custPerHour) {
  if(custPerHour > 60) {return true;}
  static default_random_engine generator;
  float custPerMinute = (float)custPerHour / 60;
  bernoulli_distribution distribution(custPerMinute);
  return distribution(generator);
}

Which I'm using in a simulation. I iterate custPerHour, starting from 1, and run the thing 6000 times per iteration, doing some calculations until a specific result is met. My problem is that the program is producing the exact same unusual results every time it's run, which immediately makes me think that I've either failed to seed the randomizer or else just set this up wrong. I thought this new set of random functions didn't need seeding, and I don't see anything wrong with my algorithm. Where am I screwing up here?

Specifically, the problem results are:

39 CPH for 100 hours = 2016 customers (expected 3900)

40 CPH for 100 hours = 1986 customers (expected 4000)

The results are the same every time.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.
Advertisement

Check the docs, specifically the constructor for default_random_engine:

http://www.cplusplus.com/reference/random/linear_congruential_engine/linear_congruential_engine/

You'll notice that it takes a seed parameter, and the default value of that parameter is a constant 1u.

SlimDX | Ventspace Blog | Twitter | Diverse teams make better games. I am currently hiring capable C++ engine developers in Baltimore, MD.

Crud. (Why the hell don't they make the default call a chrono function?)

Thanks, Promit.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

Crud. (Why the hell don't they make the default call a chrono function?)

Thanks, Promit.

Because in a vast majority of the cases you do not want to initialize it with a value from chrono, but from a pre-existing value loaded from previously saved data. In addition, the moto of the language designers and library writers is "do as little as possible.", initializing with 1u is doing as little as possible. While calling a potentially expensive time function is not.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.

Because in a vast majority of the cases you do not want to initialize it with a value from chrono, but from a pre-existing value loaded from previously saved data. In addition, the moto of the language designers and library writers is "do as little as possible.", initializing with 1u is doing as little as possible. While calling a potentially expensive time function is not.

Vast majority? I've only seen a limited number of cases where it made sense to use a specific seed. Why would specifically seeded sequences be the majority case?

More importantly, if we want to talk about conservative design, why does it have a default argument at all? How often do I want my seed value to be 1u?

(raging at my android osk right now omg)
void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

Because reproducibility is desirable, while non-reproducibility is undesirable.

// See, for instance, http://openmd.org/?p=257

Think how hard debugging would be if you couldn't guarantee the same execution path of your program.

Besides, even if you were fine with non-reproducibility, seeding with time functions is often a bad idea -- at least, you'd need a high resolution timer.

As an alternative, you can do the following:

std::random_device rdev{}; // ASSUMPTION: your implementation / target platform actually guarantees this is random
std::default_random_engine e{rdev()}; // seed with the aforementioned source of randomness

For more, read this:

http://isocpp.org/files/papers/n3551.pdf

I've only seen a limited number of cases where it made sense to use a specific seed. Why would specifically seeded sequences be the majority case?

The motivation for writing <random> did not come from game developers (most of whom live in the NIH world anyway) but from the scientific and engineering crowds with their models and simulations. For these purposes it's very important to be able to reproduce exactly the same pseudorandom sequences for different runs as various other parameters are adjusted.

You'll notice that most of the engines in <random> require a seed vector with size greater than 1 (only the LCG can use a seed vector of size 1). The only way you can have a common constructor signature across all the engines is to have a zero-argument constructor, and that's gotta be a forwarding constructor with a default initializer value. The only reasonable numbers for a default are 0 and 1, and the committee chose 1. Note that the default initializer seed vector for engines other than the LCG is generally a sequence generated by the default LCG initialized to 1.

Remember, one of the guiding design principals behind C++ (including its standard library) is "pay only for what you use." If the default for constructing a PRNG were to involve system calls and waiting for hardware to respond after several context changes, and you weren't even going to use that value, it would be a blatant contravention of the fundamental design principle. So, if you want anything other than the default "do nothing" action, you need to do it yourself.

Stephen M. Webb
Professional Free Software Developer

I've only seen a limited number of cases where it made sense to use a specific seed. Why would specifically seeded sequences be the majority case?

The motivation for writing <random> did not come from game developers (most of whom live in the NIH world anyway) but from the scientific and engineering crowds with their models and simulations. For these purposes it's very important to be able to reproduce exactly the same pseudorandom sequences for different runs as various other parameters are adjusted.


Ah, okay. That sort of makes sense. TBH it kind of sucks that they wouldn't consider both groups, though, and throw us a bone by way of a typedef or something.

You'll notice that most of the engines in <random> require a seed vector with size greater than 1 (only the LCG can use a seed vector of size 1). The only way you can have a common constructor signature across all the engines is to have a zero-argument constructor, and that's gotta be a forwarding constructor with a default initializer value. The only reasonable numbers for a default are 0 and 1, and the committee chose 1. Note that the default initializer seed vector for engines other than the LCG is generally a sequence generated by the default LCG initialized to 1.


Okay, I get the common signature part, but I don't get zero-argument part. Is there an engine that requires no seed, or are they just thinking that there may be one added in the future?
void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

I've only seen a limited number of cases where it made sense to use a specific seed. Why would specifically seeded sequences be the majority case?

The motivation for writing <random> did not come from game developers (most of whom live in the NIH world anyway) but from the scientific and engineering crowds with their models and simulations. For these purposes it's very important to be able to reproduce exactly the same pseudorandom sequences for different runs as various other parameters are adjusted.


Ah, okay. That sort of makes sense. TBH it kind of sucks that they wouldn't consider both groups, though, and throw us a bone by way of a typedef or something.

You have a constructor you can pass a chrono value to. Exactly what more do you need?

You'll notice that most of the engines in <random> require a seed vector with size greater than 1 (only the LCG can use a seed vector of size 1). The only way you can have a common constructor signature across all the engines is to have a zero-argument constructor, and that's gotta be a forwarding constructor with a default initializer value. The only reasonable numbers for a default are 0 and 1, and the committee chose 1. Note that the default initializer seed vector for engines other than the LCG is generally a sequence generated by the default LCG initialized to 1.


Okay, I get the common signature part, but I don't get zero-argument part. Is there an engine that requires no seed, or are they just thinking that there may be one added in the future?

Some of the RNGs, such as the ranlux48 generator, have a default constructor. This constructor invokes the underlying base ranlux48 generator with its default arguments, which is 1u, which is then passed to the LCE, the LCE is then used to generate the default seed of the ranlux48 base generator, which in turn is used by the ranlux48 generator.

Vast majority? I've only seen a limited number of cases where it made sense to use a specific seed. Why would specifically seeded sequences be the majority case?

So you've never wanted to be able to do replays? Or generate the same terrain more than once? Or load up save games and have them behave the same? There are plenty of cases where you want to be able to replicate the same random set of numbers in a game.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.



Vast majority? I've only seen a limited number of cases where it made sense to use a specific seed. Why would specifically seeded sequences be the majority case?

So you've never wanted to be able to do replays? Or generate the same terrain more than once? Or load up save games and have them behave the same? There are plenty of cases where you want to be able to replicate the same random set of numbers in a game.

Replays and terrain generation are the two cases I was thinking of. In both cases you're re-using a seed you got from an entropic or pseudo-entropic source, though.

Loading a save and getting identical behavior from the randomizer isn't something I like. It's not random and it encourages prng prediction cheats.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

This topic is closed to new replies.

Advertisement