Random numbers with inclusive probability

Started by
8 comments, last by Zahlman 14 years, 6 months ago
Hi there, I already have a class and functions for generating a random number from a given range, say, 25 to 75 inclusive. All of the numbers in that range have an equal chance of being generated, as per normal. What I want is to have a way of shifting the probability in favor of one of the limits. Basically, I'd like to have some function like: RNGip( 25, 50, 75, 1 ); This function would generate a random number within the range of 25 to 75, except that as the numbers go up, they would be less likely to be generated. The 50 and 1 basically mean "25 is generated 50% of the time, 75 is generated 1% of the time, and all numbers in between progress from 25 to 75 smoothly in that fashion. Any have any ideas on how to do this quickly and without using any extra memory? Someone recommended that I store an array with all of the possible numbers in it but that's obviously got a lot of problems. Cheers.
Advertisement
What you currently have is called a "uniform distribution." What you want is some other distribution (lognormal? poisson? gamma?). First, decide what distribution you want. Then, grab a library that does the deed for you (for example: std::tr1::random from the C++ standard library or boost::random from boost). The library you use depends on your implementation language -- I only know C++ so that's all I can suggest.

It's as simple as that.

Stephen M. Webb
Professional Free Software Developer

First, two recommendations:
1) Make the interval inclusive on the left and exclusive on the right. That simplifies many, many things.
2) Enter probabilities as things that add up to 1, or otherwise don't normalize them at all. Percentages are arbitrary. If your users are used to percentages, transform to and from that when you have to do I/O.

Now for your question: It's unclear to me what distribution you expect to get. Saying that you want 25 with probability 0.5 and 75 with probability 0.01 is over-constraining for a simple solution like a geometric sequence. If you simply want 25 to be 50 times as likely as 75, then a geometric distribution would work well, but you'll get 25 a lot less often than half the time.

What exactly is it that you want?

[Edited by - alvaro on October 14, 2009 1:05:55 PM]
A triangluar distribution is very easy to make and could be what you are after. All you do is add or subtract two random numbers and BAM!
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Quote:Original post by Sigvatr
This function would generate a random number within the range of 25 to 75, except that as the numbers go up, they would be less likely to be generated. The 50 and 1 basically mean "25 is generated 50% of the time, 75 is generated 1% of the time, and all numbers in between progress from 25 to 75 smoothly in that fashion.


Now that I think about your example more carefully, the only solution to this example is to give 25 probability 0.5 and all the other 50 numbers probability 0.01. Probably not really what you wanted...
Create your curve and map the random number range across it.

The curves you need are usually taught in the first or second year of college math, so many people on the board either don't know it yet or have forgotten the (simple) math involved.

Your description is a bit vague about what happens in the middle of your curve. It sounds like you want to map across a sigmoidal curve (shaped like an S and results in a bell distribution), although you might want an exponential curve (shaped like an L and results in the same) or some other distribution. The most basic sigmoid is 1/(1+exp(-n)). If you use it below, you will get a normal bell curve. It can be stretched / squashed / flipped / etc. by sticking a constants in the right places.


After you know your curve, EITHER convert your number generator to generate numbers within the function's domain, OR convert your function by the number generator's range. Either one is a simple multiply and add. Use the one that needs the least math at runtime, or the easiest one to write if you don't care. (Perhaps [-1.0,1.0], or [0.0,1.0], or [RAND_MIN, RAND_MAX], all depending on your preference.)

Be aware of the range of your number generator. A range of 100 means 0-99, not 0-100. Since you are looking for something inclusive of the end points you will need to account for this.

Generate your random number and multiply by the function above. The result will fit within your desired distribution. (In your case, 0-50 inclusive.)

Next, shift the result to the low end of your desired range. (In your case +25.)

Then it is done. :-)
Quote:Original post by frob
Your description is a bit vague about what happens in the middle of your curve. It sounds like you want to map across a sigmoidal curve (shaped like an S and results in a bell distribution), although you might want an exponential curve (shaped like an L and results in the same) or some other distribution. The most basic sigmoid is 1/(1+exp(-n)). If you use it below, you will get a normal bell curve.

What? What's the connection between 1/(1+exp(-n)) and the normal distribution? I think you might be thinking that erf and the logistic function are essentially the same shape, but this is not true.
Quote:Original post by alvaro
What? What's the connection between 1/(1+exp(-n)) and the normal distribution? I think you might be thinking that erf and the logistic function are essentially the same shape, but this is not true.
They are highly related.

Your number generator --- if it is good --- will have a linear probability density. In other words it has an equal chance of making any number.

You want to give it some other shape of probability density. You do this by mapping your linear probabilities over a nonlinear distribution function, giving a corresponding probability density.

Your post made it sound like you wanted a portion of a bell curve. The corresponding mapping function is a sigmoid.

Wikipedia pictures here.

The error function and the logistic function are both sigmoidal. In my post I suggested that you use some sigmoidal function that gives you your desired curve. The error function requires more math, but may fit your needs.
Quote:Original post by frob
Your post made it sound like you wanted a portion of a bell curve. The corresponding mapping function is a sigmoid.


Wait, what post are you talking about? Are you getting confused between the original poster and me?

Quote:Wikipedia pictures here.

The error function and the logistic function are both sigmoidal. In my post I suggested that you use some sigmoidal function that gives you your desired curve. The error function requires more math, but may fit your needs.

My objection was that you seem to claim that the density function corresponding to the cumulative probability function 1/(1+exp(-x)) is a bell curve ("The most basic sigmoid is 1/(1+exp(-n)). If you use it below, you will get a normal bell curve."). In that very Wikipedia article you'll see that "the bell curve" is a very specific shape, corresponding to some scaled version of exp(-x^2). This is not the same as the derivative of 1/(1+exp(-x)).
Quote:Original post by Sigvatr
This function would generate a random number within the range of 25 to 75, except that as the numbers go up, they would be less likely to be generated. The 50 and 1 basically mean "25 is generated 50% of the time, 75 is generated 1% of the time, and all numbers in between progress from 25 to 75 smoothly in that fashion.


That doesn't make sense; the percentages would add to more than 100. Of course I suppose you might mean that you just want 25 to appear 50 times as often as 75? :/

Anyway, this isn't a programming problem; it's a math problem.

This topic is closed to new replies.

Advertisement