#### Archived

This topic is now archived and is closed to further replies.

# Self-balancing random number generation

This topic is 5995 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

##### Share on other sites
Hello. I think this is the sort of thing you are looking for...

First, let me describe some terms:

SUM: This term is equal to the sum of all the previous return values from the function. "True" is counted as +1, "False" is counted as -1. Ex: If the fn had returned 10 trues and 8 falses, then SUM would be -2.

C: C is a constant which will need to be tweaked in order to get it to fit perfectly. C is the cap for the difference in the number of true and false responses. For example, if C = 5, then there can be no greater than 5 more true responses than false responses (and vise versa) before the fn forces a response of the opposite type.

The function below will return a weighted probability that can be used as as usual for determining a true or false. It needs to be given a probability (called x below) between 0 and 1 (inclusive). Here is the actual (simple) fn in psuedocode:

if (sum >=0)
p = x - (SUM / C)*x
else
p = x - (SUM / C)*(1-x)

I hope this helps. If it''s not what you are looking for, let me know and I''ll see what I can do.

Mike Melson

##### Share on other sites
What application do you have in mind for this?

If you really care about this kind of binary coin flipping example, you will be much better off by randomly sorting or permuting an equal number of true and false values.

##### Share on other sites
quote:
Original post by Kylotan

Imagine you are about to flip a coin 10 times. You expect 5 heads. So you flip it coin 9 times, and get 5 heads and 4 tails. Given a proper coin, the 10th flip has a 50% (p = 0.5) chance of being heads. This means that, once you''ve had 9 flips, you have a 50% chance of scoring 6 heads. However at the start, it was a 50% chance of scoring 5.

A small correction. Probability that in 10 flips you get 5 heads
is not 0.5 but smaller. To solve such problems (k succeses in N trials) you must use Bernoulli schema:
P(N,k)= N!/(k!*(N-k)!)*pk*(1-p)N-k
N - number of trials, k - number of succeses
p - probability of success, P(N,k) - probability that in
N trials you have k succeses.
So P(10,5) = (10!/5!*5!)*1/1024 = 252/1024
quote:

I have seen the term ''stochastic'' used in some contexts similar to this, but have yet to find an explanation of the term that is both precise and yet not overly complex. Perhaps this is the solution to my problem, but I don''t know.

The only meaning of "stochastic" I know about are stochastic processes. In such case stochastic means random as opposite to deterministic. I dont think they can help you,but I can be wrong.
quote:

(For those who are interested, basically I just want this so that I can have a random number system in a game that allows for freak events, but takes corrective action to reduce the chance of an unlucky streak damaging a player''s chances.)

Maybe you can try something like this, each time the "bad" event
happens decrease its basic probability by some amount. After that
let it slowly grow, so after some time it reaches its base probability.

K.

##### Share on other sites
mmelson... that system seems close to what I am getting at. I was hoping there wouldn''t be a magic constant that I would have to pick on trial and error, though And I couldn''t just add +1 or -1 to ''sum'', rather it would be fractional parts of 1 indicating how far from the expected result the actual result was. If that makes sense. Ideally, I would have liked to eliminate any ''hard upper bound'' on the function, which is what is represented by the constant. I know that if I remove the hard upper bound, then there is always a chance that the correction method I use will not kick in, but I guess I would just prefer the corrective element itself to be based on a probability than a certainty.

Anonymous... I thought I made myself clear in the original post... it would be working on a percentage system, where I say "ok, this action has a 37% chance of success" and I want the system to return true about 37 times when called 100 times. The only difference between what I want, and a standard pseudo random number generator, is that when I''ve called it 99 times and had 36 true results, I want a better than average chance of getting a 37th true result on that 100th call. Whereas a normal PRNG is stateless (as far as probabilities are concerned... I know they have internal state for their own purposes) and would just have a 37% chance of returning true on that 100th call, regardless of what had gone before.

##### Share on other sites
quote:
Original post by Grudzio
A small correction. Probability that in 10 flips you get 5 heads
is not 0.5 but smaller. To solve such problems (k succeses in N trials) you must use Bernoulli schema:
P(N,k)= N!/(k!*(N-k)!)*pk*(1-p)N-k
N - number of trials, k - number of succeses
p - probability of success, P(N,k) - probability that in
N trials you have k succeses.
So P(10,5) = (10!/5!*5!)*1/1024 = 252/1024

Yes, of course. I only understood 1% of what you just said but I do know that what I said was wrong What I should have said was more like "However at the start, you had a higher chance of getting 5 heads than 6."

quote:
The only meaning of "stochastic" I know about are stochastic processes. In such case stochastic means random as opposite to deterministic. I dont think they can help you,but I can be wrong.

I read that stochastic processes were (sometimes) to do with some sort of probabilistic compensation for events that take a random amount of time and can only be measured once they complete. I figured this had a relevance to my system since you can''t know how many heads or tails you will have got until you flip it 9 times, however you know how many you want after 10 flips. But I know I''m largely stumbling in the dark here

##### Share on other sites
Well, how about replacing C with the term (NUM_TIMES + 1), where NUM_TIMES is (obviously) the total number of time that the function has been called. That way, there is always a positive probability that any given option will be selected (unless, of course, the given probabilites are 0 or 1, which still work correctly).

Also, I''m not sure why you would have to add a fractional value to SUM instead of +1 or -1. Maybe you could explain what sorts of return values the function would be spitting out. If they are just true/false values, then this system would work (I may just not be explaining what I mean well enough...).

Hope this helps.

Mike Melson

##### Share on other sites
The reason for the fractional values is because I don''t want it ''overreacting''.

Scenario: I call RandPercent(99) 5 times. It returns true 5 times. If C was 5, then the next call to RandPercent(99) will return false. This is obviously erroneous behaviour. Scoring 5 trues in a row for a 99% query is less ''unbalanced'' than 5 trues in a row for a 50% (coin flipping) query. Therefore it should be ''tipping the scales'' far less. With me?

##### Share on other sites
I''m not ready to stop pushing permutation yet.

You''re chance is 37%. You generate a list of true/false values, and you set 37% of them to ''true.'' Then you permute the list randomly. (Fairly easy and cheap to do) Each call to get random returns the next value in the queue. When you hit the end, you have a choice of looping back to start or re-permuting the list and starting over.

This also has the advantage of cutting down the number of potentially expensive calls to your system rand function, by essentially generating a look up table of random values.

Want more general percentages? Permute the numbers from 0 to 99, then treat chance < randvalue as a true. You''ll be true exactly chance out of 100 times.

##### Share on other sites
I''m not ready to stop pushing permutation yet.

You''re chance is 37%. You generate a list of true/false values, and you set 37% of them to ''true.'' Then you permute the list randomly. (Fairly easy and cheap to do) Each call to get random returns the next value in the queue. When you hit the end, you have a choice of looping back to start or re-permuting the list and starting over.

This also has the advantage of cutting down the number of potentially expensive calls to your system rand function, by essentially generating a look up table of random values.

Want more general percentages? Permute the numbers from 0 to 99, then treat chance < randvalue as a true. You''ll be true exactly chance out of 100 times.

##### Share on other sites
This would involve having a state of about 100 bytes for every player (since each player would have his or her own PRNG state to keep the game balanced for them) and this could get unwieldy in a multiplayer game...

##### Share on other sites

Nah. With a little extra book keeping work, you can just keep one state list, and all you need per player is a one or two byte index into the list describing their current position. You''re going to need that much state info per player no matter which scheme you use.

I''d probably set up a state list a few hundred bytes long and randomly scatter the players starting points on the list as they join the game, and never recalculate. I doubt they''d notice that there''s a pattern a few hundred positions long if they endlessly repeat the same action.

Actually, /I/ would probably just use rand, because I don''t care about the observed probablity matching to the expected one under any sort of time bound.

##### Share on other sites
Hmm. It also occurs to me that rolling a rand function in this manner will produce a deterministic game, which is good for the ever cool instant replay or demo recording function.

##### Share on other sites
The problem with this system is that although it ensures you have a perfect spread of probabilities over the 100-call period, it still doesn''t actually do any balancing. If the pattern of the calls (eg. a sequence of: RandPercent(20); RandPercent(80) is a low-high repeating sequence) happens to closely correlate (or negatively correlate) with the shuffled sequence of numbers from 1 to 100, then the results can be erroneous and the system does nothing to balance this. In this sort of situation, it''s little better than just using rand(). You can guarantee a good and even distribution ''behind the scenes'', but it''s not guaranteeing a ''fair'' deal for the game entity in question, which is what I''m aiming for.

##### Share on other sites
I'm not sure if this will give you the probability distribution you're looking for. I don't even know if it'll even guarantee that it'll come close. But it's a random thought I came up with, so it was worth a shot.

  void RandomChance ( int percentage ){ static int difference = 0; // This will be an integer that remembers our bias. // difference will equal the # of true returns minus the // # of false returns. if (difference == 0) // If there's no bias { // Test the percentage directly to the random number. // Then increment or decrement difference based on outcome } else if ( difference < 0 ) // If more false returns... { // Boost your probability score percentage = 100 + (percentage / (difference - 1)); // (NOTE: Difference < 0, so division above will be < 0) // (ALSO NOTE: The -1 is to correct for the case when // difference == -1) // Pretend you're repeatedly calling RandomChance(50%). // Fore each successive false, your probability rises. // First 50%, then 75%, then 83.3%, then 87.5%, and // upwards to 100. // Test new percentage & update the difference variable } else // If more true returns... { // Lower the odds percentage = percentage / (difference + 1); // (NOTE: The +1 is to correct for the case when // difference == 1) // Pretend you're repeatedly calling RandomChance(50%). // For each successive true, your probability falls. // First 50%, then 25%, then 16.7%, then 12.5%, and // downards to 0. // Test new percentage & update the difference variable }

Will this work?

~ Dragonus
If F = ma, and Work = Fd, does that mean that Work = mad?

Edited by - Dragonus on August 17, 2001 4:54:11 PM

##### Share on other sites
If you "balance" out your random number, is it still random?

##### Share on other sites
Dragonus: that is very similar to mmelson''s first reply, except that instead of a linear change in probability you have a gradually decreasing change in probability.

What all these solutions are missing out, is the point I made in the post beginning "The reason for the fractional values..." If you are only counting the difference between true and false returns, you are considering them to be equally weighted, which they are not. They can be weighted arbitrarily, depending on what kind of calls I am making to the generator. If I call RandPercent(90) and get 5 true answers, your system will make the 6th call have a (90 / (5 + 1)) == 15% chance of returning true. This is wrong: there should still be a >50% chance of returning true.

I figure that the algorithm is on the right lines, however. Instead of adding 1 to ''difference'' for every ''true'' and subtracting 1 for every false, I think I need to add different values that reflect the likeliness of the past calls. For example, when I call RandPercent(90) and get true, that other 10% needs carrying over. Basically, I add less than 1 each time. Perhaps I add 0.2, or something, since the ratio of 0.2 to 10% is the same as 1 to 50%. But I''m not sure. Nor would I know how to prove its ''correctness'' if I tried it.

Torn Space - no, it''s not random any more. But then, they were only pseudo random in the first place. I''m not really interested in how random they are, just that I get a stream of varying numbers which are biased towards ''proving the statistic'' (eg. 10% chance of success at a given activity) rather than each call being independent.

##### Share on other sites
This thread reminds me of a "Married with children" show when Al Bundy had blind luck throughout the episode. But all this luck made him terribly unhappy because he knew he was only so lucky because something awfull was gonna happen to correct the odds. The show ended of course with every lucky thing that happened turning around (just like in your algorithm ) and Al finally happy and relieved of all the presure saying "I''M FREE !!" to the officer arresting him (for owning a stolen car he won at a card game).

##### Share on other sites
Kylotan, I don''t see how you expect any system to perform much better than a standard PNRG if you don''t provide a "cap" to its behaviour. If you''ve got an open-ended system, any sufficiently dimensionally independent PNRG (and there are lots) is in fact pretty close to the definition of proving the statistic.

You''ve got to give a cap if you want the statistic proved to a much better degree. Perhaps a system size indicator would help, wherein small = 10 (the minimum for an integer percentage to be proven to the nearest 10th), medium = 100 (the minimum for an integer percentage proving), and large > 100 (can be proved to within one range level).

I have no idea what you would need this for, but I would suggest taking a long hard look at the design you''re using if the events you wish to bias are severe enough to undermine the average random number generator.

ld

##### Share on other sites
Ok. Here is an attempt at a solution despite my trepidation as to its tractability. Basically, for each call to the function you add or subtract directly from the percentage.

int NewRand(int pctg){  static int trues = 0;      //total #trues   static int totalcalls = 0; // total # function calls  int newChance;             // adjusted value of percent chance  int result;  // adjust the percentage to compensate for the difference   // between actual and desired.  As the difference approaches     // the order of the percentage itself, the system approaches   // 0% chance of returning a true.  // Might as well increase the # fnc calls here, as well  newChance = pctg - ((int)((float)trues/(float)(++totalcalls)) - pctg))  // constrain the chance to 0 thru 100, obviously  if (newChance < 0) newChance = 0;   else if (newChance > 100) newChance = 100;  //  GenRand is a generic PNRG on a percentage system  result = GenRand(newChance);  // If we got a true record it  if (result) ++trues;  return result;}

This is pretty close to what Mike Melson wrote earlier, but I came up with it independently and figured I'd put it down to make sure we're on the same page.

ld

Edited by - liquiddark on August 19, 2001 6:09:16 PM

##### Share on other sites
Okay, maybe I''m just not getting what you''re REALLY wanting here...

We''ll use your example in your reply to me. We keep calling RandPercent(90). Your first call results in a 90% true, 10% false return. I''m good with you on that part.

However, as you mentioned, let''s pretend we get 5 true returns from this. You say the probability is still > 50%. What''s your rationale for this, because I don''t think i see it...

Unless it''s this: at 90%, we get 9 true returns out of every 10. We''ve already got back 5, so that means 4 out of the next 5 must be true, thus the odds should be 80%?

Dragonus
Thought of the Moment
If F = ma, and Work = Fd, then does Work = mad?

##### Share on other sites
quote:
Original post by liquiddark
Kylotan, I don''t see how you expect any system to perform much better than a standard PNRG if you don''t provide a "cap" to its behaviour. If you''ve got an open-ended system, any sufficiently dimensionally independent PNRG (and there are lots) is in fact pretty close to the definition of proving the statistic.

Yeah, given enough samples, which it won''t always get.

Think about the Taylor series or some other approximation method. The series doesn''t know what the real answer you''re looking for is, but you can keep applying it to get better approximations. That is what I figured could be done here: there could be a mechanism that moves in the direction I want it to. You couldn''t guarantee it moves to the exact ''position'', since there is no cap, but it should be able to move in the right direction. Do you see what I''m getting at?

I still don''t think that algorithm takes enough of the data into account. I didn''t understand the "newChance = pctg - ((int)((float)trues/(float)(++totalcalls)) - pctg))" - too much in one line for my little brain to follow. But it''s only taking into account previous true/false scores, rather than previous expected scores. Maybe the next paragraph will clarify.

Dragonus: For that 6th call, after retrieving 5 trues, ideally the percentage should be 80%. Because you would expect 9 trues out of 10. A normal PRNG will have a percentage of 90%, since it ignores what''s gone before. Your algorithm however has a 15% chance of returning true on that 6th call. Basically, it overcompensates, because it doesn''t take into account that I called the function with a 90% parameter. Any function that only counts trues and falses returned is only accurate when always called with a 50% parameter since it weights them equally.

##### Share on other sites
Well, empirically, the final equation you want will be logistic. After a bit of work on my TI-83, I figured out that a model in the forum of

tweaked = original / (1 + A * e^B)

is the type of equation you want, with an implicit logarithm attached. However, I did this based on X number of trials, and coming up with an all-case rule won't be too effective...

I also had some luck with this equation:
  original - trueReturnstweaked = ---------------------- maxCalls - totalCalls

Granted, this does tie it down to a specific number of calls. However, you can technically cheat and force your hand, so to speak. Let maxCalls = 100 (since our inputs are 0 to 100). Every time totalCalls == 100, reset totalCalls and trueCalls to 0.

The above equation GUARANTEES beyond all doubt (I think) that you will get the exact number of true returns for the number over 100 calls to the function. For example, let's deal with 90%. If trueReturns == 90, then the numerator == 0 and tweaked == 0. If falseReturns == 10 (e.g., totalCalls == trueReturns + 10), since maxCalls == original + 10, the 10's cancel and you're left with a solid probability of 100% over the calls up until 100.

You can set maxCalls as high as you want, and you can set it extremely high if you want it to be "infinite". The only problem that you might find in doing this is that you won't get the exact results you planned on, but theoretically, you should get pretty close. The reason for this is, when totalCalls << maxCalls, the tweaked probability doesn't vary greatly. Then again, we really don't want it to from the get-go, but towards the end, you won't see the massive probability jump to 0% or 100% like you'd expect if you knew how many calls you were going to make, which has both it's negatives and positives.

I think the 2nd equation is the better of the two, even though it is tied down to the maxCalls variable, but, as I said, there are ways to get around it.

Dragonus
Thought of the Moment
If F = ma, and Work = Fd, then does Work = mad?

Edited by - Dragonus on August 20, 2001 4:07:51 PM

##### Share on other sites
quote:
Original post by Kylotan

I didn''t understand the "newChance = pctg - ((int)((float)trues/(float)(++totalcalls)) - pctg))" - too much in one line for my little brain to follow.

Basically it says your new percentage is:
[the desired percentage] *minus* [the difference between] [the percentage achieved thus far] and [the desired percentage]

Thus, you get something like this:

Randomchance(80) = 1: next time the tweaked percentage is going to be .80 - (1/1 - .80) = .80 - .2 = .60.
Next generation:
RandomChance(80) = 1: next time the tweaked percentage is going to be .80 - (2/2 - .80) = .80 - .2 = .60

Actually, this scheme isn''t all that great, really, since you''re set on a "squeeze"-type solution.

quote:

Dragonus: For that 6th call, after retrieving 5 trues, ideally the percentage should be 80%. Because you would expect 9 trues out of 10. A normal PRNG will have a percentage of 90%, since it ignores what''s gone before. Your algorithm however has a 15% chance of returning true on that 6th call. Basically, it overcompensates, because it doesn''t take into account that I called the function with a 90% parameter. Any function that only counts trues and falses returned is only accurate when always called with a 50% parameter since it weights them equally.

In an open-ended system this is an unrealistic request. What is your chance of getting a false in six calls? Since you''ve got a 1/10 chance per go, it''s about 6/10. Hence your adjusted percentage is 40%, right?

So I''ll try a derivation of a method to use with this insight:

Firstly, I really need a generator object to talk about this intelligently. Assume the generator has an exposed interface with:

a Generate() function,
a SetWeightsAndValues(vector weights, vector vals)

internally, it should at least have:
an OriginalWeights[] vector
a Values[] vector
a NumberofGenerations[] vector to keep track of how many times this value has been generated.
a NumberofGenerates integer to keep track of how many times the Generate() function has been called altogether

Once the weights and values are set, the Generate() function is called. Add the OriginalWeight of amount(A) to the AdjustedWeight of amount(A). If the total POSITIVE (ie, disregarding negative entries) AdjustedWeight provided > 100, normalize the POSITIVE ELEMENTS ONLY of the AdjustedWeights vector. Once a value is generated, subtract 100 from the AdjustedWeight of that value. Let''s go back to trues and falses for a second to give an idea of what I''m thinking here:

say we call
SetWeightsAndValues({99,1}, {0,1})

we get:
Value New probabilities:
1 99-100+99 = 98 1+1 = 2
1 98-100+99 = 97 2+1 = 2
1 97-100+99 = 96 3+1 = 4
. . .
. . .
. . .
1 55-100+99 = 54 45+1 = 46
0 54+99 = 153 46-100 = -53
1 153-100+99 = 152 -53+1 = -52
. . .
. . .
. . .

Don''t know if this helps or not. I can prove that it functions very much as you request, and I believe the process extends to other vectorized number generators, but I don''t have a clean proof.

ld