Poportional random numbers

Started by
10 comments, last by TechnoGoth 19 years, 11 months ago
I was thinking of how to generate random characters in my game, and came up with a great way of doing it using proportional random numbers. The idea is this, if you have a stats that can have a value between 0 and 100 where 50 is the average value for that stat, then the majority of values for that stat should be around 50. However if you generate a random value between 0 and 100 then all values are equally likely and thus there is no average value. A proportional random number works by taking a base value and three random numbers to generate a value that is in proportion to the average value. First you generate a random number between 0 and 1, to generate a sign value, either –1 or 1. The you generate a range value by generating a random number between 0 and 14, If the random is: 0-4 = 0 5-8 = 10 9-11 = 20 12-13 = 30 14 = 40 Lastly a modifier value is generated between 0 and 10. So the final value for the stat is this: Stat value = base_value + ( sign * ( range_value + modifer_value ) ) In this way the stats value is still generated randomly but the distribution of values remains proportional to the average value. ----------------------------------------------------- "Fate and Destiny only give you the opportunity the rest you have to do on your own." Current Design project: Ambitions Slave
Advertisement
you could also sum several random numbers to get a nice bell curve, where most of the results are likely to be near the average.
--- krez ([email="krez_AT_optonline_DOT_net"]krez_AT_optonline_DOT_net[/email])
Or you could just make a look-up table where a single random number is looked up in a pre-calculated curve, that''s adjusted to fit your game. But that would be too simple :-)
enum Bool { True, False, FileNotFound };
quote:
you could also sum several random numbers to get a nice bell curve, where most of the results are likely to be near the average


But summing several random numbers doesn''t mean your going to get a value in proportion to the predefined average. Especailly if the average value is say 20 on a scale from from 1 to 100.

quote:
Or you could just make a look-up table where a single random number is looked up in a pre-calculated curve, that''s adjusted to fit your game. But that would be too simple :-)


A look up table? That has to be most extermly inefficent way to generate a porportional random number. Think about it if you want to make the full range of values available from 1 to 100 and have a built in curve you have to mutiple entries in the table for every value,in the end you are making a table with several thousand entries.

-----------------------------------------------------
"Fate and Destiny only give you the opportunity the rest you have to do on your own."
Current Design project: Ambitions Slave
1'' If your game can reasonably average at 20 and have a max of 100, any character with 100 in that stat would be majorly overpowered. Or if the importance of the stat decreases exponentially, in the end the difference between even 90 and 100 will have no considerable meaning. Either way, such a situation would be bad design.

2'' Summing values to find averages on a bell curve can also be normalized to fit ANY scale. End of discussion for this one.

3'' Even if you have a table with several thousand entries (say... 100x100), know how much memory it takes up? 10k. Wow.

4'' Your technique would not apply to an average of 20 out of 100 values anyways.
quote:Original post by RuneLancer
1' If your game can reasonably average at 20 and have a max of 100, any character with 100 in that stat would be majorly overpowered. Or if the importance of the stat decreases exponentially, in the end the difference between even 90 and 100 will have no considerable meaning. Either way, such a situation would be bad design.


How would it be bad design? For instance in an rpg if the max level was 100, and the average npc level was 20. How is that bad design? If the average was 50 that would be bad design, since it would mean that most people where halfway between a wimp and the most powerful person alive. In a real life situation, you can think of it this way, there is rather large diffrence between an average persons stamina and that of olympic mararthon runner. In that case 20 might represent average stamina 100 the max stamina the human body can have and olympic marathon runner would probably be somewhere around 80 or above.


quote:
2' Summing values to find averages on a bell curve can also be normalized to fit ANY scale. End of discussion for this one.


Again this doesn't work well unless the average value lies in the middle of the scale.

quote:
3' Even if you have a table with several thousand entries (say... 100x100), know how much memory it takes up? 10k. Wow.


I don't know about you but filling 10,000 values into a table by hand would be a serious waste of time. I can think of far more productive ways to spend my time. And if you had algorithim to fill the table why use the table and not the algorithim?

quote:
4' Your technique would not apply to an average of 20 out of 100 values anyways.


Umm, yes it does. The sign value is optional, and By changing the base value and range, it will work on any scale and average.




-----------------------------------------------------
"Fate and Destiny only give you the opportunity the rest you have to do on your own."
Current Design project: Ambitions Slave


[edited by - TechnoGoth on May 9, 2004 9:14:43 PM]
Why don't you use a shifted bell curve to generate values near some mean? For example, use this method as an utility function, and generate a number using newNumber = (meanValue + stdDeviation*nextGaussian()), then clamp it between your desired minimum and maximum values. If you clamp it this way, the mean doesn't have to be exactly halfway between the min and max (of course, it won't be the "real" mean if it's not exactly halfway between, but it should be very close).

[edited by - Matei on May 11, 2004 7:54:55 PM]
quote:if you have a stats that can have a value between 0 and 100 where 50 is the average value for that stat, then the majority of values for that stat should be around 50.

I can't tell from this if you're saying that the majority of random values will naturally be close to the mean, or if you simply want them to be (as in a normal distribution). For example, if you have a uniform distribution with min of 0 and max of 100, the mean is 50 but there won't be clustering around the mean anymore than any other point (on average) because they're uniformly distributed.

quote:However if you generate a random value between 0 and 100 then all values are equally likely and thus there is no average value.

Not quite right, at least not according to the definition of the average of a random distribution. What you defined (equally likely alternatives) is the uniform distribution. The average of the uniform distribution is (max-min)/2 or in your case 50. What you really meant I think is that no one value is more likely than the others.

As far as generating random numbers goes, there are lots of fairly simple methods for generating complex distributions in real time. The simplest approach for your purposes would be the following:

PRECOMPUTATIONAL STEPS: For each possible value (0 to 100) assign a probability (use a reference or another program or just pick numbers that look good). Since 50 is your mean and you want it to be the most likely, assign it the largest probability. Likewise 0 and 100 will be farthest away. Next create a lookup table that associates each number with the cumulative probability up to and including the probability of that point (for all points up to and including the current point, sum the probabilities to get a total).

When you want to generate a random number with this distribution, generate a uniform(0,1) called U random number using rand() or something like it. Then look that number up in the table, selecting the entry that is the closest in cumulative probability with exeeding U's value.

To shift the distribution without recomputing everything, just generate your random number, then add whatever offset you want to it. To scale it, generate a random number with your approximately normal distribution like I showed above, call it D, then multiply it by the scaling factor s. Then use U random number to interpolate between D and D+s.

Maybe an example would help:

In this example I only show a few values because it would take too long to type 100 (since I'm on break):
Stat    Prob.--------------0        0.051        0.102        0.153        0.40   This is the most likely value.4        0.155        0.106        0.05   


The cumulative probability table is given below (this is our look-up table):

Stat    Cum. Prob.------------------0         0.051         0.152         0.303         0.704         0.855         0.956         1.00    


Now generate a random number with rand(); it will be a real number between 0 and 1. Say it comes up as 0.39768792

We look that up in the table and find that the Stat with the smallest Cum. Prob. that exceeds 0.39768792 is 3 (with a cum. prob. of 0.7), so the Stat we generated is 3.

This is how you generate the numbers without shifting or scaling.

Now if you want to shift them, suppose that you now want to generate Stats from 7 to 13 with a most likely value of 10, just generate the Stat like before (so it's still 3), then add 7 which is the offset New Stat = (New base - Old base + Old Stat)

Finally if you want to scale the range, suppose now instead of 0 to 6, stat's range from 0 to 24, we do the following:

First find the Stat (3 in this case). Then multiply it by the scale factor (in this case 4 since we're going from a range of 6 to 24). This gives us a Stat of 3x4 = 12. The problem is if we multiply every Stat in the table by 4 we get the following values:

0 4 8 12 16 20 24.

Clearly we want to be able to get 1 2 3 5 6 7 ... as well. So we use the rand() number to get that. Ours was 0.3976892. We use this to interpolate between the current Stat and the next highest scaled Stat. In our case the next highest one is 16. If you look at the original Cum. Prob. table, you'll notice that we first select 3 (12 scaled) at 0.3 cum. prob. and we first select 4 (16 scaled) at 0.7 cum. prob. So we take set up the following equation to interpolate between 12 and 16 using the probability:

(0.3976892-0.3)/(0.7-0.3) = (New Stat - 12)/(16-12)

Solving for New Stat, we get New Stat = 12+(0.3976892-0.3)(16-12)/(0.7-0.3)

Thus New Stat = 12.976892. Round however you like, but I would probably round that up to 13.


Of course realize that scaling like this results in a uniform distribution between the scaled points (between 0 4 8 12 ... in our example). If you want to get a perfect distribution for all scaled up points, you will probably want to go with a continous distribution and then just round up or down as desired.

For example, you could approximate the probability curve (say the bell curve) with interconnected line segments or even Bezier curves, and then generate random numbers according to that distribution. It's not that different from the discrete method I showed, but the math is slightly more complicated.

For more info on the subject, Google "Generating random variates." Also check out www.mathworld.com and look at various random distributions.


[edited by - bob_the_third on May 12, 2004 2:35:29 PM]

[edited by - bob_the_third on May 12, 2004 2:37:03 PM]

[edited by - bob_the_third on May 12, 2004 2:41:45 PM]

[edited by - bob_the_third on May 12, 2004 2:58:27 PM]
Um, I''m not sure you need to make things all that complicated. I recently finished a program that generates numbers based on an inputted range and the only reason it was as complex as it was is because I allowed the user to define any number of plot points, not just the maximum and minimum.

In order to give an accurate reply, I''d have to know a little more about what you''re looking for, such as changes in variance, minimum & maximum values... So far all I have is: average = stat. The following is my attempt to work from there.

For symetrical distributions, I think it''s safe to say average = (max + min) / 2. Let''s look at some quick ways to handle this. Note that I''m borrowing heavily from tabletop rpg systems here.

Dice Totals
Assume you generate x random numbers from a to b. This makes the maximum x * a and the minimum x * b. Given the above:
average = (max + min) / 2
stat = (x*a + x*b)/2
stat = x(a+b)/2

This can be quickly adapted by fixing all but one of these terms. The most obvious being b, the maximum per "roll".
2*stat = x(a+b)
(a+b) = 2*stat/x
b = 2*stat/x - a
For a real easy way, set a to 0 and x to 2. This gives you a simple b = stat. Just add up two random values from 0 to stat. The result will average at the stat value, with a minimum of 0 and maximum of 2*stat.

You could also leave x free, but this is more problematic. After all, you can generate random fractions, but fractional repetitions is a bit harder. If you want to try it though, the formula is: x = 2*stat/(a+b). The up side is that more repetions makes the skill more reliable as it''s level increases, due to a more center-heavy probability curve.

Variance Sum
This is equivalent to "average give or take variance". Basically you start with the base value, the add one random number and subtract another. Given that both random numbers are between a and b...
max = base + b - a, min = base + a - b
average = (max + min) / 2
average = (base+b-a + base+a-b)/2
average = 2*base/2
average = base
This means all you have to do is let result = stat + random(b) - random(b). This gives you a constant amount of variance regardless of skill level.
Note that you can also do this with more than 2 randoms. So long as you subtract as many rolls as you add, this will balance out. If you want an odd number of rolls, you''ll have to use the following variant.

Assume the result equals the base, plus the sum of all random numbers (each of which with a range of a to b).
max = base + x*b, min = base + x*a
average = (max + min) / 2
average = (base+x*b + base+x*a) / 2

Let''s solve for the base:
2*average = 2*base + x*b + x*a
2*base = 2*average - x(a+b)
base = stat - x(a+b)/2
If x,a, and b are constant, then this really just comes down to result = stat - constant + rolls.

Triple-point mapping
This is the most complex i''m going to go into here. Only use this one if you''re looking to have a fixed range and variable average. These results are not symetrical and will distort the bell curve.
Assume we want all results between a and b with a shifting midpoint. First we need a number between 0 and 2. You can get this either through one random number, or the sum of multiple number. The number of repetitions will determine your basic probability distribution. Now let''s look at a little psuedo-code:

if(total > 1) // high result
result = stat + (b - stat)*(total - 1)
else result = a + (stat - a)*total // low result

Basically, you just took a 0 to 2 probability curve and mapped in onto an a to b range. The problem is the scale varies depending on whether the total is above or below the expected average.
Here''s a bit of proof that this makes the stat the average. For the total: average = (min + max)/2 = (0+2)/2 = 2/2 = 1. Let''s look at both equations at the average total.
high:
result = stat + (b - stat)*(total - 1)
result = stat + (b - stat)*(1 - 1)
result = stat + (b - stat)*0
result = stat
low:
result = a + (stat - a)*total
result = a + (stat - a)*1
result = a + stat - a
result = stat

Topic Analysis
Let''s assume your method would produce a symetric distribution, this tells us that average = (min+max)/2.
min = base + -(range_max + mod_max)
min = base - (40 + 10)
min = base - 50
max = base + +(range_max + mod_max)
max = base + (40 + 10)
max = base + 50
This looks a lot like the "variance sum" mentioned above, with a range of 0-50 per randomizer. Try result = stat + random(50) - random(50) and you should get roughly the same results, but smoother and more quickly. If the curve is a bit off, you can try adding in a few more rolls. Unless you''re really attached to that range-value table, the results should be a close enough match.

Summary
Getting your average to equal the stat isn''t that hard, all it takes is a little algebra in the design phase. If anything, this problem needs more definition before it can be answered. For once, I have to agree with Runelancer. You really can get pretty much any kind of distribution by just summing up some randoms and adjusting accordingly. If you want a few more ideas, just look into tabletop rpgs.
I thought it was clear, but I might be best to restate it more clearly.

Predefined Values:
Min
Max
Mean

where:
min < mean < max

Goal:

To generate x random numbers where the distrubtion of values in x, from min to max is nonuniform. Where the number of values in x decreases proportionaly to the values distance from mean.

So if we where to generate 100 random values with the following predifined data.
Min = 1
Max = 100
Mean = 50

Then the distubtion would look something like this.

value | number of values
41-60 | 34
31-40, 61-70 | 27
21-30, 71-80 | 20
11-20, 81-90 | 13
1-10, 91-100 | 6

Does that clear up what I''m trying to acchomplish?

-----------------------------------------------------
"Fate and Destiny only give you the opportunity the rest you have to do on your own."
Current Design project: Ambitions Slave

This topic is closed to new replies.

Advertisement