Need help with a fuzzy Scoring algorithm.

Started by
5 comments, last by Tutorial Doctor 9 years, 9 months ago

(I am on a tablet, I will format this when I am on my computer.)

First, I am wondering if my math is correct. Also, I need help interpreting the negative, large number in this example. I know that negative numbers mean that the value is not a member of the set. But what does the constant represent? What is its range? Here are my notes:

This algorithm gives a fuzzy score of a player based on his/her score on certain criteria, between an ideal maximum and minimum range of performance.

Scoring Algorithm: Quantifying the words, “Good Job!”

The steps (briefly):

1) Determine overall maximum score.
2) Divide scoring criteria into groups.
3) Determine the total weight of each group as a percent of the maximum.
4) Determine the individual weight of the elements in the group (the sum of the individual elements must equal the total weight of the group) as a percent of the total weight of the group.
5) Determine the ideal max, and minimum scores for each individual element in the group.
6) Get the performance of the player.
7) Find the fuzzy degree of the players performance as a function of the maximum and minimum scores.

Fuzzy Formula:
degree = (value-minimum)/(maximum - minimum) * 100

Example:

Max score: 100

Speed: 33%

* long distance: 50%
* max(average) = 30mph
* min(average) = 20mph
* value(average) = 25mph
* long_degree = 50% good & 50% bad(where best is 30mph and worst is 20mph)

* short distance: 50%
* max(average) = 30mph
* min(average) = 20mph
* value(average) = 25mph
* short_degree = 50% good & 50% bad(where best is 30mph and worst is 20mph)

If a player is 50% good at long distance speed, then they are 50% bad at long distance speed. So, out of a possible 100% at long distance speed, they are only 50%. However, this only constitutes for 50% of the total speed score.

How do I get the total speed score?

Out of a total 16.5 points in long distance speed , the player scored 50 % of that (8.25).
Out of a total 16.5 points in short distance speed, the player scored 50% of that (8.25).

The total possible score for the group is 33 points.

The total in speed performance is 16.5 points.

The full formula:
group_element_score = (max_score * group_weight * individual_weight * degree)

long distance speed score = (100 * .3. * .5 * .5) = 8.25
short distance speed score = (100 * .3. * .5 * .5) = 8.25

Strength 33%

* vertical: 50%
* max(average) = 5ft
* min(average) = 2ft
* value(average) = 3ft
vertical_degree = 33% good & 67% bad

* horizontal: 50%
* max(average) = 8ft
* min(average) = 5ft
* value(average) = 7ft
horizontal_degree = 66% good & 33% bad

vertical strength score = (100 * .3. * .5 * .33) = 5.445
horizontal strength score = (100 * .3. * .5 * .66) = 10.89

Stamina 33%

* long term: 50%
* max(average) = 6hrs
* min(average) = 4hrs
* value(average) = 4.5hrs
long_term_degree = 25% good & 75% bad

* short term: 50%
* max(average) = 1hrs
* min(average) = .5hrs
* value(average) = .95hrs
* short_term_degree = 90% good & 10% bad

long term stamina score = (100 * .3. * .5 * .25) = 4.125
short term stamina score = (100 * .3. * .5 * .90) = 14.85

total score = 51.81/100

This means that the overall performance of the player as compared to the ideal player skills (the max and minimum values are the ideal ranges of skill) is 51.81.

If “good”" were in the range:

min = 90
max = 100

then the value 51.81 would certainly not be good.

degree = (value-minimum)/(maximum - minimum) * 100

performance degree = (51.81–90)/(100–90) = –381.9%

Notice that the value is negative. All negative degree values indicate that the score is not in the set of “good.” If the performance score were 90, notice the performance degree would be 0.

They call me the Tutorial Doctor.

Advertisement

This is simply (reverse) linear interpolation. It tells you the relative "standing" of the input value relative to a minimum and a maximum (how far it is from them) taking into account the distance between the minimum and maximum, that is, the range (if the range is smaller, the input value will be more heavily penalized for being far away). An output value of 50% means the input value is "half a range" to the right of the minimum, i.e. halfway between the minimum of the maximum. An output value of -300% means the input value is "three ranges" to the left of minimum, an output value of 250% means the input value is "two and a half ranges" to the right of minimum, etc... so that only input values in [0, 100] fall in the [min, max] interval.


                       MIN                                      MAX               
                        +                                        +                
                        |                                        |                
                        |                                        |                
                        |                                        |                
+------+------------------------------------+--------------------------------+---+
       |                |                   |                    |           |    
       |                |                   |                    |           |    
       +                +                   +                    +           +    
     -40%               0%                 50%                 100%        125%   

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

You should probably use rank statistics rather than averages. Assuming raw scores are positive and better if higher:

  • normalized score= weight*(number of lower scores than the player)/(total number of scores-1).
    0 if the player is the worst of them all, maximum if he is the best, half maximum for a median (not average) performance. Suitable for distributions that are known to be skewed and clustered, e.g. batting average in a season of baseball.
  • normalized score= weight*(player score-minimum score)/(maximum score-minimum score)
    0 if the player is the worst of them all, maximum if he is the best, half maximum at the midpoint between minimum and maximum raw scores. Suitable if raw scores are quite linear but their range is unpredictable, e.g. number of goals in a season of football or number of yards gained in a season of American football.

Omae Wa Mou Shindeiru

Yes Lorenzo. The actual range should be 0 to 1, I just multiplied by 100 for simplicity.

Only thing is, I want to keep the sets fuzzy, not crisp. This way the idea of "good" can be open for interpretation.

All of the players remain good, but to varying degrees. So, one player might be 67.56% good while another player might be 23% good (87% bad). So one player, according to one interpretation of good might be good to a degree of .6756, while according to another interpretation, might be good to a degree of .75.

Then I can do a union or intersection to get the overall opinion of the player's skill level.

I use the same formula you use in your second example, but if I have a crisp boundary at a halfway mark, then a player 1% over that mark will be classified just like a player 2% over that mark.

They call me the Tutorial Doctor.

@Bacterius

I should have graphed it. That photo looks like the beginning of a fuzzy set graph

http://www.surgicalneurologyint.com/articles/2011/2/1/images/SurgNeurolInt_2011_2_1_24_77177_u1.jpg

They call me the Tutorial Doctor.

If you want fuzzy sets, you can choose some representative ranks and "blend" between them, respecting the rules for fuzzy membership functions (correct range and adding up to 1).

For example, assuming normalized score s increase from 0 for the worst player to 1 for the best player, 0 is necessarily fully "bad", and 1 is fully "very good"; you can decide arbitrarily that 0.2 is "mediocre" and 0.7 is "good". Then, if 0.2<=s<=0.7 the player is (0.7-s)/(0.7-0.2) "mediocre", (s-0.2)/(0.7-0.2) "good", and not "bad" or "very good" at all.

Of course, these triangle shaped functions can be replaced by other shapes, possibly with more than two sets for each score.

I'm still unsure about the purpose of fuzzy sets and "opinions". Comparing performance between players is a valid indicator of what the player is good at without further elaboration; it would be enough, for example, to suggest training exercises or inform "adaptive difficulty" AI.

Omae Wa Mou Shindeiru

Great idea Lorenzo. And yes, I would like to use this to make suggestions for training exercises, or even for what type of tools are "most suitable." Basically it can be for recommendations, but It can also be used for what you said, "adaptive AI."

My long term goal is adaptive AI, where decisions are not made by probability, but by using fuzzy data.

For example, if a wall is "close" then the recommended action is to move away from it. But, if a wall is "close and approaching quickly", then the recomemded action is to run away from it quickly.

It adds so much more realism to AI.

They call me the Tutorial Doctor.

This topic is closed to new replies.

Advertisement