Well, now, we begin on our journey into the crazy world of genetic algorithms. What are they? They are a useful tool in optimization problems. For example, if I gave you the equation, y = sin(x)*cos(z)+ x^2*z^2 and said I want you to find the x and z that minimizes y. You could probably do it with calculus, or you could use a genetic algorithm. The usefulness of genetic algorithms is to find that elusive global maximum or minimum (many algorithms can't find one in very complex equations and get stuck in what is called a local minimum or maximum). While many genetic algorithms out there are very successful at doing this job, there is usually no guarantee you will find a global maximum or minimum... still...the chances are better.
Where would you use a genetic algorithm in a game. Well, I will present a very simple idea later on. We can use it to aid our neural networks (remember--minimizing the error) in finding the weights--but we wouldn't do that cause "it would be boring." The manufacturing scheduling problem is one that definitely uses this. Also, we can use it to impress our friends. We can also use it to write books or in journals. But I digress again... let's get started.
We shall use a traditional genetic algorithm (again just to get you started). What makes a genetic algorithm intelligent, well it's feedback. We start with an initial population of solutions (i.e. think people with possible answers) and we evaluate their answers. We then modify these solutions by mixing up the population's chromosomes (i.e. inbreeding) and come up with some children that hopefully have better genes than their parents to provide better solutions ("this corn is something special" -- Deliverance yet again). The circular process of evaluating a population and then stealing their children is what a GA is... but how do we evaluate and generate new children?
The first step in a GA is typically how should we represent our population. The most common way is to use a series of ones and zeros (bits). Why we do this will become apparrent during the sex stage. For example,
Suppose we were measuring stock market variables such as the temperatures in Peru and number of hot women in Wisconsin (yes there is a correlation between hot women in Wisconsin and the cost of cheese), and we want to use these measurements to tell us when we should buy cheddar cheese (if goat cheese from Peru is doing well, we're in trouble).
We are trying still to find our representation. Suppose we know the maximum temperature in Peru is 128 degrees and 0 is the lowest temeperature and the maximum number of hot women from Wisconsin is 512 and minimum is 0. Hey? This isn't a game... I know.. I'll do that later... first lets get some fundamentals!
Then we can represent temperatures as a string of 7 bits that range from:
Peru Temperature Representation
0 = 0000000
1 = 0000001
2 = 0000010
...
128 = 1111111
We can also do this with the hot women but with 9 bits that range from:
0 = 000000000
1 = 000000001
2 = 000000010
...
512 = 111111111
So we can combine these together into a 16 bit stream.
xxxxxxx|xxxxxxxxx First 7 bits are temperature in Peru and last 9 bits are hot women in Wisconsin.
This will represent our "population" and this is called a chromosome. Each bit is a gene.
============================================================
So now that we know how we are going to represent our population, we now need to know how to evaluate a chromosome (or member of the population). This is the cranky boss stage.
This is where we require some intelligence ourselves... how can we creatively make a evaluation function (also known as fitness function--how fit is a member of the population...) Well, we can think to ourselves and say well we want hot women... hot women are nice... so lets try to maximize that thing. Also, we want Peru goat cheese to be filled with bacteria so we would want high temperatures. But, lets say we did a lot of studying on the subject and found the risk associated with these variables can be measured through (from someone's Ph.D. thesis),
risk = abs((Temp)^2/(Women+1) - (Women^2)/(Temp+1))
Hmm you say, if the number of hot women increases, the effect on the risk from the temperature decreases? What gives? Hey, I didn't write that thesis. Let's just pretend OK. Anyway, that will be our fitness function, and obviously we like low risk so we will be minimizing this guy.
Let's maximize instead and take the recipricol of the above
confidence = 1/abs((Temp)^2/(Women+1) - (Women^2)/(Temp+1))
==============================================
So now we have our population representation and how to evaluate them. Now what? We start with a random population of chromosomes:
Let say we generated these:
0101000_000010000 <-- Jim (who is a nice guy, but kind of dense)
0100011_001000010 <-- Mike (he plays video games a lot while at work)
1000011_100010000 <-- Sam (doesn't like cheetoes, prefers dorritoes)
0001111_110000000 <-- George (plays guitar badly)
0111100_111110000 <-- Boris (disects animals when no one is watching)
0110100_000001100 <-- Mary (scared to death)
===============================================
Now we convert each of these guys/gal to their phenotype which is a fancy way of saying back to base 10 numbers.
Jim = [40, 16]
Mike = [35, 34]
Sam = [67, 272]
George = [15, 384]
Boris = [60, 496]
Mary = [52, 12]
Then we calculate the evaluation function for each (Yes, all these calculations are done in my mind, NOT!)
confidence = 1/abs((Temp^2)/(Women+1) - (Women^2)/(Temp+1))
Jim = 0.0114
Mike = 0.3448
Sam = .00093318
George = .00010851
Boris = 0.0649
Mary = 0.1205
=====================================================
Now we calculate the total "fitness" for this population and then we calculate the selection probability. Think natural selection for selection probability--who's gonna mate? For example:
Total Fitness = 0.0114 + 0.3448 + 0.00093318 + 0.00010851 + 0.0649 + 0.1205 = 0.5426
The selection probability (percentage in this case) for each chromosome is calculated by:
Jim.spmax = 0.0114/0.5426* 100% = 2.1%
Mike.spmax = 0.3448/0.5426 * 100% = 63.5%
Sam.sp = 0.00093318/0.5426 * 100% = 0.1720%
George.sp = 0.00010851/0.5426 * 100% = 0.02%
Boris.sp = 0.0649/0.5426 * 100% = 11.9%
Mary.sp = 0.1205/0.5426 * 100% = 22.2%
(this is turning into a murder of an example...Hehe)
==========================================================
Now we play, "bust a deal, face the wheel" -- Mad Max Beyond Thunderdome. This is the natural selection process. We create a roulette wheel based on the percentages we have calculated. Mike's got the best overall score for his numbers of temperature in Peru and hot women in Wisconsin, so he shouldn't have too much trouble finding a date. Time for another poorly drawn picture.
We are selecting chromosomes (people) now that will be having sex. So we spin the wheel this way:
1. We generate a random number, x, between 0 and 100%.
2. We calculate the cumulative probability:
if x < 2.1% then select Jim;
if (2.1% <= x < 63.8%+2.1% = 65.9%) then select Mike;
if (65.9% <= x < 65.9% + 0.1729% = 66.2%) then select Sam;
if (66.2% <= x < 66.2% + 0.02% = 66.22%) then select George;
if (66.22% <= x < 66.22% + 11.9% = 77.12%) then select Boris;
if (77.12% <= x <= 100%) then select Mary;
I gave Mary a little lee-way since she's the only woman (round off error).
So we spin it about 6 times (usually about the size of the initial population) and say we get these selections:
Jim 0101000_000010000
Mike 0100011_001000010
Mike 0100011_001000010
Mary 0110100_000001100
George 0001111_110000000
Mike 0100011_001000010
Whoa, George got in. The party will be at his place.
====================================================
Now comes the part where we use that bit representation. To generate the children (sex).
The first operation is crossover. This is when we select a bit position by random (1-16) and take all bits left of this position on one chromosome (person) and all bits right of this position from another chromosome (person) and generate a child. The chance for a crossover is set to 25% usually. So we go down our selection list and suppose we stop at Mike and he needs to crossover with someone (sex :0) because he hit the 25% by random number generation. Then we select another chromosome (person) by random and lets say it's Mary. We then randomize a crossover point, lets say it's 5. The we cross over from that point (Mike on left, Mary on right):
Mike's genes to crossover--> 01000
Mary's genes to crossover--> 00_000001100
child = 0100000_000001100
and then replace Mike with the generated child.
The next operation is called mutation. Basically, we go through our whole selected population one bit at a time and generate a random number. If this random number is less than or equal to 0.01 (1% chance) we flip the bit (0 if 1 previously, 1 if 0 previously). Since we have 16*6 bits = 96 bits, we would expect around 1 bit (if ever) to have a mutation. Suppose we are on the fourth bit from the left on Mary when we get the 1% chance from the random number. Then,
Mary 0110100_000001100 becomes 0111100_000001100.
=========================================================
So we may get these children after our crossover and mutation sex operations.
Jim, Jr. 0101000_000010000
Mike, Jr. 0100011_000010000
Mike, Jr. 0100011_001001010
Mary, Jr. 0110100_000001100
George, Jr. 0001111_110001100
Mike,Jr. 0100011_001000010
Can you tell who's doing who? After this we have generated the new children and start the process over until we see our total fitness or evaluation start to settle down.
===========================================================
So we run our genetic algorithm a couple 1000 iterations, and then we take the most fit chromosome and use it as the solution that maximizes the confidence in this case.
So if we found 50 degrees in Peru and 10 hot women to be the best, that's when we should buy cheddar.
Anyway, hope you get the idea.
Reiterating the steps:
1. Find a representation suitable for crossover and mutation (there are other operations out there also) and create an evaluation function (best to make it positive if ya can--and maximize)
2. Generate an initial population of chromosomes.
3. Convert these chromosomes to phenotypes and use the evaluation function to determine the fitness.
4. Calculate the cumulative selection probabilties.
5. Spin the roulette wheel to select the potiential mates.
6. Do crossover and mutation.
7. Goto 3... until solutions seem steady.
There are lots of other genetic algorithms out there. I will come back to a game example later.
Hope I didn't make too many mistakes!
Look at this... some marketing people are using GA's with their webpages. See how GA's get exploited for the purposes of "evil?"
Here is a more elaborate tutorial on GA's I found--click on the chapter numbers.
Here is a link to someone writing about games and genetic algorithms..he uses the word sex also... whaooooooo. He says the encoding of bits is the hardest part which I don't agree with. I say coming up with an evaluation part is the hardest part cause we have to make it intelligently.
Just keep in mind, people do things different ways and you may see a lot of genetic algorithms out there. For example, when I use genetic algorithms, I sometimes keep some of the good people in the population no matter what and don't let them get crossover or mutation. It depends on the problem and a tradeoff between how fast you want the solution and how good the solution will be.
So there you have it, a genetic algorithm. You start off with some people. They get evaluated. They generate new children which are hopefully superior. Then they get evaluated... etc. etc. And we stop when we feel the answers are sufficient.