Chromosome encoding

Started by
21 comments, last by edotorpedo 22 years, 3 months ago
Simple question My chromosome is made up of a string which can contain 4 different symbols. Should I encode these symbols by 00 01 10 11 or is it better like this: 0001 0010 0100 1000 (I think this approach is better, but I don''t know why) Thanks, Edo
Edo
Advertisement
Well, if you use integers instead, the second case make it easier to use bitmasks. Better still would be to use an enum - typically evaluated as an int but with the easier to remember name scheme
enum chromosomes { a = 2 b = 4 c = 8 d = 16}; 


The only key issue you need to consider is that if you choose a binary encoding (be it in integer form, or packed bit string form) you need to ensure that any binary string that can be generated via the genetic operators (applied to the population) decodes to a valid value of the appropriate state variable.

If you were to use the second option you listed for your encoding, you would need to restrict crossover to limited places in the chromosome (allele boundaries). You would also need to implement an unusual mutation operator that was more complex than a bit-flip.

You could use the first encoding in a packed bit string and use bit masks to perform crossover. Mutation would be a simple bit flip.

Cheers,

Timkin
Representation is KEY! A well designed model can outperform a GA even with a random search (ok, not always, but you get the point

If they''re four different concepts, then use one gene for each - single a bit will do fine. It''s the same variable with different values, then I find the integer approach is fine... as Timkin said, just make sure your crossover and mutation operators do actually search all the options!


Artificial Intelligence Depot - Maybe it''s not all about graphics...

Join us in Vienna for the nucl.ai Conference 2015, on July 20-22... Don't miss it!

What the hell is a chomosome? I know what one is in biology, but whats the application in programming...?
Jon
Thanks for all the replies.

I''m using this scheme to perform some kind of path finding (to solve slide puzzles). I therefore use a chromosome made up of 20 genes (if evaluated it can perform 20 moves). Using 0010,0001 etc, I neglect wrong genes and move on to the next. But if the solution requires only 5 moves to be made, wouldn''t the chromosome evolve to a solution in which the first 5 genes are the moves to a correct solution, and the other 15 genes are all incorrect (for example 1111) ?
Edo
jonpolly:

Chromosome is not really the correct terminology for AI, genotype is prefered...

It''s basically just an array storing lots of genes (data), that can be artificially evolved - using genetic algorithms.


Artificial Intelligence Depot - Maybe it''s not all about graphics...

Join us in Vienna for the nucl.ai Conference 2015, on July 20-22... Don't miss it!

You have a bit more than 4 bits - thats what makes life fun.

Your point is crossing to an arena which in the ''molecular computing soup'' is getting rather warm on - perhaps you read about it DNA computing.

For the moment, the concept as your encoding idea sound quite nifty and are well broadcasted. However life - often likes error more than codes. How would or could you program around it. And would you even want to perhaps you could program on genetic error.



JL
www.bionicbros.de
Just a note on terminology...

Early literature in the field of genetic computing made liberal use of the parallels between natural genetics and computational genetics when determining appropriate names for data structures and operators.

As in natural genetics, a chromosome in a GA is a collection of genes. The chromosome represents the genotype of the individual (its genetic make up) and when expressed in the environment (decoded) we obtain the phenotype (point in state space). Each gene, being a unit of genetic information, can take on one of a finite set of values, called alleles.

Some people - like myself - still tend to use such terminology when discussing genetic programming techniques, as it leaves little confusion for what is meant. Others though tend to shy away from such strong representational links between natural and artificial computation. I guess I still like to remain true to John Holland''s view of the field (he invented the canonical 3-operator genetic algorithm).

Timkin
quote:Original post by edotorpedo
Thanks for all the replies.

I'm using this scheme to perform some kind of path finding (to solve slide puzzles). I therefore use a chromosome made up of 20 genes (if evaluated it can perform 20 moves).


...so your chromosome represents a plan of action...

quote:Original post by edotorpedo
Using 0010,0001 etc, I neglect wrong genes and move on to the next.


...do you mean you ignore actions in the plan that cannot be done (like trying to slide a tile into a wall)?

quote:Original post by edotorpedo
But if the solution requires only 5 moves to be made, wouldn't the chromosome evolve to a solution in which the first 5 genes are the moves to a correct solution, and the other 15 genes are all incorrect (for example 1111) ?


Yes. In this case, you need to interpret your plan as being complete when after any step in the plan, the goal is found. Just perform a quick check on the state of the puzzle - after each action is performed - to ensure that it is, or is not, in the completed state.

Cheers,

Timkin

Edited by - Timkin on December 11, 2001 7:53:42 PM

This topic is closed to new replies.

Advertisement