Recursive Partitioning via GA??

Started by
8 comments, last by kirkd 22 years, 4 months ago
I'm working on development of classification trees and I'm hopeful that I can apply GA/EP techniques. If anyone has any suggestions or comments, I welcome them. Background: A classification tree is a simple classifier system built upon a binary tree. Let's say I have 100 objects of known type (apples, and oranges) and a number of variables which describe them (shape, color, taste, smell, etc.). At each node the classification tree will choose a variable and a split point such that compounds fall into one of two groups. Then the process repeats until you have a minimum number of compounds in a node or you have a perfect classification. EX: Node 1: Diameter <= 3 inches | _____Yes_|___No____ | | Node 2: 100% apples Weight < 2 g | ___Yes__|__No_____ | | 100% oranges 100% apples Then, if I took an unknown object and ran it through the tree, I would get a prediction as to what class if falls into. This is also called Recursive Partitioning. I should also note that my system is much more complex in that I'm trying to predict chemical compounds that bind to certain receptors in the body. I'm not actually comparing apples to oranges. 8^) -Kirk My tree at the top looks pretty bad on the standard view. The edit view is much cleaner. Edited by - KirkD on November 15, 2001 7:55:41 AM
Advertisement
Well, if you want to apply GA techniques to the classifier I would suggest you start by finding a way of defining the classifier tree in a genotype. You''ll need genetic operators to apply to the genotype, such as crossover, mutation, addition and deletion. These genetic operators should have a smooth genotype, phenotype mapping, i.e. a small change in the genotype causes a small change in the phenotype. Also, the mutation operators should tend towards small changes, so that each transition in genotype space moves the individual to a close point in the fitness landscape. If there is a definate upward curve you will hillclimb up it, however if your mutation operators are too large you''ll bypass possible directions of fitness increase. The problem in mutating a treelike structure (such as lisp or your example) is that any change can be catastrophic, cutting off an entire branch and moving the individual a long distance across a fitness landscape. This is generally a bad idea, as the more radical each mutation is, the more like random search your algorithm becomes.
Any addition operator should preferably produce a new branch (or whatever) with neutral fitness,i.e. making no difference to the individual''s current fitness. Then future mutation operators on the new branch tend it towards a more optimal solution. Throwing in a new branch with random values will most likely produce a decrease in fitness to your already (most likely) fairly locally optimal solution.

Of course these are all in my most humble opinion.

Mike
Mike,

Thanks for the input!! What you''ve said makes a lot of sense. My thoughts are to avoid crossover in which a pair of nodes are swapped across trees. I thought that the mutation operators that I would use could include changing the variable selected or the split point. Another might include collapsing a sub-tree and rebuilding it.

My thoughts thus far are to select the decision variable at each node randomly but use some less than random approach for deciding on the split point. This might reduce the negative impact.

Again, I truly appreciate your comments!!

-Kirk
Hey Kirkd, we''ll meet again

---
Allow me to clear my head for once...
Stop polluting the air!
---Allow me to clear my head for once...Stop polluting the air!
Are you stalking me??

-Kirk
LOL

No, I just seem to visit the same sites as you do.
I visit both Gen5 and GD since my goal in life is to get a job as a game AI developer.

---
Allow me to clear my head for once...
Stop polluting the air!
---Allow me to clear my head for once...Stop polluting the air!
Interesting.
So are you a bio-chemist ?


I came, I saw, I got programmers block.
~V''''lion
~V'lionBugle4d
quote:Original post by Airhead Zoom
LOL

No, I just seem to visit the same sites as you do.
I visit both Gen5 and GD since my goal in life is to get a job as a game AI developer.



Good luck! It''s perfectly doable, of course....I just find that there seems to be a lot of "churn" for most develoeprs in the AI field, mostly due to disatisfaction with how the last project or two have turned out. I do know of plenty of folks who are very happy, though, so it''ll probably work out.

Have you been to the GDC job fair yet? That''s a great place to interview everybody!




Ferretman

ferretman@gameai.com
www.gameai.com

From the High Mountains of Colorado

Ferretman
ferretman@gameai.com
From the High Mountains of Colorado
GameAI.Com

No, I don''t live in America and I''m only 18, so I don''t think that will be so easy.
But I''m teaching myself as much about AI as possible. Just haven''t got around to actually /program/ anything yet, though!
My study is eating up a lot of time and I have a project or two lying around which I want to finish before I start on game AI.

I''m taking it slowly. (I''m doing the first year of a 4 year AI study so I /can/ take it slowly )

But what do you mean with the projects that have turned out bad?

---
Allow me to clear my head for once...
Stop polluting the air!
---Allow me to clear my head for once...Stop polluting the air!
Vlion,

Actually my undergraduate degree is in human nutrition and exercise physiology, my PhD is in Pharmacology/Physiology, and I did post-doc work in Computational Chemistry. Now, I work for a compay called Accelrys (www.accelrys.com) as a Software Research Scientist in the computational toxicology group. Weird, eh?

-Kirk

This topic is closed to new replies.

Advertisement