Archived

This topic is now archived and is closed to further replies.

Kylotan

Neural nets plus Genetic algorithms

Recommended Posts

Can anyone explain how to combine these two, and more important;y why it would be done? I got the impression that you can use a genetic algorithm to speed up the training of the neural nets, by breeding better weight sets. Is this right? Is taking some crossover of 2 n.nets that have each been through 1 epoch going to learn more quickly than going through 2 epochs with one net? [ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost ]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
I think this type of technique is useful when you know what the neural net has to do (measured by a fitness function) but it is difficult to work out the numerical causes of error in the output (which is needed to use training techniques).

Share this post


Link to post
Share on other sites

You would use a combination of NN and GA''s to evolve what is called ''artifial life''. You can use this technique whenever you know very little about the fitness function. For example you would only know: live = good, die = bad.
A good example is the quake bot. Each bot will have its own neural net to determine his action. At first these neural nets are random. You let a number of these bots battle a while in some sort of level. After this you pick the best two bots (least damage on itself, most kills), perform some sort of crossing over/mutation on their neural nets, and let the bots battle again.

After a thousend generations or so you will have a bot that will shoot in the path you''re walking in, and evade incoming missiles. (well, at least it should be).

This algorithm does however depend on numerous parameters.

search for neuralbot in google.

Edo

Share this post


Link to post
Share on other sites
My website is a good introduction to using GAs to optimize ANNs.

you can find it here:

http://www.btinternet.com/~fup/Stimulate.html

I''ve also just added a message board for the purposes of discussion related to the tutorials.

Share this post


Link to post
Share on other sites
quote:
Original post by edotorpedo
A good example is the quake bot. Each bot will have its own neural net to determine his action. At first these neural nets are random. You let a number of these bots battle a while in some sort of level. After this you pick the best two bots (least damage on itself, most kills), perform some sort of crossing over/mutation on their neural nets, and let the bots battle again.

Ok, so it could technically accelerate the rate of learning this way. But I still wonder whether taking the crossover of 2 neural nets is going to give you better performance than just evolving a single neural net for twice as long. (In other words, is there any point doing this when you''re not just arbitrarily assigning 1 NN to each entity?) Are there any examples of why this might be so?

quote:
Original post by Fup
My website is a good introduction to using GAs to optimize ANNs.

Why is it that nobody can seem to present a compelling example for the use of GAs? Not to criticise your site, as the tutorials themselves are very good... it''s just that the ''sample application'' for GAs is usually something very artificial, whereas the sample app for a NN is often something that makes sense, such as character recognition.

[ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost ]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote: it''s just that the ''sample application'' for GAs is usually something very artificial
:End Quote


Actually I have found that GA''s are used in more real situations. I have a friend that worked for a company that shipped several manufactured goods. The executives hired som CS GA professionals and had them engineer a more efficient way to make, move, store, and sell their goods. The process invovled too many variables to be done using standard algorythms, so a GA was used, and worked perfectly to to create a "specie" (a list of good settings for most of the variables in the system) that was 70% more effective than the system they were currently using. That seems pretty real to me.

Share this post


Link to post
Share on other sites
GAs are often used in industry to solve large scale optimisation problems. Some examples: GAs have been used to schedule trains in many cities around the world (including here in Melbourne, Australia); for optimising traffic light timings for traffic density control; optimisation of flow control parameters for utility companies; and the list goes on.

Timkin

Share this post


Link to post
Share on other sites
NN+GA are not necessarily anything to do with artificial life, and you can know as much or as much or as little about the fitness function as you like: that doesn''t matter.

GA allow to train the NN in a form of UNSUPERVISED learning: i.e. you don''t know what the right behaviour is at each stage, but you can measure its result (short or long term). GA are just an optimisation tool that allow this, among many other options.

Pure NN solutions are usually SUPERVISED: i.e. you know what the answer should be at each step -- that''s the training data.




Artificial Intelligence Depot - Maybe it''s not all about graphics...

Share this post


Link to post
Share on other sites
Once you start using GA''s you''ll realize there are countless uses for them.


PS> How do you get the hyperlinks in your sigs guys? (I do not know the mysteries of html)

Share this post


Link to post
Share on other sites
quote:
Original post by fup
How do you get the hyperlinks in your sigs guys? (I do not know the mysteries of html)


Try pasting this:

/// first attempt - drat that parser!

Second attempt:

Use the quote option to the far right on this post and then cut out the snippet of html that follows and paste it into your sig.




Stimulate





Edited by - lessbread on February 22, 2002 8:28:34 AM

Share this post


Link to post
Share on other sites
I may be confused by what you''re trying to say but you''re suggesting that it might be better to do evolution without crossover. Can I ask exactly what you mean by evolution then .

Cheers,

Mike

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
quote:
Original post by fup
Once you start using GA''s you''ll realize there are countless uses for them.


PS> How do you get the hyperlinks in your sigs guys? (I do not know the mysteries of html)


Why don''t you just name some of those countless uses then?

Share this post


Link to post
Share on other sites
I wasn''t implying GAs have no real-world uses - just that the tutorials tend to be something contrived, meaning it becomes hard to see how you would use such a thing.

I just can''t see why you''d train 2 NNs and then perform a GA crossover to improve the weights, when the same processing power could be used to train that NN for twice as long. I''m not being sceptical here, I just honestly can''t see the application, except in situations where the GA is used to approximate biological genetics (as oppposed to being used in a purely algorithmic fashion).

I don''t really grasp the ''unsupervised'' concept, I''m afraid... surely it is just a case of ''less supervised'', since you cannot generate a fitness function without some method of measuring fitness?

Oh, and I have read all the articles on alexjc and fups'' sites, so directing me there for an explanation is not going to help much

(I appreciate AI is not my forté, so please be gentle with the replies )

[ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost ]

Share this post


Link to post
Share on other sites

Kylotan, you wouldn''t train just 2 NN with GA''s, you would evolve a whole population of NN''s.
Unsupervised means you would never (I don''t know if this is really true) know which target pattern a NN would have if it was presented with an input pattern. The actions of a NN evolved with GA almost always depends on its environment.

GA''s are called ''unsupervised'' learning, since the algorithm does not depend on example data. That doesn''t mean you can''t create some sort of fitness function. Those are two different things you have to keep in mind.
For some sort of input you would only know if the output is ''good'' or ''bad''. But you wouldn''t know what the output would be if you were only given an input. There''s a subtle difference here.

But I''m sure you will understand this if you have full understanding of both NN''s and GA''s. I knew about NN and GA''s and fully understood the concept of combining these two after reading the article about the quake neuralbot.

I hope this will clear up a thing or two.\

Edo

Share this post


Link to post
Share on other sites
quote:
Original post by edotorpedo
Kylotan, you wouldn''t train just 2 NN with GA''s, you would evolve a whole population of NN''s.

Ok, but that doesn''t really answer my question. For ''2'', you can substitute any number greater than one. Why have 20 bots learning and breeding when you could have one learning at 20 times the rate (given the same computational usage)? Is it to help overcome local minima or something? Or is it genuinely more efficient? Or is it made necessary, in that you can''t use an NN on its own for this?

And is there any reason why you would use GAs with NNs if you weren''t simulating a whole load of ''minds'' in this way? That was kind of what I wanted to know in the first place.

quote:
But I''m sure you will understand this if you have full understanding of both NN''s and GA''s.

I have a decent understanding of NNs, I have a functional understanding of GAs (I know how to write one and how and why they work), but since I have little experience I am having trouble seeing the applicability.

I can see several places where I would use an NN, but not a GA, especially not together, except for that one case we''ve covered. I was wondering if there were others.

From reading that Neuralbot site, it seems like he was using only a tiny fraction of a neural net''s capability anyway, and doing all the learning work with the GA. Again, I am probably missing the point, but it seems like a waste.

[ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost ]

Share this post


Link to post
Share on other sites
quote:
Original post by Kylotan
Why have 20 bots learning and breeding when you could have one learning at 20 times the rate (given the same computational usage)?


That''s about population size. It''s a theoretical thing for the GA, so you can actually only have one agent, but many NN solutions within the GA''s population. Big populations have many advantages -- mostly diversity for optimal searching. Personally, I have one bot in the game, but it has a virtual NN population of 81 inside it (embedded evolution).

quote:

I have a decent understanding of NNs, I have a functional understanding of GAs (I know how to write one and how and why they work), but since I have little experience I am having trouble seeing the applicability.


Again, think of GA as an optimisation tool. No more, no less. Once you''ve modelled your problem for a NN, if you can''t get per-instance training data, define a fitness function and optimise to that (GA is one way of doing this).

quote:

From reading that Neuralbot site, it seems like he was using only a tiny fraction of a neural net''s capability anyway, and doing all the learning work with the GA. Again, I am probably missing the point, but it seems like a waste.


His model is very limited anyway: the way the inputs/outputs are chosen, the lack of modularity, lack of internal state... these are all things that limit the bot. The choice of GA is worthy (potentially allowing gas-nets), but reinforcement learning may have been better in my opinion.




Artificial Intelligence Depot - Maybe it''s not all about graphics...

Share this post


Link to post
Share on other sites
LessBread: thanks for the sig

Anon(lol):

quote:
Original post by Anonymous Poster


Why don''t you just name some of those countless uses then?




Because it didn''t add to the discussion. GAs are essentially an optimization tool. They traverse search spaces very quickly. So they can be used in most situations where you have a whole load of parameters which need tweaking. From the weights in neural nets, modelling racing lines and engine tuning in motor vehicles, protein modelling, stress calculations... honestly the list is too big.

Kylotan:
quote:

I wasn''t implying GAs have no real-world uses - just that the tutorials tend to be something contrived,



My GA tutorial is contrived because it was written solely as a prelude to the ANN tutorial. I didn''t want to spend pages on a ''real'' optimization problem because that''s what i do in the ANN tutorial and I believe the GA tutorial, though short, is adequate.

quote:

Ok, but that doesn''t really answer my question. For ''2'', you can substitute any number greater than one. Why have 20 bots learning and breeding when you could have one learning at 20 times the rate (given the same computational usage)? Is it to help overcome local minima or something? Or is it genuinely more efficient? Or is it made necessary, in that you can''t use an NN on its own for this?



Because if you don''t use a GA how are you going to train the net to do something like bot AI? You could *try* reinforcement learning but I reckon that''s going to prove fruitless except in specific cases. If you want to use a supervised method like backprop then how are you going to generate relevant training sets? If you don''t understand the distinction here then you do not understand what supervised and unsupervised learning actually mean.

I believe Alex has answered all your other queries.

I would strongly recommend coding your own GAs and ANNs because that''s the only way(unless you are very clever indeed) these concepts will ''click'' into place. Trust me, try it and it''ll answer all your questions.





Stimulate

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
>Why have 20 bots learning and breeding when you could have one learning at 20 times the rate (given the same computational usage)? Is it to help overcome local minima or something? Or is it genuinely more efficient? Or is it made necessary, in that you can''t use an NN on its own for this?

Consider GA as being a method of space traversal. For simplicity, let us assume the NN''s are single perceptrons. Therefore the space the GA would be traversing is the space of all weight vectors. So you''re asking why shouldn''t we have 20 initial populations as opposed to one and run the latter 20 times longer? Well imagine yourself on a mountain range. Since there are several peaks it would take one person a very long time to find the tallest of them all, if you could at all (maybe throw in random restarts). Now imagine yourself on a mountain range with 19 friends. The group would have a better chance of finding the highest peak since while some people would get stuck on lower peaks, other could continue on.

>I just can''t see why you''d train 2 NNs and then perform a GA crossover to improve the weights, when the same processing power could be used to train that NN for twice as long.

The assumption made here is that a combination of the current best leads to the overall best, though I don''t beleive this assumption can be made as a general rule. It would be highly dependant on the space you were traversing. In the mountain range example is assumes the highest peak is "inbetween" two other peaks, which obviously isn''t always true. This assumption can, and is, made inside the population though. The population is assumed to span a relatively small portion of the space and doesn''t represent two unrelated maxima.

Share this post


Link to post
Share on other sites
quote:
Original post by fup
My GA tutorial is contrived because it was written solely as a prelude to the ANN tutorial. I didn''t want to spend pages on a ''real'' optimization problem because that''s what i do in the ANN tutorial and I believe the GA tutorial, though short, is adequate.

Okay. I don''t mean to criticise your tutorial at all, as it''s very good and helpful. I guess I am just not very good at learning things unless I see a practical application, and mixing it with the NN stuff kind of blurred it for me personally.

quote:
Because if you don''t use a GA how are you going to train the net to do something like bot AI? You could *try* reinforcement learning but I reckon that''s going to prove fruitless except in specific cases.

I''ll let you and Alex debate the merits of that

quote:
If you want to use a supervised method like backprop then how are you going to generate relevant training sets? If you don''t understand the distinction here then you do not understand what supervised and unsupervised learning actually mean.

Ok, so GAs are good in situations where you have no training set for a NN (eg. learning to shoot at enemies who will appear randomly) but can still evaluate the ''fitness'' (eg. how often you hit, how much you miss by)?

quote:
I would strongly recommend coding your own GAs and ANNs because that''s the only way(unless you are very clever indeed) these concepts will ''click'' into place. Trust me, try it and it''ll answer all your questions.

Yeah - I just need to find uses for them, that''s all. (And hence this entire thread ) Any vague ideas on how I could leverage either or both of them into a text-only MUD?

Thanks to all for your responses, by the way.

[ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost ]

Share this post


Link to post
Share on other sites
quote:
Original post by Kylotan
Ok, so GAs are good in situations where you have no training set for a NN (eg. learning to shoot at enemies who will appear randomly) but can still evaluate the ''fitness'' (eg. how often you hit, how much you miss by)?



What you basically just did was explain the difference between supervised and unsupervised learning in the form of a question. GAs are unsupervised; hence they require no training data. Supervised training methods like backprop require training data. In short, you''ve hit the nail on the head.

As for real-life applications, I''ll be honoest; I haven''t tried to find any. The problem for me is that the architecture of an NN seems altogether too arbitrary. The architecture of an NN is dependent upon the application. If you don''t understand the application beyond a simple fitness function, It seems that you''re basically just hoping that the architecture you decided upon can classify things accurately. You can''t use a Perception to solve XOR. But if you didn''t know the answers to the XOR problem, and could only say "right" or "wrong" for a given output (as a fitness function does), then how would you know that it isn''t linearly seperable and therefore cannot be solved by a perception? If you design a 2-layer neural net to control your quake bot and know nothing more than whether it lives or dies, how do you know if your architecture is capable of solving the problem? These are my questions, and I can''t seem to solve them. The answer lies in evolving the architecture as well - and this is getting a bit over my head!!

Share this post


Link to post
Share on other sites
...he dreams dreams dreams... a universe all to his own...
...unsupervised learning in the land of oz...

[Theory that Dreams are a conduit for unsupervised human learning.]

[Hugo Ferreira][Positronic Dreams][]
"Somewhere, something incredible is waiting to be known."
- Carl Edward Sagan (1934 - 1996)

Share this post


Link to post
Share on other sites
quote:
Original post by TerranFury
If you design a 2-layer neural net to control your quake bot and know nothing more than whether it lives or dies, how do you know if your architecture is capable of solving the problem? These are my questions, and I can't seem to solve them.



There's a great thread on the AI Depot about this: evolution, simplicity, emergence and implicit fitness functions.

http://geneticalgorithms.ai-depot.com/GeneticAlgorithms/View.html?id=121

quote:
Original post by TerranFury
The answer lies in evolving the architecture as well - and this is getting a bit over my head!!


That's what I'm doing for my motion controllers, i'm evolving the NN's structure as well as the sensor parameters (distance, angle). That's a benefit of GA: they can optimise pretty much anything you can model! If I wasn't able to do that, I'd use reinforcement learning, with retrospect... my higher-level AI research is leading towards that anyway, and contrarily to fup, i'm a strong believer in its potential.



Artificial Intelligence Depot - Maybe it's not all about graphics...

Edited by - alexjc on February 25, 2002 6:41:17 AM

Share this post


Link to post
Share on other sites
Yeah, once you''ve got a feel for the basics of neural nets, evolving their topology is the only way to go.

Alex: I don''t think we disagree about reinforcement learning Alex. I was just indicating that, like backprop, reinforcement learning has its limits. It''s certainly very appropriate for the right sort of problem.

Kylotan:
quote:

Yeah - I just need to find uses for them, that''s all. (And hence this entire thread ) Any vague ideas on how I could leverage either or both of them into a text-only MUD?



Well, I have no experience with MUDS whatsoever so I can''t advise you here. I do know however that ANNs are being used for all sorts of text analysis, some of which may be relevant to a MUD.




Stimulate

Share this post


Link to post
Share on other sites
There seems little point using a supervised error reduction method such as back prop along with a genetic algorithm for a very simple reason. Due to the quick convergence of a population at the earliest stages of evolution (in fact evolution is merely choosing the best initial start position out of the random initial population) the individuals will quickly attain a very similar set of weights. Back propagation on these nearly identical weights will direct the weights in the same direction (though not entirely identically, given some discrepancies between the exact values). Therefore backprop will direct the converged population in the same direction before evolution chooses the best of the population for breeding. This kinda suggests that the evolution is fairly pointless as you''re only testing many individuals that have all been adapted in the same way. Further, if the point of evolution is to find novel solutions because you cannot supervise their learning then each step away from the supervised answer will be bought back by the error minimisation of backprop. i.e. evolution steps away from the supposed optimum and adaptation brings it straight back.

In all, I''d say it was fairly pointless to do both at once.

Mike

Share this post


Link to post
Share on other sites