Jump to content

  • Log In with Google      Sign In   
  • Create Account

train neural network with genetic algorithms


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
16 replies to this topic

#1 simshon   Members   -  Reputation: 100

Like
0Likes
Like

Posted 04 October 2011 - 07:55 PM

hi, im trying to program a racing game,that inside the computer car will learn the road with neural network.
i try to use also genetic algorithm to train the network in order to learn that better:)

i defined 150 steps each generation and 100 cars each generation.
each car(gene) has the weights of the network as his data.

each generation i set the best genes of the generation (the best and all the genes that their fitness is atleast 95% of the first).

the crossover method is choose each weight from the "parents"(best of the last generation).

the mutation method is select one of the weights and change him by +\- 30% of his value.

each generation has 150 steps ,after each step i add\reduce to the genes fitness.
atm i just have clear screen(half of the normal XNA screen).
if the car collide the walls i reduce 1 to the fitness,
if the car move normally(without collide) i add the ratio of the moved vector from the highest the can be,
lets say that the car moved half meter and the car speed is one meter so i add 0.5/1->
moved/carspeed.

each car has four sensors which represent the distance from each wall.
if the wall is out of the sensor range the sensor is "-1",
else the sensor is the distance of the wall divide by the maximum range =(width+height)/3.

as i can see the car learn to move much more area as the generations moves,but it cant learn avoid the walls:S
at generation 15 it start touching the walls,but then its just get faster and faster but still collide the walls:S(the best car of each generation).

there is something wrong in my general idea?
tyvm for your help:)

Sponsor:

#2 Wooh   Members   -  Reputation: 652

Like
0Likes
Like

Posted 04 October 2011 - 09:41 PM

each car(gene) has the weights of the network as his data.

I think you are confusing genes and chromosomes.

each generation i set the best genes of the generation (the best and all the genes that their fitness is atleast 95% of the first).

Is this the cars that are used to generate the next generation? If very few or none of the other cars reach 95% fitness you will get very low genetic diversity which is bad. Using something like tournament selection all cars can still be chosen but with different probabilities.

the crossover method is choose each weight from the "parents"(best of the last generation).

I don't get this. Crossover is used to swap some genes between two chromosomes.

the mutation method is select one of the weights and change him by +\- 30% of his value.

When you say +\- 30% you mean in the range [-30%, 30%]? Not sure but 30% sounds kind of much.

if the car collide the walls i reduce 1 to the fitness,

What happens if the car has been colliding since last step but is not currently colliding? Will that still affect the fitness? If you fail here it can explain why they don't learn to avoid walls.

if the car move normally(without collide) i add the ratio of the moved vector from the highest the can be,
lets say that the car moved half meter and the car speed is one meter so i add 0.5/1->
moved/carspeed.

Are you taking the road direction into account so that you don't reward movement in the wrong direction?

each car has four sensors which represent the distance from each wall.

Is there one sensor at the front,back,left and right of the car? I have no idea how your track looks like but imagine the car driving on a straight road that has a 90° turn later on. No sensor will be able to see the turn until the car is already in the turn. Could this be too late so a crash is unavoidable? Maybe you need more sensors to be able to learn better.

#3 simshon   Members   -  Reputation: 100

Like
0Likes
Like

Posted 05 October 2011 - 07:21 AM

each car(gene) has the weights of the network as his data.

I think you are confusing genes and chromosomes.

each generation i set the best genes of the generation (the best and all the genes that their fitness is atleast 95% of the first).

Is this the cars that are used to generate the next generation? If very few or none of the other cars reach 95% fitness you will get very low genetic diversity which is bad. Using something like tournament selection all cars can still be chosen but with different probabilities.
yes,all the 'best' cars use for the next generation crossover

the crossover method is choose each weight from the "parents"(best of the last generation).

I don't get this. Crossover is used to swap some genes between two chromosomes.
if the program decide to do crossover to this car(the crossover rate is 0.5~) so for each weight in the network the new 'child' randomize from which parent he will "take" this weight

the mutation method is select one of the weights and change him by +\- 30% of his value.

When you say +\- 30% you mean in the range [-30%, 30%]? Not sure but 30% sounds kind of much.
yea,the program decide if it should reduce or add to this weight(randomally 50% each),then i add\reduce random.nextdouble()*(val/3) which is 33%^_^.

if the car collide the walls i reduce 1 to the fitness,

What happens if the car has been colliding since last step but is not currently colliding? Will that still affect the fitness? If you fail here it can explain why they don't learn to avoid walls.
this situation cant be,because it the car collides i dont allow to move the car into the wall, so from the step the car collide ,it will continue to collide till end of the generation,
how i can improve this maybe?

if the car move normally(without collide) i add the ratio of the moved vector from the highest the can be,
lets say that the car moved half meter and the car speed is one meter so i add 0.5/1->
moved/carspeed.

Are you taking the road direction into account so that you don't reward movement in the wrong direction?
atm i just have blank screen and im trying to 'teach' the cars avoid the walls.

each car has four sensors which represent the distance from each wall.

Is there one sensor at the front,back,left and right of the car? I have no idea how your track looks like but imagine the car driving on a straight road that has a 90° turn later on. No sensor will be able to see the turn until the car is already in the turn. Could this be too late so a crash is unavoidable? Maybe you need more sensors to be able to learn better.

good idea,ill add more to sensors at the front of the car
ty for your help:)




#4 AntP   Members   -  Reputation: 143

Like
0Likes
Like

Posted 06 October 2011 - 08:15 PM

this situation cant be,because it the car collides i dont allow to move the car into the wall, so from the step the car collide ,it will continue to collide till end of the generation,
how i can improve this maybe?



How about:

fitness = fitnessBeforeCollisions / timeInContactWithWalls


#5 LorenzoGatti   Crossbones+   -  Reputation: 2773

Like
1Likes
Like

Posted 07 October 2011 - 06:59 AM

I don't think applying genetic algorithms to rough drafts of car controllers (imperfect enough to crash into walls) is very productive. I'd try to get a pool of cars that can finish the race first, then apply genetic algorithms with lap time (on one track, or the total of several tracks) as fitness and eliminating mutants that crash without letting them perpetuate.

To obtain a pool of decent car controllers that finish the race, you could evaluate them on a long test track with a great variety of curve and straight sections and use length of the path before crashing (not speed!) as fitness: improved mutants would crash further along, until you get controllers that finish the race and you start making them fast.

Can you explain the car controller representation and the type of neural network you are using? Are you considering speed, acceleration, centrifugal forces etc. as inputs?
Produci, consuma, crepa

#6 IADaveMark   Moderators   -  Reputation: 2532

Like
0Likes
Like

Posted 07 October 2011 - 07:52 AM

I agree that the mix-and-match approach seemed a bit odd. I can easily see the NN and GA working against each other... or at least being counter-productive because of the additional noise.

I'm curious why the OP chose to use NN/GA technology anyway. Are we trying to create a racing game or a machine learning test project?
Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC

Professional consultant on game AI, mathematical modeling, simulation modeling
Co-advisor of the GDC AI Summit
Co-founder of the AI Game Programmers Guild
Author of the book, Behavioral Mathematics for Game AI

Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

#7 Emergent   Members   -  Reputation: 971

Like
1Likes
Like

Posted 08 October 2011 - 10:14 AM

I agree that the mix-and-match approach seemed a bit odd. I can easily see the NN and GA working against each other... or at least being counter-productive because of the additional noise.

I'm curious why the OP chose to use NN/GA technology anyway. Are we trying to create a racing game or a machine learning test project?


Wait, what's the problem, and why is this weird? ANNs are function approximators, GAs are optimizers, and using a GA to optimize the ANN weights is both totally reasonable and commonly done. Maybe a simpler linear architecture would be better, but that only really matters if you're trying to do something like approximate dynamic programming.

I like LorenzoGatti's suggestion to first get cars that can finish the race, and then to optimize lap times. I'm reminded of two-phase algorithms for convex optimization. Borrowing this idea, you'd think of your problem as minimizing lap time subject to the constraint that you finish the race. To do this, you'd start with a bunch of infeasible points, and, in phase 1, minimize constraint violation (say, maximize distance traveled around the track) until you have a large enough pool of feasible points; then you switch to minimizing lap time, keeping only feasible points from now on.

The same two-phase strategy can really be applied with whatever optimizer you want. Since we've got a nonconvex problem if we stick with ANNs, stochastic methods (like GAs) seem reasonable, though I might prefer the (simpler, IMO) cross-entropy method.

All this said, if you're interested more in results than in having the computer "automatically" "learn" how to race, maybe you really just want to use some nice hand-made controllers. Maybe you can optimize them locally by tuning some weights with gradient descent.

#8 Franck Dernoncourt   Members   -  Reputation: 142

Like
1Likes
Like

Posted 08 October 2011 - 12:38 PM

Having spent a few months this year training ANNs with GA (aka neuroevolution, as a side note we don't always use GA as the evolutionary algorithm paradigm to evolve ANNs), I'd just like to sum up quickly the key benefits, so as you can see more accurately if GA is useful in your case (I only skimmed through the thread so I have no opinion):
  • structural evolution (i.e. you can evolve the ANN's structure, not just the weights)
  • multi-objective optimization
  • quite robust to local optima
  • easily multithreadable
The main downside is the computational cost (and potentially spatial cost if your individuals are obese).


#9 willh   Members   -  Reputation: 160

Like
0Likes
Like

Posted 11 October 2011 - 09:18 AM

Using ga or ep to train an ANN is fine and can work. For many problems back propagation is better, but for yours I think the method you chose is fine.Chances are that you've found a local optima where smashing in to a wall is okay behavior. What you need to do is increase the smashing penalty. Go ahead and make it really high. To be a little nicer, you may want to calculate the force of the collision and use that to so that head on collisions are penalized more than glancing blows.You might also want to try roulette wheel tournament selection. It gives your population a bit more leeway in terms of fitness strategies. You will be always guaranteed to keep the most fit member, but you won't always be wiping out slightly weaker members with functional clones of the top scorer.Good luck, and please post some videos showing the progress of evolution :)

#10 IADaveMark   Moderators   -  Reputation: 2532

Like
0Likes
Like

Posted 11 October 2011 - 12:39 PM

I believe that was my issue with all of this... it seems that by combining the two, you are training your fitness function with a fitness function. Why not just make the original one more accurately descriptive in the first place?
Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC

Professional consultant on game AI, mathematical modeling, simulation modeling
Co-advisor of the GDC AI Summit
Co-founder of the AI Game Programmers Guild
Author of the book, Behavioral Mathematics for Game AI

Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

#11 sooner123   Members   -  Reputation: 241

Like
1Likes
Like

Posted 12 October 2011 - 11:18 AM

I believe that was my issue with all of this... it seems that by combining the two, you are training your fitness function with a fitness function. Why not just make the original one more accurately descriptive in the first place?


I disagree. Back propagation works about as well as permuting the weights genetically. That's all this guy is doing. Treating the weights of the connections between the nodes as components of the chromosome. The only time NN works against GA is when not only the weights but the structure of the NN is dynamic.

#12 Franck Dernoncourt   Members   -  Reputation: 142

Like
0Likes
Like

Posted 12 October 2011 - 02:33 PM

Back propagation works about as well as permuting the weights genetically.

It probably depends on the problem you are trying to solve and obviously on your GA choices (operators, etc).

http://ieeexplore.ie...arnumber=938792 : "Training neural networks: backpropagation vs. genetic algorithms"
There are a number of problems associated with training neural networks with backpropagation algorithm. The algorithm scales exponentially with increased complexity of the problem. It is very often trapped in local minima, and is not robust to changes of network parameters such as number of hidden layer neurons and learning rate. The use of genetic algorithms is a recent trend, which is good at exploring a large and complex search space, to overcome such problems. In this paper a genetic algorithm is proposed for training feedforward neural networks and its performances is investigated. The results are analyzed and compared with those obtained by the backpropagation algorithm



The only time NN works against GA is when not only the weights but the structure of the NN is dynamic.

Why would it works against? As stated in my previous post, structural evolution is potentially one of the key benefits of training ANNs with GA.

#13 willh   Members   -  Reputation: 160

Like
0Likes
Like

Posted 13 October 2011 - 09:31 AM

The only time NN works against GA is when not only the weights but the structure of the NN is dynamic.

Why would it works against? As stated in my previous post, structural evolution is potentially one of the key benefits of training ANNs with GA.


Totally agree. Start with inputs feeding directly to the output and let the structure evolve on it's own. Usually works faster than actually imposing a structure. I've used this method to build classifiers for both face detection and OCR.

That said, I want to take this opportunity to pee on the GA/ANN parade here for a second.

GA/ANN can take an insanely long time to converge, and it's impossible to calculate ahead of time when covergence will occure. In some cases we are talking about days of processing without any gaurantees of an acceptable solution.

Back propagation is usually much faster than GA/ANN provided that you have training data. It's so much faster that you can often train a few hundred ANNs with different structures and different activation functions (picking the best performer when you're all done) in a fraction of the time it would take to 'evolve' one. Below is a video I made demonstrating the back propagation learning algorithm in 'real time'. If this were done using a GA/EP/whatever we would have started out with a single 'step' function within a few generations, and then waited a long time for any additional progress.

Back prop video


If you don't have training data (like in the case of the race car) then definately using a GA is fine and dandy.

Now, all of that said, there is yet a better method. I am talking about C&RT (Brieman, et al). It's statistically sound, lightning fast, and the results are human readable to the point that a 5 year old could understand what the final algorithm is doing. There are no complicated activation functions of derivatives to worry about, no learning rates to twiddle with, inputs do not need to be normalized or altered in any way, and anyone who can use an 'if then' statement is already qualified to work with them.

If there is enough interest I can write something up about them and how to implement a simple homogenity-based learning algorithm.

The reason I bring this up is because it would be much easier to hand-code a C&RTree to navigate a car around a track, and then let the machine learning tool of choice improve on it.

#14 IADaveMark   Moderators   -  Reputation: 2532

Like
0Likes
Like

Posted 13 October 2011 - 09:50 AM


I believe that was my issue with all of this... it seems that by combining the two, you are training your fitness function with a fitness function. Why not just make the original one more accurately descriptive in the first place?


I disagree. Back propagation works about as well as permuting the weights genetically. That's all this guy is doing. Treating the weights of the connections between the nodes as components of the chromosome. The only time NN works against GA is when not only the weights but the structure of the NN is dynamic.

That wasn't what I meant. I was speaking to what one of the earlier posters had said that it seems like you would be using one fitness function (that of the GA in this case) to discover what the best fitness function is for the the other arch (the ANN).



Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC

Professional consultant on game AI, mathematical modeling, simulation modeling
Co-advisor of the GDC AI Summit
Co-founder of the AI Game Programmers Guild
Author of the book, Behavioral Mathematics for Game AI

Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

#15 J-dog   Members   -  Reputation: 120

Like
1Likes
Like

Posted 20 October 2011 - 03:18 AM

Okay... I'm actually surprised to see this discussion having gone on to this depth. There're loads of ways to train NN's, and GAs are a great one... but why are you even doing this? Why not just neural nets? Why not only GAs?

I think the bottom line here is that these algorithms are just too complex for what you are trying to do with them. Even if your problem space wasn't large and your heuristic (fitness function) was more specific (it seems rather vague), it would still take ages for this thing to to run it's course, and even THEN you wouldn't have any guarantee that your network is well-trained. Even less so that these AI opponents are fun / worthwhile to race against.

I think your idea sounds cool, but if you said you were investigating NNs trained by GAs in this problem space, I would think it a fantastic idea. But as you say you wish to develop a game using this, I cannot help but think that you learnt about these algorithms and are trying to apply them in a space where they do not belong.

#16 LorenzoGatti   Crossbones+   -  Reputation: 2773

Like
0Likes
Like

Posted 20 October 2011 - 07:23 AM

I still don't know what sort of "neural network" is being used as a race car controller. What are its inputs (range, speed, acceleration sensors) and outputs (steering, throttle, brakes)? Does it hold some internal state? Is it really a neural network, or thinking of it as a generic concise representation of a lookup table would be more appropriate? What constraints (e.g. symmetry between left and right) can be imposed on its structure?

Without assurance that good car controllers are possible, I'm strongly inclined to attribute any problems with cars that keep crashing around to fundamentally inadequate architecture (mainly ignoring important inputs), not to insufficient or unsophisticated training.
Produci, consuma, crepa

#17 willh   Members   -  Reputation: 160

Like
1Likes
Like

Posted 20 October 2011 - 11:49 AM

I still don't know what sort of "neural network" is being used as a race car controller. What are its inputs (range, speed, acceleration sensors) and outputs (steering, throttle, brakes)? Does it hold some internal state? Is it really a neural network, or thinking of it as a generic concise representation of a lookup table would be more appropriate? What constraints (e.g. symmetry between left and right) can be imposed on its structure?

Without assurance that good car controllers are possible, I'm strongly inclined to attribute any problems with cars that keep crashing around to fundamentally inadequate architecture (mainly ignoring important inputs), not to insufficient or unsophisticated training.



He specified the inputs: they are range to wall, and he has four of them. Range to wall is meaured in distance I'm assuming, and not 'time to collision'.

Good car controllers are definately possible. Below are some decent ones.

a. This one made by evolving a neural network (video)
b. A prettier one using Neuroevolution
c. Even prettier one

It's not really that complicated a problem when there are no other cars on the track. With enough time the neural network would just memorize the optimal path.

I can't speak for the OP, but one reason to use a neural network (MLP) is that they can approximate any function 'as is' without special coding considerations. Changing the inputs/outputs doesn't mean re-writing the neural network code.


He is definately missing some inputs if he only has 4 direction sensors. Speed of car would be an important one. :)




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS