Evolving PacMan AI

Started by
37 comments, last by kirkd 16 years, 1 month ago
For those who are interested in such things, I have just updated my Evolutionary PacMan project on SourceForge: http://sourceforge.net/projects/aipac/ The most significant change is addressing instability issues. Apparently many users had problems with the application shutting down unexpectedly. I have made a number of changes which should address this problem. Please don't hesitate to let me know if problems of this sort still occur. Two representation methods are in place: Windowed, in which a window around PacMan constitutes the inputs. Each tile in the window (except under PacMan himself) has an input associated with dots/walls, power pellets, and ghosts. Users can select which inputs are used through the parameters file. Global, in which details of the maze are used as inputs, e.g., PacMan's X and Y coordinates, ghost X and Y coordinates, ghost state, power pellet state, presence of walls or dots around PacMan. Again, the specifc inputs used are selectable through the parameters file. I hope this can be useful and/or entertaining. -Kirk
Advertisement
can one see the pacman playing at some point? only the statistics gets updated here, when does it stop and the game plays?
[size="2"]I like the Walrus best.
Thanks for your interest!

Yes, you can opt to have the PacMan run after each generation using the best neural net thus far. To get this to work, while one of the application's windows is activated (the PacMan game maze is probably the best), press the 'B' key. After the current evolutionary step, PacMan will become active. You can turn it off - go back to having PacMan not run - by pressing the 'B' key again.

BE PATIENT after you press the 'B' key as the evolutionary step will have to complete before PacMan runs and sometimes this will take a while depending on how many inputs you have and how large the population is. Also, if you get impatient and hit the 'B' key a second time, you will end up toggling the display off so you'll be back where you started. Just hit the 'B' key once.

Please let me know that you get this working.

-Kirk


Alright, now I see it play. What does exactly the pac-man learns here? I've the feeling that you measure the fitness based on the amount of pills he eats, and that the genome is a fixed sucession of moves that leads the pacman to eat determined amount of pills. I mean, the pacman will always play the same game, isn't it?
[size="2"]I like the Walrus best.
There's a lot to your question. Here's the detail:

A neural net controls PacMan's movements through the maze and you can control the types of inputs going into the PacMan. The default version associated with the executable has no ghosts and uses a Window around PacMan with a radius of 4 tiles. This results in four tiles in each direction around him, plus the two tiles in each direction that he occupies, or 4+4+2 (4 to the right, 4 to the left, 2 for PacMan) left/right and up/down. This gives 100 total tiles. I also have the tiles that PacMan covers removed from this set, so you get 96 inputs.

Each tile can have 3 possible inputs - 1 for ghosts (-1 if a baddie, 0 if no ghost there, and 1 if a blue one), 1 for a dot/wall (-1 if there's a wall, 0 if nothing there, and 1 if a dot is there), and 1 for a pellet (0 for no pellet, 1 for a pellet). The default version you have has no ghosts (you can control the number of ghosts in the input file), so you have dots/walls and pellets as inputs for each tile. The grand total is now 96 * 2 or 192 inputs.

At each step that PacMan takes, the neural network is evaluated. The net has 4 outputs, one for each direction (up, down, left, right). The output with the highest value gives the direction that PacMan takes.

In the absence of ghosts, the game map is deterministic so yes, PacMan will take the same path when using the same neural network. The ghosts have a strong random component to their movements, so adding these will cause some random variation in PacMan's movement from run to run.

The Neural Nets are evolved using an Evolutionary Computation process. The fitness I'm using is the score that PacMan gets for one run through the maze. The time one run can take is limited (you can also control this in the parameters file). The goal is to maximize the score that PacMan gets.

I've found it most interesting to look at PacMan's navigation early in the process and compare that to what happens after it has run for a while. No surprise that early in the process PacMan's movements are more or less random - often he just drifts one direction or the other, hits a wall, and stops. Then after it has run for a while and his score is over 1000, his behavior is much more interesting. So far the maximum score I've seen (with no ghosts) is about 2600 out of a maximum of 2840.

I hope that answers your question - without putting you to sleep!

-Kirk



That's pretty nice. How good does pac-man play when there are ghosts? I'm editing the .param file to see how it goes...
[size="2"]I like the Walrus best.
Thank you! Tell your friends. (Get my rating up on SourceForge!) 8^)

What I've found so far is this:

Using the combined dot/wall input I described is MUCH better than having them as separate inputs. I'm sure this is due to the fact that the network becomes much simpler and easier to deal with in a short time. I'm concerned, however, that this is a bit of cheating. Should I allow the evolution to learn this relationship on its own??

The best window radius I've found so far is 2. I've tried many sizes and this seems to perform very well with only a few inputs - 32 to be exact.

Adding the inputs for the PowerPellets and the Ghosts has not yet seemed to help much. My guess is that with either of these providing inputs, they are only turned "on" a very small amount of the time. For example, power pellet inputs are only equal to 1 when there's a pellet within PacMan's window, which is not very often. The result is that learning is very slow and limited in its impact. The same goes for ghosts.

That being said, when I have turned ghosts on and have dot/wall, power pellet, and ghost inputs, I've seen some interesting results. Many times, PacMan will just eat a power pellet and then wait for a ghost to bump into him. (The ghosts are rather stupid, so they don't avoid him like they perhaps should - I'm working to fix this.) Often, though, I've seen PacMan run back and forth in a corridor just below a power pellet waiting for a ghost to come by. When a ghost gets close, he grabs the power pellet. I've not yet seen him pursue the ghosts, though.

If you get any interesting results, let me know. If you can grab a movie of the screen (like I have done and uploaded onto SourceForge), please do. I'll load it up on the project page as well. Eventually I want to be able to write the neural nets to a file which can be loaded in again later. This would allow any interesting behaviors to be saved and shared.

-Kirk

this is a pretty nice project. keep it up!
[size="2"]I like the Walrus best.
Thank you again. I truly appreciate the positive feedback.

Feel free to offer suggestions. If you have an idea of something you would like to see, let me know. I'm always open to suggestions and constructive criticism.

-Kirk
Have you seen this research into making a ms pac man AI? I think they used simple handcrafted rules and searched for policies to apply those rules effectively. Maybe try replacing your outputs with rules and see how that works, where the network chooses which rule to apply. Of course, the big question is how to automatically build effective rules, given just the basic inputs and outputs.

This topic is closed to new replies.

Advertisement