Jump to content

  • Log In with Google      Sign In   
  • Create Account

We're offering banner ads on our site from just $5!

1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.


Evolving PacMan AI


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
38 replies to this topic

#1 kirkd   Members   -  Reputation: 505

Like
0Likes
Like

Posted 29 January 2008 - 11:08 AM

For those who are interested in such things, I have just updated my Evolutionary PacMan project on SourceForge: http://sourceforge.net/projects/aipac/ The most significant change is addressing instability issues. Apparently many users had problems with the application shutting down unexpectedly. I have made a number of changes which should address this problem. Please don't hesitate to let me know if problems of this sort still occur. Two representation methods are in place: Windowed, in which a window around PacMan constitutes the inputs. Each tile in the window (except under PacMan himself) has an input associated with dots/walls, power pellets, and ghosts. Users can select which inputs are used through the parameters file. Global, in which details of the maze are used as inputs, e.g., PacMan's X and Y coordinates, ghost X and Y coordinates, ghost state, power pellet state, presence of walls or dots around PacMan. Again, the specifc inputs used are selectable through the parameters file. I hope this can be useful and/or entertaining. -Kirk

Sponsor:

#2 owl   Banned   -  Reputation: 364

Like
0Likes
Like

Posted 05 February 2008 - 09:13 PM

can one see the pacman playing at some point? only the statistics gets updated here, when does it stop and the game plays?

#3 kirkd   Members   -  Reputation: 505

Like
0Likes
Like

Posted 06 February 2008 - 04:25 AM

Thanks for your interest!

Yes, you can opt to have the PacMan run after each generation using the best neural net thus far. To get this to work, while one of the application's windows is activated (the PacMan game maze is probably the best), press the 'B' key. After the current evolutionary step, PacMan will become active. You can turn it off - go back to having PacMan not run - by pressing the 'B' key again.

BE PATIENT after you press the 'B' key as the evolutionary step will have to complete before PacMan runs and sometimes this will take a while depending on how many inputs you have and how large the population is. Also, if you get impatient and hit the 'B' key a second time, you will end up toggling the display off so you'll be back where you started. Just hit the 'B' key once.

Please let me know that you get this working.

-Kirk




#4 owl   Banned   -  Reputation: 364

Like
0Likes
Like

Posted 06 February 2008 - 04:42 AM

Alright, now I see it play. What does exactly the pac-man learns here? I've the feeling that you measure the fitness based on the amount of pills he eats, and that the genome is a fixed sucession of moves that leads the pacman to eat determined amount of pills. I mean, the pacman will always play the same game, isn't it?

#5 kirkd   Members   -  Reputation: 505

Like
0Likes
Like

Posted 06 February 2008 - 05:29 AM

There's a lot to your question. Here's the detail:

A neural net controls PacMan's movements through the maze and you can control the types of inputs going into the PacMan. The default version associated with the executable has no ghosts and uses a Window around PacMan with a radius of 4 tiles. This results in four tiles in each direction around him, plus the two tiles in each direction that he occupies, or 4+4+2 (4 to the right, 4 to the left, 2 for PacMan) left/right and up/down. This gives 100 total tiles. I also have the tiles that PacMan covers removed from this set, so you get 96 inputs.

Each tile can have 3 possible inputs - 1 for ghosts (-1 if a baddie, 0 if no ghost there, and 1 if a blue one), 1 for a dot/wall (-1 if there's a wall, 0 if nothing there, and 1 if a dot is there), and 1 for a pellet (0 for no pellet, 1 for a pellet). The default version you have has no ghosts (you can control the number of ghosts in the input file), so you have dots/walls and pellets as inputs for each tile. The grand total is now 96 * 2 or 192 inputs.

At each step that PacMan takes, the neural network is evaluated. The net has 4 outputs, one for each direction (up, down, left, right). The output with the highest value gives the direction that PacMan takes.

In the absence of ghosts, the game map is deterministic so yes, PacMan will take the same path when using the same neural network. The ghosts have a strong random component to their movements, so adding these will cause some random variation in PacMan's movement from run to run.

The Neural Nets are evolved using an Evolutionary Computation process. The fitness I'm using is the score that PacMan gets for one run through the maze. The time one run can take is limited (you can also control this in the parameters file). The goal is to maximize the score that PacMan gets.

I've found it most interesting to look at PacMan's navigation early in the process and compare that to what happens after it has run for a while. No surprise that early in the process PacMan's movements are more or less random - often he just drifts one direction or the other, hits a wall, and stops. Then after it has run for a while and his score is over 1000, his behavior is much more interesting. So far the maximum score I've seen (with no ghosts) is about 2600 out of a maximum of 2840.

I hope that answers your question - without putting you to sleep!

-Kirk





#6 owl   Banned   -  Reputation: 364

Like
0Likes
Like

Posted 06 February 2008 - 05:48 AM

That's pretty nice. How good does pac-man play when there are ghosts? I'm editing the .param file to see how it goes...

#7 kirkd   Members   -  Reputation: 505

Like
0Likes
Like

Posted 06 February 2008 - 06:19 AM

Thank you! Tell your friends. (Get my rating up on SourceForge!) 8^)

What I've found so far is this:

Using the combined dot/wall input I described is MUCH better than having them as separate inputs. I'm sure this is due to the fact that the network becomes much simpler and easier to deal with in a short time. I'm concerned, however, that this is a bit of cheating. Should I allow the evolution to learn this relationship on its own??

The best window radius I've found so far is 2. I've tried many sizes and this seems to perform very well with only a few inputs - 32 to be exact.

Adding the inputs for the PowerPellets and the Ghosts has not yet seemed to help much. My guess is that with either of these providing inputs, they are only turned "on" a very small amount of the time. For example, power pellet inputs are only equal to 1 when there's a pellet within PacMan's window, which is not very often. The result is that learning is very slow and limited in its impact. The same goes for ghosts.

That being said, when I have turned ghosts on and have dot/wall, power pellet, and ghost inputs, I've seen some interesting results. Many times, PacMan will just eat a power pellet and then wait for a ghost to bump into him. (The ghosts are rather stupid, so they don't avoid him like they perhaps should - I'm working to fix this.) Often, though, I've seen PacMan run back and forth in a corridor just below a power pellet waiting for a ghost to come by. When a ghost gets close, he grabs the power pellet. I've not yet seen him pursue the ghosts, though.

If you get any interesting results, let me know. If you can grab a movie of the screen (like I have done and uploaded onto SourceForge), please do. I'll load it up on the project page as well. Eventually I want to be able to write the neural nets to a file which can be loaded in again later. This would allow any interesting behaviors to be saved and shared.

-Kirk



#8 owl   Banned   -  Reputation: 364

Like
0Likes
Like

Posted 06 February 2008 - 06:23 AM

this is a pretty nice project. keep it up!

#9 kirkd   Members   -  Reputation: 505

Like
0Likes
Like

Posted 06 February 2008 - 06:27 AM

Thank you again. I truly appreciate the positive feedback.

Feel free to offer suggestions. If you have an idea of something you would like to see, let me know. I'm always open to suggestions and constructive criticism.

-Kirk


#10 Vorpy   Members   -  Reputation: 869

Like
0Likes
Like

Posted 06 February 2008 - 01:21 PM

Have you seen this research into making a ms pac man AI? I think they used simple handcrafted rules and searched for policies to apply those rules effectively. Maybe try replacing your outputs with rules and see how that works, where the network chooses which rule to apply. Of course, the big question is how to automatically build effective rules, given just the basic inputs and outputs.

#11 kirkd   Members   -  Reputation: 505

Like
0Likes
Like

Posted 07 February 2008 - 03:37 AM

I had not seen that one. There are a few other examples of learning PacMan and Ms. PacMan, but that's a new one. I briefly scanned the paper and the problem I have is that they have a set of hand coded rules from which the agent can choose. My hope was to supply raw game information without any influence on what the agent does with that information. I think their work is very good, but I would like to see more of an organic development of the rules - much as you mentioned.

I have thought about something more like Genetic Programming, but have just not had the time to implement it. Again, if the raw game information is supplied and subroutines are allowed to develop, it might serve as an organic rule development process. Then GP could develop the rule importance process.

Interesting paper.

-Kirk



#12 owl   Banned   -  Reputation: 364

Like
0Likes
Like

Posted 07 February 2008 - 06:43 AM

I think the pac-man should know the entire map and the position of every object all the time because that's what a human player knows when playing. But I guess that'd be a lot of input to process..

#13 kirkd   Members   -  Reputation: 505

Like
0Likes
Like

Posted 07 February 2008 - 08:58 AM

Yes, I've tried that version as well. Currently the small windowed version seems to perform best (as I mentioned above), but I've tried much larger windows and a full screen input. Like you mentioned the complexity of the neural net becomes huge and not much progress is made. Perhaps if I were to let the simulation run for a few weeks it might find some interesting behavior.

I've also tried a non-tile based input scheme. In the parameters file you can switch to a Global type in which there are only about 38 inputs maximum - PacMan X and Y, Ghost X and Y, Ghost state, Power pellet X and Y, Power Pellet state, the direction having the most dots, the number of dots remaining in each direction, walls. It also does reasonably well but I'm convinced the inputs aren't very good and could be greatly improved.

Right now I'm running the small window version (radius = 2) with 4 ghosts and inputs for ghosts, dots/walls, power pellets (96 total inputs). After running for almost 24 hours the score is around 1300. The most interesting behavior is PacMan follows a similar path for each run but if a ghost is in the way, he'll deviate that path a bit. He also hesitates to eat a Power Pellet until a ghost is nearby. He still doesn't pursue blue ghosts.

Any ideas for other input schemes?

-Kirk



#14 owl   Banned   -  Reputation: 364

Like
0Likes
Like

Posted 07 February 2008 - 04:28 PM

How are you calculating the "direction with the most dots"? Maybe giving scores to the tiles based on if there is a pill, a wall, emptiness or a ghost, dividing the map in sectors and taking into account the distance of pac.man to those areas could be a way to doit. You could score the tiles also with the times the pac-man passed through (say, negatively), so if the pac-man got stuck in some area, the function should make the pacman change the direction that's making it be stuck.

#15 tok_junior   Members   -  Reputation: 229

Like
0Likes
Like

Posted 07 February 2008 - 07:48 PM

I left it running overnight with all inputs enabled, 4 ghosts and a window-radius of 2. I achieved a best fitness of 3770, but more than 18000 generations without improvement. No wonder, since it moved to the upper left corner and got stuck.

#16 kirkd   Members   -  Reputation: 505

Like
0Likes
Like

Posted 08 February 2008 - 04:38 AM

Quote:
Original post by owl
How are you calculating the "direction with the most dots"? Maybe giving scores to the tiles based on if there is a pill, a wall, emptiness or a ghost, dividing the map in sectors and taking into account the distance of pac.man to those areas could be a way to doit. You could score the tiles also with the times the pac-man passed through (say, negatively), so if the pac-man got stuck in some area, the function should make the pacman change the direction that's making it be stuck.




For the Global version of the AI (non-windowed AI type), I have the following inputs:

CheckForWalls gives 4 inputs, one for each direction around PacMan. The input is 1 if there is a wall there, 0 if PacMan can move that direction

CheckDots gives 4 inputs (1 for each direction around PacMan), each with the total number of dots on that side of PacMan's current position. For this I look at the entire maze and count the number of remaining dots. My hope was that he would gravitate toward regions of high dot density.

MaxDotDirection gives 4 inputs, 1 of which will be set to 1 (others 0) corresponding to the direction in which most dots are present.

GhostX, Y and State - 3 inputs for each ghost

PowerUpX, Y and State - 3 inputs for each power pellet

PacManX, Y - hopefully to give a relative position to the ghosts and power pellets


Regarding PacMan getting stuck, I have in place a stall out timer which will stop the simulation early if his position doesn't change for a certain number of clock ticks. I also have another timer that will end the simulation early if his score does not increase for a certain number of ticks. This should prevent him getting stuck for too long in cycles. You can adjust both of these in the parameters file as well.

Scoring the tile based on PacMan passing through them is interesting. Essentially creating a deterrence to follow the same path multiple times. I'll have to think about this one and the scoring/sectoring idea. Hmmmm....

-Kirk




#17 kirkd   Members   -  Reputation: 505

Like
0Likes
Like

Posted 08 February 2008 - 04:43 AM

Quote:
Original post by tok_junior
I left it running overnight with all inputs enabled, 4 ghosts and a window-radius of 2. I achieved a best fitness of 3770, but more than 18000 generations without improvement. No wonder, since it moved to the upper left corner and got stuck.


3770 is quite good! Given that he drift to the upper left and gets stuck I'll bet that he goes to the Power Pellet, waits for the ghosts to come along, gets the Pellet, and then waits for ghosts to bump into him. I've seen this happen a lot.

One of the problems is with the ghost AI. I currently have it set up such that every time a ghost hits an intersection it has a random probability of moving toward PacMan or in a random direction. For the 4 ghosts the probability of moving toward PacMan is 20%, 40%, 60%, and 80%. When they are vulnerable (blue) they have the same probabilities of moving away from PacMan or in a random direction. This causes the first couple of ghost to be quite stupid and bump into PacMan a lot. I changed this in the code to be 50%, 60%, 70%, and 80% and this helps a little. I will probably put these probabilities into the parameters file, too, allowing users access to the ghost movements.

-Kirk



#18 kirkd   Members   -  Reputation: 505

Like
0Likes
Like

Posted 08 February 2008 - 05:44 AM

I just had an idea for a way to represent the game - please offer feedback if you have any.

Ideally, I would like to represent the entire game board. This provides all the information all the time, much like someone mentioned in a previous post. The current Window AI type only represents a small portion of the game board at a time so any planning or reaction is limited to the current window of interest. The upside of this method is that the number of inputs are relatively small and, most importantly I think, the world is represented relative to PacMan's location. The Global representation (PacMan X, Y, Ghost X, Y, and state, Pellet X, Y, and state) packs information about the entire game board into only 38 inputs, but the content is limited and it isn't truly relative to PacMan himself. Actually, PacMan's location is part of the input, so he would have to learn that this represents him.

So what I'm looking for is a compact set of inputs that can represent the entire game board relative to PacMan himself. How about a set of vectors? For each ghost, for example, we could have the angle and distance to the ghost from PacMan's current location. Same for the Power Pellets. Dots could be represented in the current window format (seems to work well and allows the inclusion of walls) or a vector from PacMan to the centroid of all current dots - I'd want to experiment with this one a bit. I've also struggled with the inclusion of fruit in the game and how to represent it. The windowed version wouldn't be very useful for this as it will represent another set of inputs that are set to zero the vast majority of the time (due to fruit only being present intermittently and likely not being in the window at all if it were present).

Comments??



#19 owl   Banned   -  Reputation: 364

Like
0Likes
Like

Posted 08 February 2008 - 08:12 AM

here is similiar project for you to look at for ideas.

#20 tok_junior   Members   -  Reputation: 229

Like
0Likes
Like

Posted 11 February 2008 - 09:07 PM

Quote:
Original post by kirkd

3770 is quite good! Given that he drift to the upper left and gets stuck I'll bet that he goes to the Power Pellet, waits for the ghosts to come along, gets the Pellet, and then waits for ghosts to bump into him. I've seen this happen a lot.

One of the problems is with the ghost AI. I currently have it set up such that every time a ghost hits an intersection it has a random probability of moving toward PacMan or in a random direction. For the 4 ghosts the probability of moving toward PacMan is 20%, 40%, 60%, and 80%. When they are vulnerable (blue) they have the same probabilities of moving away from PacMan or in a random direction. This causes the first couple of ghost to be quite stupid and bump into PacMan a lot. I changed this in the code to be 50%, 60%, 70%, and 80% and this helps a little. I will probably put these probabilities into the parameters file, too, allowing users access to the ghost movements.

-Kirk


Actually i got stuck just to the right of the pellet. When a ghost got near moving in Pacmans direction he started moving towards the dot, but a bit too late, so the ghost got him. And he repeated the same every time I saw it.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS