Jump to content

  • Log In with Google      Sign In   
  • Create Account


Like
0Likes
Dislike

Application of Genetic Programming to the Snake Game

By Tobin Ehlis | Published Aug 09 2000 06:14 PM in Artificial Intelligence

snake function food set game runs board initial danger
If you find this article contains errors or problems rendering it unreadable (missing images or files, mangled code, improper text formatting, etc) please contact the editor so corrections can be made. Thank you for helping us improve this resource

Abstract: This paper describes the evolution of a genetic program to optimize a problem featuring task prioritization in a dynamic, randomly updated environment. The specific problem approached is the "snake game" in which a snake confined to a rectangular board attempts to avoid the walls and its own body while eating pieces of food. The problem is particularly interesting because as the snake eats the food, its body grows, causing the space through which the snake can navigate to become more confined. Furthermore, with each piece of food eaten, a new piece of food is generated in a random location in the playing field, adding an element of uncertainty to the program. This paper will focus on the development and analysis of a successful function set that will allow the evolution of a genetic program that causes the snake to eat the maximum possible pieces of food.


Introduction and Overview

Artificial intelligence (AI) techniques have been proven highly successful at the problems of navigation, task prioritization, and problem avoidance. Traditionally, humans have encoded rule-based AIs to create the behaviors necessary to allow an automaton to achieve a specific task or set of tasks. Genetic programming (GP), however, has been proven to allow a computer to create human-competitive results. Specifically, examples such as the wall-following robot (Koza 1992) and Pac Man® (Koza 1992) demonstrate the effectiveness of GP at evolving programs capable of navigation and task prioritization behaviors which are competitive with human-produced results.

In an original approach to demonstrating the effectiveness of GP at producing human-competitive results, this paper describes the evolution of a genetic program that can successfully achieve the maximum possible score in the "snake game." The problem posed by the snake game is of particular interest for two main reasons. First, the size and shape of the area through which the main game character, the snake, can move is constantly changing as the game progresses. Second, as the snake eats the single available piece of food on the game board, a new piece is generated in a random location. Because of these two factors, the snake game presents a unique challenge in the developing of a function and terminal set to allow GP to evolve an optimal solution that is generalized for successive runs of the snake game.

The "Background" section of this paper outlines the rules and discusses the specific details of the "snake game." Next, "Statement of the Problem" explains the problem being addressed by this paper. The "Methods" section provides the GP specifics of how the problem was approached. The "Results" section gives numerous examples of results produced by the GP runs along with a discussion and analysis of those results. The "Conclusion" section summarizes the ultimate results achieved by the paper. The "Future Work" section discusses potential for further study in line with the work discussed in this paper. Finally, the "References" section provides a bibliography for the paper.


Background

The "snake game" has been in existence for over a decade and seen incarnations on nearly every popular computing platform. The game begins with a snake having a fixed number of body segments confined to a rectangular board. With each time step that passes, the snake can either change direction to the right or left, or move forward. Hence the snake is always moving. Within the game board there is always one piece of food available. If the snake is able to maneuver its head onto the food, its tail will then grow by a single body segment and another piece of food will randomly appear in an open portion of the game board during the next time step. The game ends when the snake’s head advances into a game square that is filled with either a snake body segment, or a section of the wall surrounding the game board. From a task prioritization standpoint, then, the snake’s primary goal is to avoid running into an occupied square. To the extent that this first priority is being achieved, its second priority is to pursue the food.

Attached Image: image1.gif

The version of the game used for this paper, shown in figure 1, is a replica of the game as it currently exists on Nokia cell phones. In this version, which is available for play online at www.nokia.com/snake, the game board is made up of 220 total squares, 20 horizontal and 11 vertical, and the food begins in position (11,6) on the game board, represented by a diamond in the figure. The snake is initially made up of 9 body segments, occupying positions (1,11)-(9,11) on the board, with the head in position (9,11) and the snake moving to the right, represented by the arrow in the figure. The maximum number of pieces of food that can be eaten is the size of the game board minus the initial size of the snake. With the given parameters, then, this equates to 220-9=211 pieces of food. This is because with each piece of food eaten, the snake grows by a body segment, reducing the amount of free space in which it can move. Hence when it has eaten 211 pieces of food, its body will fill the entire game board, rendering any further movement impossible. One critical piece of information is whether or not it is even possible for the snake to eat the maximum amount of food. Indeed it is conceivable that after eating a certain amount of food, the snake will have grown so large that it restricts itself from access to a portion of the board. Upon close inspection, however, the reader will note that by tracing certain patterns repeatedly over the board, it is possible for the snake to cover every square exactly once and return to its initial position. One such pattern is shown in figure 2, which features a snake of 210 body segments about to eat the final piece of food. Hence by continually tracing the pattern shown, the snake can eat the maximum possible pieces of food.

In evolving a genetic program to successfully eat the maximum amount of food, a human competitive solution, in terms of score, will have been obtained. With that in mind, there are some important differences in the game when being played by a human as opposed to a computer-generated program.

Attached Image: image2.gif

For a human player, the fact that the snake is always moving adds an element of pressure, forcing him/her to make decisions in a timely manner. When using a computer to play the game, this is not a concern, as the computer will have the time between each move to parse through a program tree and determine the next move. The nearest equivalent to "pressure" for a computer is any limitation imposed on the size and depth of the genetic program’s function tree. These limitations restrict the possible number of decision trees that can be generated, thereby ensuring that the computer will have a finite amount of time in which to determine the next move for the snake. The particular function tree limitations imposed for this problem will be discussed in the following "methods" section.


Statement of the Problem

The fundamental problem of the snake game is to eat the maximum number of food pieces before "dying" by running into either a wall or a segment of the snake’s body. The problem being addressed in this paper is to provide a function and terminal set that will allow for the evolution of a GP that will maximize the number of food pieces eaten by the snake. The maximum goal for the particular configuration of the snake game used in this paper is 211 pieces of food.


Methods

Table 1 provides the tableau for the initial runs of the snake game. Following over twenty initial runs of the program, the maximum score that had been achieved was 123 hits. As it was apparent that a maximum solution would not be obtained using the initial function set, the function set was expanded to enhance the snake’s movement and environment sensing capabilities. For the remainder of the paper, any GP runs performed with the function and terminal sets given in Table 1 will be referred to as a run made with the "initial" function set. Any run made with the enhanced function set, which includes the complete initial function set as a subset, will be referred to as having been made with the "final" function set. A discussion of both the initial and final function sets follows.

Table 1. Tableau for Snake-Game Problem

<table width="543" border="1" cellpadding="7" cellspacing="1"><tbody><tr valign="top"><td width="23%">Objective:</td><td width="77%">Find a computer program that eats the maximum possible pieces of food.</td></tr><tr valign="top"><td>Terminal set:</td><td>(forward), (left), (right)</td></tr><tr valign="top"><td>Function set:</td><td>ifFoodAhead, ifDangerAhead, ifDangerRight, ifDangerLeft, progn2</td></tr><tr valign="top"><td>Fitness cases:</td><td>One fitness case.</td></tr><tr valign="top"><td>Raw Fitness:</td><td>Pieces of food eaten.</td></tr><tr valign="top"><td>Standardized fitness:</td><td>Maximum possible pieces of food eaten (211) minus the raw fitness.</td></tr><tr valign="top"><td>Hits:</td><td>Total pieces of food eaten during a run of the program, same as raw fitness.</td></tr><tr valign="top"><td>Wrapper:</td><td>None.</td></tr><tr valign="top"><td>Parameters:</td><td>M = 10000. G = 500.</td></tr><tr valign="top"><td>Success predicate:</td><td>A program scores 211 hits.</td></tr></tbody></table> Terminals: The terminal set chosen for the problem was right, left, and forward. Each terminal was a macro that would cause the snake to take the corresponding action during a time step as follows:

Right: the snake would change its current direction, making a move to the right

Left: the snake would change its current direction, making a move to the left

Forward: the snake would maintain its current direction, and move forward. This is the same as a no-op, as the snake must make a move during each time step.

These three terminals represent the minimal terminal set with which the snake can effectively navigate its surroundings. While some problems consisting of navigation in a two-dimensional grid can be successfully navigated by way of only one direction changing terminal, that is impractical for the snake game because the facts that the game board is enclosed and that the snake has an extended body that is impassible necessitate the ability for the snake to move in either direction in order to avoid death. More advance terminals, such as moving the snake along the shortest path to the food, were not implemented. Rather, the function set was constructed in such a manner that the GP could evolve the necessary capabilities to achieve the maximum score.

Functions: Initially the snake was given very limited functionality. One function gave it information about the location of the food, three other functions gave it information about any immediately accessible danger, and progn2 was provided as connective "glue" to allow a function tree to make multiple moves in a single pass. All functions were implemented as macros of arity two, and therefore would only execute one of their arguments depending on the current state of the game, except for progn2, which would execute both of its arguments. Even though no expressions evolved from this initial function and terminal set were able to achieve the optimum score of 211 pieces of food, this set served as a baseline by which to evaluate progress and determine enhancements the would lead to the eventual optimal solution. Following is a description of the initial function set:

ifFoodAhead: If there is food in line with the snake’s current direction, this function will execute its first argument, otherwise it will execute the second argument. This was the only initial function that gave the snake information beyond its immediate surroundings.

ifDangerAhead: If the game square immediately in front of the snake is occupied with either a snake body segment or the wall, this function will execute its first argument, otherwise it will execute its second argument.

ifDangerRight: If the game square immediately to the right of the snake is occupied with either a snake body segment or the wall, this function will execute its first argument, otherwise it will execute its second argument.

ifDangerLeft: If the game square immediately to the left of the snake is occupied with either a snake body segment or the wall, this function will execute its first argument, otherwise it will execute its second argument.

progn2: This is a connectivity function that will first execute its right argument, then its left. It is the only function that allows execution of more than one terminal in a single parse of the function tree. Although this function will always execute both of its arguments, it was necessary to implement it as a macro because of the way that the software used to make GP runs, Dave’s Genetic Programming in C (DGPC), evaluated functions vs. macros. To avoid unnecessary modification of DGPC, implementing progn2 as a macro proved the simplest option.

As mentioned previously, no GP runs performed with the initial function set were able to score greater than 123 hits. In order to increase the probability of evolving a function tree capable of achieving the maximum number of hits, the initial function set was enhanced. Functions were added to extend the snake’s capabilities for detecting food and danger, as well functions that were conditional on the snake’s current movement direction. Following is a discussion of the additional functions that, along with the initial function set, make up the final function set.

Additional Functions, all of arity 2:

ifDangerTwoAhead: If the game square two spaces immediately in front of the snake is occupied by either the wall or a segment of the snake’s body, this function will execute the first parameter, otherwise it will execute the second.

ifFoodUp: If the current piece of food on the board is closer to the top of the game board than the snake’s head, then the first parameter of this function will be executed, otherwise the second parameter will be executed.

ifFoodRight: If the current piece of food on the board is further to the right of the game board than the snake’s head, then the first parameter of this function will be executed, otherwise the second parameter will be executed.

ifMovingRight: If the snake is moving right, then the first parameter of this function will be executed, otherwise the second parameter will be executed.

ifMovingLeft: If the snake is moving left, then the first parameter of this function will be executed, otherwise the second parameter will be executed.

ifMovingUp: If the snake is moving upward, then the first parameter of this function will be executed, otherwise the second parameter will be executed.

ifMovingDown: If the snake is moving downward, then the first parameter of this function will be executed, otherwise the second parameter will be executed.

There are two characteristics of the final function set that should be given special attention. First, note that the "ifFoodUp" and "ifFoodRight" functions are direction independent, meaning that the direction in which the snake is moving has no impact on the function’s behavior. This is in contrast to the initial set of functions, such as "ifDangerAhead", in which the direction that the snake was traveling would have an impact on the return value of the function. The reason for the difference is to maintain simplicity in the function set. The snake can potentially be surrounded by danger, but there will only be one piece of food on the board at any one time. If the "ifDanger*" functions were direction-independent, then two significant complexities would be added to the problem.
  • An additional function would be required, as there would need to be one for all cardinal directions in order to account for all possible surrounding dangers. An added downfall of this complexity is that one of the "ifDanger*" functions will be virtually meaningless depending on the direction of snake’s travel, since the snake’s neck segment adjacent to the snake’s head is always an adjacent danger, although not one of any consequence to the snake, since it is unable to move back on itself.
  • Anytime an "ifDanger*" function was used, it would need the aid of a helper function, such as the new "ifMoving*" functions in order to make intelligent moves based on an assessment of the danger.
Taking the second complexity into account, the reader may now note that the same disadvantage is true of the two new functions, "ifFoodUp" and "ifFoodRight." Indeed this is true, but an important difference between the role of food and the role of danger in the game makes for a worthwhile tradeoff. The difference is that there will only be one piece of food on the board at any time. This allows the new "ifFood*" functions to serve as two functions each. To clarify, consider the ifFoodUp function. When not true, it is indicating that the food is either down, or on the same horizontal plane as the snake’s head. Now consider a hypothetical "ifDangerUp" function. If this function were not true, it would tell nothing about whether or not danger is down, because it can be anywhere simultaneously. Likewise is would not even tell whether existing danger that was "up" posed a immediate threat to the snake, as the further information of the snake’s current moving direction would need to be known, as discussed earlier. For the second special characteristic of the new functions, consider the new "ifMoving*" functions. These functions can be used as helper functions with the two new "ifFood*" functions to create beneficial schemata.

As an example of a beneficial schemata, consider "ifFoodUp(ifMovingRight(left, ifMovingUp(fwd, right))))", which will orient the snake to pursue food that is upward. As will be seen in the results section, not only does the GP learn how to use these functions in conjunction with the two new "ifFood*" functions, but they also prove useful in helping the snake discover patterns that greatly extend its life. Discussion of other schemata is given below in the description of schemata, and specific examples are given in the "Results" section.

Fitness Cases: For initial runs of the problem, only a single fitness case was used to determine the fitness for each individual. Because the food placement is random both during a single run, and from one run to another, occasionally individuals would score a number of hits because of fortuitous placement of the food, and not as much on the merit of their function tree.

To better ensure that the most successful individuals achieved high fitness measures primarily on the basis of their function tree, new GP runs were often made featuring a "primed" population in which the fitness was measured as the average of four runs of an individual. The procedure for this is as follows: once a run had completed without obtaining a solution, or if a run had stalled on a single individual for a large number (100 or more) of generations, a new run was begun with this final individual as one of the initial individuals. For this new run, however, the fitness was taken as the average fitness of an individual over four runs instead of merely a single run. The averaging of the fitness over four runs helped eliminate the possibility of an individual having a high fitness due simply to lucky placement of the food. Using this averaging method to determine fitness was only used in primed populations because it increased the time of a GP run fourfold. Furthermore, it was common for the generations that timed out to feature an individual who had scored a high fitness as a result of a lucky run. By beginning a new run with this individual in the initial population, it not only assured a more realistic fitness measure, but it introduced an entirely new mix of randomly generated schemata that could potentially benefit the stalled individual. Details of results produced by primed runs are given in the results section.

Fitness Measure: The fitness measure used is the maximum possible pieces of food eaten, 211, minus the actual number of pieces of food eaten. Furthermore, if the snake was unsuccessful at eating any food the fitness would be penalized by the number of board squares that it was from the food. This additional criterion was added to favor individuals who moved toward the food in early generations of snakes who were unable eat any food.

Parameters: Population was set to 10000. The maximum number of generations was set to 500. The size of a function tree was limited to 150 points. These parameters were chosen mainly based on available computer resources, covered in computer equipment and run-time explanation below.

Designating a result and criterion for terminating a run: The best of generation individual will be the one that is able to eat the most pieces of food. A run will end when one of three termination criteria are met:
  • The snake runs into a section of the game board occupied by a wall
  • The snake runs into a section of the game board occupied by a segment of the snake’s body
  • The number of moves made by the snake exceeds a set limit. This limit was set to 300, slightly larger than the size of the game board. This will prevent a snake from wandering aimlessly around a small portion of the board.
The reader may note that there is no termination criterion for the completely successful snake. That is because upon eating the final piece of food, the snake’s tail will grow onto its head, causing it to satisfy termination criteria 2 above. Hence even the optimal solution will end in death for the snake.

Crossover, mutation rates: Crossover of nodes was the primary genetic operator employed during the GP runs. The crossover fraction for leaves was set to .10; the crossover fraction for a node was set to .80; the mutation fraction was set to 0. Additionally, primed GP runs were used to improve genetic diversity, as described above in the description of fitness cases.

Computer equipment and run time: The majority of the computer runs were performed on a 550MHz Intel® Celeron Processor running Microsoft® Windows 98 SE Operating System. The software used was Version 2.0 of Dave’s Genetic Programming In C, and Microsoft® Visual C++ 5.0. In addition, a stand-alone simulation of the snake game was created that was able to read in the function trees produced by DGPC and graphically display a run of a particular function tree. This utility proved invaluable, as it provided a fast, visual method to determine the overall optimization strategy represented by the function tree. The alternative of hand-evaluating each function tree would have proven not only more time consuming, but much less conclusive. A complete run of 500 generations took around 20 hours to complete. Because of the length of time for each run, many runs were farmed out to separate computers, all with approximately equivalent computer power.

Schemata: Given the initial function set, there were a few highly desirable sub-tree schemata that could be produced. First, considering a minimal sub-tree of 3 points, any sub-tree that would evade impending danger by changing directions is certainly the key to survival of an individual. One such sub-tree is "ifDangerAhead(right, forward)." Secondly, a basic sub-tree that will avoid changing directions into impending danger is solely beneficial to an individual. One example is "ifDangerRight(forward, right)." The reader will note that anytime a change in direction is about to be undertaken, it would be wise to have such a check before making the move. Thirdly, a 3-pointed sub-tree that aims at pursuing the food, and modifying directions if no food is ahead, is required to give the individual more than a random opportunity to eat the food pieces. One such individual is "ifFoodAhead(forward, right)."

As explained previously, the "ifFoodAhead" function will return true for a piece of food any number of squares in front of the snake. Therefore, in addition to seeking the food, it would also be desirable for the individual to continually scan for impending danger while the food is being sought. Hence a final example of a desirable schemata is any combination of the above three examples that effectively combines the goals of each. For example, consider the following function tree of 7 points: ifFoodAhead(ifDangerAhead(right, forward), ifDangerRight(left, right)). This schema will cause the snake to pursue food ahead as long as no immediate threat is observed. If however, there is a threat or no food ahead, the sub-tree will cause the individual to change direction avoiding any observed danger, or pursuing a new vector to find food. Specific examples of the emergence of such schemata will be given in the results section.

In addition to the potential beneficial schemata, touched on above, there are also "detrimental" schemata. The detrimental schemata would be any function branch whose primary goal is to either seek danger or avoid food. Examples of detrimental schemata are essentially the converse of the previously outlined beneficial schemata, and their further consideration is left to the reader.

Certainly all schemata are not strictly beneficial or detrimental, and any such schemata will be called "neutral schemata." Consider, for example, the simple subtree "ifDangerRight(left, forward)." This function will turn left if danger is present to the right, and continue forward otherwise. This schema makes either a left or forward move without having any apparent knowledge of what lies in those directions. This could certainly prove to be detrimental, but the move to the left when danger is right is at least avoiding the danger to the right. Schemata such as this can actually prove beneficial when placed in the context of a complete function tree. An examination of actual schemata produced during the GP runs in question follows in the results section.


Results

As mentioned in the methods section, there were three types of GP runs made in an attempt to evolve a solution to the snake game: runs using the initial function set, the final function set, and primed runs, also using the final function set. The highest number of hits generated by a run using the initial function set was 123. Three separate solutions were generated using the final function set, although none of them were found to consistently generate a solution. The number of hits achieved by each solution depended on the placement of the food. It was not until the method of "priming" a run, described in the methods section, was used that a consistent solution was generated. Of ten primed runs, using various initial seeds, exactly five of them evolved a solution, all of which were consistent solutions over multiple runs. Comparatively, over twenty runs using the full function set were made, and only three of them produced solutions, none consistent.

Attached Image: Image3.gif

A summary of the overall results achieved in each type of run is given in figure 3. Each line on the graph is the average of ten runs. Note that the initial and final function sets produce a roughly equivalent maximum number of hits until about generation fifty. At this point the final function set continues to improve while the initial function set levels off. By generation 200, the initial function set has virtually no improvement, while the final function set continues improving past generation 400. Because the final function set is both more complete and larger, new and more successful individuals continue to evolve while individuals produced by the initial function set max out around 100 hits. Another feature to note in figure 3 is the impressive results achieved by the primed runs. All primed runs were begun with an individual from a final function set run who had achieved at least 150 hits. When taken as the average over four runs, however, these individuals are only able to achieve about 50 hits, as shown in the first generation of the primed runs. These individuals jumpstart the population to great success, and by generation 25 the maximum number of hits has more than tripled to around 160. By generation 150 the primed runs level off to about 200 hits. Following is an evaluation of some of the most prominent strategies evolved during the various GP runs. Specific examples of individuals from each type of run are presented and analyzed, and all function trees are reduced for simplicity sake.

Zig-zagger: One strategy that was prevalent in individuals across multiple runs is what will be referred to as the "zig-zagger." These individuals would trace the board diagonally in a stair-stepper pattern until they either reached a wall or had lined the direction of their movement up with the food. Upon reaching a wall, they would change their direction as if bouncing off of the wall, and continue diagonally tracing the board in a new direction. If they were successful in aligning their movement with the piece of food, they would typically head directly toward the food, perhaps avoiding danger depending on the particular individual. Variations in zig-zaggers occurred between which directions they would head when hitting the wall, how often they would seek the food, and how they would react in enclosed situations, such as corners or heading towards food that was blocked by their body. Obviously the more successful individuals evolved traits that allowed them to avoid danger in close quarters and dodge their body when it blocked progress toward the food. One example of a zig-zagger, who was able to score a maximum of 33 hits in one particular run, is given below:

(ifFoodAhead (ifDangerLeft (right )(ifDangerRight (1forward )(ifDangerAhead (left )( 1forward ))))
   (ifDangerAhead (ifDangerLeft (right )(left ))
      (ifDangerLeft (ifDangerRight (forward )( 2right ))
         (progn2 (left )(right )))))


Consider initially the rightmost sub-tree of the function tree, which is given on the last line as progn2 (left)(right). This is the branch executed initially and for the majority of this zig-zagger’s run. When executed repeatedly, this sub-tree will cause the snake to move left then right, progressing diagonally across the board. For this example, the sub-tree is executed whenever there is no food ahead of the snake’s line of movement, and there is no danger in front of or to the left of the snake’s head. This continuous zig-zagging motion allows the snake to examine successive rows or columns of the board in search of the food. Because both branches of the progn2 are executed before returning to the beginning of the function tree, however, the snake will only detect the food if the second argument of the progn2, right, leaves the snake’s head in line with the food.

Attached Image: image4.gif

Once the food is directly in line with the movement of the snake’s head, the left-hand sub-tree, given on the first line above, is executed. As noted with a "1" above, the snake will continue forward if there is no danger to the left, and either there is danger to the right, or there is no danger to the right or ahead. Unfortunately for the snake, if there is no danger to the left, but danger to the right and ahead, this function tree will lead it directly into the danger ahead, noted with the first "1" above. This is exactly what happened to the snake in figure 4, shown one time step before its demise after having eaten 24 pieces of food. This snake, whose head is in (14,5), began moving towards the food in (4,5) after having released from the wall.

Finally, note that when the "right" portion of the "progn2" sub-tree causes the snake to be either facing or next to a wall, the sub-trees on the second and third line above will be executed respectively. Further investigation reveals that each of these sub-trees will cause the snake to move away from the wall in a direction that avoids danger, even in corners. In this fashion, the snake appears to "bounce" from the walls and proceed to zig-zag in an alternate direction. Two examples of this are seen at positions (17,1) and (20,5) of figure 4. In both of these cases the snake made the right turn noted with a "2" above in order to avoid the wall.

Wall-slitherer: The strategy that scored the highest out of all individuals using the initial function and terminal set is what will be referred to as the "wall-slitherer." These individuals would follow along the wall, not simply moving forward, but rather slithering back and forth between the two squares closest to the wall. Once able to align its head with the food, the individual would move away from the wall in a straight line to obtain the food. Than, when the food was eaten, successful wall-slitherers would either double-back along their own body and head for the wall or head in a random direction toward a wall. Variations on wall-slitherers occurred in the direction they would take around the wall and when they would leave the wall to pursue the food. One highly successful wall-slitherer is shown below. This individual scored a maximum of 107 hits in one particular run, and an evaluation of its important characteristics follows:

(ifFoodAhead (ifDangerAhead (left )(forward ))
   (1ifDangerAhead (ifDangerRight (left )(progn2 (right )
         (ifFoodAhead (ifDangerRight (forward )(right )))(ifDangerRight (forward )(right ))))
      (2ifDangerRight (ifDangerLeft (forward )(left ))
         (3ifDangerLeft (right )
            (4progn2 (left )(ifFoodAhead (ifDangerLeft (right )(left ))(ifDangerRight (left )
                  ( 5progn2 (ifDangerAhead (right )(ifDangerLeft (right )(left )))
                  (ifDangerRight (forward )(right ))))))))))


In evaluating this individual, first consider the root, which consists of the "ifFoodAhead" function. For any case in which there is food ahead, the very simple left sub-tree is executed. This subtree simply checks for danger ahead and attempts to avoid it to the left if present, otherwise the snake will continue along its current movement path towards the food. While this sub-tree proves both simple and effective, the fact is clear that the individual spends the majority of its run without the food immediately ahead, which is handled by the much larger right-hand sub-tree.

While it appears much more complicated than the left-hand sub-tree, the fundamental strategy of the right-hand sub-tree is to avoid danger. This strategy is executed impressively by the three different "ifDanger*" functions, noted with 1, 2, and 3. These functions provide the roots for the three sub-trees along the right-hand side of the main function tree. The reader can verify that each of these three sub-trees contains schemata that are highly effective at avoiding any impending danger to the snake. Having already taken precautions to pursue food and avoid danger, the final sub-tree provides the snake with its wall-slithering motion, in which it spends the majority of its time.

The final sub-tree, noted with a "4" above, is rooted with a progn2. This indicates that multiple actions will be carried out every time this sub-tree is reached, which proves to be very frequently. Initially the branch will make a move to the left, which is already known to be safe. Following this move, if there is no food ahead and no food to the right, then the second progn2, noted with a "5", is reached, making for a total of three moves to be executed on this single pass of the function tree. This three-move sequence is both common and highly beneficial to the success of this wall-slitherer. In figure 5, note the snake’s body segment at (19,8). At this point in the past, the snake was facing downward and a new parse of the function tree was beginning. As no danger was immediately present, sub-tree 4 was reached and the snake turned left towards the wall. Needing to complete the second argument of the progn2, and with no food ahead or danger to the right, the progn2 at 5 was reached, which caused the snake to turn right twice, leading to the next parse of the function tree. Looking back in history through the illustration, note that the same pattern was carried out at points (19,6), (19,4), (19,2), and numerous other times in the brief potion of the snake’s run demonstrated here. This repeated slithering pattern served to maximize the amount of ground covered by the snake while minimizing the danger that its body would pose to itself. A time step prior to a fatal flaw in the snake’s movement, however, is illustrated.

Attached Image: image5.gif

As the food is in front of the snake’s head, the simple sub-tree on the left is entered. Since there is danger ahead of the snake, it will simply turn left. As shown in the illustration, this turn will lead to the snake’s death, as it hits it’s own body after having eaten 61 pieces of food. While it may seem surprising that this flaw in the left-hand function tree was not encountered sooner, the snake survives by keeping its body along the walls as much as possible. In the illustration, it is clear that the snake left the wall 57 time steps earlier in order to pursue a piece of food across the board. Once the food was eaten, the snake resumed its slither pattern clockwise around the edge of the board. Unfortunately, its body had grown so long that by the time its head was in line with the food at position (19,9) its body was still blocking its path to the food. The snakes evasion tactic of going to the left when danger is encountered with the food ahead had saved in previous similar situations because once a single successful left was made, the snake was no longer in line with the food and it would continue any necessary evasive maneuvers via the much more robust sub-trees, 1, 2, and 3. In this final, fatal case, however, the combination of the snake’s long body, its previous cross-board pursuit of the food, and the placement of the next piece of food three board squares off of the wall caused the evasive left to lead the snake directly into its own body.

Circler: After the function set was enhanced to include the further food-sensing capabilities of "ifFoodUp" and "ifFoodRight" as well as the four "ifMoving*" functions, a new strategy of behavior that evolved is what will be referred to as the "circler." These individuals would follow along the outside of the wall in a circular pattern and only leave the wall to get the food. Once they reached the food they would continue forward until they reached the wall, then they’d start to circle again. Typically they would only attempt to eat the food while moving in one particular direction. While similar to the wall-slitherer, they differ in two key ways. The first is that the circler will always remain directly next to the wall and not move back and forth like the wall-slitherer. The second is that the circler will typically only leave the wall while headed in one direction. Both of these differences are a direct result of the new functions. Before further discussion, consider the following circler, who scored a maximum of 80 hits over a single run:

(ifDangerAhead (ifDangerTwoAhead (ifDangerLeft (right )(left ))
   (ifFoodUp (ifDangerLeft (right )(2left ))(forward )))
      (ifMovingUp (ifDangerRight (ifFoodUp (forward )(3left ))(right ))
         (1forward )))


First note the leftmost branch of the function tree, in which the snake will primarily avoid danger to both the front and the left. Certainly the left-hand sub-tree, though simple, proves highly effective at achieving the snake’s primary goal of avoiding danger. Secondly, take note of the right-hand sub-tree, which is parsed whenever danger is not immediately ahead of the snake. If the snake is not moving upwards, it simply continues forward, which is already known to be a safe move. This proves to be the move that snake most commonly makes. If, however, the snake is moving upwards, and there is danger to the right, then it will turn left as soon as the food in no longer above it. The primary moves of this snake, then, are to continue forward around the outside of the board until either there is danger ahead and it turns left, or the snake is moving upwards and there is food to left, when it turns left. Note that these moves are marked 1, 2, 3 respectively in the function tree above. Hence when seen in action the snake will make a counterclockwise circular motion around the outside of the board with the top of the circle determined by the current piece of food.

Pattern Following Solution: As a final example of an evolved strategy, an individual that was able to score the maximum number of hits, 211, will be considered. All individuals who were able to score the maximum number of hits demonstrated some pattern similar to that shown in figure 2. All of these individuals took little to no consideration of where the food was on the board, but rather followed a set pattern that would cover the entire board, eventually causing them to eat the food. Furthermore, the pattern they followed would be continuous, meaning that their head would eventually reach its original starting position, allowing the pattern to continue indefinitely. One such pattern follower, produced in generation 27 of a "primed" run, is given below:

(ifDangerRight (ifDangerAhead (ifDangerTwoAhead (8left )(forward ))
      (ifMovingRight (6left )( 4,7forward )))
   (ifDangerAhead (ifDangerLeft (ifFoodUp (right )(right ))(ifDangerTwoAhead (left )(forward )))
      (ifMovingUp (ifDangerTwoAhead (ifFoodAhead (ifMovingRight (3right )( 9forward ))
         (ifMovingDown (left )( 2right )))
            (1progn2 (forward )(forward )))
               (ifMovingRight (right )(forward )))))


This individual followed a pattern exactly the same as that shown in figure 2. There were only a few minor deviations from the pattern that would occur during very infrequent states of the game board. Before considering any such deviations an examination of the major pattern following steps will be made.

The overall pattern followed by the individual above is as follows, with the movement steps noted by superscripts on the individual. To simplify the analysis consider that the snake has already eaten enough food to be as long as the board is high, 11 segments, and that the snake is currently moving upward with its head at position (2,10) of the board:
  • While moving upward, if there is not danger two ahead, move forward twice.
  • Once there is danger two ahead, turn right; snake now moving right one row from the top of the board.
  • Turn right again, to begin heading downward.
  • Continue moving downward until there is danger directly ahead.
  • Once there is danger ahead, turn left; snake now moving right at the bottom of the board.
  • Turn left again and return to step one until there is danger to the right of the snake.
  • Danger right indicated the final right-hand column, so the snake now moves up until danger is one ahead.
  • Once there is danger ahead, turn left to follow the top row of the board (4,7) while moving left; repeat this same step to move down the left-hand side of the board, and when the bottom of the board is reached, return to step 5.
While it is clear that by repeatedly following this pattern the snake will continually trace the whole board, causing it to eat at least one piece of food on each pass of the board, there is one notable exception from the pattern that is made whenever the food is in the top row of the board and the snake is moving upward toward it. In this rare case, when step 2 of the pattern is reached, rather than turning right, the snake will continue forward to eat the food, as noted with a "9" in the function tree. When this case occurs the snake will resume the pattern to the right following its consumption of the food. If, however, this case occurs too far to the right and the snake’s body is long enough, the snake can trap itself on the right side of the board, causing it to die. This is the only way that the way that the individual shown above will not successfully eat 211 pieces of food.


Conclusion

This paper has presented the development and evaluation of a function set capable of evolving an optimal solution to the snake game. An initial function set was presented and evaluated, but proved unsuccessful at evolving an optimal solution. The initial function set was then expanded upon to create the successful final function set, and consistently optimal solutions were generated using primed GP runs. A comparison was made of the results achieved by each function set, as well as by the primed GP runs. Examples of commonly evolved strategies were presented and evaluated, and a final analysis of a consistently successful optimal solution was given.


Future Work

The work presented in this paper provides innumerable opportunities for further investigation into the evolution of a task prioritization scheme within a dynamically changing, randomly updated environment. Specific to the snake problem, modifications can be made to create completely new and interesting problems, such as a non-rectangular game board, obstacles within the game board, or multiple pieces of food. Multiple snakes could be co-evolved to competitively pursue the food. The function set could be modified to feature enhanced detection capabilities and more advanced navigational options. The techniques used for navigating the snake could be generalized to apply to various other problems of interest. Possibilities include automated navigation of multiple robots through a crowded workspace, an automaton for tracking fleeing police suspects through harsh environments, or a control scheme for an exploratory vehicle seeking a particular goal on a harsh alien planet. The possibilities are only limited by the imagination.


References

Koza, John R. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. Cambridge, Massachusetts: The MIT Press.





Comments
Hi Tobin,

Well written article. I am curious though, on what actions are taken between generations. For example how many snakes are choosen (from the 10000) to go to the next round.

In particular, what happens to these snakes? Are they "mated" with other snakes? If so, how? eg the first snake's first branch is mixed with the seconds snakes second (and remaining) branch.

Also, is any mutation requiered? ie, any further randomness after the initial 10000 pool?

Any answers or references will be greatly appreciated.

Cryptic.

Note: Please offer only positive, constructive comments - we are looking to promote a positive atmosphere where collaboration is valued above all else.




PARTNERS