Jump to content

  • Log In with Google      Sign In   
  • Create Account


Connect Four AI


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
14 replies to this topic

#1 Jaap85   Members   -  Reputation: 241

Like
0Likes
Like

Posted 18 March 2012 - 10:42 AM

Hello everybody,

For the past few weeks i have been studying AI and this weekend i decided to make a start and develop my own really simple game AI, for the well known game of connect four (http://en.wikipedia....ki/Connect_Four). The game is played on an 8x8 board.

This is my first AI for a two player game (i created an A* pathfinding for a NPC in another project, but that was something totally different).

I decided i want to learn to use the MINIMAX algorithm. The first thing i did, is write a pseudocode solution for it. I posted it below, and i hope anyone would be able to provide me with some hints or tips on whether this code would work or not.

The algorithm is currently very simple and does the following:

- Think 4 moves ahead (2 AI and 2 human moves) and try to prevent human from winning
- No heuristics are used for game states, just -1 for a human win and +1 for an AI win
- No pruning is used, all options are evaluated

Right now, i am not really interested in improving the algorithm (that would be my next step), but i am just curious whether this algorithm will work if i implement it in real (C#) code. I hope somebody would like to read my code and point me to any mistakes, if there are any.

Finally, this is mostly a learning-by-doing thing for me. There will probably be lots of code available online that is much better and can easily solve the problem, but in my opinion the best way for me to learn is to first try and build my own algorithm and only afterwards check what others did better. So any links to complete solutions are not really what i am looking for.

Thank you all for reading!

// MINIMAX AlGORITHM
			// THINK 4 MOVES AHEAD
			//
			// 1. AI move
			// 2. Human move
			// 3. AI move
			// 4. Human move
			//
			// Possible different outcomes:
			// 8*8*8*8 = 4096 (minus earlier finished outcomes)
			//
			//  Human win gives -1 points
			//  AI win gives +1 points
			//
			// -----------
			// PSEUDO CODE
			// -----------
			//
			//  create class GAMESTATE with properties:
			//	  int(8,8) SQUARES
			//	  GAMESTATE Parent
			//	  int column															 // The move made to reach this state
			//
			//  Create array of class gamestate
			//	  states(4,4096) = new GAMESTATE
			//
			//  ADD current state of board to states (0,0).
			//
			// --------------------------------------
			//  Add all possible states to the array
			// --------------------------------------
			//
			//  for x = 0 to 4															  // 4 moves ahead
			//
			//	  for y = 0 to 4096													   // check all STATES
			//
			//		  IF states(x,y) != Nothing and states(x,y).winvalue IS NOT 1 or -1
			//
			//			  for z = 0 to 7												  // check all columns
			//
			//				   if column z IS NOT full
			//					  PLACE TOKEN in z
			//					  ADD STATE TO ARRAY:
			//						  states(x+1,y+z) = new GAMESTATE
			//							  squares(8,8) = value of squares				 // with for loops for each square
			//							  parent = states(x,y)
			//							  column = z
			//
			//					  if wincondition = true								  // Check whether the new state is a final state (winsituation)
			//						  Set winvalue of new state to +1 or -1
			//					  end if
			//
			//				   end if
			//
			//			  next
			//
			//		  end if
			//	  next
			//
			//  next
			//
			//  ----------------------------------------------------------------
			//	  Check all states from level 4 to 1 and move their values up
			//  ----------------------------------------------------------------
			//
			//
			//  for x is 4 to 1														 // Start at the bottom of the tree and work up
			//
			//	  for y is 0 to 4096												  // Check all states at this level
			//
			//		  if states(x,y) != nothing
			//
			//			  if x is an even number									  // Human player (minimizing) is deciding
			//	
			//				  if states(x,y).winvalue < states(x,y).parent.winvalue
			//					  states(x,y).parent.winvalue = states(x,y).winvalue
			//				  end if
			//
			//			  else if x is an odd number								  // AI player (maximizing) is deciding
			//
			//				  if states(x,y).winvalue > states(x,y).parent.winvalue
			//					  states(x,y).parent.winvalue = states(x,y).winvalue
			//				  end if
			//
			//		  end if
			//
			//	  next
			//
			//  next
			//
			//  -------------------
			//   Determine AI move
			//  -------------------
			//
			//  int BestState = -2							  // This integer will be used to record the best option
			//  int MoveColumn								  // This integer will be used to store the column to get to the best option
			//
			//  for x is 0 to 8								 // Run through all states on level 1
			//
			//	 if states(1,x).winvalue > BestState		
			//		
			//		  BestState = states(1,x).winvalue		//  Update the best option value
			//		  MoveColumn = states(1,x).column		 //  Update the column used to get ther
			//
			//	  end if
			//
			//  next
			//
			//  return MoveColumn							   //  Let's hope this is actually the best move!
			//


Sponsor:

#2 Álvaro   Crossbones+   -  Reputation: 11859

Like
0Likes
Like

Posted 18 March 2012 - 07:35 PM

Minimax is most easily implemented using recursive functions. If you adopt the convention that a positive score means good for the side to play, the code becomes much cleaner. Search the web for "negamax" and you'll find some pseudo-code. That's definitely how I would start.

#3 Jaap85   Members   -  Reputation: 241

Like
1Likes
Like

Posted 20 May 2012 - 02:45 AM

Well, it took me some time, but i have finally created a minimax recursive algorithm (not yet negamax, will try that later).

For reference, my entire code is below. I dont use a heuristic function yet, i just try to see if a certain gamestate is a winner.

public AIMove RecursiveMiniMaxAI(int Depth, int MaxDepth, gamestate StateOfBoard, int Player)
		{
			//Depth is level of gamestate (initially 0)
			//Maxdepth is maximum search depth (set at creation of board)
			//StateofBoard is current gameboard
			//Player is the moving player. 1 = human, 2 = computer
			//Human Maximises, Computer Minimises
			//AI Move is the move to make
			Trace.WriteLine(Depth);
					  
			AIMove ReturnMove = new AIMove();
			//BestScore & Move
			int BestMove = -1;
			int BestScore;
			if (Player == 2) // Minimizing
				BestScore = 999;
			else
				BestScore = -999;
		  
			if (Depth == MaxDepth)
			//We are at the bottom of the search space and evaluate this level
			{
				ReturnMove.Score = EvaluateBoard(StateOfBoard, StateOfBoard.Column);
				ReturnMove.Move = StateOfBoard.Parent.Column;
			}
		  
			else
			{
				//Check for possible moves
				List<int> MoveList = new List<int>();
				MoveList = GenerateListOfMoves(StateOfBoard);
				if (MoveList.Count == 0)
				{
					ReturnMove.Score = EvaluateBoard(StateOfBoard, StateOfBoard.Column);
					ReturnMove.Move = StateOfBoard.Column;
				}
				else
				{
					if (Player == 2)  // Minimizing
					{
						//Perform minimax for each child
						for (int i = 0; i < MoveList.Count; i++)
						{
							//Make a move and create a new gamestate
							int j = GetStateNumber(Depth+1);
							AIStates[Depth + 1, j] = new gamestate(StateOfBoard, MoveList[i], size);
							//Place a token in the new Gamestate
							PlaceToken(AIStates[Depth + 1, j],MoveList[i], Player);
						  
							//Call minimax on the child
							ReturnMove = RecursiveMiniMaxAI(Depth + 1, MaxDepth, AIStates[Depth + 1, j], 1);
							//Evaluate return
							if (ReturnMove.Score < BestScore)
							{
								BestScore = ReturnMove.Score;
								BestMove = ReturnMove.Move;
							}
						}
					}
					else // Maximizing player
					{
						//Perform minimax for each child
						for (int k = 0; k < MoveList.Count; k++)
						{
							//Make a move and create a new gamestate
							int j = GetStateNumber(Depth + 1);
							AIStates[Depth + 1, j] = new gamestate(StateOfBoard, MoveList[k], size);
							//Place a token in the new Gamestate
							PlaceToken(AIStates[Depth + 1, j], MoveList[k], Player);
							//Call minimax on the child
							ReturnMove = RecursiveMiniMaxAI(Depth + 1, MaxDepth, AIStates[Depth + 1, j], 2);
							//Evaluate return
							if (ReturnMove.Score > BestScore)
							{
								BestScore = ReturnMove.Score;
								BestMove = ReturnMove.Move;
							}
						}
					}
					//Hier wordt de ReturnMove aangepast
					ReturnMove.Score = BestScore;
					ReturnMove.Move = BestMove;
				}
			}
			//string Data2;
			//Data2 = ReturnMove.Score.ToString();
			//Data2 += "/";
			//Data2 += ReturnMove.Move.ToString();
			//Trace.WriteLine("ReturnScore & Move: " + Data2 + " || At Depth " + Depth.ToString());
			return ReturnMove;
		}

The results i got out of this code are wrong and i think i managed to isolate the issue. When this piece of the function executes:

//Perform minimax for each child
						for (int i = 0; i < MoveList.Count; i++)
						{
							//Make a move and create a new gamestate
							int j = GetStateNumber(Depth+1);
							AIStates[Depth + 1, j] = new gamestate(StateOfBoard, MoveList[i], size);
							//Place a token in the new Gamestate
							PlaceToken(AIStates[Depth + 1, j],MoveList[i], Player);
						  
							//Call minimax on the child
							ReturnMove = RecursiveMiniMaxAI(Depth + 1, MaxDepth, AIStates[Depth + 1, j], 1);
							//Evaluate return
							if (ReturnMove.Score < BestScore)
							{
								BestScore = ReturnMove.Score;
								BestMove = ReturnMove.Move;
							}
						}

It doesnt run the FOR function multiple times. Even when MoveList.Count = 3, it only runs once, calling itself and when it returns to Evaluate the return, the function doesnt start over again for its next Move on the MoveList. The value of i is still 0, and MoveList.Count is still larger than zero.

If i remove the recursive function call, the code does re-iterate the for loop.

Does anybody know where i went wrong and how i could fix this?

Thank you very much in advance!

#4 Dragonsoulj   Crossbones+   -  Reputation: 2008

Like
0Likes
Like

Posted 20 May 2012 - 02:51 AM

Sorry about the upvote. Accidentally clicked it. We'll just say it's for moving towards a solution. I'll look into this and post back here.

Edited by Dragonsoulj, 20 May 2012 - 02:53 AM.


#5 Jaap85   Members   -  Reputation: 241

Like
0Likes
Like

Posted 20 May 2012 - 03:20 AM

Thanx!

Little update: If i change my code to this, it still doesnt work, however in this way the code never finishes (i guess it must be stuck in a loop somewhere).


						//Execute a minimax on each child
						//for (int i = 0; i < MoveList.Count; i++)
						foreach (int i in MoveList)
						{
							//Make a move and create a new gamestate
							int j = GetStateNumber(Depth+1);
							AIStates[Depth + 1, j] = new gamestate(StateOfBoard, MoveList[i], size);

							//Place a token in the new Gamestate
							PlaceToken(AIStates[Depth + 1, j],MoveList[i], Player);

							//Call minimax on the child
							ReturnMove = RecursiveMiniMaxAI(Depth + 1, MaxDepth, AIStates[Depth + 1, j], 1);

							//Evaluate return
							if (ReturnMove.Score < BestScore)
							{
								BestScore = ReturnMove.Score;
								BestMove = ReturnMove.Move;
							}
						}

Edited by Jaap85, 20 May 2012 - 11:11 AM.


#6 Storyyeller   Members   -  Reputation: 212

Like
0Likes
Like

Posted 03 June 2012 - 12:37 AM

By the way, if you are looking for other algorithms, Connect 4 has been completely solved. You could just store a database of moves if you want.
I trust exceptions about as far as I can throw them.

#7 Álvaro   Crossbones+   -  Reputation: 11859

Like
0Likes
Like

Posted 03 June 2012 - 06:34 AM

By the way, if you are looking for other algorithms, Connect 4 has been completely solved. You could just store a database of moves if you want.


And how do you suggest he computes the database of moves? Connect 4 [in its usual 7x6 size and a few others] was solved using minimax. Also, 8x8 connect 4 is relatively popular and hasn't been solved yet.

#8 Jaap85   Members   -  Reputation: 241

Like
0Likes
Like

Posted 05 June 2012 - 04:07 PM

Thanks for the suggestion but indeed i think it might be a bit to advanced for me at this level.

Right now i am trying to understand the performance of my algorithm. When i call the minimax algorithm at a 6x6 board with searchdepth 4, the algorithm takes about 40 milliseconds to run. However when i increase the searchdepth to 5, the algorithm takes about 950 milliseconds. Increasing it to searchdepth 6, takes the processing time to about 32 seconds.

Although i understand processing time grows exponentially with the searchdepth, i dont understand why it grows at this level. I would expect it to grow by the size of possible moves, for example:

- Searchdepth 4 - processing time 40 ms
- Searchdepth 5 - processing time 40 + 40 * 6 = 280 ms instead of 935

Does anybody have an idea why my processing time might be growing at its current rate?

PS. the total time doesnt really matter to me, i can always try to improve that later. right now i am just trying to understand the algorithm.

#9 Álvaro   Crossbones+   -  Reputation: 11859

Like
0Likes
Like

Posted 05 June 2012 - 09:10 PM

Instead of measuring time of each search (although ultimately that is what matters), count how many nodes have been visited. The node count should behave as you expect. If that's not the case, you are probably doing something wrong. If your search times are growing faster than your node counts, you probably have some data structure in memory that is growing, and its performance is deteriorating with the increased size.

By the way, once you implement alpha-beta pruning with a decent move-ordering heuristic, the ratio of searching depth n+1 over depth n should get significantly smaller (ideally it would be the square root of the branching factor).

#10 Jaap85   Members   -  Reputation: 241

Like
0Likes
Like

Posted 17 June 2012 - 12:01 AM

Thanks for your suggestion! This morning i tried to implement a nodecount as well to see what is really going on in my searches. As i expected, the number of nodes grows at a currently expected rate, however the search time grows much faster (see table below for more details). I have absolutely no idea what causes this deterioration in search speed. Could anybody point me in the right direction? The only thing i can think of right now (after reading the post of alvaro above) is that the way in which i store the gamestates might not be optimal.

Thanks in advance for alle replies!

Searchdepth 3 II Nodes searched 259 II Time spent 4 ms II 65 nodes/ms
Searchdepth 4 II Nodes searched 1555 II Time spent 43 ms II 36 nodes/ms
Searchdepth 5 II Nodes searched 9331 II Time spent 942 ms II 10 nodes/ms
Searchdepth 6 II Nodes searched 55986 II Time spent 32688 ms II 1,7 nodes/ms

The gamestates are stored in an array which is created at runtime using this (not so elegant) code:

AIStates = new gamestate[SearchDepth + 1, System.Convert.ToInt32(Math.Pow(System.Convert.ToDouble(activeBoard.Width), System.Convert.ToDouble(SearchDepth)))];

For example, when using SearchDepth 4 and boardsize 6, it will create an [5, 6^4] array. When the search is performed, the different gamestates are saved in AIStates, based on current searchdepth and then in the first empty gamestate.

After each complete search, i reset all gamestates using this code.


public void ClearAIStates()
		{
			for (int x = 0; x < AIStates.GetLength(0); x++)
			{
				for (int y = 0; y < AIStates.GetLength(1); y++)
				{
					if (AIStates[x, y] != null)
					{
						AIStates[x, y] = null;
					}
					else
					{
						y = AIStates.GetLength(1);
					}
				}
			}
		}


#11 Álvaro   Crossbones+   -  Reputation: 11859

Like
0Likes
Like

Posted 17 June 2012 - 05:37 AM

You don't need to use nearly as much memory as you are using. Actually, you basically only need one game state, on which you'll make and undo moves. It might be useful for you at this point to use a global game state, to guarantee you are not copying game states around. The negamax function (a version of minimax with the convention that positive score means the player to move is ahead; and forget about alpha-beta for now) looks something like this:

int negamax(int depth) {
  if (game_over())
	return game_score_from_the_point_of_view_of_player_to_move();
  if (depth <= 0)
	return static_evaluation_function_from_the_point_of_view_of_player_to_move();

  int move_list[7];
  int n_moves = generate_moves();
  int best_score = -INFINITY;
  for (int i=0; i<n_moves; ++i) {
	make_move(move_list[i]);
	int score = - negamax(depth-1);
	undo_move(move_list[i]);
	 if (score > best_score)
	  best_score = score;
  }

  return best_score;
}

Notice that this algorithm only uses O(depth) memory, instead of your current O(7^depth) memory usage.

Edited by alvaro, 17 June 2012 - 05:37 AM.


#12 Jaap85   Members   -  Reputation: 241

Like
0Likes
Like

Posted 17 June 2012 - 09:13 AM

Thanks a lot for the useful insight. I hadnt considered using only one gamestate with undo moves. I will definitely give it a try and see if this is a faster way to perform the search.

Update: Well, it turned out to be quite easy to implement. After editing my code, my program issued me the following response:

I visited 137257 nodes, of which 117649 are leaves, in 304 miliseconds which is 452 nodes/ms


Almost 400 times as fast Posted Image Thanks a lot, it gave me a great insight in how memory is being used.

Edited by Jaap85, 17 June 2012 - 09:47 AM.


#13 Jaap85   Members   -  Reputation: 241

Like
0Likes
Like

Posted 23 June 2012 - 09:15 AM

The past few days i worked a bit more on my Connect Four game (i have now implemented that AI players can play each other a certain amount of times in a row, so i can test easier). The next thing i want to dive into, is iterative deepening.

Right now, i just set my maximum searchdepth to 6, but this is just an arbitrary number (it takes about 300 milliseconds per move on my machine). When approaching the end of the game, when there are less choices, the thinking time decreases. In this case i would like the AI to try and think further ahead.

Of course it would be easy to hard code this (for example, when a certain percentage of the field is filled, increase searchdepth), but i dont think its the correct way. I dont know how to implement it in the search itself though. Since negamax is depth first, how can you determine how deep the AI has to search? If it would be breadth-first search, it would be easier (check first level, if time left, check second level).

I would be happy to hear any ideas on this problem!

Edit: i read online that the best way to implement it, is to do a search with depth 1 first, then depth 2 and so on, but this seems rather inefficient to me, because in the early stages of the game you have to search a lot more to get to the same result:

Search with iterative deepening
1. start at the top, search with depth 1
2. start at the top, search with depth 2
...
6. start at the top, search with depth 6
return

Search without iterative deepening
1. start at the top, search with depth 6
return

Or did is misinterpret something?

PS. i already implemented a simple version of AB pruning, this wasnt very hard in my opinion (unless i did it completely wrong Posted Image ).

Edited by Jaap85, 23 June 2012 - 09:21 AM.


#14 Álvaro   Crossbones+   -  Reputation: 11859

Like
0Likes
Like

Posted 24 June 2012 - 05:57 AM

It is a common misconception that iterative deepening will be slower than doing a single search of the appropriate depth.

How long an alpha-beta search takes is heavily dependent on how well the moves are sorted. Iterative deepening allows you to improve move order in two ways:
  • Keep the list of moves at the root sorted from most promising to least promising: You do this by simply moving a move to the top of the list when it's found to be best during the search at the root.
  • Store the best move found at each position in a hash table, so that you can try it first whenever you encounter this position in the future.
Those two improvements make iterative deepening actually faster than searching depth 6 directly. There are a few other things that can be done to reuse information from previous searches: You can read about them here.

#15 willh   Members   -  Reputation: 160

Like
0Likes
Like

Posted 24 June 2012 - 09:11 AM

Alvaro gives good advice.

Keep in mind that your code spend 99% of its time looking at useless positions. Anything you can do to get plausible positions explored early will eliminate a whole lot of work for the algorithm (thanks to alpha beta).

Iterative deepening and killer move heuristics can really speed things up substantially.






Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS