A tic-tac-toe superbot

Started by
25 comments, last by Solias 19 years, 2 months ago
Hi, Altough I know it has been done many times before, I'm trying to write a bot for tic-tac-toe that always does the best move. So far, I am only succeeding half. Take this situation:

 X | 1 | 2
---|---|---
 3 | O | 5
---|---|---
 X | 7 | 8
Circle may do a move now, and the computer is the circle. Now I wrote a method for the BoardState class that returns the percentage of wins as a float between 0 and 1. The most obvious move ofcourse should be 3. However, the computer takes 5. I stepped trough the code and let it checked the return values of 3 and 5: 3: 0.9 = 90% 5: 1.0 = 100% Is it a flaw in the algorithm, or a misimplementation in my code? Here is the bot code:

public void RequestMove(Game game, int id)
{
	Move bestMove = new Move();
	float bestVal = -1.0f;

	Move[] moves = game.State.GetPossibleMoves();

	foreach(Move move in moves)
	{
		float chance = game.State.Evaluate(move, game.GetPlayerName(this));
		if(chance<=bestVal)
			continue;

		bestVal = chance;
		bestMove = move;
	}

	Console.WriteLine("Bot makes move {0}", bestMove.X+bestMove.Y*3);
	game.DoMove(this, id, bestMove);
}


As you can see, the code loops trough all possibilities and takes the one with the highest win percentage. This part should be ok, but the following method returned those weird results from above:

public float Evaluate(Move move, PlayerName player)
{
	PlayerName winner = CheckWinner();
	if(winner!=PlayerName.Empty)
		return winner==player ? 1.0f : 0.0f;
	if(IsBoardFull())
		return 0.5f;

	BoardState board = new BoardState(this, move, player);
	
	winner = board.CheckWinner();
	if(winner!=PlayerName.Empty)
		return winner==player ? 1.0f : 0.0f;
	if(board.IsBoardFull())
		return 0.5f;

	Move[] moves = board.GetPossibleMoves();	

	float wins = 0.0f;
	foreach(Move movetry in moves)
		wins += board.Evaluate(movetry, player);

	return wins/moves.Length;
}


The first part checks for a win in the current state, and returns the win/loss/draw, the second part checks for the same in the board after the move, and if the next board declares no winner, all possible moves are evaluated. What's wrong with this?
Advertisement
Fixed it already, I think. The method only calculates for the current player, not taking in account the moves by the opponent. Now it does, but the it is beatable. Here is the winning tactic:

 0 | 1 | 2 ---|---|--- 3 | 4 | 5 ---|---|--- 6 | 7 | 8 Cross, please make your move: 6 0 | 1 | 2 ---|---|--- 3 | 4 | 5 ---|---|--- X | 7 | 8 Bot makes move 4 0 | 1 | 2 ---|---|--- 3 | O | 5 ---|---|--- X | 7 | 8 Cross, please make your move: 2 0 | 1 | X ---|---|--- 3 | O | 5 ---|---|--- X | 7 | 8 Bot makes move 8 0 | 1 | X ---|---|--- 3 | O | 5 ---|---|--- X | 7 | O Cross, please make your move: 0 X | 1 | X ---|---|--- 3 | O | 5 ---|---|--- X | 7 | O Bot makes move 3 X | 1 | X ---|---|--- O | O | 5 ---|---|--- X | 7 | O Cross, please make your move: 1 X | X | X ---|---|--- O | O | 5 ---|---|--- X | 7 | O Cross has won


It is indeed so, that when you calculate all possible moves, the middle one has a higher win percentage in the first move for the computer. However, it is a bad thing to do that, because as soon as the player X-es the opposite corner, the game is already finished.

What should I try to do now? Obviously, this way is not the way to go.
Implement alpha-beta, or make a table big enough to hold all the positions in the game and compute the game value of each node, as it is done with endgame tablebases in chess or checkers.
It was an idea worth exploring... :)

I tried the same thing for Connect-Four once, just for kicks.. One of the (many) problems with this method is that the computer had no knowledge of gauranteed wins, and would typically avoid them.

Take for instance a 2 move win-- if you screw up any of the two moves, the opponent wins, but as long as you play correctly the win is 100% certain. Looking at a win / loss ratio would actually have the computer avoid that situation, which is pure crazy! Coversly, the computer would persue it if the tables were turned.

Kudos for trying something different. As has been mentioned earlier, MinMax would give you the true 'super bot'.

Will
------------------http://www.nentari.com
Quote:Original post by Sijmen

... as soon as the player X-es the opposite corner, the game is already finished.


No, it is not. Play at any side, not a corner.

And, obviously, Alvaro is right.
_______________________The best conversation I had was over forty million years ago ... and that was with a coffee machine.
Try adapting your percentage calculation so that when you consider future moves consider best moves.

Basically, don't try all possible moves for the next possible move (of both the player and the AI) take the one with the highest win percentage (a bit of recursion there).

Basically the AI should be taking the best move given that its opponent will make the best possible move for himself.

So far your AI has been calculating its percentages with the assumption it is playing against a moron (i.e. under the assumption that it will pick a move at random) while that's not the case. (Mathematicly speaking you haven't modeled your probability space properly).

So, your AI is working OK now it is picking a move that is most likely to lead to a win (given that it is playing against a moron). The move it makes gives it the highest possibilty of win (unfortunately it also leaves a small chance of a loss, but hey, the opponent is a moron anyway). :)

Did you get that?



-----------------Always look on the bright side of Life!
Quote:Original post by A Guy from CRO
Did you get that?


Not entirely. Currently I'm trying every possible bot move, then every possible player move, and so forth - but what I understand of your explanation, is that I should /always/ anticipate on the best opponent move?

/edit:

Quote:Basically the AI should be taking the best move given that its opponent will make the best possible move for himself.


So I assume that is right. I'll try to implement it as soon as I boot into Windows again, thanks!
Quote:Original post by Sijmen
Not entirely. Currently I'm trying every possible bot move, then every possible player move, and so forth - but what I understand of your explanation, is that I should /always/ anticipate on the best opponent move?



That's exactly what I meant. I haven't tried it, just thought of it but mathematics tells me it should work.
-----------------Always look on the bright side of Life!
I've put the calculating bot in the fridge to rethink it a litle bit, and started work on a remembering bot (which is easy using hte Tic-Tac-Toe bot API I know have).

This bot checks for every move if he has done it before, and if so, what the win-loss rate is. When there is a move he didn't try before, he tries it, and if all moves are tried, he uses the one with the highest win ratio.

When the game ends, the bot checks if he has won. If so, he adds 1 to the win/loss ratio of all moves from the past game, otherwise he substracts 1. With 'move' I mean the state of the board after the move has been done.

But now I have this problem. When I first play agianst the bot, he uses a predictable pattern (he simply tries all the moves 1 after 1). However, since the bot is always beaten, he never builds up any winning states.

What should I do - should I be changing something in the 'remember' thing, or should I let him learn some tactics, by deliberately losing (as player) when the bot makes a good move?
You're making this way too hard for yourself. Your code shouldn't have to know anything about particular cases of where to go. For TTT use a simple minimax algorithm and pretty much all you have to do is tell it what a win and a loss look like. Then it will always pick the best move, guaranteed.
No more "Oops I missed the case where..."
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms

This topic is closed to new replies.

Advertisement