Jump to content
  • Advertisement

Archived

This topic is now archived and is closed to further replies.

Minimax Algorithm Tic Tac Toe Trying to Learn It

This topic is 5835 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I hope it is appropriate to post this much code. I am learning C++ along with AI (coming from VB). Please excuse the crudity of this code. All the code is mine except the "consider" and most of it has been altered. The "consider" function is one which I butchered from someone else on the net I found. What it is supposed to do is print: X XO (or wherever the next move is) XOX...etc. until a full board is setup. What it does is print whole lines and then tells me who won (over and over as it should here). It appears to be correct in its accuracy as to who won, but I believe it is still acting incorrectly. For example, it only examines 235K positions and other programs with no moves end up with 549K. This is not supposed to be a full fledged Tic Tac Toe game, just the parts to understand the minimax algorithm. And I do not understand the line: board = '' ''; after the consider recursive statement; perhaps, that is for another forum Also, where do I get the best move from. It goes through all these lines, but where it the move I need to store to play? Thanks for your help. #include <iostream> char board[9]; int total; int print_board(){ int i; for (i=0; i<10; i++){ std::cout << board[i]; } std::cout << "print board complete" << std::endl; return 0; } void clear_board(){ for (int i = 0; i<=8; i++){ board[i] = '' ''; } } isXWin(){ if (board[0] == ''X'' && board[1] == ''X'' && board[2] == ''X'') return 1; if (board[3] == ''X'' && board[4] == ''X'' && board[5] == ''X'') return 1; if (board[6] == ''X'' && board[7] == ''X'' && board[8] == ''X'') return 1; if (board[0] == ''X'' && board[3] == ''X'' && board[6] == ''X'') return 1; if (board[1] == ''X'' && board[4] == ''X'' && board[7] == ''X'') return 1; if (board[2] == ''X'' && board[5] == ''X'' && board[8] == ''X'') return 1; if (board[0] == ''X'' && board[4] == ''X'' && board[8] == ''X'') return 1; if (board[2] == ''X'' && board[4] == ''X'' && board[6] == ''X'') return 1; return 0; } isOWin(){ if (board[0] == ''O'' && board[1] == ''O'' && board[2] == ''O'') return 1; if (board[3] == ''O'' && board[4] == ''O'' && board[5] == ''O'') return 1; if (board[6] == ''O'' && board[7] == ''O'' && board[8] == ''O'') return 1; if (board[0] == ''O'' && board[3] == ''O'' && board[6] == ''O'') return 1; if (board[1] == ''O'' && board[4] == ''O'' && board[7] == ''O'') return 1; if (board[2] == ''O'' && board[5] == ''O'' && board[8] == ''O'') return 1; if (board[0] == ''O'' && board[4] == ''O'' && board[8] == ''O'') return 1; if (board[2] == ''O'' && board[4] == ''O'' && board[6] == ''O'') return 1; return 0; } int consider(char board[], int isPlayerX) { if (isXWin()) { std::cout << "X has won" << std::endl; return 1; } if (isOWin()) { std::cout << "O has won" << std::endl; return -1; } int value = -999999; // or some similarly invalid value for (int i = 0; i <= 8; i++) { if (board[i] == '' '') { board[i] = isPlayerX ? ''X'' : ''O''; print_board(); total++; int tval = consider(board, !isPlayerX); board[i] = '' ''; if ((value == -999999) || (isPlayerX && (tval > value)) || (!isPlayerX && (tval < value))) { value = tval; } } } if (value == -999999) /* no moves left */ return 0; return value; } main() { int value; int isPlayerX = 1; clear_board(); total=0; value = consider(board, isPlayerX); std::cout << "value is: " << value << std::endl; std::cout << "total count was: " << total << std::endl; print_board(); return 0; }

Share this post


Link to post
Share on other sites
Advertisement
There less than 20K possible positons in a 3x3 tic-tac-toe board so I don't see where you are getting the 235K/549K numbers from.

Also, seems you forgot to post the 'consider' function.

When posting code try use the source tags. See the FAQ link above.

[edited by - carrot on May 31, 2002 11:10:12 AM]

Share this post


Link to post
Share on other sites
I will look at the faq; I put the consider function in, it is:

int consider(char board[], int isPlayerX)

I got the 549k, because after playing ttt games that work, they all get 549k on the first move.

Share this post


Link to post
Share on other sites
quote:

What it does is print whole lines and then tells me who won (over and over as it should here).


But seeing who wins is not a real AI problem. It is easy.


quote:

examines 235K positions and other programs with no moves end up with 549K.



There are technically 365.880 states (9!) in TicTacToe. Really less if you consider simetry but this is not usually done since this problem is finite.

In a TicTacToe game you can use even deep-search or wide-search techniques because a PC has memory enough. Anyway, this is a minmax problem.

quote:

Also, where do I get the best move from. It goes through all these lines, but where it the move I need to store to play?
[quote]

Let's see the program:




        
isXWin(){

if (board[0] == 'X' && board[1] == 'X' && board[2] == 'X')
return 1;
if (board[3] == 'X' && board[4] == 'X' && board[5] == 'X')
return 1;
if (board[6] == 'X' && board[7] == 'X' && board[8] == 'X')
return 1;
if (board[0] == 'X' && board[3] == 'X' && board[6] == 'X')
return 1;
if (board[1] == 'X' && board[4] == 'X' && board[7] == 'X')
return 1;
if (board[2] == 'X' && board[5] == 'X' && board[8] == 'X')
return 1;
if (board[0] == 'X' && board[4] == 'X' && board[8] == 'X')
return 1;
if (board[2] == 'X' && board[4] == 'X' && board[6] == 'X')
return 1;
return 0;
}

isOWin(){
if (board[0] == 'O' && board[1] == 'O' && board[2] == 'O')
return 1;
if (board[3] == 'O' && board[4] == 'O' && board[5] == 'O')
return 1;
if (board[6] == 'O' && board[7] == 'O' && board[8] == 'O')
return 1;
if (board[0] == 'O' && board[3] == 'O' && board[6] == 'O')
return 1;
if (board[1] == 'O' && board[4] == 'O' && board[7] == 'O')
return 1;
if (board[2] == 'O' && board[5] == 'O' && board[8] == 'O')
return 1;
if (board[0] == 'O' && board[4] == 'O' && board[8] == 'O')
return 1;
if (board[2] == 'O' && board[4] == 'O' && board[6] == 'O')
return 1;
return 0;
}


I think this is a poor way to calculate if someone wins. It's better to iterate 3 times and calculate:

char SomeOneWon(char board[])
{
for (int i=0;i == board[i+1])&&(board == board[i+2])&&(board[i]!=' ')) return (board[i]);
if ((board[i] == board[i+3])&&(board[i] == board[i+6])&&(board[i]!=' ')) return (board[i]);
}
if ((board[0] == board[4])&&(board[0] == board[8])&&(board[4]!=' ')) return (board[4]);
if ((board[2] == board[4])&&(board[0] == board[6])&&(board[4]!=' ')) return (board[4]);
return ' ';
}

So this returns ' ', 'O' or 'X' depending if someone won or no one.

Let's go with consider:



  
int consider(char board[], int isPlayerX)
{

if (isXWin())
{ std::cout << "X has won" << std::endl;
return 1;
}
if (isOWin())
{ std::cout << "O has won" << std::endl;
return -1;
}
int value = -999999; // or some similarly invalid value


for (int i = 0; i <= 8; i++)
{
if (board[i] == ' ')
{
board[i] = isPlayerX ? 'X' : 'O';
print_board();
total++;
int tval = consider(board, !isPlayerX);

board[i] = ' ';

if ((value == -999999) || (isPlayerX && (tval > value)) ||
(!isPlayerX && (tval < value)))
{
value = tval;
}
}

}

if (value == -999999) /* no moves left */
return 0;
return value;
}


Before talking about how to calculate it, you asked about board=' '. Well: what minmax does is to calculate the 'best path', but assuming that your opponent IS DOING THE SAME. So in that loop:

if (board == ' ')
{
board[i] = isPlayerX ? 'X' : 'O';
print_board();
total++;
int tval = consider(board, !isPlayerX);
board[i] = ' ';
...

As you see it first sets 'x' or 'o' in an empty position. Really it is not putting that x or o there, only 'supossing what would happen if a x or o was there'. Thats why it puts an X or O, then 'considers' the next possibles moves (recursive) and then clears that position.

I think the code is confusing. Also, this algorithm is usually used with some initial position int the board, so the program calculates the best possible move. That's why the program fills the board and then clears it each time because in the calculation is supossed that you end winning or loosing. If a path leads to lose, that path gets no points for that player. See below.

This one seems to compete with itself. As the problem is FINITE, it can NEVER WIN, nor lose.

Before the explaining I will also point that this concrete problem can be solved with 3 lines of maths, but I lost that algorithm (which never understood then) and cannot review now that I understand more maths. Well, if anyone knows the 3-math-line-tictactoe-algorithm please email me.

MINIMAX:

Ok, thats how this works:

What we have:
- we are in a initial state (in this case, blank board but could be any)
- we have "rules" that determine who wins and the posible moves from a state, in tictactoe, any blank is a 'possible move'

Our TREE will have an initial node (blank), 9 nodes below, 8 nodes below each, 7 nodes below each... Imagine or draw it.

Theoric Process (later I explain the computational one):
- 1st: we exploit the tree to see which nodes are winners, this is a recursive process. In many games (almost all) you can't exploit all the tree so we give estimated values to each node in the point where we stop recursivity.
- 2nd: the max player moves
- 3rd: the min players moves (he needs to exploit too the tree to calculate his winner strategy)
- 4th: when done until the end, we label the leaf nodes, in the tictactoe case we can safely give infinite positive values to winner nodes and infinite positive values to looser nodes. This is impossible so any number will do it. Other aproximation better values to winning strategies if they are reached in less turns than other winning strategies. I really don't know where this is correct or not.
- 5th: now that we have a COST for each leaf, we can propagate the costs up to the root node (en each node you choose one value from its sons, if you are in a MAX level you pick positive costs, and if you are in MIN level you pick negative costs).

That's all. Remember that if you can't exploit the COMPLETE TREE you cannot be sure that the algorithm will win, it 's also important how you estimate costs when you are estimating them (with some A* algorithm). In ticatactoe case the problem is finite and you can exploit the complete tree, so the MAX player will always win. As this game can end up with tie, the computer will always tie when playing against itself.

GRAPHIC OF TIC TAC TOE TREE (for clarity):


  
max R

min R1 R2 ... R9

max R11 .. R18 R21 .. R28 ... R91 .. R98

min .... .... . . .. .... . .


Of course i'll will not draw complete cause are 362880 possibilities.

SIMPLE GRAPHIC OF MINIMAX PROBLEM:



        
max R
min R1 R2
max R11 R12 R21 R22 R23
min R111 R121 R122 R211 R212 R221 R222 R231 R232 R233

val: 10 -10 10 10 -10 10 -10 10 -10 10



(Note: not all leaf nodes must be at the same level, any player can win before that).

So, to propagate the results, imagine that the last level (leafs) is a min level. You need to propagate costs to the upper level, which would be a 'max' level. So you select the greater costs and lavel each node with the bigger cost from below. Now, the upper level will be a 'min' level, so for that you label each node with the lesser value from its sons.

In the end, you have a number associated to the root node (which is always max), imagine:



        
max 6
min 6 5 6
max 6 7 9 5 6 7
min 4 6 7 6 8 9 5 1 6 7


Note that the solution for the next move, calculated from this tree is a 6 (six). This means that the max player (our algorithm) should choose the move that has cost 6. In this case we have 2 paths with 6 so we must choose between them randomly (real random, programmers should do a random choose and not choose always the first solution for example, or the algorithm will be technically incorrect and mathematically not as good player as one who really does it randomly).

OK, EXPLAINED, and NOW?

Ok, so what should you program? Well, with recursivity you can avoid the tree in this case, you can do what your source code does, that is fill the board, calculate recursively and then clear that position so the loop can continue with the board intact.

I think this is not a good practice if you are trying to learn (though it seems a correct and clever optimization).

You should better make a tree structure (not binary tree!) which has COST and BOARD asociated to each node, and a function (CalculateNextMove) that fills all the sons from a node.

Once you have this, start with an empty board and try to calculate the next move, if you do this until there is a winner, you will have the computer playing against itself. You can do what you want, probably you will want to play against the computer too.

Each time you have to calculate a move, take the board's state as a starting node (root), make the tree below that board state, label each leaf with, for example, +10 and -10 (+10 if in that leaf the algorithm wins, and -10 if it loses), propagate the labels to the root, and select one of the paths labeled with +10 (winners). You can label ties with -10 or 0, as you want, think about it

Ok, it took me one hour, hope it helps you and if anyone has doubts, post them here i'll try to help.

Sorry for the highly-confusing-english




[edited by - jjmontes on May 31, 2002 12:52:46 PM]

Share this post


Link to post
Share on other sites
Sorry, here is the code:


  
#include <iostream>

char board[9];
int total;

int print_board(){
int i;
for (i=0; i<10; i++){
std::cout << board[i];
}
std::cout << "print board complete" << std::endl;
return 0;
}

void clear_board(){

for (int i = 0; i<=8; i++){
board[i] = '' '';
}

}

isXWin(){

if (board[0] == ''X'' && board[1] == ''X'' && board[2] == ''X'')
return 1;
if (board[3] == ''X'' && board[4] == ''X'' && board[5] == ''X'')
return 1;
if (board[6] == ''X'' && board[7] == ''X'' && board[8] == ''X'')
return 1;
if (board[0] == ''X'' && board[3] == ''X'' && board[6] == ''X'')
return 1;
if (board[1] == ''X'' && board[4] == ''X'' && board[7] == ''X'')
return 1;
if (board[2] == ''X'' && board[5] == ''X'' && board[8] == ''X'')
return 1;
if (board[0] == ''X'' && board[4] == ''X'' && board[8] == ''X'')
return 1;
if (board[2] == ''X'' && board[4] == ''X'' && board[6] == ''X'')
return 1;
return 0;
}

isOWin(){
if (board[0] == ''O'' && board[1] == ''O'' && board[2] == ''O'')
return 1;
if (board[3] == ''O'' && board[4] == ''O'' && board[5] == ''O'')
return 1;
if (board[6] == ''O'' && board[7] == ''O'' && board[8] == ''O'')
return 1;
if (board[0] == ''O'' && board[3] == ''O'' && board[6] == ''O'')
return 1;
if (board[1] == ''O'' && board[4] == ''O'' && board[7] == ''O'')
return 1;
if (board[2] == ''O'' && board[5] == ''O'' && board[8] == ''O'')
return 1;
if (board[0] == ''O'' && board[4] == ''O'' && board[8] == ''O'')
return 1;
if (board[2] == ''O'' && board[4] == ''O'' && board[6] == ''O'')
return 1;
return 0;
}
int consider(char board[], int isPlayerX)
{

if (isXWin())
{ std::cout << "X has won" << std::endl;
return 1;
}
if (isOWin())
{ std::cout << "O has won" << std::endl;
return -1;
}
int value = -999999; // or some similarly invalid value


for (int i = 0; i <= 8; i++)
{
if (board[i] == '' '')
{
board[i] = isPlayerX ? ''X'' : ''O'';
print_board();
total++;
int tval = consider(board, !isPlayerX);

board[i] = '' '';

if ((value == -999999) || (isPlayerX && (tval > value)) ||
(!isPlayerX && (tval < value)))
{
value = tval;
}
}

}

if (value == -999999) /* no moves left */
return 0;
return value;
}

main()
{
int value;
int isPlayerX = 1;
clear_board();
total=0;
value = consider(board, isPlayerX);
std::cout << "value is: " << value << std::endl;
std::cout << "total count was: " << total << std::endl;
print_board();
return 0;
}


Share this post


Link to post
Share on other sites
This is a basic programming question about this AI. Perhaps it was better suited to another forum, but I was unsure.

In the source below, how does recursion work? When it hits the "consider" call inside the for loop, does it call the consider function eight times? and then move to the next statement?

For example, when the function is first called, there are no moves on the board and the player is X. An ''X'' is assigned at board[0]. Then it moves to the recursion line and changes the player to ''O.'' Now, on the second time in the function (where O is the player) i is reassigned zero again and board[0] now equals ''X.'' So, how does the consider statement (inside; the recursion one) ever get executed more than once, because each time board[0] will already have a value in it. Thanks.



  

int value = -999999; // or some similarly invalid value


for (int i = 0; i <= 8; i++)
{
if (board[i] == '' '')
{
board[i] = isPlayerX ? ''X'' : ''O'';
//print_board();

total++;
int tval = consider(board, !isPlayerX);

board[i] = '' '';
std::cout << "board[" << i << "] is " << board[i] << std::endl;

if ((value == -999999) || (isPlayerX && (tval > value)) ||
(!isPlayerX && (tval < value)))
{
value = tval;
}
}

}

if (value == -999999) /* no moves left */
return 0;
return value;
}

Share this post


Link to post
Share on other sites

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!