AI and undoing moves

Started by
18 comments, last by cryo75 11 years, 1 month ago

Hi all,

I'm writing a board game and using Negamax for the AI. I'm applying an undo function instead of copying the board state, in order to speed up the search.

The evaluation function needs to consider the players' number of pieces that are on the board, those that still have to be placed and those that have been removed from the game. I can get this information every time but this means that I have to loop through the whole board and that would slow down search.

If I add variables and these are increased in makemove and decreased in undomove, it still doesn't work because of the recursive nature of the AI search.

So what is the best way to set these variables and sync them with the negmax search and the actual moves being done?

Thanks,

C

Advertisement

If I add variables and these are increased in makemove and decreased in undomove, it still doesn't work because of the recursive nature of the AI search.

You lost me there. That should work just fine. I do it all the time...

I have a variable called activePieces which returns the number of activepieces a player has.when I make a move that places a piece, I add 1 to it. In undomove I subtract 1. my negamax function is:


    private int negamax(IBoard board, int depth, int alpha, int beta, Move[] bestMove)
    {
        bestMove[0] = null;

        if (depth == 0)
            return board.getScore();

        if (board.getWon() != 0)
            return board.getScore();
        
        List<Move> bestMoves = null;
        List<Move> moves = board.getMoves();
        
        if (depth == maxDepth)
        {
            bestMoves = new ArrayList<Move>();
            bestMoves.clear();
            bestMoves.add(moves.get(0));
        }

        int minimax = -board.getMaxScoreValue();
        int val;

        for (Move move : moves)
        {
        	//Make the move
        	board.makeMove(move, true);

                val = -negamax(board, depth - 1, -beta, -alpha, dummyMove);
            
            //Set the move score
            move.setScore(val);
            
            //Undo the move
            board.undoMove(move);
            
            if (val > minimax)
                minimax = val;

            if (depth == maxDepth)
            {
                if (val > bestMoves.get(0).getScore())
                {
                    bestMoves.clear();
                    bestMoves.add(move);
                }
                else if (val == bestMoves.get(0).getScore())
                    bestMoves.add(move);
            }
            else
            {
            	//Alpha-Beta pruning
                if (val > alpha)
                    alpha = val;
                
                if (alpha >= beta)
                    return alpha;
            }
        }

        if (depth == maxDepth)
        {
        	bestMove[0] = bestMoves.get(0);
        	for (int i = 1; i < bestMoves.size(); i++)
        		if (bestMoves.get(i).getScore() > bestMove[0].getScore())
        			bestMove[0] = bestMoves.get(i);
        }

        return minimax;
    }

As you can see above, because undoMove is called after the recursive calls, the activepieces variable can go up to 8000 and above.

As you can see above, because undoMove is called after the recursive calls, the activepieces variable can go up to 8000 and above.

No, I can't see that above. If activePieces reaches a value that is too high, you probably have a bug somewhere, but this has nothing to do with the recursive structure of the algorithm.

Some problems with your code:
* You would be better off having a separate function to search the root and to search the other moves. Forcing them to be in the same function forces you to have awkward things in your code, like `dummyMove' and conditional statements.
* `move.setScore(val);' looks weird to me. You don't need to store a score with the move at all.
* You don't need `minimax' at all: `alpha' is all you need.
* You are not updating `alpha' when searching the root, but you should.
* You seem to be trying to make a list of all equally good moves, but it won't work: Alpha-beta pruning works in such a way that a returned value of alpha only means that the true minimax score is alpha or less, and a returned value of beta only means that the true minimax score is beta or more. This means that when a move returns a value equal to the best value so far, it only means that the move is not better than the other move; but it could be much much worse!
* I don't even know what the last loop in your function is supposed to do. That was a better place to put a comment than the ones you have. But my guess is that you don't need that code at all.

I tried to follow your suggestions, and this is what I came up with:


    public Move GetBestMove(IBoard board, int depth)
    {        
        //Changes in alpha/beta values propagates. Increases the chance of alpha-beta prunning
        int alpha = Integer.MIN_VALUE, beta = Integer.MAX_VALUE;
        int val, bestScore = 0;
        
        //The move that will be returned
        Move bestMove = null;      
        List<Move> moves = board.getMoves();
        
        for (Move move : moves)
        {
        	//Make the move
        	board.makeMove(move, true);

                val = -negamax(board, depth - 1, -beta, -alpha);
               
            //Undo the move
            board.undoMove(move);
            
            //Keep best move
            if (val > bestScore)
            {
                bestScore = val;
                bestMove = move;
            }
        }
		
		//Return the move
        return bestMove;
    }

    private int negamax(IBoard board, int depth, int alpha, int beta)
    {
        if (depth == 0)
            return board.getScore();

        if (board.getWon() != 0)
            return board.getScore();

        int val;

        List<Move> moves = board.getMoves();
        for (Move move : moves)
        {
        	//Make the move
        	board.makeMove(move, true);

                val = -negamax(board, depth - 1, -beta, -alpha);
               
            //Undo the move
            board.undoMove(move);

        	//Alpha-Beta pruning
            if (val > alpha)
                alpha = val;
            
            if (alpha >= beta)
                return alpha;
        }

        return alpha;
    }
}

Is this correct?

No, you still don't update alpha in the code that searches the root node. You don't need `bestScore' at all. Just use alpha for that.

Ok. So I replace bestScore with alpha. This should make the search correct now?

I wouldn't use Integer.MIN_VALUE and Integer.MAX_VALUE. I am not sure what those are in your environment, but often they will be something like -2^31 and +2^31-1. The problem is that -Integer.MIN_VALUE might be the same as Integer.MIN_VALUE, the way overflow typically works. In my most recent chess engine I use 8000 as infinity (I want to be able to represent a score in 14 bits to store it in the transposition tables).

Other than that, it looks correct to me, but I might have missed something.

It looks like the search is partially working but this is due to the fact that I'm still not ready with the evaluation function and move generation.

However, I come back to my first issue where I have a variable to hold the number of active pieces on the board for each player. Before the game starts, the variable is 0 for each player. After the first turn, the variable goes up to approx. 3000 for each player and it keeps growing after each turn.

In makeMove I'm doing +1 and in undomove I'm doing -1 so why is it increasing that much?

Do you know how to use a debugger? If not, this is the perfect opportunity.

This topic is closed to new replies.

Advertisement