AI and undoing moves

Recommended Posts

cryobuzz75    105

Hi all,

I'm writing a board game and using Negamax for the AI. I'm applying an undo function instead of copying the board state, in order to speed up the search.

The evaluation function needs to consider the players' number of pieces that are on the board, those that still have to be placed and those that have been removed from the game. I can get this information every time but this means that I have to loop through the whole board and that would slow down search.

If I add variables and these are increased in makemove and decreased in undomove, it still doesn't work because of the recursive nature of the AI search.

So what is the best way to set these variables and sync them with the negmax search and the actual moves being done?

Thanks,

C

Share on other sites
alvaro    21246

If I add variables and these are increased in makemove and decreased in undomove, it still doesn't work because of the recursive nature of the AI search.

You lost me there. That should work just fine. I do it all the time...

Share on other sites
cryobuzz75    105

I have a variable called activePieces which returns the number of activepieces a player has.when I make a move that places a piece, I add 1 to it. In undomove I subtract 1. my negamax function is:

    private int negamax(IBoard board, int depth, int alpha, int beta, Move[] bestMove)
{
bestMove[0] = null;

if (depth == 0)
return board.getScore();

if (board.getWon() != 0)
return board.getScore();

List<Move> bestMoves = null;
List<Move> moves = board.getMoves();

if (depth == maxDepth)
{
bestMoves = new ArrayList<Move>();
bestMoves.clear();
}

int minimax = -board.getMaxScoreValue();
int val;

for (Move move : moves)
{
//Make the move
board.makeMove(move, true);

val = -negamax(board, depth - 1, -beta, -alpha, dummyMove);

//Set the move score
move.setScore(val);

//Undo the move
board.undoMove(move);

if (val > minimax)
minimax = val;

if (depth == maxDepth)
{
if (val > bestMoves.get(0).getScore())
{
bestMoves.clear();
}
else if (val == bestMoves.get(0).getScore())
}
else
{
//Alpha-Beta pruning
if (val > alpha)
alpha = val;

if (alpha >= beta)
return alpha;
}
}

if (depth == maxDepth)
{
bestMove[0] = bestMoves.get(0);
for (int i = 1; i < bestMoves.size(); i++)
if (bestMoves.get(i).getScore() > bestMove[0].getScore())
bestMove[0] = bestMoves.get(i);
}

return minimax;
}


As you can see above, because undoMove is called after the recursive calls, the activepieces variable can go up to 8000 and above.

Share on other sites
alvaro    21246

As you can see above, because undoMove is called after the recursive calls, the activepieces variable can go up to 8000 and above.

No, I can't see that above. If activePieces reaches a value that is too high, you probably have a bug somewhere, but this has nothing to do with the recursive structure of the algorithm.

* You would be better off having a separate function to search the root and to search the other moves. Forcing them to be in the same function forces you to have awkward things in your code, like dummyMove' and conditional statements.
* move.setScore(val);' looks weird to me. You don't need to store a score with the move at all.
* You don't need minimax' at all: alpha' is all you need.
* You are not updating alpha' when searching the root, but you should.
* You seem to be trying to make a list of all equally good moves, but it won't work: Alpha-beta pruning works in such a way that a returned value of alpha only means that the true minimax score is alpha or less, and a returned value of beta only means that the true minimax score is beta or more. This means that when a move returns a value equal to the best value so far, it only means that the move is not better than the other move; but it could be much much worse!
* I don't even know what the last loop in your function is supposed to do. That was a better place to put a comment than the ones you have. But my guess is that you don't need that code at all.

Share on other sites
cryobuzz75    105

I tried to follow your suggestions, and this is what I came up with:

    public Move GetBestMove(IBoard board, int depth)
{
//Changes in alpha/beta values propagates. Increases the chance of alpha-beta prunning
int alpha = Integer.MIN_VALUE, beta = Integer.MAX_VALUE;
int val, bestScore = 0;

//The move that will be returned
Move bestMove = null;
List<Move> moves = board.getMoves();

for (Move move : moves)
{
//Make the move
board.makeMove(move, true);

val = -negamax(board, depth - 1, -beta, -alpha);

//Undo the move
board.undoMove(move);

//Keep best move
if (val > bestScore)
{
bestScore = val;
bestMove = move;
}
}

//Return the move
return bestMove;
}

private int negamax(IBoard board, int depth, int alpha, int beta)
{
if (depth == 0)
return board.getScore();

if (board.getWon() != 0)
return board.getScore();

int val;

List<Move> moves = board.getMoves();
for (Move move : moves)
{
//Make the move
board.makeMove(move, true);

val = -negamax(board, depth - 1, -beta, -alpha);

//Undo the move
board.undoMove(move);

//Alpha-Beta pruning
if (val > alpha)
alpha = val;

if (alpha >= beta)
return alpha;
}

return alpha;
}
}


Is this correct?

Share on other sites
alvaro    21246
No, you still don't update alpha in the code that searches the root node. You don't need bestScore' at all. Just use alpha for that.

Share on other sites
cryobuzz75    105

Ok. So I replace bestScore with alpha. This should make the search correct now?

Share on other sites
alvaro    21246
I wouldn't use Integer.MIN_VALUE and Integer.MAX_VALUE. I am not sure what those are in your environment, but often they will be something like -2^31 and +2^31-1. The problem is that -Integer.MIN_VALUE might be the same as Integer.MIN_VALUE, the way overflow typically works. In my most recent chess engine I use 8000 as infinity (I want to be able to represent a score in 14 bits to store it in the transposition tables).

Other than that, it looks correct to me, but I might have missed something.

Share on other sites
cryobuzz75    105

It looks like the search is partially working but this is due to the fact that I'm still not ready with the evaluation function and move generation.

However, I come back to my first issue where I have a variable to hold the number of active pieces on the board for each player. Before the game starts, the variable is 0 for each player. After the first turn, the variable goes up to approx. 3000 for each player and it keeps growing after each turn.

In makeMove I'm doing +1 and in undomove I'm doing -1 so why is it increasing that much?

Share on other sites
alvaro    21246
Do you know how to use a debugger? If not, this is the perfect opportunity.

Share on other sites

Does the object you pass to recursive functions have its own container of moves? Because you may be continually adding moves to the same object rather than making a copy of the object which would break the recusrion. You need to pass copies of objects to recursive functions, not references. I'd check that first... in the debugger, as Al says.

Share on other sites
alvaro    21246

Does the object you pass to recursive functions have its own container of moves? Because you may be continually adding moves to the same object rather than making a copy of the object which would break the recusrion. You need to pass copies of objects to recursive functions, not references. I'd check that first... in the debugger, as Al says.

Passing the board by reference and making and undoing moves the way his code does is very common. You can also extend the board to keep some precomputed information that is easy to update incrementally with each move. For instance, my chess engine has this:
void Board::place_piece(int square, int piece) {
a[square] = piece;
bb[piece] |= 1ul << square;
zobrist_key ^= zobrist_table[piece][square];
material_zobrist_key += zobrist_table[piece][0];
opening_material_score += opening_square_piece_table[piece][square];
endgame_material_score += endgame_square_piece_table[piece][square];
total_material += naive_material_value[piece];
}

void Board::remove_piece(int square, int piece) {
a[square] = Empty;
bb[piece] &= ~(1ul << square);
zobrist_key ^= zobrist_table[piece][square];
material_zobrist_key -= zobrist_table[piece][0];
opening_material_score -= opening_square_piece_table[piece][square];
endgame_material_score -= endgame_square_piece_table[piece][square];
total_material -= naive_material_value[piece];
}


a' is an array with the content of each square on the board, and bb' is a bitboard (a 64-bit number indicating where a piece type appears). The other five things are of the nature that we are discussing: They are kept incrementally as you make and undo moves. The code that makes and undoes moves uses the two functions above.

Share on other sites

Yeah, I get that... obviously you need to check that the undo stuff is correctly undoing the state though or else the container size explosion described is likely to be the result. Debugger-me-do...

Share on other sites
cryobuzz75    105

That's similar to what I'm doing:

	private void placePiece(int cell)
{
this.board.getCells()[cell] = playerType;
this.board.getActivePieces()[playerType]++;
}

private void undoPlacePiece(int cell)
{
this.board.getCells()[cell] = CellState.Empty;
this.board.getActivePieces()[playerType]--;
}


But the activepieces still increase out of proportion.

Share on other sites
alvaro    21246
How did it go with the debugger?

You can also write a simple test program that makes and undoes some moves, so you can see what's going on independently of the complicated alpha-beta search.

Share on other sites
cryo75    143

Hmm... although I could see the variables increasing and sometimes decreasing I couldn't fully follow. I'm suspecting that the variable goes out of sync because after every move and every undo I'm swapping the players (the current player on turn).

Share on other sites
DarrylStein    653

A good test for ensuring that your move/undo is working properly is to temporarily save (clone) the state before each application of the move function and then after the undo move function testing that the state is the same as your saved state.  Obviously this is a huge performance hit so enable it, do a test to a reasonable depth of search (so that this clone and test is executed on all searched nodes) and once satisfied disable it.  If there is a problem the comparison will pinpoint which part of the move/undo move is not performing correctly.

Share on other sites
cryo75    143

I decided to copy the board anyways instead of undoing moves for now. If I have perfomance issues with copying I'll have to revert to undoing moves.

Share on other sites
DarrylStein    653

It comes down to what your objective is for this implementation. If you are trying to make a strong AI try to tackle the move/undo problem before delving into any of the enhanced search features (PVS, History heuristics, killer moves, null move) as it will become much harder to make the change later.  Also your search depth will suffer greatly as copying is going to be your main bottleneck for the AI so you won't be generating nearly as many nodes compared to move/undo.

Share on other sites
cryo75    143

Yes I noticed that copying is a bottleneck. I'm currently searching at 5 ply but when I increase it to 7 it starts taking its time search. Maybe I should go back to make/undo.