Negascout behaves oddly

Started by
5 comments, last by DarrylStein 10 years, 9 months ago

I have a negascout function within an iterative deepening with aspiration windows framework. The negascout function uses move ordering, transposition tables, killer moves and history heuristics. The game is a nine men's morris (yes.. still working on it in my free time!!)

The problem is that over 70% of the time, the AI is not that aggressive, ie. when the AI has a chance to capture pieces, it doesn't.

This is the function:


	private int negaScout(int curDepth, int maxDepth, int alpha, int beta) 
	{
		if (curDepth >= maxDepth || board.getWon() != 0)
			return board.getScore();

        //Check if the move is in the TT
        long key = board.getZobristKey();
        TTEntry entry = tt.probe(key, curDepth);
        if (entry != null)
        {
        	switch (entry.boundType)
        	{
            	case TTEntry.BOUND_EXACT:
                    return entry.score;

            	case TTEntry.BOUND_ALPHA:
                	alpha = Math.max(alpha, entry.score);
            		break;

            	case TTEntry.BOUND_BETA:
            		beta = Math.min(beta,  entry.score);
            		break;
        	}
        	
        	if (alpha >= beta)
        		return entry.score;
        }
		
		int val = 0;
 		int bestScore = -INFINITY;
		
		Move bestLocalMove = null;
		boolean foundPV = false;
		
		List<Move> moves = sortMoves(board, curDepth, false);
		for (int i = 0, n = moves.size(); i < n; i++) 
		{
			Move move = moves.get(i);
			
			//PV has been found
			if (alpha < bestScore) 
			{ 
				alpha = bestScore; 
				foundPV = true; 
			} 
			
			board.make(move, true);
			
			if (foundPV) 
			{
				//Zero window
				val = -negaScout(curDepth + 1, maxDepth, -alpha - 1, -alpha);
				if (val > alpha && val < beta) 
					val = -negaScout(curDepth + 1, maxDepth, -beta, -alpha);
			} 
			else	
				val = -negaScout(curDepth + 1, maxDepth, -beta, -alpha);
			
			board.undo(move, true);
			
			//Alpha has improved
			if (val > bestScore) 
			{
				bestScore = val;
				bestLocalMove = move;
				
				//Beta cut-off
				if (bestScore >= beta)
					break;
			}
		}

		//We have the current best move so far at the root
		if (curDepth == 0) bestMove = bestLocalMove;
		
		//Set TT entry flag
		byte flag = bestScore <= alpha ? TTEntry.BOUND_BETA : 
			        bestScore >= beta ? TTEntry.BOUND_ALPHA : TTEntry.BOUND_EXACT;
		
        //Store the move in the TT
        tt.set(key, curDepth, bestScore, flag, bestLocalMove);

        if (bestLocalMove != null)
        {
	    	//Add to killer moves
	    	killers.add(bestLocalMove, curDepth);
			
	        //Update history heuristics for non-captures
			if (bestLocalMove.cellRemove == -1)
				history[bestLocalMove.player][bestLocalMove.cellFrom + 1][bestLocalMove.cellTo + 1] += 1<<(maxDepth-curDepth);
        }
		
		return bestScore;
	}

This is the transpostion table:


public class TranspositionTable 
{
	private TTEntry[] tt;
	
    public TranspositionTable(int size)
    {
    	tt = new TTEntry[size];
    }
    
    public TTEntry probe(long key, int depth)
    {
        TTEntry entry = tt[(int)(key % tt.length)];

        if (entry != null && entry.depth <= depth) 
        	return entry;

        return null;
    }     
    
    public void set(long key, int depth, int val, byte boundType, Move move)
    {
    	int pos = (int)(key % tt.length);
        tt[pos] = new TTEntry(key, depth, val, boundType, move);
    }
}

As you can see in the transposition table, I always do a replace in the 'set' function. However, when I probe, I get the entry only if the entry's depth is smaller than the depth required. By depth here, I mean that depth from the root node, so the smaller the depth the nearer the entry is to the root move.

Is it possible that I'm probing/setting wrong values or could it be the part of the code when the 'Alpha has improved' comment is?

Thanks,

Ivan

Advertisement
Does the bad behavior go away if you disable the transposition table?

I don't understand the logic of depths you are using. One typically stores the "depth left", or how many moves from the current position were analyzed in the search, and then an entry can be used if the depth stored is greater or equal than the depth required. The distance from the root is a less important number: You may think they encode the same information because their sum is constant, but that goes out the window with iterative deepening or with extensions and reductions.

I commented out the transposition table code. It is not always playing better. In some games, the AI has a clear win on the next move, however it stays moving other unnecessary pieces. Looks like this happens more often in the endgame.

I use 2 depths because (as you can see from the code above), I do:

1. check if the current depth is the maximum depth (maximum being that of the current iterative of the ID) and if it is, I exit the function, and

2. if current depth == 0, then the search is at the root and I need to save the the move too.

I don't know how I can eliminate one depth variable.

You should probably write a separate search function for the root, where you can implement iterative deepening, time control, returning a move, informing the user of the progress of the search...

I have nothing against keeping two depth variables (although my minimax searchers don't typically need both). What I was saying is that the depth stored in the hash table should be the depth left from here to the end of the tree, not the distance from the root.

EDIT: Oh, and you should definitely fix odd behaviors (like not winning right away when it is possible) before you add transposition tables, because transposition tables are hard to get right and make things harder to debug. You may need to make the winning score be something like (LARGE_NUMBER - length_of_the_game).

What's the best way to fix the odd behaviors? Playing around with the values of the evaluation function?

Also, should the depth of the killer moves be the depth left (like the tt) or the current depth?

The best way to fix the odd behaviors is to eliminate bugs, which you seem to have. The best way to do that is to start with as simple a program as possible and make sure that everything is working great. You then add sophistication one step at a time, making sure you didn't break anything.

It is very hard to find the bugs in a program that has lots of things coded in and where you have no idea which ones are correct and which ones aren't.

One thing to look for when the AI is not taking the immediate win. Is to see if the evaluation function discriminates between depth when it finds a win. For instance, if a win is found at depth 2 and one at depth 5. It might choose the one at depth 5 which looks odd to a human - even if the win is still forced.

This topic is closed to new replies.

Advertisement