Designing AI for Quoridor

Started by
34 comments, last by Wizuriel 11 years, 5 months ago

Are you pruning the search space using common sense assumptions? For example, taking advantage of left/right reflection symmetry, ignoring illegal wall combinations, ignoring wall combinations which will never improve the situation, etc? And of course there's applying heuristics so you search the more likely options first, e.g. putting a wall behind the opponent is unlikely to help your cause.


I'm trying to biggrin.png I do make sure all wall combinations are legal and my first children are always moving forward (followed by other movement options) since getting closer to the goal should always give a good score (I think).

As for applying common sense to try and prune more wall options, that I'm not sure how to do. I am going through every valid move, because I couldn't think of a logical way to try and find good spots for defensive walls to help you. I'm literally starting at square zero, checking what walls I can place there and if it is legal making it a new child and going to the next square. Also you might be surprised how much putting a wall behind your opponent can help

What I'm actually going try and see if it helps if make the computer just move on the first 3 turns. If they are both in the middle hopefully can prune some results faster as more walls are played instead of just movement. Yeah if no walls are placed and pawns just move you have 128 legal wall placements and 4 movement options.
Advertisement
Pruning the search space would best be done by biasing the search to include stuff that is close the the current player position and close to the goal. Summing 2 influence maps would give you a hierarchy of which places to test first.

Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

Just in case I'm misunderstanding the wall placement, the way I calculate the number is:

A line contains 9 possible wall positions.
There are 8 lines down (I assume the outer lines aren't included).
The same applies the other way.

This means 9 * 8 * 2 possible wall positions, which is 144. If you apply left/right symmetry, this cuts it down to 72 possible wall positions. This seems more manageable to me. Please correct me if I've misunderstood the rules. I also wonder whether checking if the wall positions are valid is slowing you down a lot, as it's in the inner loop. Try removing that check (apart from "is a wall already there") just to see if it is a large factor on speed.
PS, I wonder whether some results could be cached and re-used somehow. If they moved, then the walls are the same time as when you last calculated, and if they built a wall then their position is the same as when you last calculated.
The middle point of a wall can be placed in 8x8 different places, and then you pick whether the wall is vertical or horizontal, so 128 options. Left/right symmetry goes out the window after a few moves, so it's not terribly important. Checking if the wall position is valid is trivial and should be done.

Move ordering heuristics that should work well include killer heuristic and history heuristic; find information about them in chess programming websites.

You can implement a hash to identify positions that can be reached in different ways (but that probably won't matter until your depth goes beyond 4 ply).

You can apply reductions for unpromising moves (e.g., moving away from the goal, or placing a wall in such a way that it obstructs you more than it does the opponent).

It might take a little work, but you should be able to search significantly deeper than 3 moves.
Here's a thought. If you pre-process the board each turn to determine the number of moves to the goal from each square, after a move is made, you only need to determine what boxes are affected by the new move. If you can get from your current location to one of the unaffected boxes, you only need to add the number of moves from that box. That is, you don't need to pathfind all the way across the board for each test -- only to an unaffected square. The trick is, how do you determine which squares have been affected?

Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

Okay so just not sure if I'm doing this right then. I'm trying to use negamax with alpha beta pruning to find my move. What I have so far

getbestmove: This is the function I initially call to try and find my best move. I get all the possible moves based off the current players position. I than loop through them all and run them through the negamax function to evaluate how good of a placement they are.
[source lang="csharp"]
public static List<BoardView> GetBestMove(BoardView myview, int depth, int color)
{
var bestMoves = new List<BoardView>();

double highestScore = int.MinValue;
double tmpScore;
Boolean first = true;
List<BoardView> validmoves = new List<BoardView>();
//getting all valid children based off the current players position
validmoves.AddRange(myview.getChildren(color));
for (int x = 0; x < validmoves.Count; x++)
{
BoardView move = validmoves[x];
//doing minimax tree / negamax alpha beta pruning (tree?) for upcoming potential moves to evaluate how well this first move way
if (first)
{
tmpScore = NegaMax(move, int.MinValue, int.MaxValue, depth, -color);
first = false;
}
else
tmpScore = NegaMax(move, highestScore, int.MaxValue, depth, -color);
if (tmpScore > highestScore)
{
bestMoves.Clear();
bestMoves.Add(move);
highestScore = tmpScore;
}
else if (tmpScore == highestScore)
{
move.board.myscore = highestScore;
bestMoves.Add(move);
}
}
return bestMoves;
}
[/source]

[source lang="csharp"]
private static double NegaMax(BoardView myview, double alpha, double beta, int depth, int color)
{
if (depth <= 0 || myview.currentplayer.checkWin())
return myview.getScore();
List<BoardView> validmoves = new List<BoardView>();
validmoves.AddRange(myview.getChildren(color));
foreach (BoardView child in validmoves)
{
double temp = -NegaMax(child, -beta, -alpha, depth - 1, -color);
if (temp > alpha)
alpha = temp;
if (alpha >= beta)
break; // cut-off
}
return alpha;
}
[/source]

and my boardview class. This just pretty much holds the board and 2 players. The constructor takes either an old view or a board and 2 players. It than makes deep copies of the objects.

swap: just swaps the current and opponent references

getChildren: Returns a list of all possible plays based off the current players position on the board. It first checks for movement for the current player than wall positions. For wall positions it first checks if the current player is between the opponent and the goal and which direction the opponent is going. This should create 4 scenarions
i) Opponent is going towards start of the board and I'm between my opponent and the goal spaces. The wall logic will start placing walls behind me going from my position > 0 ; maxsquare > position
ii) Opponent is going towards start of board and I'm not between my opponent and the goal. Same thing as above but I start placing the walls behind the opponent
iii & iv) same thing but replace start of board with end of board and the logic is position > maxsquare; 0 > position

Getscore: I use an A* search to find the shortest path for me to a goal square and my opponent and his goal squares. The main weight is based off the distance to the goal with some weight on bad wall positions. While I don't think this equation is going get great defensive wall positions (see picture posted earlier in this thread). In theory moving away from the goal or bad wall placements should return bad scores resulting in those paths getting cut off.


[source lang="csharp"] public class BoardView
{
public Board board { get; set; }
public Player currentplayer { get; set; }
public Player opponent { get; set; }
//public BoardView parent { get; set; }
//makes a clone of an old view
public BoardView(BoardView oldview)
{
//parent = oldview;
board = new Board(oldview.board);
currentplayer = new Player(oldview.currentplayer);
opponent = new Player(oldview.opponent);
}
//makes a new board view by combining 2 players and a board
public BoardView(Board myboard, Player p1, Player p2)
{
board = new Board(myboard);
currentplayer = new Player(p1);
opponent = new Player(p2);
}
//swap the players
public void swap()
{
Player temp = currentplayer;
currentplayer = opponent;
opponent = temp;
}

/*
* returns a list of all avilable moves from a given view
* color = 1 = current player
* color = -1 = opponent
*/
public List<BoardView> getChildren(int color)
{
List<BoardView> children = new List<BoardView>();
//getting all movement views from the current player
GameSpace myspace = board.grid[currentplayer.position];
List<GameSpace> movement = new List<GameSpace>(AStar.neighborsAndJumps(myspace, board, opponent.symbol));
if (currentplayer.symbol != Global.getType(color))
{
myspace = board.grid[opponent.position];
movement = new List<GameSpace>(AStar.neighborsAndJumps(myspace, board, currentplayer.symbol));
}

foreach (GameSpace mymove in movement)
{
BoardView newReality = new BoardView(this);
if (newReality.currentplayer.symbol != Global.getType(color))
newReality.swap();
newReality.board.MovePawn(newReality.currentplayer.position, mymove.SQNum, newReality.currentplayer.symbol);
newReality.currentplayer.position = mymove.SQNum;
children.Add(newReality);
}
//if the current player has walls left, find all valid moves
if (currentplayer.numberwalls > 0)
{
/*
* 0 = horz walls
* 1 = vertical walls
*/
for (int y = 0; y < 2; y++)
{
//go through and try a wall at every valid position
for (int x = 0; x < board.grid.Count; x++)
{
int pos = 0;
int temp = opponent.position;
//finding where I should start the wall placement. If my goal[0] is > size (length of board) than it means I'm heading towards the back squares
if (currentplayer.goal[0] >= Global.SIZE)
{
/*
* opponent is moving to the front squares. Check if current player or opponent is closer to the goal.
* If I'm closer start placing walls behind me, if opponent is closer place walls behind opponent
*/
if (currentplayer.position < opponent.position)
temp = currentplayer.position;
/*
* starting at the far right of the row the opponent or current player are on (depending who is closer to the goal). Keep moving left checking each square for wall placement.
* Hopefully this will quickly find a good spot to place a wall that slows the opponent and not me
*/
pos = (temp / Global.SIZE) * Global.SIZE + (Global.SIZE - 1) - x;
if (pos < 0)
pos += board.grid.Count;
}
else
{
//opponent is moving to the back squares
if (currentplayer.position > opponent.position)
temp = currentplayer.position;
/*
* starting at the far left of the row the opponent or current player are on (depending who is closer to the goal). Keep moving right checking each square for wall placement.
* Hopefully this will quickly find a good spot to place a wall that slows the opponent and not me
*/
pos = (temp / Global.SIZE) * Global.SIZE + x;
if (pos >= board.grid.Count)
pos -= board.grid.Count;
}
BoardView newReality = new BoardView(this);
if (newReality.currentplayer.symbol != Global.getType(color))
newReality.swap();
if (y == 0)
{
//if true the wall has been placed and add it to the list of vald moves
if (newReality.board.SetHorzWall(pos, newReality.currentplayer, newReality.opponent))
{
children.Add(newReality);
}
}
else
{
//if true the wall has been placed and add it to the list of vald moves
if (newReality.board.SetVertWall(pos, newReality.currentplayer, newReality.opponent))
{
children.Add(newReality);
}
}
}
}
}
return children;
}
public double getScore()
{
double score = 0;
//finding my shortest path to a goal
List<GameSpace> mypath = new List<GameSpace>(board.findPath(true, currentplayer, opponent));
//finding opponents shortest path to a goal
List<GameSpace> opppath = new List<GameSpace>(board.findPath(true, opponent, currentplayer));
score = 100 * (opppath.Count - mypath.Count) + currentplayer.numberwalls - opponent.numberwalls;
return score;
} [/source]

Edit: For some reason my negamax function won't show right with the code tag sad.png

private static double NegaMax(BoardView myview, double alpha, double beta, int depth, int color)
{
if (depth <= 0 || myview.currentplayer.checkWin())
return myview.getScore();
List<BoardView> validmoves = new List<BoardView>();
validmoves.AddRange(myview.getChildren(color));
foreach (BoardView child in validmoves)
{
double temp = -NegaMax(child, -beta, -alpha, depth - 1, -color);
if (temp > alpha)
alpha = temp;
if (alpha >= beta)
break; // cut-off
}
return alpha;
}

edit 2: Found a bug when I was calling swap. Did more checking that swapping occurs at the right time. Man that getchild function is atrocious :S
Hmm, I think there's a lot of room for optimisation. One suggestion would be as follows:
- Have the player and opponent positions stored outside the board.
- Attach the previous shortest path to a board.
- This means that any time a player moves (rather than wall-building), your board (including pathing) doesn't change, so you don't need to instantiate a new board.
- Explore moves before walls to take maximum advantage of this.
- Exploring this way you only have to re-path if the opponent steps into your pre-calculated path, you step off your shortest path, or when a wall is built that blocks your path.

The majority of the branches are wall-building related, but pathfinding is an expensive part of the inner loop to reduce. Also I would imagine that most of the walls you create would not block your own path.
Huzzah just got the alpha beta working (at least to the best I can debug/see) biggrin.png I gave up on negamax and just used a more traditional alpha/beta look.

I don't think my program is slow because of the pathing. When I generate children each wall place requires pathing to be called a min of twice (once per each player till it can find a path to one of their goal spaces). My getchildren routine is doing that in a fraction of a second. I think my programs slow speed is just the amount of states it needs to explore and evaluate (also probably doesn't help for debugging it I kept links to each state to help debug, good think memory is cheap biggrin.png).

Unfortunately even if my opponent just moves that can also reflect my best position. It is possible because of wall placements that player 1 can't place a wall to block player 2 without also increasing player 1's path. After player 2 has moved x squares though it is possible that you can now place a wall to increase just player 2's time despite both players following their optimal path.

Though now that I wrote all that I do think is player 1 path is shorter than player 2's path as long as both just move I shouldn't have to reevaluate player 1's positions. Unfortunately would need to still do so for player 2 :S

edit: Updated code
[source lang="csharp"] public static List<BoardView> GetBestMove(BoardView myview, int depth, int color)
{
var bestMoves = new List<BoardView>();

double highestScore = int.MinValue;
double tmpScore;
List<BoardView> validmoves = new List<BoardView>();
//getting all valid children based off the current players position
validmoves.AddRange(myview.getChildren(color));
for (int x = 0; x < validmoves.Count; x++)
{
BoardView move = validmoves[x];
//doing minimax tree / negamax alpha beta pruning (tree?) for upcoming potential moves to evaluate how well this first move way
//tmpScore = NegaMax(move, int.MinValue, int.MaxValue, depth - 1, -color, color);
tmpScore = alphabeta(move, int.MinValue, int.MaxValue, depth - 1, -color, color);
if (tmpScore > highestScore)
{
bestMoves.Clear();
bestMoves.Add(move);
highestScore = tmpScore;
}
else if (tmpScore == highestScore)
{
move.board.myscore = highestScore;
bestMoves.Add(move);
}
}
return bestMoves;
}[/source]

Edit: Of course this one still doesn't show
private static double alphabeta(BoardView myview, double alpha, double beta, int depth, int color, int originalplayer)
{
if (depth <= 0 || myview.currentplayer.checkWin())
{
double score = myview.getScore(originalplayer);
return score;
}
if (color != originalplayer)
{
double minscore = beta;
List<BoardView> validmoves = new List<BoardView>();
validmoves.AddRange(myview.getChildren(color));
foreach (BoardView child in validmoves)
{
minscore = Math.Min(minscore, alphabeta(child, alpha, beta, depth-1, -color, originalplayer));
beta = minscore;
if (alpha >= beta)
return minscore;
}
return minscore ;
}
else
{
double maxscore = alpha;
List<BoardView> validmoves = new List<BoardView>();
validmoves.AddRange(myview.getChildren(color));
foreach (BoardView child in validmoves)
{
maxscore = Math.Max(maxscore, alphabeta(child, alpha, beta, depth-1, -color, originalplayer));
if (alpha >= beta)
{
return maxscore;
}
}
return maxscore;
}
}

[source lang="csharp"] public class BoardView
{
public Board board { get; set; }
public Player currentplayer { get; set; }
public Player opponent { get; set; }
public BoardView parent { get; set; }

//makes a clone of an old view
public BoardView(BoardView oldview)
{
parent = oldview;
board = new Board(oldview.board);
currentplayer = new Player(oldview.currentplayer);
opponent = new Player(oldview.opponent);
}

//makes a new board view by combining 2 players and a board
public BoardView(Board myboard, Player p1, Player p2)
{
board = new Board(myboard);
currentplayer = new Player(p1);
opponent = new Player(p2);
}

//swap the players
public void swap()
{
Player temp = currentplayer;
currentplayer = opponent;
opponent = temp;
}


/*
* returns a list of all avilable moves from a given view
* color = 1 = current player
* color = -1 = opponent
*/

public List<BoardView> getChildren(int color)
{
List<BoardView> children = new List<BoardView>();
Boolean swapped = false;
/*
* checking if the current player is what it should be according to the color call
* If color = 1 player = 1
* if color = -1 player = 2
* if I swap here it changes this level not the children, so will need to swap again later (since I just want the children to move opposite the parents, not the current level to swap
* who played)
*/

if (currentplayer.symbol != Global.getType(color))
{
swapped = true;
swap();
}

//getting all movement views from the current player
GameSpace myspace = board.grid[currentplayer.position];
List<GameSpace> movement = new List<GameSpace>(AStar.neighborsAndJumps(myspace, board, opponent.symbol));
foreach (GameSpace mymove in movement)
{
BoardView newReality = new BoardView(this);
newReality.board.MovePawn(newReality.currentplayer.position, mymove.SQNum, newReality.currentplayer.symbol);
newReality.currentplayer.position = mymove.SQNum;
children.Add(newReality);
}
//if the current player has walls left, find all valid moves
if (currentplayer.numberwalls > 0)
{
/*
* 0 = horz walls
* 1 = vertical walls
*/
for (int y = 0; y < 2; y++)
{
//go through and try a wall at every valid position
for (int x = 0; x < board.grid.Count; x++)
{
int pos = 0;
int temp = opponent.position;
//finding where I should start the wall placement. If my goal[0] is > size (length of board) than it means I'm heading towards the back squares
if (currentplayer.goal[0] >= Global.SIZE)
{
/*
* opponent is moving to the front squares. Check if current player or opponent is closer to the goal.
* If I'm closer start placing walls behind me, if opponent is closer place walls behind opponent
*/
if (currentplayer.position < opponent.position)
temp = currentplayer.position;
/*
* starting at the far right of the row the opponent or current player are on (depending who is closer to the goal). Keep moving left checking each square for wall placement.
* Hopefully this will quickly find a good spot to place a wall that slows the opponent and not me
*/
pos = (temp / Global.SIZE) * Global.SIZE + (Global.SIZE - 1) - x;

if (pos < 0)
pos += board.grid.Count;
}
else
{
//opponent is moving to the back squares
if (currentplayer.position > opponent.position)
temp = currentplayer.position;
/*
* starting at the far left of the row the opponent or current player are on (depending who is closer to the goal). Keep moving right checking each square for wall placement.
* Hopefully this will quickly find a good spot to place a wall that slows the opponent and not me
*/
pos = (temp / Global.SIZE) * Global.SIZE + x;
if (pos >= board.grid.Count)
pos -= board.grid.Count;
}

BoardView newReality = new BoardView(this);
if (y == 0)
{
//if true the wall has been placed and add it to the list of vald moves
if (newReality.board.SetHorzWall(pos, newReality.currentplayer, newReality.opponent))
{
children.Add(newReality);
}
}
else
{
//if true the wall has been placed and add it to the list of vald moves
if (newReality.board.SetVertWall(pos, newReality.currentplayer, newReality.opponent))
{
children.Add(newReality);
}
}
}
}
}
/*
* If I swapped above than undo the swap at the current level
*/
if (swapped)
swap();
return children;
}

public double getScore(int color)
{
bool swapped = false;
if (currentplayer.symbol != Global.getType(color))
{
swapped = true;
swap();
}
double score = 0;
//finding my shortest path to a goal
List<GameSpace> mypath = new List<GameSpace>(board.findPath(true, currentplayer, opponent));
//finding opponents shortest path to a goal
List<GameSpace> opppath = new List<GameSpace>(board.findPath(true, opponent, currentplayer));
score = 100 * (opppath.Count - 1 - mypath.Count) + currentplayer.numberwalls - opponent.numberwalls;
if (swapped)
swap();
return score;

}[/source]

edity 2: I'm not really understanding the killer heuristic, Right now I always start with movement towards the goal (followed by the rest since it is a short list) and for my wall placement I start by placing walls between my opponent and their goal in a way that won't block me. Is there anything else I can try with that?
I put together a C++ quoridor engine last night. It typically searches depth 4 in less than a second, and it already plays better than I do. I haven't done anything about move ordering (other than at the root, where I use iterative deepening), I haven't implemented transposition tables, and I am not reducing any moves yet. I haven't tried null-move pruning either, which I think might actually work for this game. So I expect I'll get a much better search with a little bit of effort.

For evaluation I am using something like
return 20*(his_distance_to_goal - my_distance_to_goal) + 10 + small_random_perturbation;

I'll post the program in a few days, when I am satisfied with it.

This topic is closed to new replies.

Advertisement