Designing AI for Quoridor

Started by
34 comments, last by Wizuriel 11 years, 5 months ago
I was thinking that this shouldn't be that big of a problem given a decent A* implementation. Brute forcing it shouldn't be that big of a deal. *shrug*

Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

Advertisement

I was thinking that this shouldn't be that big of a problem given a decent A* implementation. Brute forcing it shouldn't be that big of a deal. *shrug*


I am not sure what you have in mind. This is a two-player game, so some form of tree search is probably required to play it well. A* won't do much more than compute how far away you are from the goal right now, which I do many many times in a search.

Which reminds me... I didn't even use A* for pathfinding! I started with some dead simple find-all-distances-to-goal-and-pick-the-number-where-the-pawn-sits method, and I didn't bother substituting it by something faster. I could compute the distance-to-goal map for each player at the root (or at nodes with high enough depth left) and use them as the heuristic in A*. Since walls only ever get added, that is an admissible heuristic (i.e., an underestimate of the true distance). Maybe I'll try that tonight.
Now I'm really curious what I did wrong for mine to take hours to do depth 4 searching.

I've tried shrinking the board down to 5x5 with fewer walls to test it and I'm positive my algorithms are all working correctly (if not quickly).

In a nutshell what I do is send the current view to my function.
-Try to move pawn each direction (check if way is blocked by a wall, if not you can move, if the node your moving has the opponent I get his move directions and add then to yours (since opponent square won't see your piece as his it returns that your square is not valid so doesn't infinite loop).
-Try to place wall on each square
-When placing a wall I check that it doesn't hit another wall or go between a wall (My wall pieces have a start and end point making it pretty easy to tell the difference between 1 wall and between 2 walls which would be legal)
-If wall position doesn't hit a wall do A* search on both player and opponent. I do this until I find a path for each of them (so min of once per pawn, max of board length), if both still have exits the wall is good.

run all those children in the alpha beta function
Make sure you are not doing any dynamic memory allocation at all. I don't know how feasible this is with languages like C# and Java, but in C or C++ it's fairly straight forward.


-When placing a wall I check that it doesn't hit another wall or go between a wall (My wall pieces have a start and end point making it pretty easy to tell the difference between 1 wall and between 2 walls which would be legal)

Ooops! I guess I am implementing that rule wrong. I don't allow a wall that crosses something that blocks perpendicularly, whether it's a single wall or the place where two join. I don't think that should matter a whole lot, though. More work for tonight.

[quote name='IADaveMark' timestamp='1346852031' post='4976812']
I was thinking that this shouldn't be that big of a problem given a decent A* implementation. Brute forcing it shouldn't be that big of a deal. *shrug*


I am not sure what you have in mind. This is a two-player game, so some form of tree search is probably required to play it well. A* won't do much more than compute how far away you are from the goal right now, which I do many many times in a search.
[/quote]
Proper placement of walls is based on the combined ideas of

  • Lengthening the other player's path
  • Not lengthening your own
  • Not doing an illegal move that blocks a complete path

Obviously, the longer you can make his path without affecting your own, the better. This is where A* comes in. You want to know which of your potential wall placements will do the most damage without being illegal or making your own situation markedly worse. That is the fitness of your decision.

Now you can certainly try to do it all the way to the end (i.e. full tree search to leaf nodes) but the branching factor does get annoying. The good news is that there are a finite number of walls (20) so that while the maximum depth may be deeper than that (you don't always have to place a wall), there are only 20 plys where you really have significant branching because of a new wall.

Even searching a few layers in (which is what most human players would do anyway) will provide enough insight to know if/when/where to place walls.

Again, having a screaming A* routine expedites the search through the possibility space.

Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"


Obviously, the longer you can make his path without affecting your own, the better. This is where A* comes in. You want to know which of your potential wall placements will do the most damage without being illegal or making your own situation markedly worse. That is the fitness of your decision.


Yes, that's effectively a depth-1 search in my program. It's already an interesting opponent, but depth-2 is probably the minimum level at which it is not very easy to beat.

I haven't profiled my program, but I think you are right: Having a really fast A* implementation will probably speed things up a lot.

By the way, the large branching factor is an issue, but it's not a deal breaker for alphabeta. In games like go the dealbreaker is that there is no obvious evaluation function to start with. Luckily, the difference in distance to goal between the players serves that role nicely.

Make sure you are not doing any dynamic memory allocation at all. I don't know how feasible this is with languages like C# and Java, but in C or C++ it's fairly straight forward.


Right now everything I do used dynamic memory allocation and truth be told my classes aren't memory efficient (especially for debugging I like tracking some extra information to make sure stuff is working), but would that really be the difference between you're alpha beta pruning taking seconds and mine taking hours?

Right now everything I do used dynamic memory allocation and truth be told my classes aren't memory efficient (especially for debugging I like tracking some extra information to make sure stuff is working), but would that really be the difference between you're alpha beta pruning taking seconds and mine taking hours?


Of course, you might have a bug somewhere. But needless memory allocation can easily result in a factor of 100 slowdown. The problem is that everything in an alpha-beta searcher is part of the "inner loop", in some sense. Keeping an eye on performance is very important in this context.

I'll add some node counts to my code so we can compare mine and yours and see if I am just getting more nodes per second. It could be that I am also visiting many fewer nodes than you are, which would probably indicate a problem with your alpha-beta implementation.

[quote name='Wizuriel' timestamp='1346870816' post='4976938']
Right now everything I do used dynamic memory allocation and truth be told my classes aren't memory efficient (especially for debugging I like tracking some extra information to make sure stuff is working), but would that really be the difference between you're alpha beta pruning taking seconds and mine taking hours?


Of course, you might have a bug somewhere. But needless memory allocation can easily result in a factor of 100 slowdown. The problem is that everything in an alpha-beta searcher is part of the "inner loop", in some sense. Keeping an eye on performance is very important in this context.

I'll add some node counts to my code so we can compare mine and yours and see if I am just getting more nodes per second. It could be that I am also visiting many fewer nodes than you are, which would probably indicate a problem with your alpha-beta implementation.
[/quote]

Would be appreciated.

Going off my code using standard board and players starting in the middle
P1) initial position has 131 children. Assume player 1 moves
P2) initial position (based off player 1 moving) has 131 children
P1) First child from the 131 children returns 132 more children
P2) P2's first child also has 132 children

This first branch assumes both players are just moving directly towards their desired goal

So yeah I should find a way that you can't move backwards for the children (by backwards I mean to a position already done by one of the earlier children). Easiest way I can think of would be to have each view have a link back to it's parents and make sure the new view is not equal to any of the parents. Would the extra memory usage be worth the less nodes to view?
One important thing that helps performance is that I don't make any copies of the board. There is a single board (I pass references to it around, but it could actually be made global), and the search function makes moves on it and then undoes them carefully. Here's my actual search function:
int negamax(Board &b, int alpha, int beta, int depth) {
int current_score = b.get_score();
if (depth <= 0 || abs(current_score) >= 400000)
return current_score;

Move moves[132];
int n_moves = b.generate_moves(moves);
int old_pawn_pos = b.get_pawn_pos(); // I need to remember where the pawn for the side to move was, so I can return it there if I move it.
for (int i=0; i<n_moves; ++i) {
b.make_move(moves);
int score = -negamax(b, -beta, -alpha, depth-1);
b.undo_move(moves, old_pawn_pos);
if (score > alpha) {
alpha = score;
if (score >= beta)
break;
}
}
return alpha;
}


Once I add move sorting and transposition tables it probably won't be as readable.

As you see, the array of moves is on the stack (moves are small) and I don't need to do any allocation anywhere.

This topic is closed to new replies.

Advertisement