Heuristic for minimax - Board Game 'Odd'

Started by
4 comments, last by Mouhyi 11 years ago

I am implementing an AI player for this board game.

I am using AB minimax with memory (MTD(f)) for search, but I am having trouble finding a good heuristic for the evaluation function, and even if I want to use reinforcement learning I need to find a good set of features to represent the board states since the state space is very large. How should I approach this problem?

Thanks

Advertisement
That looks like a very tricky problem. I am not even certain minimax is the right approach. Perhaps Monte Carlo methods might work better.

Are there people that are good at playing it? Any strategy guides?

If it was me I'd take the dumb approach and do this:

  1. If it's possible to add a stone to a bunch that turns it into a group and moves towards the player's goal (i.e. odd number of groups for player 1, even number for player 2), do it.
  2. Else, if it's possible to add a stone to a bunch that prevents it from turning it into a group that'd go towards the opponent's goal, do it.
  3. Else, just place a random stone and hope it moves forwards (may want to put it next to an already placed one if possible).

Yeah, that's really dumb but I imagine it could end up getting reasonably far, especially given how easy it's for the AI to see those first two conditions. May want to keep this as an "easy" AI if you can find something better =P

EDIT: fixed the second step (derp)

Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.

wow - nice problem to solve. I think minimax certainly will be useful as you start approaching the end of the game, for two main reasons.

The branching factor reduces to reasonable numbers for minimax to handle well. And there appears to be opportunity to use a quiescent search as it appears

that many forcing moves will become apparent when there is an opportunity to block merging (or creation in a limited space) of groups (which I assume becomes very important near the end of the game).

I have one thought about this game that might be relevant: This game is played on a graph, in the sense that the geometry of the hexagonal pattern (e.g., what's aligned) does not matter. It could be played on any other graph just fine. Towards the end of the game, small groups that cannot grow because they are totally blocked by the other color are irrelevant and can be removed from the internal description of the board. Similarly, large groups are equivalent, whether they have 5 stones or 15. This allows to simplify the graph structure towards the end of the game. The are more simplifications available, like identifying "pass" cells (empty cells where you can prove that it won't matter whether they end up being black or white) and ignoring what they attach to. I believe only the parity of the number of "pass" cells on the board really matters.

If you explore a position using minimax and you get a winning or losing score, you can store the simplified graph of the position (or some hash of it) and its score. Over time you can build a database of endgame positions that way, and then you can query this database in the search. If the graphs simplify as much as I think they will, this should allow you to deduce the true score of positions from pretty early on in the game.

Right now I am using iterative deepening AB minimax with memory(MTD(f)) + Monte Carlo search as the evaluation function and it is performing better than my earlier minimax algorithm and better than a basic Monte Carlo search.

The algorithm is performing like a Monte Carlo search in the beginning: the search depth is in the range [1-4] (branching factor ~ 122 and the leaf nodes are evaluated using Monte Carlo search which takes time) but since it's storing the node's evaluation in a transposition table, this guides minimax later when the search can go deep towards the end of the game. I am trying to find a good combination of Monte Carlo and minmax especially that the current implementation suffers from the horizon effect and has a very small search depth at the beginning.

I am thinking about the following approaches to improve my player:

- Improve the evaluation function (Alvaro's suggestions sound reasonable) and maybe use quiescence search to mitigate the horizon effect.

- Implement my AI using a purely Monte Carlo search (it's Aheuristic nature might work for this game)

This topic is closed to new replies.

Advertisement