Turn based, complete information game AI

Started by
13 comments, last by alvaro 12 years, 1 month ago
I'm developing a game for android, and now i came to the AI part of it. I never wrote AI (previous application didn't need it), i tried to read what i found and thought it will be useful.

Since i'm developing for android, i'm forced to use java (yes, i can use the ndk, but it is far too much effort for the needed results), i'm memory bound and the less i use the cpu the better(i don't want my app to drain all the battery).

I'm first developing a 1 vs 1 situation, but i would like to improve it to be a X vs Y in future(not strictly related to this topic, but please keep this in mind when responding. I prefer expandable and maintainable code overs strict efficiency).

I have D = A + B + C entities(the real player controls A entities, the enemy controls B, and there are C unallocated entities). A player wins when all the opposite entities are killed, so even with 1 remaining entity, if the opposite player doesn't control any more, the player wins.

Since an entity can perform 2 actions, one on the enemy entities and one on the unallocated entities, so each choice/node has a branching of B + C
This leads to a big growth of a possible choices-tree.

In addition, in my game cyclic states can appear (cyclic state = you are in state A. You and the enemy do a some actions, and then you are again in state A). I would like to avoid this, even letting the AI do not-the-best-suited-move.

In addition i would like the AI not always to perform the best guessed action, but to randomly choose with a given distribution between the best e.g. 5 possible moves(obviously i need to find the 5 best possible moves, i can code the choice on my own smile.png ): predictability kills fun.

And the AI will have a maximum set of time to make its choice(that thus will probably be not the best one, but the better it can find in the give time).

I read about minmax and alpha-beta pruning, but i don't know if it can be the best algorithm for my case, if it is memory or processor hungry, if it can provide the best X moves and not only the best one, and it can be extended to work with more player..
In addition i don't know if it is heavier to reconstruct the tree each turn or to keep it in memory..

Useful resources, maybe with implementations details or code, would be really appreciated

A possible tree
http://www.pasteall.org/pic/show.php?id=25213
Advertisement

I read about minmax and alpha-beta pruning, but i don't know if it can be the best algorithm for my case, if it is memory or processor hungry, if it can provide the best X moves and not only the best one, and it can be extended to work with more player..
In addition i don't know if it is heavier to reconstruct the tree each turn or to keep it in memory..

Minimax is not memory hungry, although lots of memory can be used in the form of a transposition table to speed up the search (how effective this is depends a lot on the game). About CPU usage, the more you give it, the better it will play. But this will be the case pretty much for any algorithm you use.

One important thing to know about minimax is that you don't need to keep the tree in memory at all: You simply compute the moves from the current node and explore them one at a time, generating moves as you need them and forgetting them automatically when you return from your search function. If at all possible, you should have a single board on which you can perform moves and undo them as you search. The tree that you explore is only a mathematical construct, not an actual data structure.

About returning multiple moves, what you would probably do is add some random term to your evaluation function, which will provide variety to your bot's game.

Other than minimax, you should probably also look at Monte Carlo Tree Search (MCTS), which has been tremendously successful in computer go, compared to any other approach. MCTS can be applied to a much larger class of games, including games with dice, games with more than 2 players and games where players make simultaneous decisions. It is very different from minimax in nature, in particular because you don't need an evaluation function at all: You simply play games (known as "playouts") picking moves at random from the current position and accumulate statics on who wins. In order to make this basic idea work, you need to add some form of search (UCT is a good algorithm to start with) and you may have to bias the random moves in the playouts (at the very least so that games end in some reasonable amount of moves). Biasing the random moves in the playouts is the main way you can incorporate knowledge into your bot, while in minimax the evaluation function is where you do that.

There is probably less information online about MCTS than about minimax, since it is a much more recent method. However, it is so easy to get something basic working that I think I would start there. If you have trouble getting started, I can help you.
Thanks for the reply, i'll start searching papers and code for it!
I just need to clarify 1 point:
Should i provide a "database" with the history of the played match (playouts) with my application in order to not have a AI totally dumb the first few times the game will be played?

What a pity that multithreading in android is quite useles(very few devices have multiple core cpus), i know that Monte Carlo algorithms usually are good in mulithreading.

Additional informations: the board is not fixed. Neither the number of elements. It will change for each "level". In addition different entities may have different effects (for example an entities may do the double of the standard damage). Is MCTS still suited?
As I said, MCTS can handle a very large class of games. The only feature of a game that would make it very tricky to use MCTS is hidden information (i.e., cards).

Can you post the rules? Or perhaps a slightly simplified version of them, but such that the essence of the game is still captured? And a sample level?
Rules:

Player1 starts with X entities.
Player2 starts with Y entities.
Each entity control one and only one location.
There are Z empty locations.

In player1's turn, each of the X entities can choose one between:
-attack one of the Y entities(and thus make the enemy entities value increase by the attacking value)
-if its value %2 ==0 it can split itself into an empty location.

Then it is player2's turn, that can act like player one.

When the value of an entity is bigger than a given threshold, the entity dies and leave the controlled location empty.
When a player lose all of his entities, it dies. The last alive player wins

In the current level i'm working on i have 3 players: yellow, blue and green. But since yellow and blue are on ally, i think it is no different from having 2 players.

I can post you an example of a singe game 1vs1, with 2 starting entities each other and no starting empty post.

Here the threshold is 4. There is a sign near the player who have to take the turn. I didn't paint all the possible actions, only some to show how it works (actually, i also avoided to "conquer" the locations left empty by enemy entities, but this is a valid move)
http://www.pasteall.org/pic/25248
(in the top right branch, player1 will lose)

As you can see, often a good move is not to let the other split (so keeping it with a odd value)
I see. And what is the largest board configuration you have in mind?
Regardless of which technique you use, you'll need a class that represents the board, with methods like generate_moves and make_move. Did you write that already?

Once you have that, you can easily generate random playouts from the initial configuration. Check if those playouts finish in some reasonable time (in some games -like go- you need to disallow certain types of really bad moves for the playouts to even finish). If that's the case (and I suspect it will be), or if you can identify the really bad moves that make the playouts last forever, I think MCTS should work just fine.
i'll never go over the total 20 locations, but usally it will be around 10, not many more.

I didn't wrote it already, but i suspected that it would be needed.
I already have a class that describes the starting board, i create it after parsing my level with the parser i wrote, and i use it to populate the scene, so this is not a problem.
I can write a memory optimized one that allow to undo the moves if this is needed by the algorithm. But i didn't understand what you mean with generate_move.

For the infinite play moves i prefer the AI to make random decison rather that turn into a cyclic situation, so maybe is possible to check if the state was already visited, and if so, the algorithm chose a random move, or not-the-best-move, just to escape the loop.

So actually a "database/tree" of moves is needed? Maybe it is possible to collect the match played by the players, and let them upload them, then i can merge them and improve the ai for the next releases?

It is possible to set a difficulty variable( the more it is low, the more the algorithm choose random moves instead of good ones)?

Btw i'll read this papers as soon as i find time
http://www.mcts.ai/?q=mcts
http://teytaud.over-blog.com/article-35709049.html
http://www.ru.is/faculty/yngvi/pdf/WinandsBS08.pdf
You'll only need the ability to undo moves if you use minimax. For MCTS, making moves forward is enough. By `generate_moves' I mean a function that creates a list of the available moves in a particular position.

You will not need a database of moves, at least not to get the basic idea working. The UCT algorithm does grow a tree in memory; it's actually fairly easy to implement, but you don't even have to do that to get something working.

As a first version of your bot, make something that explores 1000 times each possible move, finishing the game by playing random moves from there. Then pick the move that won the most times. Once you have that working, you can improve it by using an algorithm called UCB (Upper Confidence Bounds) to explore promising moves more often. Then you can use UCB not only for the current position, but also for positions that have appeared several times in the search, and that's what UCT is (Upper Confidence bounds applied to Trees).

MCTS is a bit random by nature, and you can make it weaker and more random by playing fewer playouts. That's the easy part. smile.png
Ok, here what I understood.

1) the algorithm "start". Nothing is in memory and it doesn't "know" anything.
2) the algorithm is initialized :. the root node and the board (that currently represent the starting situation) are created.
3) the current node calls the board, that returns a list of all possible moves. For each move, the algorithm create a child of the current node.
4) a move/child node is chosen at random, the board is updated to represent that move.
5) repeat steps 3-4 until the board declare a winner.
6) go up the tree, from the winning step until the root, and update all the needed values of the node (from what i understood, wins and visits).
7) set the root node as current node and update the board to represent that state.
8) repeat 3-4-5-6-7 as many times as you wish (the more you repeat, the smarter the AI will be)

So the tree is created.
Now the player can make his move. The tree is traversed move by move(either the real player and the ai moves are taken in account, so that the current node always represent the current state), and the AI makes the right move by comparing the values of all the child nodes of the tree.

So, actually, all the teaching smartness is in the board, that should not chose moves at random but with a bit of heuristic and should not report all the possible moves, but only the "smart ones"(i'm not sure of this, since if the player will make a "stupid move", the tree will have a totally missing branch until a known situation is reached).

I understood this (and already implemented my board that does all is needed), but i'm not really able to join my knowledge with the code i find on the net.

For example this
http://www.mcts.ai/?q=code/simple_java
uses UCT (you can see uct value formula in select()). But i don't know where the board is saved, if it is needed to stay even after the tree is constructed or only when building it, where i should plug my board in that code, etc etc etc.
Long story short, i'm not sure of implementation details. Can you please provide me some code (even pseudo code), or some example? I found even this
http://senseis.xmp.net/?UCT%2FDiscussion
but they say the code is not complete/fully functional/take in account something different form the real UCT.

This topic is closed to new replies.

Advertisement