Game of The Generals AI

This topic is 2474 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

Recommended Posts

Hello.

We have a board game called Game of The Generals. It's played on a 9 x 8 bored and players are free arrange their pieces on the lower three rows according to their strategy. Players are not allowed to see each other's pieces.

http://en.wikipedia....of_the_Generals

Writing an AI for this is difficult since players can't see what the pieces are unlike chess. One of the ideas I thought of was setting probabilities to each enemy piece and adjusting as pieces reveal themselves through the game (e.g. a piece eliminating a lower ranking piece). The good thing about this game is that pieces can only move one square per turn, and should simplify the search tree.

Any tips?

Thanks in advance. Edited by mudslinger

Share on other sites
Is it a bit like Stratego?

You can create random scenarios that are consistent with the information you have so far. Then consider what would happen after each one of your moves, probably using Monte Carlo playouts (look at how recent computer go programs are implemented).

That's the first approach I would try.

Share on other sites

Yes, it is a lot like Stratego but much simpler. All pieces move only once per turn, and all they do is outrank each other.

I'll attempt the method you suggested. But is Minimax advisable?

BTW, I'm implementing the game on C#.

Share on other sites
Quote: Original post by mudslinger
I'll attempt the method you suggested. But is Minimax advisable?

I wouldn't know how to use minimax in a game with partial information. Edited by IADaveMark

Share on other sites
Can anyone please explain Monte Carlo for me? The ones on the internet aren't doing it for me.

Thanks.

Share on other sites
Quote: Original post by mudslinger
Can anyone please explain Monte Carlo for me? The ones on the internet aren't doing it for me.

The first thing you need is a "policy". This is a very simple rule that allows you to pick a move on a given situation, with some randomness. It doesn't have to be particularly smart: For many games, picking a purely random move is a good start. You want to make sure that games end in a reasonable number of moves, so perhaps you should make it more likely for pieces to go towards opponent pieces, so they eventually annihilate each other.

Now, given a position on the board (with full knowledge of what's on both sides), you can play the rest of the game using the policy (this is called a "playout" or a "Monte Carlo simulation"). This is a random process, and each player will have a certain probability of winning. The hope is that the probability of winning this random process is a decent evaluation function for the board.

So the first algorithm you can try is this:
For each move M {

Do 10,000 times {

Generate a plausible configuration for the pieces of the opponent.

Play M

Run the rest of the game using the simple policy for both sides

}

Count how often you won

}

Pick the move that won the most.

There are several possible refinements. Of the top of my head:
(1) Use UCB1 to sample more promising moves more often.
(2) Make the policy smarter (always room for improvement here).
(3) Bias the plausible configurations using Bayes's formula on the history of the game (this seems hard).

I would probably implement only (1) and see if the program plays well enough to be fun.

The quality of play will depend on how many simulations you run, which means that speed will likely be important. Perhaps C# is not the best choice for this type of approach, but I wouldn't worry about it for now: Implement it initially in whatever language you are most comfortable with, and you can always rewrite it in C or C++ later if you need the speed. Edited by IADaveMark

Share on other sites
Quote: Original post by alvaro
(1) Use UCB1 to sample more promising moves more often.

What is UCB1?

EDIT: I just finished writing the game with an AI with fully random moves.

[Edited by - mudslinger on October 23, 2010 5:33:16 AM] Edited by IADaveMark

Share on other sites
Quote: Original post by mudslinger
Quote: Original post by alvaro
(1) Use UCB1 to sample more promising moves more often.

What is UCB1?

How hard have you tried to find out on your own what it is? Edited by IADaveMark

Share on other sites
Quote:
 Original post by alvaroHow hard have you tried to find out on your own what it is?
It is one of those things that is surprisingly hard to google for - only one relevant entry on the first page [smile]

Share on other sites
I did find this:
http://eprints.pascal-network.org/archive/00002713/01/nips_exploration_exploitation.pdf

While I was trying to understand it, I hoped someone could give me a simpler explanation.

Will still try to understand it, though.

Share on other sites
Quote:
Original post by swiftcoder
Quote:
 Original post by alvaroHow hard have you tried to find out on your own what it is?
It is one of those things that is surprisingly hard to google for - only one relevant entry on the first page [smile]

It gets far better if you search for "UCB1 monte carlo".

Share on other sites
My AI just started making smart moves (e.g. protecting the flag).

Share on other sites
Quote: Original post by mudslinger
My AI just started making smart moves (e.g. protecting the flag).

Did you do it similarly to what I suggested? Can you describe what you've done so far? Edited by IADaveMark

Share on other sites
It's exactly the algorithm you suggested. The playout is purely random, and make some game configurations more likely based on the history of the game. (but not Bayes' formula)

It's terribly inefficient. With 5,000 playouts, it takes a few minutes to run. I think I can speed it up.

Share on other sites
Quote: Original post by mudslinger
It's exactly the algorithm you suggested. The playout is purely random, and make some game configurations more likely based on the history of the game. (but not Bayes' formula)

It's terribly inefficient. With 5,000 playouts, it takes a few minutes to run. I think I can speed it up.

Yes, performance tends to be an issue with this approach. Time to optimize the code. There are also sometimes simple modifications to the playout policy that will make the playouts shorter or faster to evaluate, without biasing the results too badly. Edited by IADaveMark

Share on other sites
Going back to Bayes, I would think that you could profile all the pieces based on a sliding belief of what you think they might be. Obviously, everything is equal at the beginning. Once you know what something is, it not only affects that piece but the others as well since the type of piece that is now know is taken off the potential list of the others. Over time, you will hone down what you think things might be. Additionally, you can nudge things based on the other player's behavior. If he starts moving a piece aggressively toward one of your pieces that HE knows, you can assume that it is a higher-ranking piece and slide your assumptions accordingly.

For strategy, you can then work out a shallow-depth min-max that is optimized using utility theory. That is, what are you going to gain/lose over the next few moves. This can support a higher-level strategy such as attrition, a directed attack at a place where you believe the enemy's flag to be, or defending a high-value area. There are other playbook entries that can be devised such as making a hole for a spy using other higher-ranking pieces, etc.

That said, I don't think Monte Carlo is even necessary.

Share on other sites
Quote:
 Original post by InnocuousFoxThat said, I don't think Monte Carlo is even necessary.

This is the algorithm I might try next:

For each move M {  Do 1,000 (or more) times {   Generate a plausible configuration for the pieces of the opponent using Bayes.     Run minimax on M     Count points for possible gains and losses for move M  }  Sum points for move M}Pick the move with most points

Thanks to everyone who's been replying.

Share on other sites
Hello again,

I'm kinda lost with probabilities. I don't know how to adjust the chances of a piece being a certain piece using Bayes'. Here are the simplified rules of the game:
Each player has:amount  type1       flag6       privates1       rank 11       rank 21       rank 31       rank 41       rank 51       rank 61       rank 71       rank 81       rank 91       rank 101       rank 111       rank 122       spies---21(1) Higher rank takes out lower rank, privates and flags(2) Spies take out all ranks and flags, except privates(3) Privates take out spies and flag

This means every piece has a 1/21 chance of being a flag, 6/21 of being a private, 1/21 of being rank 1, ..., 2/21 chance of being a spy.

(Note: I'm speaking in the perspective of the AI)
My questions are:

(1) How do I adjust piece probabilities if my piece Rank N takes out an enemy piece with Rank M? (N < M, N is known)
(2) How do I adjust piece probabilities if an enemy piece Rank N takes out my piece with Rank M? (N < M, M is known)
(3) How do I adjust piece probabilities if my piece Rank N takes out an enemy piece with Rank N? (equal pieces, both are taken out)

It is obvious that if a piece takes out a spy, that piece must be a private. Also, there must always be a Flag.

What I've done so far is to set ranges on enemy pieces. That is, if it takes out my piece Rank N, that piece is sure to be a piece with Rank >N. Spies are counted, so that if my private takes out a spy, then the number of spies of my enemy decreases.

Thank you.

Share on other sites
Remember that the Bayes information is only for the unknown pieces. When you discover a piece, then you decrease the "unknown" pool by 1 and the "piece X" pool by 1.

In your example, if you discover the single rank 6 piece, you now have 20 pieces unknown rather than 21. Therefore, the percentage of anything being a rank 6 is now 0%, but the odds of a given piece being something else increases slightly because the denominator is 20 rather than 21. (e.g. 4.7% to 5.0%)

Now, if you are holding a piece of Rank n, and you want to know the odds of you winning, you simply total up the percentage chances of all the things that it would beat (i.e. rank > n). Obviously, include the flag in this list. The same can be said for calculating the other direction. If you put a piece at risk, what are the odds that the unknown piece next to it could capture it?

The fun comes when you start to bias pieces for other reasons. For example, if the other player knows you have a Rank 3 piece that you are advancing and a particular piece of his is running away from it, it is more likely that this piece is Rank 4+. To get really fancy, it is actually more likely that this piece is rank 5 or 6 than it is 7+. After all, he would be more concerned with saving a 5 or 6 than he would a lesser rank. However, that gets all sorts of subjective at that point.

The point is, you can tweak the percentage chance of something manually and have it cascade to other pieces. The trick is to use a fully normalized range that encompasses every piece. As you increase the likelihood of Piece A being Rank N, it necessarily reduces the odds of the other pieces being Rank N and, therefore, increases the odds the the other pieces are NOT Rank N (i.e. all the others). It's a more subtle version of going from whatever % all the way to 1.0 (i.e. I know what that piece is).

So, once you have all these percentages, you can seed all sorts of things... MinMax, Markov models, Monte Carlo, etc. to determine what the "best move" should be at any given point.

(I miss anything Alvaro?)

Share on other sites
Quote:
 Original post by InnocuousFox(I miss anything Alvaro?)

There are two things that you might have missed. One is that in this game it seems to be the case that the rank of the enemy piece is not revealed after a challenge, so you generally don't learn that a piece is definitely of rank 6, only who won.

The bigger thing that you (and all of us in this thread so far) are missing is a concrete algorithm that can generate an unbiased random arrangement of the pieces that satisfies certain constraints, or that respects the probabilities that we have discovered for each piece.

In theory, you could simply generate completely unbiased random distributions for the initial configuration of the pieces and give them a weight that is proportional to the product of the conditional probability of every event that has happened from the beginning of the game. In particular, distributions that are inconsistent with the results of the challenges so far would get a weight of 0, so you can just discard them. If you have a fancier Bayesian model that tells you things like "the flag moving forward unprotected is a very low-probability event", the product of conditional probabilities will take care of it.

The problem with this simple and theoretically sound method is that after a few challenges, the probability of sampling something that gets a weight larger than 0 will be vanishingly small. I believe importance sampling ("Application to simulation" in particular) might be the kind of tool needed here, but I haven't actually used other than for a couple of toy problems.

I'll think about the problem some more, but it's probably tricky.

Share on other sites
Quote:
 Original post by alvaroThere are two things that you might have missed. One is that in this game it seems to be the case that the rank of the enemy piece is not revealed after a challenge, so you generally don't learn that a piece is definitely of rank 6, only who won.

Oops... I was thinking 2-person Stratego like when I was a kid. The only thing you can do after a battle then is say "I know this piece is greater than N." Still pares things down a bit.

Share on other sites
Just throwing my \$0.02 in.

I'm not so sure Bayes with a gametree would be of much help here, given that the probabilities of knowing a piece are so small; 0.04 * 0.04 = 0.0016. A 0.16% chance of knowing a piece out beyond one move is too low. This would be like trying to play poker based on card distribution, which I think we all know doesn't work. :)

Is it fair to say that all you can know about each opponents pieces is a minimumn value, and a maximum value?

If so, have you considered using a two phased approach?

1. Doing a 1-ply search of the game space
2. A logic routine that trumps decisions from 1.

1. Start the game by assuming every one of the opponents pieces is the most powerful piece (min = MAX_VALUE, max=MAX_VALUE). Then do a shallow 1 ply search, looking at all of your move possibilities to depth 1, and count the total number of wins/losses for each move. The best move is the one that results in the highest ratio of wins / losses. As pieces are lost/won, you will know the maximum (for you winning), or minimum (for you losing) strength of a piece and can adjust your knowledge accordingly.

2. As you begin to know what the min/max values of the pieces are, a logic routine should be able to tell you exactly what moves (if any) will result in a win, which pieces of yours are doomed, etc... Basically your logic routine will be able to make decisions using hard-coded rules.

You could also extend the search to depth 2,3,4,5, etc.. based on how much knowledge you have about the board (and how much knowledge your opponent has about the board-- this would require you to track both players knowledge). So at the beginning you do a depth-1 search, (50% knoweldge). At 60% knowledge, do a depth-3 search. At 70% depth 4, etc..

Throw in an opening book and you're done! Makes me want to try this out now. lol. :)

Share on other sites
Quote:
 Original post by willhThrow in an opening book and you're done! Makes me want to try this out now. lol. :)

Please, do try it out. :D

Share on other sites
I sent you a PM. This is a very interesting game. It's like a mix of Clue, Chess, and Poker.

I'm willing to setup and host a webservice to broker and manage games if anyone else is interested in getting involved. I'll make sure it is simple enough for HTTP POST or GET, so no SOAP libraries neccessary.

The approach I'm taking is similar to what I mentioned earlier. A question about the rules:

Do you know if there is a rule for a forced draw? Towards the end game, if someone is trying to advance their flag, it seems possible that you could run in to a situation where neither player can win as long as one player keeps moving in a certain pattern.

Share on other sites
Quote:
 Original post by willhI'm not so sure Bayes with a gametree would be of much help here, given that the probabilities of knowing a piece are so small; 0.04 * 0.04 = 0.0016. A 0.16% chance of knowing a piece out beyond one move is too low. This would be like trying to play poker based on card distribution, which I think we all know doesn't work.

However, we don't need to know exactly what the piece is. We need to be able to estimate win/lose/tie. By adding up the percentages above and below your own (known) piece, you can estimate that.