Jump to content

  • Log In with Google      Sign In   
  • Create Account


AI for a simple shedding card game


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
12 replies to this topic

#1 mudslinger   Members   -  Reputation: 139

Like
0Likes
Like

Posted 10 October 2012 - 08:03 AM

Hello,

We have this game with 4 players where each players take turns to discard cards that must be higher than the ones previously discarded. The card combinations are based on poker. Discarded cards are visible to everyone. The last player with remaining cards loses. I have code on the gameplay, and the AI has code that can list valid combinations from the cards that it has.

I'm planning to use Monte Carlo playouts:
For each move M {
  Do 10,000 times {
	Generate a plausible configuration for the cards of the opponents
	Play M
	Run the rest of the game randomly (room for improvement here)
  }
  Count how often you won
}
Pick the move that won the most.
What do you think?

Edited by mudslinger, 10 October 2012 - 08:07 AM.


Sponsor:

#2 clb   Members   -  Reputation: 1780

Like
2Likes
Like

Posted 10 October 2012 - 08:27 AM

Monte Carlo sounds an excellent approach here, but unlike in your pseudo, you need to be very smart about where you direct your playouts. Having a fixed "Do 10,000 times" loop in the code is very inefficient and wastes a lot of the playouts.

An important aspect of Monte Carlo is to construct/expand a game tree while the playouts are being played. You should never settle to just playing a fixed number of playouts at the root of the tree (or you can, but it won't be as effective). This is the machinery that causes the result to converge to the best play as you increase the number of playouts.

You'll also want to estimate the uncertainty of a certain branch. For a move M, if you play 1000 random samples and always lose, versus if you played 1000 random samples and win 50% of the time, you'll know statistically that you should probably invest more on the latter case, and never play the first move. This should guide the statistical playouts to balance the search towards uncertain branches. UCT tree search is one of the recent key advances in the field. This is also an interesting read, that gives some more references.
Me+PC=clb.demon.fi | C++ Math and Geometry library: MathGeoLib, test it live! | C++ Game Networking: kNet | 2D Bin Packing: RectangleBinPack | Use gcc/clang/emcc from VS: vs-tool | Resume+Portfolio | gfxapi, test it live!

#3 mudslinger   Members   -  Reputation: 139

Like
0Likes
Like

Posted 10 October 2012 - 09:52 AM

I'm not that familiar with Monte Carlo, although I fully understand the pseudocode.

What do you mean by "construct/expand a game tree while the playouts are being played"? I'm thinking this means I should use a game tree then randomly playing out the leaves.

I'll be reading about UCT and Monte Carlo in general, I hope it will be fruitful.

Edited by mudslinger, 10 October 2012 - 09:55 AM.


#4 clb   Members   -  Reputation: 1780

Like
1Likes
Like

Posted 11 October 2012 - 02:37 AM

Yeah, even with Monte Carlo, it's customary to maintain a game tree. Instead of exploring the tree fully like with traditional minimax, the tree is expanded only selectively to the areas based on the Monte Carlo simulation. When the tree search hits a leaf node (the moves after that node are not tracked by the tree), the game is continued with random play i.e. a random playout until the end is performed.
Me+PC=clb.demon.fi | C++ Math and Geometry library: MathGeoLib, test it live! | C++ Game Networking: kNet | 2D Bin Packing: RectangleBinPack | Use gcc/clang/emcc from VS: vs-tool | Resume+Portfolio | gfxapi, test it live!

#5 Álvaro   Crossbones+   -  Reputation: 12920

Like
1Likes
Like

Posted 11 October 2012 - 03:09 AM

As clb points out, there is a large improvement available over sampling every move the same number of times. There are good move and bad moves, and after sampling them a few thousand times, you should have a pretty good idea of which are which, and you can exploit this by sampling promising moves more often. This idea is formalized in the multi-armed bandit problem. There is an algorithm called UCB1 that is a good starting point.

The idea of building a tree is important for full-information games, but I don't think it will get you very far in a card game of this type, since a node depends not only on what the previous moves have been, but also on the plausible scenario generated.

#6 mudslinger   Members   -  Reputation: 139

Like
0Likes
Like

Posted 11 October 2012 - 12:20 PM

This looks like a very good resource:
http://www.personeel...ressiveMCTS.pdf

as Alvaro pointed, I believe building a deep tree will not be helpful, as uncertainties will compound. maybe trees aren't even helpful at all.

I'm doing reading on UCB1.

Edited by mudslinger, 11 October 2012 - 12:30 PM.


#7 Álvaro   Crossbones+   -  Reputation: 12920

Like
0Likes
Like

Posted 11 October 2012 - 02:11 PM

UCB1 is basically formula (1.1) in that paper.

#8 mudslinger   Members   -  Reputation: 139

Like
0Likes
Like

Posted 11 October 2012 - 02:37 PM

UCB1 is basically formula (1.1) in that paper.


Yes, I just realized how simple UCB1 is. I'm reading another paper/presentation: http://www.cs.bham.a...ctures/ucb1.pdf. I just made a simple C implementation of UCB1 with the rand() function. It does test more rewarding functions more.

This is making more sense to me. I'll start coding again tomorrow.

EDIT: I just realized that the tree in the progressiveMCTS paper is not the game tree I think it is. I thought it was something similar to a min-max tree, in which each layer is a turn of each player. It is not - all of it is the turn of the current player. Is this thinking correct? I'm confused as to how this would work though, all of the results are back propagated to the root, and the root is always visited, therefore the root will always be chosen as the move.

EDIT2: It does say the child that was visited the most Posted Image again, more feelings of realization. Based on the simple rule, since a leaf will be expanded upon simulation, does this mean each leaf is simulated only once?

Then again, trees might not be worth it, it's probably better to just list every possible move, then run UCB1 on them.

Edited by mudslinger, 11 October 2012 - 04:15 PM.


#9 Álvaro   Crossbones+   -  Reputation: 12920

Like
0Likes
Like

Posted 11 October 2012 - 04:29 PM

Where do you use rand() in UCB1? The original algorithm is deterministic (which is kind of strange for a Monte Carlo technique, but the randomness comes from the results of the playouts).

Start by implementing UCB1 at the root. It's basically the same as the pseudo-code at the beginning of the thread, but instead of sampling each move the same number of times, you always sample the move that has the highest UCB1 value:
While there is time {
  Select the move M that maximizes the UCB1 formula
  Generate a plausible configuration for the cards of the opponents
  Play M
  Run the rest of the game randomly (room for improvement here)
  Count how often you won
}
Pick the move that was picked most often

Don't worry about the tree part for now, but I'll be happy to explain it to you if after reading and thinking about it for a while it still doesn't make sense.

By the way, is this for some sort of simplified Tichu variant? I love Tichu!

#10 mudslinger   Members   -  Reputation: 139

Like
0Likes
Like

Posted 11 October 2012 - 06:31 PM

I think I understand the tree part now. The expanding part is not directly dependent on the number of simulations, for example, I can do 1000 simulations, then generate a branch. I can generate another 1000 simulations, then generate another branch. I think I can't apply this to my game since the game has a max 13 moves at the start and this drops quickly. I'd run out of moves for the tree. Your pseudocode is what I had in mind.

I'm not familiar with tichu.

here's my UCB1 code, it's a bit rough, I apologize.
http://pastebin.com/rddWUSzF

Edited by mudslinger, 13 October 2012 - 08:04 AM.


#11 Álvaro   Crossbones+   -  Reputation: 12920

Like
0Likes
Like

Posted 11 October 2012 - 07:19 PM

The tree part of Monte Carlo Tree Search (or more precisely the UCT algorithm) is basically using the UCB1 mechanism not only at the root, but also at other nodes that you end up visiting many times.

The problem with this in the case of card games is that you will be dealing the cards you haven't seen randomly at each simulation, so there won't be many instances of visiting the same situation twice, because from the point of view of the next player, the situation is different after each plausible configuration is determined.

#12 slicer4ever   Crossbones+   -  Reputation: 3480

Like
0Likes
Like

Posted 12 October 2012 - 11:54 AM

Hello,

We have this game with 4 players where each players take turns to discard cards that must be higher than the ones previously discarded. The card combinations are based on poker. Discarded cards are visible to everyone. The last player with remaining cards loses. I have code on the gameplay, and the AI has code that can list valid combinations from the cards that it has.


i'd like to toss out an idea, but before i can(as it might be moot depending), i'd like a bit of clarification of the rules, if i understand them correctly:

you can discard any number of cards, so long as it's a valid poker combination, and it must be higher than the last card discarded?, exactly how is that value calculated?, as well, how many cards are players given at start?

Edited by slicer4ever, 12 October 2012 - 11:56 AM.

Check out https://www.facebook.com/LiquidGames for some great games made by me on the Playstation Mobile market.

#13 mudslinger   Members   -  Reputation: 139

Like
0Likes
Like

Posted 03 November 2012 - 11:44 AM

I have more or less finished the game, it can be played here: http://pinoyai.com/pdai.php . Java is needed.
I also put the rules here: http://pinoyai.com/pdrules.php

Everything on the website is still beta, the game doesn't even have drag-and-drop. I was too lazy to write it Posted Image

Edited by mudslinger, 04 November 2012 - 01:58 PM.





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS