Jump to content
  • Advertisement
Sign in to follow this  
willh

Monte Carlo Tree Search

This topic is 2508 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

This post is related to a comment Alvaro made in a thread about AI in games of perfect information.

I'm interested in implementing Monte Carlo Tree search for a game that involves random outcomes. For some moves the outcome is fixed. For others there could be 6 or more different outcomes depending on some unknown value. The probabilities of each outcome occuring can change as the game progresses-- in this way it is similar to Poker.

I understand MCTS in terms of growing the tree for moves with a fixed outcome, but I'm confused about nodes that have a random outcome. Do I try to expand these nodes, or do I simply use the node to as a starting point for a number of simulated games?

If they are expanded, how? Is each possible outcome explored as a separate node? Are the probability distributions stored, and the node essentially 'randomized' each time it is visited (which would totally mess up the sanity of child nodes?) I can see the value in expanding them (to a degree) but it seems extermely risky given the nature of UCB-- it may attach priority to a very unlikely outcome.

Any help would be very much appreciated. Thanks!

Will

Share this post


Link to post
Share on other sites
Advertisement
You do expand those nodes as if the result of the die were a move by some player, but you have to make sure that instead of using UCB to select a "move" in future searches, you simply pick a random move at this type of node.

Does that clear up all your questions?

Share this post


Link to post
Share on other sites
Put another way, those branches of the tree represent "what could possibly happen based on the relative likelihood that they will happen."

Share this post


Link to post
Share on other sites
You do expand those nodes as if the result of the die were a move by some player, but you have to make sure that instead of using UCB to select a "move" in future searches, you simply pick a random move at this type of node. Does that clear up all your questions?


I think so.. So to clarify, I expand until I hit a 'chance' node. When I hit a chance node, I roll the dice, and expand as if that were the move. The next time I run in to that node, I roll the dice again.

Share this post


Link to post
Share on other sites
The way I have done it before, the board structure knows whose turn it is. I consider die rolls to be moves by a player called "Fortuna" (my wife is a classicist). The UCT tree expands a position where Fortuna is to move just as it does any other position. When I store the statistics for each move (wins and visits), the wins are stored from the point of view of the player to move. Since Fortuna doesn't care who wins, I always count the result of the playout as 0.5 from her point of view. The UCB rule will then always select a move among the ones that have been explored the least. I add a bit of randomness to the UCB score of each move so we don't always explore "1" first or have some other weird bias like that.

EDIT: Of course, in the non-UCT part of the playout, Fortuna picks random moves.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!