Jump to content



Monte Carlo Tree Search

  • You cannot reply to this topic
5 replies to this topic

#1 willh   Members   -  Reputation: 118

Like
0Likes
Like

Posted 01 February 2012 - 11:33 AM

This post is related to a comment Alvaro made in a thread about AI in games of perfect information.

I'm interested in implementing Monte Carlo Tree search for a game that involves random outcomes. For some moves the outcome is fixed. For others there could be 6 or more different outcomes depending on some unknown value. The probabilities of each outcome occuring can change as the game progresses-- in this way it is similar to Poker.

I understand MCTS in terms of growing the tree for moves with a fixed outcome, but I'm confused about nodes that have a random outcome. Do I try to expand these nodes, or do I simply use the node to as a starting point for a number of simulated games?

If they are expanded, how? Is each possible outcome explored as a separate node? Are the probability distributions stored, and the node essentially 'randomized' each time it is visited (which would totally mess up the sanity of child nodes?) I can see the value in expanding them (to a degree) but it seems extermely risky given the nature of UCB-- it may attach priority to a very unlikely outcome.

Any help would be very much appreciated. Thanks!

Will

Ad:

#2 alvaro   Members   -  Reputation: 1689

Like
2Likes
Like

Posted 01 February 2012 - 01:11 PM

You do expand those nodes as if the result of the die were a move by some player, but you have to make sure that instead of using UCB to select a "move" in future searches, you simply pick a random move at this type of node.

Does that clear up all your questions?

#3 IADaveMark   Moderators   -  Reputation: 1247

Like
0Likes
Like

Posted 02 February 2012 - 08:34 AM

Put another way, those branches of the tree represent "what could possibly happen based on the relative likelihood that they will happen."
Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC

Professional consultant on game AI, mathematical modeling, simulation modeling
Co-advisor of the GDC AI Summit
Co-founder of the AI Game Programmers Guild
Author of the book, Behavioral Mathematics for Game AI


Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

#4 willh   Members   -  Reputation: 118

Like
0Likes
Like

Posted 02 February 2012 - 09:34 AM

View Postalvaro, on 01 February 2012 - 01:11 PM, said:

You do expand those nodes as if the result of the die were a move by some player, but you have to make sure that instead of using UCB to select a "move" in future searches, you simply pick a random move at this type of node. Does that clear up all your questions?

I think so.. So to clarify, I expand until I hit a 'chance' node. When I hit a chance node, I roll the dice, and expand as if that were the move. The next time I run in to that node, I roll the dice again.

#5 alvaro   Members   -  Reputation: 1689

Like
1Likes
Like

Posted 02 February 2012 - 09:53 AM

The way I have done it before, the board structure knows whose turn it is. I consider die rolls to be moves by a player called "Fortuna" (my wife is a classicist). The UCT tree expands a position where Fortuna is to move just as it does any other position. When I store the statistics for each move (wins and visits), the wins are stored from the point of view of the player to move. Since Fortuna doesn't care who wins, I always count the result of the playout as 0.5 from her point of view. The UCB rule will then always select a move among the ones that have been explored the least. I add a bit of randomness to the UCB score of each move so we don't always explore "1" first or have some other weird bias like that.

EDIT: Of course, in the non-UCT part of the playout, Fortuna picks random moves.

#6 willh   Members   -  Reputation: 118

Like
0Likes
Like

Posted 02 February 2012 - 11:03 PM

Thanks Alvaro. Very helpful.






We are working on generating results for this topic
PARTNERS