Jump to content

  • Log In with Google      Sign In   
  • Create Account

We're offering banner ads on our site from just $5!

1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.


Monte Carlo Tree Search


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
5 replies to this topic

#1 willh   Members   -  Reputation: 160

Like
0Likes
Like

Posted 01 February 2012 - 11:33 AM

This post is related to a comment Alvaro made in a thread about AI in games of perfect information.

I'm interested in implementing Monte Carlo Tree search for a game that involves random outcomes. For some moves the outcome is fixed. For others there could be 6 or more different outcomes depending on some unknown value. The probabilities of each outcome occuring can change as the game progresses-- in this way it is similar to Poker.

I understand MCTS in terms of growing the tree for moves with a fixed outcome, but I'm confused about nodes that have a random outcome. Do I try to expand these nodes, or do I simply use the node to as a starting point for a number of simulated games?

If they are expanded, how? Is each possible outcome explored as a separate node? Are the probability distributions stored, and the node essentially 'randomized' each time it is visited (which would totally mess up the sanity of child nodes?) I can see the value in expanding them (to a degree) but it seems extermely risky given the nature of UCB-- it may attach priority to a very unlikely outcome.

Any help would be very much appreciated. Thanks!

Will

Sponsor:

#2 Álvaro   Crossbones+   -  Reputation: 13897

Like
2Likes
Like

Posted 01 February 2012 - 01:11 PM

You do expand those nodes as if the result of the die were a move by some player, but you have to make sure that instead of using UCB to select a "move" in future searches, you simply pick a random move at this type of node.

Does that clear up all your questions?

#3 IADaveMark   Moderators   -  Reputation: 2531

Like
0Likes
Like

Posted 02 February 2012 - 08:34 AM

Put another way, those branches of the tree represent "what could possibly happen based on the relative likelihood that they will happen."
Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC

Professional consultant on game AI, mathematical modeling, simulation modeling
Co-advisor of the GDC AI Summit
Co-founder of the AI Game Programmers Guild
Author of the book, Behavioral Mathematics for Game AI

Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

#4 willh   Members   -  Reputation: 160

Like
0Likes
Like

Posted 02 February 2012 - 09:34 AM

You do expand those nodes as if the result of the die were a move by some player, but you have to make sure that instead of using UCB to select a "move" in future searches, you simply pick a random move at this type of node. Does that clear up all your questions?


I think so.. So to clarify, I expand until I hit a 'chance' node. When I hit a chance node, I roll the dice, and expand as if that were the move. The next time I run in to that node, I roll the dice again.

#5 Álvaro   Crossbones+   -  Reputation: 13897

Like
1Likes
Like

Posted 02 February 2012 - 09:53 AM

The way I have done it before, the board structure knows whose turn it is. I consider die rolls to be moves by a player called "Fortuna" (my wife is a classicist). The UCT tree expands a position where Fortuna is to move just as it does any other position. When I store the statistics for each move (wins and visits), the wins are stored from the point of view of the player to move. Since Fortuna doesn't care who wins, I always count the result of the playout as 0.5 from her point of view. The UCB rule will then always select a move among the ones that have been explored the least. I add a bit of randomness to the UCB score of each move so we don't always explore "1" first or have some other weird bias like that.

EDIT: Of course, in the non-UCT part of the playout, Fortuna picks random moves.

#6 willh   Members   -  Reputation: 160

Like
0Likes
Like

Posted 02 February 2012 - 11:03 PM

Thanks Alvaro. Very helpful.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS