MCTS with a Draw reward

Artificial Intelligence Programming

Started by mudslinger September 11, 2018 08:42 PM

1 comment, last by alvaro 5 years, 7 months ago

mudslinger

143

Author

September 11, 2018 08:42 PM

If I use MCTS but with "reward" as -1, 0, and 1 for lose, draw, and win respectively, can I use the UCT formula as is?

uct = node.rewards/(node.visits+1.0) + explorationRate * sqrt(ln(node.parent.visits) / (node.visits+1.0))

Afterwards, I still return the node that was most visited as the best move?

alvaro

21,604

September 12, 2018 03:50 PM

That seems reasonable. You just need to use an estimate of the expected value of the distribution, and node.rewards/(node.visits+1) is reasonable.

A minor matter of naming: I normally call that the "UCB1 formula", not the "UCT formula". UCT is the algorithm resulting from using the UCB1 formula at every node of an expanding tree.

This topic is closed to new replies.

Recommended Tutorials

Using Game Engine Art Tools to Improve Run-Time Streaming in Online Worlds Visual Arts

Marketing my latest mobile game - post mortem of the first month Business and Law

Getting the best from your audio department Music and Sound FX

Fast Phong Shading (OTMPHONG.DOC) Graphics and GPU Programming

Game Design : The Addiction Element Game Design and Theory

MCTS with a Draw reward

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines