Jump to content

  • Log In with Google      Sign In   
  • Create Account


#Actualmudslinger

Posted 11 October 2012 - 04:15 PM

UCB1 is basically formula (1.1) in that paper.


Yes, I just realized how simple UCB1 is. I'm reading another paper/presentation: http://www.cs.bham.a...ctures/ucb1.pdf. I just made a simple C implementation of UCB1 with the rand() function. It does test more rewarding functions more.

This is making more sense to me. I'll start coding again tomorrow.

EDIT: I just realized that the tree in the progressiveMCTS paper is not the game tree I think it is. I thought it was something similar to a min-max tree, in which each layer is a turn of each player. It is not - all of it is the turn of the current player. Is this thinking correct? I'm confused as to how this would work though, all of the results are back propagated to the root, and the root is always visited, therefore the root will always be chosen as the move.

EDIT2: It does say the child that was visited the most Posted Image again, more feelings of realization. Based on the simple rule, since a leaf will be expanded upon simulation, does this mean each leaf is simulated only once?

Then again, trees might not be worth it, it's probably better to just list every possible move, then run UCB1 on them.

#6mudslinger

Posted 11 October 2012 - 03:49 PM

UCB1 is basically formula (1.1) in that paper.


Yes, I just realized how simple UCB1 is. I'm reading another paper/presentation: http://www.cs.bham.a...ctures/ucb1.pdf. I just made a simple C implementation of UCB1 with the rand() function. It does test more rewarding functions more.

This is making more sense to me. I'll start coding again tomorrow.

EDIT: I just realized that the tree in the progressiveMCTS paper is not the game tree I think it is. I thought it was something similar to a min-max tree, in which each layer is a turn of each player. It is not - all of it is the turn of the current player. Is this thinking correct? I'm confused as to how this would work though, all of the results are back propagated to the root, and the root is always visited, therefore the root will always be chosen as the move.

EDIT2: It does say the child that was visited the most Posted Image again, more feelings of realization. Based on the simple rule, since a leaf will be expanded upon simulation, does this mean each leaf is simulated only once?

Then again, trees might not be worth it, it's probably better to just list every possible move, then run UCB1 on them.

#5mudslinger

Posted 11 October 2012 - 03:17 PM

UCB1 is basically formula (1.1) in that paper.


Yes, I just realized how simple UCB1 is. I'm reading another paper/presentation: http://www.cs.bham.a...ctures/ucb1.pdf. I just made a simple C implementation of UCB1 with the rand() function. It does test more rewarding functions more.

This is making more sense to me. I'll start coding again tomorrow.

EDIT: I just realized that the tree in the progressiveMCTS paper is not the game tree I think it is. I thought it was something similar to a min-max tree, in which each layer is a turn of each player. It is not - all of it is the turn of the current player. Is this thinking correct? I'm confused as to how this would work though, all of the results are back propagated to the root, and the root is always visited, therefore the root will always be chosen as the move.

EDIT2: It does say the child that was visited the most Posted Image again, more feelings of realization. Based on the simple rule, since a leaf will be expanded upon simulation, does this mean each leaf is simulated only once?

#4mudslinger

Posted 11 October 2012 - 02:58 PM

UCB1 is basically formula (1.1) in that paper.


Yes, I just realized how simple UCB1 is. I'm reading another paper/presentation: http://www.cs.bham.a...ctures/ucb1.pdf. I just made a simple C implementation of UCB1 with the rand() function. It does test more rewarding functions more.

This is making more sense to me. I'll start coding again tomorrow.

EDIT: I just realized that the tree in the progressiveMCTS paper is not the game tree I think it is. I thought it was something similar to a min-max tree, in which each layer is a turn of each player. It is not - all of it is the turn of the current player. Is this thinking correct? I'm confused as to how this would work though, all of the results are back propagated to the root, and the root is always visited, therefore the root will always be chosen as the move.

EDIT2: It does say the child that was visited the most Posted Image again, more feelings of realization. Based on the simple rule, since a leaf will be expanded upon simulation, does this mean each leaf is simulated only once?

#3mudslinger

Posted 11 October 2012 - 02:55 PM

UCB1 is basically formula (1.1) in that paper.


Yes, I just realized how simple UCB1 is. I'm reading another paper/presentation: http://www.cs.bham.a...ctures/ucb1.pdf. I just made a simple C implementation of UCB1 with the rand() function. It does test more rewarding functions more.

This is making more sense to me. I'll start coding again tomorrow.

EDIT: I just realized that the tree in the progressiveMCTS paper is not the game tree I think it is. I thought it was something similar to a min-max tree, in which each layer is a turn of each player. It is not - all of it is the turn of the current player. Is this thinking correct? I'm confused as to how this would work though, all of the results are back propagated to the root, and the root is always visited, therefore the root will always be chosen as the move.

EDIT2: It does say the child that was visited the most :P again, more feelings of realization.

#2mudslinger

Posted 11 October 2012 - 02:51 PM

UCB1 is basically formula (1.1) in that paper.


Yes, I just realized how simple UCB1 is. I'm reading another paper/presentation: http://www.cs.bham.a...ctures/ucb1.pdf. I just made a simple C implementation of UCB1 with the rand() function. It does test more rewarding functions more.

This is making more sense to me. I'll start coding again tomorrow.

EDIT: I just realized that the tree in the progressiveMCTS paper is not the game tree I think it is. I thought it was something similar to a min-max tree, in which each layer is a turn of each player. It is not - all of it is the turn of the current player. Is this thinking correct? I'm confused as to how this would work though, all of the results are back propagated to the root, and the root is always visited, therefore the root will always be chosen as the move.

PARTNERS