MTD(f) : how can I retrieve the best first move ?

Started by
8 comments, last by alvaro 18 years, 5 months ago
Hi, First of all, I tried posting on this thread (http://www.gamedev.net/community/forums/topic.asp?topic_id=270248) but couldn't for some reason... I've implemented MTD and it retrieves the correct value but I am wondering how to retrieve the actual path. From what I've read, the whole PV cannot be retrieved but all I really need is to know what the first best move is. Can anyone help me ? I've posted my complete code as well as a few graphs representing my search tree in the following forum : http://forums.topcoder.com/?module=Thread&threadID=507626&start=0 Thanks, Stephane.
Advertisement
Couldn't you follow the tree back up to the origin by creating a parent pointer in each node? Simply check if the parent was the origin and if so then take this node.

Hope this helps,
Cam
The function AlphaBetaWithMemory stores the results of the search in the transposition table. Have you tried recovering the move from there? I am not entirely sure this would work, but that's the first thing I would try.

hi,

thanks for those replies. since the last time i posted i have found a very useful resource which explains two ways which can be used to retrieve the pv (or just the best move) ... and those two ways are the ones alvaro and talyssan are describing !

bruce morland's programming topics

however, i am encountering another problem (should i use a different forum topic ?) : i now wish to retrieve not only the best move but, say, the 5 best moves for a given situation. i am trying to find a way that can enable several moves with almost the same outcome (in terms of value) to be retrieved so that the computer can choose the best move with a little randomness (so as to not repeat the same game over and over again).

i think it is best to use a new forum topic so please post your answers here instead.

thanks for your help,

stephane.
Stephane,
the methods to recover Principal Variation work only if you are NOT using Transposition Tables. For most games, getting good performance requires TTs.
I would be really glad to know if you find a method to get PV with TTs (I were not able to get them when working with alpha-beta) ... and of course glad to test your game when it's finished :)

Enric - Deep
_______________________The best conversation I had was over forty million years ago ... and that was with a coffee machine.
Enric,

First of all, thanks a lot for those files you sent me about TTs : I'm not done going through them but I noticed something which I'll need to look into soon, i.e. minimax for more than 2 players.

I'm not sure this is what you need since I haven't implemented it, but for what it's worth check out this link about collecting the PV.

Cheers,

Stephane.
Quote:Original post by DeepButi
Stephane,
the methods to recover Principal Variation work only if you are NOT using Transposition Tables. For most games, getting good performance requires TTs.
I would be really glad to know if you find a method to get PV with TTs (I were not able to get them when working with alpha-beta) ...


Well, I actually use transposition tables to recover the PV. The code that does that looks like this (literally):
void print_pv(Board &b, Move m, int max_length=20){  if(max_length<=0)    return;  b.do_move(m);  HashEntry hash_entry;  bool hash_hit=query_hash(b.hash_key(),hash_entry);  if(hash_hit){    std::cout << ' ' << hash_entry.move;    print_pv(b,hash_entry.move,max_length-1);  }  b.undo_move();}
Quote:Original post by sgalzin
however, i am encountering another problem (should i use a different forum topic ?) : i now wish to retrieve not only the best move but, say, the 5 best moves for a given situation. i am trying to find a way that can enable several moves with almost the same outcome (in terms of value) to be retrieved so that the computer can choose the best move with a little randomness (so as to not repeat the same game over and over again).


The only mini-max search that will reveal the best N moves is infact a real minimax() search.

AlphaBeta() for example only needs to prove that SOME variation exists which is inferior to the principle variation in order to stop searching a particular alternative. The reason for the cutoff might not be the best or worst reason for a cutoff, it is only known that it is a good enough reason.

Hence the true value of a give inferior alternative is not known.

Any attempt to show the true values of the alternatives will consume (quite a bit of) search time that could have been used to get a better value for the principle variation. This is unacceptable if your engine is intended to play as strong as possible but is perfectly acceptable if your engine is used for analysis/study. In that case I would simply do N searches where N is the number of moves off the root.

In the case of narrow or zero window alphabeta, you can use information from previous searches to enhance the speed of the current search by giving a good guess for the window.

- Rockoon (Joseph Koss)

Quote:Original post by alvaro
Quote:Original post by DeepButi
Stephane,
the methods to recover Principal Variation work only if you are NOT using Transposition Tables. For most games, getting good performance requires TTs.
I would be really glad to know if you find a method to get PV with TTs (I were not able to get them when working with alpha-beta) ...


Well, I actually use transposition tables to recover the PV. The code that does that looks like this (literally):
*** Source Snippet Removed ***

mmm ... trying to understand your code: you get the best move for each position from the TT, do it and search again. Right?

There is no security about the board position being on TT except if your TT is big enough to hold all your nodes. In my case, my worst test case examines 500 milion nodes and TT is not big enough to hold all of them. And in real games I've found far worst situations.

Thanks anyway.

_______________________The best conversation I had was over forty million years ago ... and that was with a coffee machine.
Quote:Original post by DeepButi
mmm ... trying to understand your code: you get the best move for each position from the TT, do it and search again. Right?

Basically. No big mistery.

Quote:Original post by DeepButi
There is no security about the board position being on TT except if your TT is big enough to hold all your nodes. In my case, my worst test case examines 500 milion nodes and TT is not big enough to hold all of them. And in real games I've found far worst situations.

Of course not all nodes will fit in the TT. That's why you carefully design a replacement mechanism that gives priority to nodes that have been searched to large depths, so at least the first few moves will be found in the TT.

This is sample output from my program:
info depth 1 seldepth 1 time 38 score cp -238 nodes 3 nps 78 pv a3a4info depth 1 seldepth 1 time 38 score cp -237 lowerbound nodes 6 nps 155 pv b2b3info depth 1 seldepth 1 time 38 score cp -233 nodes 9 nps 231 pv b2b3info depth 2 seldepth 2 time 39 score cp -256 nodes 263 nps 6716 pv b2b3 e6e5info depth 2 seldepth 2 time 39 score cp -255 lowerbound nodes 337 nps 8554 pv f2f4 c7f4info depth 2 seldepth 2 time 39 score cp -241 nodes 713 nps 17876 pv f2f4 a7a5info depth 2 seldepth 3 time 42 score cp -240 lowerbound nodes 2007 nps 47398 pv d1a4 c7h2info depth 2 seldepth 3 time 43 score cp -238 nodes 2697 nps 62416 pv d1a4 c7f4info depth 3 seldepth 3 time 44 score cp -238 nodes 3355 nps 76004 pv d1a4 c7f4 b4e7info depth 4 seldepth 5 time 47 score cp -238 nodes 5967 nps 125210 pv d1a4 c7f4 b4e7 d8e7info depth 5 seldepth 5 time 53 score cp -238 nodes 10479 nps 197003 pv d1a4 c7f4 b4e7 d8e7 a4d4info depth 6 seldepth 6 time 121 score cp -238 nodes 59247 nps 486361 pv d1a4 c7f4 b4e7 d8e7 a4d4 f4c1info depth 7 seldepth 7 time 334 score cp -245 nodes 226233 nps 676107 pv d1a4 c7f4 c1e1 b8b7 d3a6 d8d7 a6b5info depth 8 seldepth 9 time 1963 score cp -244 nodes 1460255 nps 743778 pv d1a4 c7f4 c1c2 b8b7 g2g3 f4e5 f2f4 c8d7info depth 8 seldepth 10 time 3775 score cp -243 lowerbound nodes 2826052 nps 748554 pv c1c7 d8c7 b4d6 c7b7 d6b8 b7b8 h1g1info depth 8 seldepth 10 time 3830 score cp -244 nodes 2857500 nps 745998 pv d1a4 c7f4 c1c2 b8b7 g2g3 f4e5 f2f4 c8d7info depth 9 seldepth 10 time 5381 score cp -242 nodes 3986534 nps 740749 pv d1a4 c7f4 c1c2 b8b7 g2g3 f4e5 f2f4 c8d7 a4a6info depth 9 seldepth 11 time 7867 score cp -241 lowerbound nodes 5895530 nps 749338 pv e4f6 g7f6 b4e7 d8e7 d1g4 g8h8 g4h4 h7h6 h4h6 c7h2info depth 9 seldepth 13 time 10551 score cp 18 nodes 7939928 nps 752472 pv e4f6 g8h8 f6h7 h8g8 h7f8 g8f8 d1h5 c7d6 h5h8 e7g8 c1c4info depth 10 seldepth 13 time 14793 score cp 147 nodes 11034609 nps 745903 pv e4f6 g8h8 f6h7 c7f4 h7f8 f4c1 d1c1 e6e5 c1g5 f7f6 f8g6info depth 11 seldepth 15 time 22626 score cp 109 nodes 16639098 nps 735376 pv e4f6 g8h8 f6h7 c7f4 h7f8 f4c1 d1c1 e7d5 f8h7 d5b4 a3b4 e6e5info depth 12 seldepth 17 time 58275 score cp 117 nodes 40395081 nps 693172 pv e4f6 g8h8 f6h7 c7f4 h7f8 f4c1 d1c1 e7d5 f8h7 d5b4 a3b4 e6e5

As you see, I get a pretty decent line, and this is with 32 MB of hash tables which are not very well written (they waste a lot of space).

If you want to be guaranteed that you will get the entire PV, follow Bruce Moreland's method. In any case, what you said about TT being incompatible with extracting the PV didn't make any sense.

This topic is closed to new replies.

Advertisement