Jump to content
  • Advertisement
Sign in to follow this  
crypto_rsa

Iterative deepening

This topic is 3716 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, I have a question about IDA. My algorithm is able to search the game tree to e. g. depth 5 (complete search). During searching to the depth 6 the time runs out. However, some "top-level" branches were investigated completely to depth 6. Is it possible to somehow use the information from them or should I always stick with the result I got only from the completely searched depth (in this case 5). What is the common practice? Deeper results contain more information about the current position but they are not complete.

Share this post


Link to post
Share on other sites
Advertisement
One would keep the moves at the root sorted, at least so the best move found so far is always the first one. Then you simply return the first move, regardless of whether the depth-6 search was completed or not.

Does that answer your question?

Share this post


Link to post
Share on other sites
Quote:
Original post by alvaro
One would keep the moves at the root sorted, at least so the best move found so far is always the first one. Then you simply return the first move, regardless of whether the depth-6 search was completed or not.

Does that answer your question?


Well, it sorts the moves at the beginning of each iteration based on the results stored in a transposition table (which are likely to come from the last completed iteration), but the order based on deeper results may be different. Yes, the best move from depth-5 search is the first one but that does not guarantee it will be the best one in depth-6 search as well.

Share this post


Link to post
Share on other sites
Quote:
Original post by crypto_rsa
Quote:
Original post by alvaro
One would keep the moves at the root sorted, at least so the best move found so far is always the first one. Then you simply return the first move, regardless of whether the depth-6 search was completed or not.

Does that answer your question?


Well, it sorts the moves at the beginning of each iteration based on the results stored in a transposition table (which are likely to come from the last completed iteration), but the order based on deeper results may be different.

Typically you wouldn't use the transposition table for this. I'll post some pseudo-code below.

Quote:
Yes, the best move from depth-5 search is the first one but that does not guarantee it will be the best one in depth-6 search as well.

Of course not. The point is that it is safe to interrupt depth 6 at any point, and you will get a move that is no worse than the move that was best at depth 5. If you don't get the same move as you got in depth 5, it's because the new move has a better score in depth 6.

Here's some pseudo-code (of the C++ variety):
Move think(Position P) {
vector<Move> moves = generate_moves(P);
for(int depth=1; enough_time_left(); ++depth) {
double best_score = -infinity;
for(unsigned i=0; i<moves.size(); ++i) {
double score = alphabeta(apply_move(P, moves), -infinity, -best_score, depth-1);
if(search_interrupted_because_of_timeout())
break;
if(score>best_score) {
score = best_score;
// Move the new best move to the beginning of the list
std::rotate(moves.begin(), moves.begin()+i, moves.begin()+i+1);
}
}
}

return moves[0];
}


Share this post


Link to post
Share on other sites
Quote:
Original post by alvaro
Quote:
Original post by crypto_rsa
Quote:
Original post by alvaro
One would keep the moves at the root sorted, at least so the best move found so far is always the first one. Then you simply return the first move, regardless of whether the depth-6 search was completed or not.

Does that answer your question?


Well, it sorts the moves at the beginning of each iteration based on the results stored in a transposition table (which are likely to come from the last completed iteration), but the order based on deeper results may be different.

Typically you wouldn't use the transposition table for this. I'll post some pseudo-code below.


I use the TT for storing evaluations (and thus sorting the move list) in the whole game tree, not just at the root node.

Quote:

Quote:
Yes, the best move from depth-5 search is the first one but that does not guarantee it will be the best one in depth-6 search as well.

Of course not. The point is that it is safe to interrupt depth 6 at any point, and you will get a move that is no worse than the move that was best at depth 5. If you don't get the same move as you got in depth 5, it's because the new move has a better score in depth 6.

Here's some pseudo-code (of the C++ variety):
Move think(Position P) {
vector<Move> moves = generate_moves(P);
for(int depth=1; enough_time_left(); ++depth) {
double best_score = -infinity;
for(unsigned i=0; i<moves.size(); ++i) {
double score = alphabeta(apply_move(P, moves), -infinity, -best_score, depth-1);
if(search_interrupted_because_of_timeout())
break;
if(score>best_score) {
score = best_score;
// Move the new best move to the beginning of the list
std::rotate(moves.begin(), moves.begin()+i, moves.begin()+i+1);
}
}
}

return moves[0];
}


This is what I currently do in my algorithm, only written in a slightly different way. But basically I am storing the best score (and the appropriate move) until time runs out. Thanks for confirming my thoughts :-)

Share this post


Link to post
Share on other sites
Quote:
Original post by crypto_rsa
I use the TT for storing evaluations (and thus sorting the move list) in the whole game tree, not just at the root node.

This makes sense in all the other nodes, but at the root the order you get by using my pseudo-code is better. When a move has been the best candidate for a while and at a particular depth another move seems better, you will often see the old best move become the best again. The closer you have it to the beginning of the list, the faster your search will go.

Anyway, the only thing that is really important is that the first move you explore in depth 6 is the move that was the best move in depth 5, and you seem to be doing that.

Share this post


Link to post
Share on other sites
Quote:

The point is that it is safe to interrupt depth 6 at any point, and you will get a move that is no worse than the move that was best at depth 5. If you don't get the same move as you got in depth 5, it's because the new move has a better score in depth 6.

Here's some pseudo-code (of the C++ variety):
Move think(Position P) {
vector<Move> moves = generate_moves(P);
for(int depth=1; enough_time_left(); ++depth) {
double best_score = -infinity;
for(unsigned i=0; i<moves.size(); ++i) {
double score = alphabeta(apply_move(P, moves), -infinity, -best_score, depth-1);
if(search_interrupted_because_of_timeout())
break;
if(score>best_score) {
score = best_score;
// Move the new best move to the beginning of the list
std::rotate(moves.begin(), moves.begin()+i, moves.begin()+i+1);
}
}
}

return moves[0];
}


One more thing. You seem to be using the best result from the previous iteration as the lower bound in the next iteration. Is this safe? I think the evaluation function would have to be monotone or something. You can get e. g. evaluation 150 for depth 3 (for move A), then you sort the moves, next you begin searching to depth 4 but the correct (best) evaluation for this depth is 100 (move B). When using 150 as the lower bound, you won't discover move B as being the best one. Or am I wrong?

Share this post


Link to post
Share on other sites
Quote:
Original post by crypto_rsa
One more thing. You seem to be using the best result from the previous iteration as the lower bound in the next iteration. Is this safe? I think the evaluation function would have to be monotone or something. You can get e. g. evaluation 150 for depth 3 (for move A), then you sort the moves, next you begin searching to depth 4 but the correct (best) evaluation for this depth is 100 (move B). When using 150 as the lower bound, you won't discover move B as being the best one. Or am I wrong?

You didn't read my pseudo-code correctly. best_score is initialized to -infinity every time we start searching a new depth.

Share this post


Link to post
Share on other sites
Quote:
Original post by alvaro
Quote:
Original post by crypto_rsa
One more thing. You seem to be using the best result from the previous iteration as the lower bound in the next iteration. Is this safe? I think the evaluation function would have to be monotone or something. You can get e. g. evaluation 150 for depth 3 (for move A), then you sort the moves, next you begin searching to depth 4 but the correct (best) evaluation for this depth is 100 (move B). When using 150 as the lower bound, you won't discover move B as being the best one. Or am I wrong?

You didn't read my pseudo-code correctly. best_score is initialized to -infinity every time we start searching a new depth.


Oh, sure. I am sorry.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!