Alpha Beta, the next step??

Started by
14 comments, last by Ninj4D4n 21 years, 4 months ago
right, but with alpha-beta dont you sometimes cut off parts of the tree which are potentially better?


"There is no dark side of the moon really,
As a matter of fact, its all dark."
Advertisement
quote:Original post by DarkHamster
right, but with alpha-beta dont you sometimes cut off parts of the tree which are potentially better?


No, alpha-beta is guarenteed to return the exact same score that min-max would have returned. Alpha-beta only prunes a branch after it proves that the branch cannot produce anything better than what you already know you can get.

For example, you search the first move, and the search tells you that you can get 3 in a row if you play this first move. That''s pretty good for you. Then you search the second move, and during that search, alpha-beta realizes that your opponent can get 4 in a row if you play this second move. So even though it hasn''t searched that entire branch for the second move yet, it stops searching that entire branch and bails out, because no matter what else the search finds in that branch, the result is still going to be a win for your opponent.

So at this point, there is zero possibility that you are going to get anything better than a loss (4 in a row for your opponent), so you aren''t going to prune away anything meaningful. That''s why it''s safe to prune away branches wihtout searching them. If you continued to search the rest of that branch after you found out that your opponent could win, the search would basically be saying:

"Can I do better than a loss? No."
"Can I do better than a loss? No."
"Can I do better than a loss? No."
"Can I do better than a loss? No."
"Can I do better than a loss? No."
"Can I do better than a loss? No."
...

Over, and over, and over, until it searched the rest of that branch. Whenever you find out that the move you are searching is a loss, you know it can''t get any worse, so you can stop wasting time asking a question you already know the answer to. It works the same way with regular scores too.

For example, you search the entire first legal move and get a score of +2. Now if you play this first move, you know for sure that you can get a score of +2. So now you search the second move. You start searching, you try the second move, and you see that your opponent can play a move that will force a score of +1 (and the decision is up to your opponent now, because it''s his turn, so it''s out of your control now). So you''ve only looked at two moves here, and you know two things for sure:

1. If I play the first move I can get a score of +2
2. If I play the second move, my score is going to be AT MOST a +1 (it could get worse, you''ve only looked at two moves so far)

So, at this point, you can decide that the second move is trash and throw it away without even bothering to search the rest of it.
There''s a huge paper by Victor Allis about solving Connect 4, looking very deep into the subtelties of the game. I suggest you read that and improve your evaluation function as much as you can, because you''ll get _a lot_ more power from a good evaluation function than from 1 or 2 extra ply. Not to mention that a solid evaluation function allows deeper searches by itself.

Then, definitely go for iterative deepening, even without the best move optimisation or transposition table. As table columns fill up, or players are forced to avoid or make certain moves, the branch factor drops a lot, so you''ll be able to search deeper. If only one empty column remains (very frequent occurance in C4), you can do a 6 ply search with 6 moves. Fixed depth searches are plain bad.

I think this hasn''t been mentioned: move ordering is critical to the performance of alpha-beta.

For Connect4, I developed a variation on chess'' history heuristic. If you are using a hash table for transpositions (as you should), use the move from the hash table as your first try and sort the rest based on the history heuristic that I describe below.

Version 1:
Keep a table: unsigned hh[2][42];
The first index is side and the second is the square. After each alpha-beta loop, add 2^depth to the entry in the table that corresponds to the best move (the one with the best score, or the one that triggered a beta-cut). Before the alpha-beta loop, sort your moves based on the score they get in the table.

Version 2:
Keep two tables: unsigned hh[2][42], hh2[2][42][42];
hh is the same as in version 1. hh2 uses an additional index, which is the previous move (the opponent''s move), and is updated with the same formula (adding 2^depth to the best move). When sorting, combine the results from the two tables. I used hh[side][square]+42*hh2[side][square][previous_square].


But the most important part in a Connect4 game is evaluation. You should learn about the relative importance of threats on even and odd rows. I spend three months working on that with two other math students. I found similar results on the web recently:
http://www.pomakis.com/~pomakis/c4/expert_play.html


Post again if you need more help. Good luck!
alvaro, how does a simple alpha-beta search (no transposition table, no opening book) with a simple heuristic function (threat count) fare against the average human C4 player? I am unwilling to try to improve it anymore, unless I really have to.
Ok, i''ve really gotten some boots from some simple suggestions on the forum, but I still don''t get the whole hash table thing (at all). Where can i find some more explenation on that?? I''ve tried searching, but nothing special shows up.

--Ninj4D4n

This topic is closed to new replies.

Advertisement