• Advertisement
Sign in to follow this  

Killing bugs in Chess AI and correctly implementing alpha-beta and transpos. tables

This topic is 4470 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

For some time now, I've been implementing an AI for my 3d chess game, which is available with screenies at my webpage. After I implemented the first minmax version (static 3-ply), I noticed that the computer plays always the same game against itself, of course, because there is no randomness. So I took this as a 'bug-test' requirement. All the info I've read up on alpha-beta say that it is a conservative mathematical algorithm, so it cuts off only the paths that wouldn't be taken anyway. So wouldn't this mean that the AI should play the same with and without alpha-beta pruning? Ok, now I have minmax, alpha-beta and I'm now experimenting on good move ordering, iterative deepening and transposition tables. This is where my problems arise. Aren't all of these methods also 'conservative' (I don't know if this is the right term), so that they only speed up the search but the result should be the same with and without them? Because the problem is that with transposition tables, the AI starts behaving erratically. I'm not really sure how to store alpha & beta hashes in the table, so for now I just store exact hashes in it, i.e. I only hash positions which I've evaluated at depth=0. But still, the AI starts to make weird moves. Could this be due to collisions or something like that? Do you know any good articles that describe the use of transposition tables? Many sites do describe the alpha-beta algorithm in detail, but I haven't been able to find many on the actual implementation of transposition tables.

Share this post


Link to post
Share on other sites
Advertisement
This is a good website on chess programming:
http://www.brucemo.com/compchess/programming/

Changing the order of the moves is not "conservative" in the sense that the search can return some other move as best if two moves have the same score. Transposition tables are definitely not conservative, at least if you store bounds and if you use stored scores for searches of smaller depth. Read Bruce Moreland's page on "Search Instability".

However, your program shouldn't select bad moves because of search instability. Your problem lies somewhere else. I suggest you make a list of positions with clear best moves (chess problems would do) and use them to test changes to your program. When you introduce a big ugly bug, hopefully you'll see that your program doesn't solve the problems anymore.

Share this post


Link to post
Share on other sites
I've pretty much read through the website, but I think the bug lies somewhere in my alphabeta -function. Moreland's site describes Negamax implementation, while I use minmax, because it's easier for me to understand. That's why I have had to adapt a little, and I'm not sure if I got it right. I can post source code if necessary, but I doubt if you have time to look it trough.

Thanks for pointing out the situation about two move paths having the same score, I hadn't thought of that. I maintain a list of best moves so far, and while it seems that most of the moves are logical, the computer often makes a half-dubious move like trading its queen for a knight.

One thing that puzzles me, is that how do I really cope with has collisions? I have a hash table of about 400MB, which I index by hashTable[board->zobristKey % hashTableSize] like it is usually done.. Then I check that the key in the hash table is the same as the board's key, but what if they are the same and the positions are in fact different? I would then get the score from an incorrect position which would really mess up the AI (make it do bad moves).

For now I have disabled the tables, and I think the AI still makes some very stupid moves. (not obviously stupid, but semi-stupid :)) My evaluation function is quite simple, in addition to calculating material score it only penalizes unmoved knights, bishops and king's and queen's pawns, and gives bonuses to pawns nearing promotion. I wonder if this simplicity might be the biggest reason for the AI acting so dumb?

Share this post


Link to post
Share on other sites
My alpha-beta implementation is based on what I learned from here:
wikipedia
More specific, I think here's my problem:




1) If I understood right, (beta <= alpha) means cutoff? this is a 'bad' node, and won't be played? so why do we return alpha?
The parent node is a max node, so it will select
the biggest node given to it. Shouldn't we return beta (a bad score) so that the parent function won't select this node?

2) And shouldn't here be return alpha; instead?

Or am I just way off somewhere?

Share this post


Link to post
Share on other sites
AlphaBeta will not always play the same game as MinMax. This is because you can have several moves which are all equally as good.

AlphaBeta generally wants to do this:

-If my opponent already has a better move than the one I'm looking at, stop searching this node.


Go to:

http://www.cs.mcgill.ca/~cs251/OldCourses/1997/topic11/

for a decent description.


Transposition tables are very difficult to implement because of issues with bugs. I recommend you do the following:


Debug stage 1:

1. Take out the code that lets you return early due to move transposition.

2. Create a function called 'createHash', which will look at a board and generate a hash value for it.

3. Every time you see a move, call createHash and compare this value against the hash value you already have.

4. If there is a discrepency, find out why.



Debug stage 2:

1. Set a static search depth (4 ply?).

2. With move transposition turned off, play a test game and record every single move / response, and the scores, out to like 14 moves or so.

3. Turn on move transposition only for the last ply (the one where you would call your evaluation function).

4. Replay the same game-- the results should be identical, as move transposition is only removing a few evaluates.



Debug stage 3:

(You can use this one for every new feature you add)

1. Download some chess problems.. Things like 'mate in 4' or 'mate in 7'.

2. Make sure your program can solve them.


Best of luck,
Will

Share this post


Link to post
Share on other sites
Quote:
Original post by RPGeezus
http://www.cs.mcgill.ca/~cs251/OldCourses/1997/topic11/


been there, it's been a great help.

Quote:
Original post by RPGeezus
Debug stage 2:

1. Set a static search depth (4 ply?).

2. With move transposition turned off, play a test game and record every single move / response, and the scores, out to like 14 moves or so.

3. Turn on move transposition only for the last ply (the one where you would call your evaluation function).

4. Replay the same game-- the results should be identical, as move transposition is only removing a few evaluates.


I'm now doing exactly that, and the results are nowhere near identical.

Quote:
Original post by RPGeezus

Debug stage 3:
(You can use this one for every new feature you add)
1. Download some chess problems.. Things like 'mate in 4' or 'mate in 7'.
2. Make sure your program can solve them.


Hey, that's a GREAT suggestion! I think I'll try that some day.

Share this post


Link to post
Share on other sites
I have a hard time thinking of alpha-beta when it's not written in the negamax style. Anyway, the way I think about alpha-beta is that it can return three things:
- An exact score between alpha and beta,
- alpha or lower, which means that the score is alpha or lower,
- beta or higher, which means that the score is beta or higher.

I hope that helps in general.

Are you using a quiescence search at all? You know that without quiescence search your program will play horribly no matter what, right?

I didn't find hash tables very hard to implement. The problem you mention of using a score that was store for one position as valid for a different position can be avoided by storing the zobrist key in the table for verification. If you read Bruce Moreland's page carefully, you'll see that he does that.

If you want to send me code, I'll probably have time to look over it during the weekend.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by alvaro
I have a hard time thinking of alpha-beta when it's not written in the negamax style. Anyway, the way I think about alpha-beta is that it can return three things:
- An exact score between alpha and beta,
- alpha or lower, which means that the score is alpha or lower,
- beta or higher, which means that the score is beta or higher.

I hope that helps in general.


I changed my alphabeta function a little and I think I had a bug there, it might be solved now. I think the problem is that when I return a value from alphabeta function, the parent function can't tell if the score returned is the exact score or a upper or a lower bound. So if I call alphabeta with a value like a=1002 and it returns 1002, I can't tell if it is 'the real' computed result for this node, or if it just cut off (which it probably did). So for now I just select the best move with '>' and not '>=' because the later searches will return the same score if they got cut off and shouldn't be selected. As a result (I always order moves so that captures will be tried first) the AI is very keen on capturing enemy pieces, because it will select the first move, usually capture, rather than some other move which actually has the same score. Is this normal and how would I overcome this? I don't want the AI to always just trade pieces.

Another thing that I can't make a "score" list of the evaluated moves because the list would look something like this:

1204 e2-d3 <

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Ok, achieved to mess up my post by putting tag signs there, doh me.

The point is that my 'best moves' -list ends up with the same score for the best move and several 'cut-off' moves. I can only tell from the list that which was the best move, but can't try out the 2nd best move for example (I'd like the AI to vary its play a bit)

The quiescence search I implemented wasn't very good because I didn't have the null-move option in it. I'll try that after I get the transposition tables to work.

I do store the zobrist keys in the hash table, what I meant to ask is that what if I get the same hash for two different positions? Can this happen or do I have a bug if it does?

I'll put something up by the weekend, so you can look what's going on.

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
The point is that my 'best moves' -list ends up with the same score for the best move and several 'cut-off' moves. I can only tell from the list that which was the best move, but can't try out the 2nd best move for example (I'd like the AI to vary its play a bit)

Sure, alphabeta won't tell you anything about the second best move. If you want that, you need to change the logic of the search at the root a little. Anyway, if you only want to randomize your games a little, you can start by giving it a broad opening book. After that, if you ponder during the opponent's turn, variations in time control should be enough to not play the exact same game over and over again. You can also add a little random component to the evaluation function.

Quote:
The quiescence search I implemented wasn't very good because I didn't have the null-move option in it. I'll try that after I get the transposition tables to work.

Bad idea. Get your quiescence search to work as soon as possible. Everything else is secondary.

Quote:
I do store the zobrist keys in the hash table, what I meant to ask is that what if I get the same hash for two different positions? Can this happen or do I have a bug if it does?

I haven't made the computations in a while, but if you store a 64-bit key and the table you use to compute the Zobrist keys is reasonably random, you will probably never see a collision.

Share this post


Link to post
Share on other sites
Quote:
Original post by alvaro
Bad idea. Get your quiescence search to work as soon as possible. Everything else is secondary.

I haven't made the computations in a while, but if you store a 64-bit key and the table you use to compute the Zobrist keys is reasonably random, you will probably never see a collision.


Listen to Alvaro.

Share this post


Link to post
Share on other sites
On the other hand, I think I'll delay posting the sources for now. I still have a bug in my alpha-beta code but I think I can solve it on my own. Meanwhile, you can try the latest implementation at:
Chess v7 download
Chess homepage

Share this post


Link to post
Share on other sites
ai snippets
ok, here goes... It is not the full source, I tried to put the most relevant parts in it. Ask me if you want to look at any headers or something else. (Please don't whine about code convention and such :) )
I found a bug in the zobrist hash creation when making a move which was probably the cause of so many table collisions. I'll fix that some time soon.

The main point is that what should I do (in which order) to make the AI stronger:
-- quiescence searches?
-- transposition tables?
-- better move evaluation? (what's the most prominent thing I should change?)

Another thing I'm worried that I think it might not mate correctly or try to draw the game if it is losing. Are there any special things I should consider when detecting draw or mate? Is it common to use a 'lazy' mate detection so that we detect mate just by a move that captures the king?

Share this post


Link to post
Share on other sites
It seems that the AI prefers draw game to an obvious mate :( So that's one more bug I need to smash.

The final goal is though to make the AI 'fun' to play. I'm very poor at chess and I'd like to see the computer vary its play and try out different lines (which aren't total blunders of course) so the goal is to make it challenging. I guess this would best be achieved by first making it as strong as possible and then artificially reducing the strength.

Share this post


Link to post
Share on other sites
A few comments on your code:
* You probably shouldn't use bitboards for your first chess program. The way you are evaluating and generating moves is not making good use of the bitboards, and it would be faster and more clear if you had used a format like an array of 64, a 12x10 representation, 0x88 or 16x12.
* Instead of making a copy of the board in the search function, I recommend using a single board object on which you carefully do and undo moves. For this I keep a stack of all the moves made so far together with some information that you need to undo the moves (what were the castlings allowed before the move, was an en-passant capture possible, what piece was captured...).
* I find it easier to define a few funcions (place, remove, move) that operate on the board at the lowest level, and which update the board (bitboards or array), the piece list, the zobrist key, the material count... Basically anything that you want to keep updated incrementally while you do and undo moves.
* If you want to enable the part of SearchHash that you have dissabled (dealing with hash scores that are only bounds), you need to pass alpha and beta as references or pointers, so you can modify their values. If a lower bound is found, you can use it to improve (increase) the value of alpha to that bound; similarly, if an upeer bound is found, you can use it to improve (decrease) the value of beta. If the values of alpha and beta cross, you can return the hash score from MiniMax. Oh, and this should work just find regardless of whether it's white's turn or black's turn.
* RecordHash shouldn't write to the hash if what's already there has a higher depth, because with the current code you will be forgetting important information to learn unimportant information.
* Checking the clock in every node is too expensive. If you profile your program you'll see that you spend most of your time asking for the time. Instead, check it only every few thousand nodes. You can also poll the input when you check the time, to see if you got a command that you need to deal with (like the user asking to abort the search).
* You shouldn't use std::list to store moves, because allocating and deallocating memory is expensive. Instead, use a large array and use only the part of the array that you need.
* You can use the "lazy" mate detection, but I think you are complicating things too much. What I do is set the score in MiniMax to -Infinity instead of alpha (let's look at it from white's perspective to simplify things). At the end, score will still be -Infinity if no moves were legal, and then I decide between mate and stalemate at the end. I think setting the initial value of score to alpha or beta instead of -Infinity or +Infinity is the main problem with your code. Let me post some of my code:
  int eval=-1000000000;
Move m[256]; // I am not completely sure that 256 is enough, but
// I haven't thought much about it. According to
// http://www.xs4all.nl/~timkr/records/records.htm#Most%20possible%20moves
// 75 is the maximum for positions that happened in real games.
Move *best=m;
Move *end=generate_moves(b, m); // b is a reference to the board

// Main loop of the alphabeta function
for(Move *i=m;i!=end && !abort_search;++i){
b.do_move(*i);
if(is_legal(b))
eval=-alphabeta(b,-beta,-alpha,depth-1);
b.undo_move();
if(alpha<eval){
alpha=eval;
best=i;
if(alpha>=beta)
break;
}
}

// This detects mate or stale-mate
if(eval==-1000000000)
return in_check?-1000000000+b.move_number-move_number_at_root:0;


This is actual code. As you see, using NegaMax simplifies things tremendously. `b.move_number-move_number_at_root' is how far we are from the root, and that piece of code makes the program prefer shorter mates when it wins and longer mates when it loses. It also guarantees that a score of "mate" is different from -Infinity. Otherwise you would think that you are in stalemate a move leads to being mated.

I haven't checked you RootSearch function or the last thing that deals with threads (I don't think you need threads unless you are using multiple processors).

Share this post


Link to post
Share on other sites
Thanks a million times, alvaro. You're being most helpful. Rating += inf.

Another thing, can you tell about statistics of how a 'decent' chess program should perform? I get about 250knodes/sec, and can search to ply 5 or 6 in about 7 seconds now (which is the AI time limit to move, iterating depths from 2 to 6)

I'm using the threads because my 3D drawing code would otherwise stall when the computer is thinking.

Share this post


Link to post
Share on other sites
Quote:
Original post by clb
Another thing, can you tell about statistics of how a 'decent' chess program should perform? I get about 250knodes/sec, and can search to ply 5 or 6 in about 7 seconds now (which is the AI time limit to move, iterating depths from 2 to 6)

It's very hard to compare one program to another in nodes/s or depth, because these are meaningless numbers. Well, they are not completely meaningless; if you make a change to your code that gives you a 20% improvement in nodes/s, that can only be a good thing.

The program I am working on right now (which has a mindlessly simple evaluation function), runs at about 600Knodes/s and reaches depth 10 in about 7 seconds. But of course the depth you get to depends on how aggressively you prune your tree, and what extensions you have in your search. To give you a better idea of what my "depth 10" means, running with null-move forward pruning (R=2) and extending all checks and nothing else.

Quote:
I'm using the threads because my 3D drawing code would otherwise stall when the computer is thinking.

That's a very good reason to use threads. However, I would recommend running the chess engine as a separate process that knows nothing about graphics and just talks to the GUI using pipes. There are already two well-established protocols to communicate between the GUI and the engine: Winboard and UCI. I really like the design of UCI and that's what I have implemented in the program I am currenty working on. If you want to implement a GUI as well, that's fine, but Winboard or UCI compatibility will allow using your engine with other GUIs, matching your engine against other engines and playing on Internet chess clubs, like FICS and ICC.

Share this post


Link to post
Share on other sites
Ok, having done some work, here are the results so far:
Chess main page
Latest build

Quote:
Original post by alvaro
* You probably shouldn't use bitboards for your first chess program. The way you are evaluating and generating moves is not making good use of the bitboards, and it would be faster and more clear if you had used a format like an array of 64, a 12x10 representation, 0x88 or 16x12.


Well yes.. my move generation/evaluation isn't as effective as it could be.. I'm now thinking that might this be what's 'slowing me down' so much?

Quote:
Original post by alvaro
* Instead of making a copy of the board in the search function, I recommend using a single board object on which you carefully do and undo moves.


Done, I removed all std::lists and now use only statically allocated memory, which is somewhat faster. I save the board positions in a stack and the possible moves in a queue of some sort.

Quote:
Original post by alvaro
* You shouldn't use std::list to store moves, because allocating and deallocating memory is expensive. Instead, use a large array and use only the part of the array that you need.
* If you want to enable the part of SearchHash that you have dissabled (dealing with hash scores that are only bounds), you need to pass alpha and beta as references or pointers, so you can modify their values. If a lower bound is found, you can use it to improve (increase) the value of alpha to that bound; similarly, if an upeer bound is found, you can use it to improve (decrease) the value of beta. If the values of alpha and beta cross, you can return the hash score from MiniMax. Oh, and this should work just find regardless of whether it's white's turn or black's turn.
* RecordHash shouldn't write to the hash if what's already there has a higher depth, because with the current code you will be forgetting important information to learn unimportant information.
* Checking the clock in every node is too expensive. If you profile your program you'll see that you spend most of your time asking for the time. Instead, check it only every few thousand nodes. You can also poll the input when you check the time, to see if you got a command that you need to deal with (like the user asking to abort the search).


Fixed these.

Quote:
Original post by alvaro
* You can use the "lazy" mate detection, but I think you are complicating things too much.


I cleaned it up a little, but I think the AI still finds checkmates in a very weird way. I previously had a bug where it preferred draws over checkmates but it should be fixed now.

Two big problems still exist:

1) The search is too slow. I'm not sure how to achieve better results, by move ordering? by some optimizations?

2) Quiescence searches take too much time because of the amount of captures can be done. Is MVV/LVA the best way to address this problem?

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement