• ### Announcements

#### Archived

This topic is now archived and is closed to further replies.

# New Chess game tree

## Recommended Posts

I''ve recently completed the guts of a new chess engine that''s been in the works for a few weeks. What makes it interesting is the way in which I generate moves. No bitboards, no extra data structures to see what squares are safe, etc.. In fact, aside from the basic movement rules, there is no legal-move detection at move generation time. This means that it''s much faster than your _standard_ run-of-the-mill Chess game tree. (I mean this only in terms of valid move generation) I''m posting for two reasons. One is to see if anyone else has heard of or used this method in the past (are there better ways?). And the other is to let people know that there is at least one ( )good alternative to standard bitboard/legal move detection. Here is how the system works: Normally a chess game needs to generate a list of legal moves for a given position. An illegal move is one that would put the king in check. Knowing whether a move is legal or not requires you to scan the board, figure out where each piece can attack, and determine if one of those attacks is on the king''s square. Knowing the is critical. "Scanning the board" is where we make our first change. We skip it. Why? Because MOST of the information we actually need will become available in due-process (through the natural course of programs execution). So for now we assume most moves are valid. This is a good assumption, because as it turns out, the only time it''s neccesary to verify a moves is when it results in the capture of a king. Most moves DON''T result in a kings capture, so checking their validity is not necessary. When more moves do result in a kings capture, the tree is going to do lots of pruning anyway! This is how we save-- by doing more processing than normal, but only when it needs doing. i.e.: I move (A), the my opponent moves (B) and captures my king If B did not result in a capture, then there is no need to verify A. If it did then we must do some extra work. To accomplish this we need to add a check to our MinMax / AlphaBeta / Whatever algortithm. As you enter the funtion, generate a list of legal moves (as you normally would). Then check each move to see if it results in a king capture (you would normally do this too, I hope). If a king capture occures, then call our new Verify function, and pass the PREVIOUS move (A)to it (not the ones you just generated (B)). Verify does three things. 1. Using this board data you passed, construct a copy of the board previous to that (A-1). (Not a problem if you remember details about the last move in your gameboard structure). 2. Using the new board (A-1), let the side who did the capture (white in our example) move again. If any of these moves results in the kings capture, then the black king must be in checkmate, so there is no need to do any more processing. Quit. 3. If in 2, no moves resulted in a capture: Using the new board (A-1), create a list of moves for the side that was captured (black). For each of these, create a list of moves for the side that did the capturing(white). If for every black move, there is a resulting king capture by white, then the board is in a stalemate position. But, if ONE or more of these results in a move where the king isn''t captured, the move is ILLEGAL, and we can stop all processing. Thats it. If coded properly it will only do maximum work if move results in a stalemate (very rare). The only time there is waste is when a king is captured. We end up exploring just one move extra (by move I mean move, not ply) in the main loop IF a capture occures, and a maximum of two ply, minimum of 1 move, extra if move is illegal or a stalemate. This is much faster than checking EVERY move to see if it''s legal. There is lots of room for optimizing (like sorting moves, so king captures appear first, during move generation). Please let me know what you think.. I''m sure this has been tried before. It''s really easy to implement. Will

##### Share on other sites
First of all, congratulations for having a chess engine up and running. The hardest part in making a good chess program is getting started.

Now the bad news: You are not discovering any new alternatives to legal-move detection. Sorry.

Actually, it goes the other way around. Old programs (including mine) used to do it your way. But adding information about threats, or being able to find them fast (like using bitboards) turns out to be important for speeding up the evaluation. So if you have a non-trivial evaluation function, you have to do the work anyway. Legal-move detection is faster under those circumstances.

Also, there are situations where lots of moves are illegal (for instance, after a check). And trees in which that happens often are the most critical in a game, because they can lead to check-mate! If your program slows down exactly at the hotest moments in a game, you are doomed. At least, you should make a specific routine to avoid checks.

Your scheme is still usable. I would recommend that you make your move generation more or less independant of the rest of the game, so you can test what is fastest for your program.

##### Share on other sites
Thanks for the post. It was a lot of work getting this chess engine running. My wife thought I was loosing it.

I was under the assumption that, even with bitboards, it is still necessary to recalculate which squares are under attack for every move. If this is correct, that is almost the equivilant of doing an extra ply of moves for every move (except illegal moves). Is this correct?

I didn''t take in to account implementing any kind of complicated evaluation. Terms that can be measured quickly (material advantage, pawn structure, and a few other rules) are not a problem, but I see what you''re saying-- in my engine there would need to be a lot of recalculation for more complex terms.

Thanks,
Will

##### Share on other sites
As far as I understand (I am pretty new to AI programming so forgive me is my explanations are clueless...), with bitboards, you don''t need to recalculate the controlled squares (or legal moves) after each move as if you had no informations at all about it.
Let''s say for instance you moved a rook without capture.
You take your previous bitboard, remove from them the flags on squares indicating that they were controlled by the rook at its old position, and add flags on the new squares controlled by the rook. You also need to do a few extra things of course, like extend the controlled squares of long range pieces that were defneding or attacking the rook, and if you captured something, remove the flags for the controlled squares from the captured piece. You also must identify pinnings possibly caused or destroyed by your rook move, and remove or add flags for squares controlled by the unpinned or newly pinned piece. If your bitboard datas include xrays (for instance if you have a rook on a1 and a rook on b1, c1 is controlled by the b1 rook and is xrayed by the a1 rook through b1, the extension of lines and pinning/unpinning are easily detected from your previous bitboard data.
At least that''s what I am doing for my shogi program (even though there are few long range pieces in shogi), and I think it fastens a lot the legalmove detections without requiring really complex coding. The update of my bitoard is done with very few loops, and the geting of legalmove just requires me one loop over the 81 squares to get each piece who can go to that square (+ adding the drop piece movements, but these don''t exist in classic chess).

David Antonini

##### Share on other sites
Ugh... I was writing a long explanation on how to detect legal move rooks with a bitboard (without any loop, just a few binary rotaions, |, & and ~) and the message posting window just closed for some reason....
Anyway check this link, though less detailed (and not giving an algorithm for bitboard roation), it explains it well too.
http://www.ast.cam.ac.uk/~cmf/chess/theory.html#bitboards

After rereading your post I understood better what you meant though. You do not verify if a move leaves the king en prise, but you have somehow to test collisions for long range pieces when generating your move list (to avoid having rooks jump above pieces). So I would certainly advise you to use bitboards, as it fastens up A LOT this task : no loop needed at all, except maybe to update the rotated boards after a move has been played. And even more important, everything is done with binary operations on 64bites unsigned integers (wich are done extremely fast).

About the fact of considering as legal the moves that leave the king en prise, I think most top programs proceed that way already, except when the verification is needed (i.e quiescent search at lowest parts of tree). For the rest they consider that the brute force gained by not having to detect absolute pins is more important than the fact of generating a few less moves. So nothing new I am affraid, sorry .

If you need more explanations on bitboards, you can mail me at sigloxx@cybercable.fr (I won''t retry posting my explanations here to see my message closed again )

David Antonini

##### Share on other sites
Maybe I''ll go back and rewrite the move generator to use bitboards, just for the sake of comparision. It''s not so much bitboards that I considered slow, it was the legal move detection.

I was pretty sure this method has been used elsewhere; but like I said I''ve been unable of finding any documentation that describes it.

Thanks for the info,
Will

##### Share on other sites
David,

I believe that the amount of work you would have to do to take care of all of the special cases would take as much time or longer than it would to generate the moves from scratch.

I have fiddled around before with generating 100% legal moves (IE I tried to detect pins) and that alone was complicated enough to cause a slowdown. Detecting it a knight is pinned is easy. But what about when a bishop is pinned? It might still be able to move in two directions, so you still have some checking to do. It''s not impossible, but it certainly seems much more bug prone. Imagine if you overlooked some obscure case and your program played an illegal move in a crucial game...

I''ve recently experimented with some attack generation methods that might prove useful for this method, but I still think incremental move generation isn''t going to cause great speed improvements. Even if it is an improvement, it probably won''t be that significant in the scope of the entire program. I think I''ll play with it though. I''d like to be able to generate 100% legal moves instead of having to check if it was actually legal at the next ply.

Contrary to what many people think, move generation is not where a good chess engine will spend most of it''s time. Evaluation is usually where the majority of time is spent.

Russell

##### Share on other sites
Thanks for the advice, Russel .
I gave up too the idea of checking if a move leaves king en prise, so no use for checking pins. Anyway the quiescent search will find king captures.
And all the "legal moves" in a given position are actually generated by the evaluation function itself now (and are probably a small part of it indeed, speedwise), as my evaluation needs to know wich squares are controlled by who.
In fact as I wanted to do this program in javascript, the BIG slowing factor is arrays or objects definition, it seems... defining a new object within the search is out of question it seems and I am going to have to use a lot of global variables (well I already do, my program barely looks like a program now ).

I wonder how IE stocks personal objects in memory... I even had to give up the idea of pre-generating legal movements bitboards for all squares a rook or bishop might be on, as creating a few arrays of 9*512 objects of 3 32bytes integers (bitboards for a 9x9 board with a language that doesn''t admit 64 unsigned integers) was making my comp swap on the hard drive... (at least with IE5.5, NS6 seems to use less memory to stock objects).

I just pre-generated them for one file, depending on the occupancy, and I make bitboards rotations during the legal move generation.. This doesn''t seem to slow me too much compared to the time taken when stocking data in the tree. I wanted to stock the new positions (an object with the bitboards for each pieces and the bitboards for white and black pieces) at each depth, to not have to "undo" moves. I guess I will have to undo moves ).

Anyone interested can download the actual sources at 9*9 board bitboards utilities and Main part. The main part is VERY chaotic at the moment (a mix of bitboard use and of my first try without bitboards), but you can find the actual evaluation function there (GiveEvaluation() ). It still obviously requires a lot of optimization (evaluating 1000 times the starting position takes over 30 seconds, when I hope to reach 1 or 2 seconds).

David Antonini

##### Share on other sites
err... tired, 7am .
Here are the actual links :
bitboards
and

Main

##### Share on other sites
Another myth I think a lot of people believe is that bitboards are the most popular board representation. It''s not true. It''s actually only becomming more popular now, but 0x88 board representation is by far the most popular I believe.

I use bitboards though, and I don''t know if using a non-compiled language for bitboards is a good idea. The reason bitboards are fast enough is because they can make use of fast "least significant 1 bit" functions, making use of the bsf/bsr ASM instructions. The language you use has to have support for inline assembly, or linking of an external assembly. I think finding the location of bits in java or javascript (or any other interpreted language) is going to be quite a bit slower than using bsf/bsr.

Russell

##### Share on other sites
Isn''t it possible to get the least significant bit by doing b&(-b)? Though it''s true this just clears the bitboard of all other bits but the least significant. It alas doesn''t give me the shift to do to bring it to the first square, and javascript has no premade function to do this I am affraid (so my function to get coords of the least significant bit is so far indeed very slow, with up to 18 conditional branches. I am looking for a way to get rid of it, maybe by indexing my squares with one bit bitboards, but I am affraid referencing the square without using an incremental index might then become very slow...). Anyway I''ll switch the program to C++ as soon as I can afford buying it...
BTW Russel, I saw your functions to generate knight and rook moves in the general section and found the rank algo quite clever. I''ll have to compare them speedwise with the use of my pre-calculated moves depending on rank/file/diago occupancy. As I could generate them only for one rank bcs of memory problems, and thus need to rotate the board for rook files and bishop diagos, they might be much faster . Not going to be easy to adapt to my 9*9 board though.

David Antonini

##### Share on other sites
There are free C/C++ compilers you can get if money is a problem.

I think you should be able to find the least significant bit in fewer than 18 branches. You could do something like:

  // This is a most significant bit function, but you get the idea// Untested code, in C/C++int msb(unsigned __int64 bits){    int result = 0;    if (bits & 0xFFFFFFFF00000000)    {        bits >>= 32;        result += 32;    }    if (bits & 0xFFFF0000)    {        bits >>= 16;        result += 16;    }    if (bits & 0xFF00)    {        bits >>= 8;        result += 8;    }    if (bits & 0xF0)    {        bits >>= 4;        result += 4;    }    if (bits & 0xC)    {        bits >>= 2;        result += 2;    }    if (bits & 0x2)    {        bits >>= 1;        result += 1;    }    return result;}

I''m not sure if this is 100% correct, but you get the idea. You basically do a binary search on the 64-bit value. Each if-statement will tell you which half the most significant bit is in. You could also speed this up by using a 256-entry lookup table and save the last few if-statements, and look up the most significant bit in your 8-bit lookup table. That would be very fast if your 8-bit lookup table was in the cache (which no doubt it would).

Russell

##### Share on other sites
I had just changed my GetLSBShift function to use a dichotomic method .
Alas javascript is very surprising sometimes... The function seems even slower on average now . I guess the interpreter loses more time understanding the branching structure than actually testing the conditional branches...
Going to look for some C++ compilers tonight, while waiting to be able to get MSdev .

##### Share on other sites
If you''re not using a compiled language, forget about speed.

If you''re using JS, Java, or anything other ''non-compiled'' language, be carful with your memory. A bitboard (with 1 bb per piece type) will actually take up more memory than an 8x8 array of bytes. If you''ve got to move this data around a lot, you could be wasting many cycles copying data.

Uhm, just as a reference, I''m getting about (average) 350000 Nodes/Sec on a PII-500. Sometimes it''s up over 400000, sometimes as low as 320000. What is a normal benchmark? I know this is hard to tell, because of different static evaluations, but, there must be a ballpark figure somewhere.

Will

##### Share on other sites
As far as using a non-compiled language, you''re right. Don''t worry about speed, because it''s going to be slow. As far as using a non-compiled language with bitboards...ouch, that''s going to be _SLOW_. I think the only way bitboards are able to keep up on 32-bit machines is because of the bsf/bsr x86 instructions.

Regarding nodes per second, don''t bother. Nodes per second don''t mean anything when you''re comparing different engines. Fritz gets about 350,000 nps on my PIII 733 and it''s world class Grandmaster level. I know of other programs that get ridiculously higher nps, but they''re only rated around weak master level, many hundreds of ELO points below Fritz and GM''s.

A brilliant program written in the slowest interpreted language could perform better than an absolute speed demon written in 100% assembler or highly optimized C. My experience has been that chess programmers put way too heavy of an emphasis on how fast their program is. It''s all about algorithms and searching smart, not how fast.

Russell

##### Share on other sites
Thats an interesting point. So a good static evaluation can really go a long way. And it''s true, another 100,000 nodes/sec is really nothing-- not even 1 extra ply.. I just optimzed the hell outta something a few days back, and it shows up as a 2 second savings on a search that normally takes 27 seconds. That doesn''t translate in to anything useful.

Maybe the reason I, and many others, focus so much on execution speed is because our knowledge of chess is somewhat limited.

I can play okay chess (win 50% of my games on InstantChess.com), but not good chess. At best my understanding of the game is superficial, so I focus on what I can: brute force.

Cheers,
Will

##### Share on other sites
Sigloxx,

I''m not too sure about javascript, but in Java I don''t think there is a ''sign bit'' (not in the C/C++ sense anyway). I think Java may store information about the sign separatley from the accesable part of the variable. This prevents wrap-arounds and other binary side-effects most programmers don''t like dealing with.

I ran in to this while writing a Genetic Programming suite in Java. I had a set of op codes defined, where each op code was a bit pattern (optimized to increase effectiveness of random-bit mutation). I kept running in to problems with this, and found that Java was managing the ''sign'' of my numbers for me! In the end I was forced to use 2-byte values just so I could manage all 8 bits of a single byte. I suspect that javascript may have the same ''feature''.

It''s been awhile since I''ve used Java, and it is quite possible that I was going about this entirely wrong.

Trying to manage your ''native'' variables at a low level may be causeing IE to do an enormous amount of work. If I were you, I would forget about using any ''native'' binary type operations. They don''t really exist in Java/JS anyway.

Let me know how it turns out.

Will

##### Share on other sites
quote:
Original post by RPGeezus
So a good static evaluation can really go a long way.

A static evaluation function isn''t the real key either. Forward pruning is what makes top programs really strong. Don''t get me wrong, a static eval is very important, but that isn''t what turns a good engine into a great one. You can double your nps and if you''re lucky you''ll get an extra ply sometimes. So that''s no big deal. What gets you significant improvements is forward pruning, things like null-move. With null-move you DO get many extra ply at the cost of a tiny, tiny, tiny reduction in the computer''s razor sharp tactical ability. So now your engine is using a Mach 3 razor instead of a straight edge razor, but it gets several extra plies.

Null-move is just one of the forward pruning methods (as opposed to backward pruning, which is what alpha-beta does...prunes AFTER it has searched forward. Forward pruning prunes before it searches, so you get way big savings). There are other forward pruning methods, but unfortunately they are used by top commercial programs like Fritz and Chess Tiger and of course they aren''t goin to tell us what methods they use.

Russell

##### Share on other sites
I should have done a classic chess engine instead of a shogi one I guess, as I have played chess a lot with FIDE rating close to 2200. I actually wanted to try it with a game I don''t know well, in the hope that maybe my program would be stronger than me .
About the sign, javascript uses classic 32 bits signed integers as only type of integer, with the sign carried by the heaviest bit. I noticed nothing particular as far as binary operations are concerned.
The speed is indeed awfully slow in javascript... No wonder why the few chess js scripts I found were playing so bad. But by using bitboards I do go roughly 10 times faster than with using a classic 8*8 or 12*12 array representation like I was doing at first. The key is that IE uses a LOT of memory to allocate a new object. My bitboards are 3 integers objects (I use the 27 first bits to represent 9*3 squares). So I defined the bitboards for my position once at initialisation, and during the search I never "copy" them. I use the main position data, and I play and unplay moves on it. I use them as global variables. But well, I still reach only ridiculous speeds... The main slowing factor is in fact the generation of the search tree (using an array). Assigning a value to an element of a javascript array is slow.... Once the quiescent search will be added, the program should play at my lvl though, I hope (wich is very weak hehe).

I came to consult this forum tonight mainly because I noticed something wich gave me an interesting idea wich is probably already used in many programs...
I noticed than when I have considered a move in a given position (let''s say a move at ply 1), and generated all the possible answers at ply2, if I consider a new move at ply1 the possible answers to that new move are most of the time very similar to the answers to the first move. So though it may seem a bit tricky to implement, wouldn''t it be quite fast for a given ply to just generate the answers to a null move, and to consider wich of them became invalid and wich new ones became possible after replacing the null move by the move you want to study?
In the initial position, for instance, the answers to any white move are exactly the same, so using this would speed up move generation a lot...
I suppose this is already used or has been already tried? This seems even more valid for shogi where each camp often plays moves within their camp wich have absolutely no impact on the opponents possible moves. But of course, determining wich new moves became valid or invalid because a move was played, compared to the ones that were valid after a null move, isnt''t easy to code efficiently..
Anyway this wouldn''t help me much while my program is in js as it uses more times puting the 30 possible moves in my tree array than actually determining what they are....

David Antonini.

P.S : I downloaded the freeware borland C++ compiler. Will probably migrate my program to C++ once it''s working, provided I have the time to.

##### Share on other sites
By the way, in javascript, not only assigning a value to an array element is slow. Geting the value contained by an array element is also very slow...
For instance, I rewrote in a better way the getLSBshift function using dichotomy, and it got on average slightly faster than the previous version. I then tried using an array of 512 elements (for my 9*9 squares shogi board), to get immediately the shift on a rank. Replacing my last 4 branches by ONE reference to this array was making 500 000 calls to the function take exactly 2 seconds more (7 to 9 seconds for the branch algo, 9 to 11 for the algo taking the value in a premade array). And of course the 7 to 9 seconds are used mostly because I need to get the values contained by my bitboard wich is an object, not by the branches... If I replace the bitboard elements by 3 integers, the 500 000 calls don''t even take one second (but not using an object nor an array at all for the board representation would make the program a hell to code. And for the search tree representation I don''t even have the choice).

##### Share on other sites
David,

I''m not sure if this is what you have in mind, but what you describe sounds like a transposition table to me. A transposition table keeps track of positions you have already looked at and saves the score. So if you do a 12 ply search from move 1, then on the second move you search you search a position you''ve already looked at, you get a "free" 12 ply search.

Basically the computer says, "Hey I''ve already seen this position and it''s score was +1.5 (or whatever) so I don''t have to do this long search."

For example, from the opening position in chess, the position after 1. Nf3 Nf6 2. Nc3 Nc6 and the position after 1. Nc3 Nc6 Nf3 Nf6 are the same position, so if we already have the score for this position, there''s no point in re-searching from it the second time we see it.

If this is what you had in mind I can explain some more of the details of how you would implement it and give you some links. If not, explain a little more of what you had in mind.

##### Share on other sites
No, I was already aware of transposition tables.
In the initial position for instance :
At the very start of the search, let''s say you want to make a complete min max 2 plys deep, to get a first rough eval (is that Iterative deepening? I don''t remember what is what ).

You are first going to generate all moves for white. You get a list of 20 moves.
You then play the first move (let''s say a3).
You are going to generate a list of all the possible black answers to a3.
You get 20 possible answers, that you generated by testing the legal moves of each piece. You study all these answers and now you pass to the second possible white move (let''s say a4).
Here you don''t really need to test each piece to see what moves it can do. You can just take the list of all possible moves in answer to a3. And in this case the answers to a4 will be exactly the same. You won''t transpose in the same positions, but you don''t need to consider each piece in order to know what moves it can do.

Now later on in middle game it of course won''t be the case (even though in shogi it''s the case for a long time, as there are few long range pieces and pawn tension appears often late). But still, in answer to each move, the list of possible moves is pretty much the same. Some moves became impossible because th previous move removed the piece, or shut some lines. Some other moves become possible because the piece that moved was blocking their lines and it''s no longer the case. But still (especially if you don''t consider moves that leave the king en prise as illegal), the list of black moves in answer to two different white moves will be usually very similar.

So the idea was, in any position where white is to play, to start by considering all black moves in the initial position (by playing a null move, you anyway usually do it if using null move heuristic). Then when you consider a white move that you want to study deeper, you somehow determine wich black pieces move possibilities got affected by that white move (for instance, you have a bitboard with all black rook moves in answer to the null move. If the white move doesn''t put a piece on one of the squares where that black rook could go, no need to check again where this rook could go, you can just use the same move bitboard. Of course the problem is more to determine easily what new moves become available than what moves got forbidden... but a priori this seems doable and might be save time).
Is there anything like this implemented already in some programs?

##### Share on other sites
Ok, I understand what you are trying to do. You''re trying to do incremental move generation. People have tried it before, but never with great success. In fact I''ve never heard of anyone who even got it to work successfully.

The problem is that you have to do a lot of extra work to determine which moves can stay the same and which moves you need to re-generate. The work you do to generate a legal move is really not that much, and so if you spend very much time at all testing to see which moves you need to re-generate, then you probably aren''t going to get any great gains here.

I think bitboards would make it easier, but still, by the time you do a handful of tests to see which moves you need to re-generate, and then re-generate a few of the moves, you will probably end up with little or no gain.

Also, keep in mind that move generation really isn''t what makes a good chess program. If you took any chess program and made it''s move generation routines twice as fast, that would only make the program run maybe 10-20% faster overall (because most of the time is spent in evaluation and other stuff). Chances are you aren''t going to get your move generation to be twice as fast. I''d estimate that you''d be doing well to make it go 10% faster, and that would only mean a few percent faster overall.

There are plenty of other things you can do to get faster before you try something like this. Think about it. This may be a total waste of your time. Implement other things that you know will help your program, such as writing it in a compiled language, maybe upgrade from alpha-beta to a PVS search, add null-move and a transposition table (if you haven''t), add more knowledge to your program''s static eval, etc. Those things will do more to help you out. Do what you KNOW will work first, then experiment when you run out of ideas that you know will work for sure. That''s my advice anyway.

Russell

##### Share on other sites
Thanks Russel . I''ll have to rethink about it for shogi though. The fact there are really few long range pieces might make it more interesting then in classic chess. But I''ll follow your advice and will first finish my javascript one then try it in C++ .
By the way, do you know any place on the net where I could find exemples of how the SEE quiescent search works? (for classic chess that is, I can surely adapt it to shogi)

David Antonini