Hashing connect -4 board (for Transposition table)..need some help :)

Started by
81 comments, last by Tom Sloper 8 years, 3 months ago

Hi all. I am sorry for the post flood i am doing here lately smile.png today's topic: Transposition table for my connect 4 game.
I am programming in java and when looking for a scheme to hash a board (array of 42 ints: 1 for white, 2 for black and 0 for empty) i first tried

to put it in a long number.. but java won't let me to have a long number with 42 digits (even the LONG type). It says that it is: out of range (??)

sooo.. I am now doing it another way: i am "casting" this 42 int array into a string instead of a number, that means everytime i want to hash a board, i pass through all the array indexes and if the current index in the array is 1 , i concatenate "1" to the string...and so on until i finish.
In the end i get something like this: "12200011"... etc'. and after i done with this operation ,i take this string and using the java Hash() function, i am giving it a hash number. this hash number is the key code for the board in my TT.
what do you think about this scheme? I read about zobrist keys but i just couldn't understand how exactly i should parse a board of 42 ints (1,2 or 0) into a number of 42 digits / charachters / bits... whatever.
I wish i could just do 12200001 etc', than it would be simple and unique, but i cant

I know that working with strings like i do takes a lot of time, and i am also not sure if the hahing is good / unique. I will be very very glad if any of you can help me thourgh it (cause i am a total noob smile.png ) and guide me on what is the best way to hash a board position.

Thanks in advance!

Advertisement

"java won't let me to have a long number with 42 digits (even the LONG type). "

A long is at most 8 bytes -- that's 64 bits. That means you can have 64 zeros or ones. However, you've stated that you need 0, 1 and 2. So we need a base 3 system. The problem with this is you'd need to take up 2 bits a piece to store 0, 1, or 2. So now you can only cram, at most, 32 bits in to a long type. The long just doesn't have enough memory to store the entire board.

There are many different solutions: For example: you could use a multidimensional array to simply store bytes. byte[][] ?

If you want a rather 'tricksy' solution, you could do what I once did in school:

Create a table that holds Player 1's pieces (0 = no piece, and 1 is a piece), and create another table for Player 2's pieces (0 = no piece, and 1 is a piece). Since we're now back to using base 2 system, we can hold all of player 1's pieces in a long, and all of player 2's pieces in a long.

THEN if we want to see if we have a valid move, we can go ahead and make the move on the respective player's long and simply do a logical AND compare - if you have two values overlapping the logical AND will return true (at which point, you need to undo the move so that the board is in a valid state).

If you just want to see where the next available space is on a board, then you will logically OR/XOR the two values and anywhere there is a zero is an empty space. I'll leave the finer shifting details to you (or if you make an effort I can lend a hand).

If you want to see if a player has won after their move, if you are feeling super duper excited, you could probably create a dictionary of winning states for the board and logically AND your board against these states (you could systematically remove boards that are definitely not going to happen in a game as well, so that as the game goes on, the number of boards checked for victory goes down).

As I said, there are lots of ways to do what you're asking - this is another one :D

Additionally, if you are not too comfortable with performing bitwise operations -- look in to Java's BitSet.

AfroFire | Brin"The only thing that interferes with my learning is my education."-Albert Einstein

thanks you for the answer, and although the bits operation stuff is a bit vague to me right now, i am trying to unserstand the first part of your solution.

So basically i need to do two bytes arrays (or one two dimentional) in length of 42 (for all the board squares) and each one of them will represent the pieces of one of the two players.(0 for no piece and 1 for a piece)

so for example, the byte array for the first player can look like this? 000000000000000000000000000000000000001000 here the first player just played his first move in the middle square at the buttom of the board. so now every board representation ,instead of one array of 42 int's (0,1 or 2) will be those two Bytes arrays?

and sorry for my ignorance, but even if i have those two bytes array for the players' pieces, how can i transpose each one of them to a Long type number?
and after i do that, how should i hash those two long numbers into one hash initeger so i can store in my TT? by the way,thanks for the java's Bitset tip, i think i will be using that :)



Actually you can hash a 7x6 Connect-4 board fairly simply in 63 bits, since you know that the empty holes must always appear above the pieces:

7*6 = 42 bits for the piece colours (0 = white or empty, 1 = black)

7*3 = 21 bits for the height of the columns (a three bit number containing a height of 0-6, 7 is invalid)

John Tromp taught me of a slight variation of what hovermonkey suggests: Reserve an extra bit per column and encode a position by marking the first empty spot in each column and each white disk. That will give you a 49-bit unique encoding of the position.

EDIT: Here's what the code would look like (in C++, because I don't know Java):


/*
Square numbering:
 
 6  13  20  27  34  41  48  <-- extra row to mark the first empty spot, even when the column is full
 5  12  19  26  33  40  47
 4  11  18  25  32  39  46
 3  10  17  24  31  38  45
 2   9  16  23  30  37  44
 1   8  15  22  29  36  43
 0   7  14  21  28  35  42
*/
 
typedef unsigned long long u64;

u64 const BOTTOM_ROW = (1ull<<0) | (1ull<<7) | (1ull<<14) | (1ull<<21) | (1ull<<28) | (1ull<<35) | (1ull<<42);

struct Position {
  u64 white;
  u64 black;
  int move_number;
};

u64 encode(Position const &p) {
  u64 occupied = p.white | p.black;
  return ((occupied << 1) | BOTTOM_ROW) ^ p.black;
}

FURTHER EDIT: Sorry I had to fix my sloppy code.

thanks for the great advices. I also read the john tromp's code but except for the first couple of lines i couldn't understand exactly what's going on.
let's say i want to implement what hovermonkey suggest.. how do i do that? what do you mean by 0 = white or empty? how can i tell the difference than?
and when you say 3 bit number containing a height, how does it look like?

again,I am very sorry for my ignorance but i am rather new java programmer and programmed only in OOP fashion up until now, i have no idea about bits and bits manipulation so i am getting a hard time understand this stuff. but i (kinda) got the basic idea , it's mostly the technical part that holds me back.

I think hovermonkey and alvaro are suggesting the same solution.

First: recognize that one byte is 8 bits. Each bit is a switch that is either on or off.


/*
Each COLUMN is a BYTE (8 bits). The last/first (you decide) 3 bits tell you where the empty spot is (this is the parenthesis number). Since you have 3 bits, you can count from 0-7. You need 0-4 to mark where the empty spot is in the rows below. Then you can use whatever information is left to tell you what colored piece is left in the last row (once you've filled up rows 0-4).
 
c5: (0)  (0)  (0)  (0)  (0)  (0)  (0)  <-- extra row to mark the first empty spot. 
c4:  0    0    0    0    0    0    0
c3:  0    0    0    0    0    0    0
c2:  0    0    0    0    0    0    0
c1:  0    0    0    0    0    0    0
c0:  0    0    0    0    0    0    0

Zero can be white or nothing because of the number in the top. If you KNOW where the last available spot is, then you know that any zeros below that point are white pieces and you know that anything above that is empty. 

For example, imagine this board state:

c5: (3)  (4)  (2)  (0)  (0)  (0)  (0)  <-- extra row to mark the first empty spot. 
c4:  0    0    0    0    0    0    0
c3:  0    0    0    0    0    0    0
c2:  1    0    0    0    0    0    0
c1:  1    0    1    0    0    0    0
c0:  0    0    1    0    0    0    0

This is a win for the white player - Can you see why? Look at the 2nd column (the one with (4) at the top.) The 4 tells us that the free spot on this column is at c4, which means c0-c3 are all white pieces since in this case 0 is white. 1 is black.

Is this clear?

*/

AfroFire | Brin"The only thing that interferes with my learning is my education."-Albert Einstein


I think hovermonkey and alvaro are suggesting the same solution.

hovermonkey's suggestion uses 63 bits, while mine uses 49 bits only, so they can't possible be the same.

What I suggested (again, John Tromp's idea, not mine) uses n_rows+1 bits per column. In each column the highest-set bit tells you how many pieces are in that column. The lower bits indicate whether those spaces are occupied by a white (1) or black (0) piece.

In hovermonkey's idea, the number of pieces in a column is encoded by a 3-bit number.

thanks for the further explanations, i think i am starting to get it, but i will read it a couple more times before i'll start to write some code.
and one more thing, let's say i am working with bit board and i want to hash a board position, so now instead of hashing it, the key will be the bitboard ,right?

thanks for the further explanations, i think i am starting to get it, but i will read it a couple more times before i'll start to write some code.
and one more thing, let's say i am working with bit board and i want to hash a board position, so now instead of hashing it, the key will be the bitboard ,right?

No, the bitboard-based representation most likely will consist of two bitboards, one per player, as I showed in the piece of code above (I would probably use an array of 2 bitboards instead of calling them `white' and `black'). The function to hash it could then be the one I posted called `encode'.

This topic is closed to new replies.

Advertisement