Started by Nov 04 2010 10:44 PM

,
7 replies to this topic

Posted 04 November 2010 - 10:44 PM

In coding of negamax, using the normal pseudo code,what is the initial call to negamax function from the typical 'get_move' function?

is it

eval=-negamax(...);

or

eval=negamax(...);

My evaluation function returns -1000 for win.

now to code intermidiate positional evaluations, I cannot put values less than -1000 as it eg(-950>-1000) So the computer will rather jump for this -950 rather than -1000 which in turn is weaker move.

also when the sides switch,the computer will go for 1000 than for 950, and so on.

So in this case only one player out of to will play better moves,while other player will not play the best moves.

Can any one actually show me initial call to negamax(I have read some where that on Wikipedia its wrongly written.)

and also some code which will help me to code a sample evaluation,the returning values from evaluation.

here is the reference that I read from

http://www.fierz.ch/strategy1.htm#negamax

is it

eval=-negamax(...);

or

eval=negamax(...);

My evaluation function returns -1000 for win.

now to code intermidiate positional evaluations, I cannot put values less than -1000 as it eg(-950>-1000) So the computer will rather jump for this -950 rather than -1000 which in turn is weaker move.

also when the sides switch,the computer will go for 1000 than for 950, and so on.

So in this case only one player out of to will play better moves,while other player will not play the best moves.

Can any one actually show me initial call to negamax(I have read some where that on Wikipedia its wrongly written.)

and also some code which will help me to code a sample evaluation,the returning values from evaluation.

here is the reference that I read from

http://www.fierz.ch/strategy1.htm#negamax

Posted 05 November 2010 - 01:05 AM

Quote:

Original post by mandar9589

In coding of negamax, using the normal pseudo code,what is the initial call to negamax function from the typical 'get_move' function?

is it

eval=-negamax(...);

or

eval=negamax(...);

It is

eval=-negamax(...);

Quote:

My evaluation function returns -1000 for win.

If what you mean is that your evaluation function returns -1000 when invoked on a position where the player to move has already lost, this is correct.

The code that searches from the current position on the board looks similar to negamax itself, but there are enough differences (iterative deepening, output, returning a move, time control...) that you should keep it as a separate function.

Posted 05 November 2010 - 01:25 AM

Well I cannot agree. Initial call on the root node is simply negamax(...). All problems with negamax notation are because users don't understand the main idea behind it, they simply copy it to their code. I try to explain it as clearly as possible:

1. in your search function you don't care who is really on move, it's always MAX on move,

2. in your evaluation function you take care of who is on move, you return the score from perspective of side on move so if MAX is on move it's MAX_SCORE-MIN_SCORE, if MIN is on move it's MIN_SCORE-MAX_SCORE, MAX_SCORE>0 and MIN_SCORE>0 so returned value is always positive when side on move is winning, negative when side on move is losing, no matter who is on move, it's always true,

3. because evaluation is made on successor it's done from the perspective of opponent and we are interested in the best score for me (from my perspective) and it's minus score of successor (opponent),

4. win, lose and draw are handled from within search function (from perspective of MAX player).

1. in your search function you don't care who is really on move, it's always MAX on move,

2. in your evaluation function you take care of who is on move, you return the score from perspective of side on move so if MAX is on move it's MAX_SCORE-MIN_SCORE, if MIN is on move it's MIN_SCORE-MAX_SCORE, MAX_SCORE>0 and MIN_SCORE>0 so returned value is always positive when side on move is winning, negative when side on move is losing, no matter who is on move, it's always true,

3. because evaluation is made on successor it's done from the perspective of opponent and we are interested in the best score for me (from my perspective) and it's minus score of successor (opponent),

4. win, lose and draw are handled from within search function (from perspective of MAX player).

Posted 05 November 2010 - 02:35 AM

thanks for replying.

actually when I implemented this code in my game of connect 4,I can just verify whether the game is lost for won or is in progress.

eval=-1000; is the game ending condition(where game is strictly won by some one, draw condition is excluded).

However when I try to make evaluate more sophisticated, by adding three in a row, giving intermediate values (between 0 to -1000), the computer player plays weaker.

I made a match of 20 games against two computer programs differing only in there evaluations. Surprisingly the computer with only winning conditions beat sophisticated evaluation(10-0 with 10 draws) draws happened when simple evaluation played as second player.

@BitSet:

perhaps code will be simpler to understand my implementation.

evaluation class

So I made changes here,

So does your suggestion and my code match?

P.S: Since I am only posting part of methods, I removed all debugging statements so that the size of post will reduce.

actually when I implemented this code in my game of connect 4,I can just verify whether the game is lost for won or is in progress.

eval=-1000; is the game ending condition(where game is strictly won by some one, draw condition is excluded).

However when I try to make evaluate more sophisticated, by adding three in a row, giving intermediate values (between 0 to -1000), the computer player plays weaker.

I made a match of 20 games against two computer programs differing only in there evaluations. Surprisingly the computer with only winning conditions beat sophisticated evaluation(10-0 with 10 draws) draws happened when simple evaluation played as second player.

@BitSet:

perhaps code will be simpler to understand my implementation.

evaluation class

public static int evaluate(char[][] posn, boolean white) {

int eval = 0;

'1'=empty

//Winning.

for (row = 0; row < rows; row++) {

for (column = 0; column < cols-3; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row][column + 1]

&& posn[row][column] == posn[row][column + 2]

&& posn[row][column] == posn[row][column + 3]) {

eval = fscore;

return eval;

}

}

}

// check for a vertical win

for (row = 0; row < rows-3; row++) {

for (column = 0; column < cols; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row + 1][column]

&& posn[row][column] == posn[row + 2][column]

&& posn[row][column] == posn[row + 3][column]) {

eval = fscore;

return eval;

}

}

}

// check for a diagonal win (positive slope)

for (row = 0; row < rows-3; row++) {

for (column = 0; column < cols-3; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row + 1][column + 1]

&& posn[row][column] == posn[row + 2][column + 2]

&& posn[row][column] == posn[row + 3][column + 3]) {

eval = fscore;

return eval;

}

}

}

// check for a diagonal win (negative slope)

for (row = 3; row < rows; row++) {

for (column = 0; column < cols-3; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row - 1][column + 1]

&& posn[row][column] == posn[row - 2][column + 2]

&& posn[row][column] == posn[row - 3][column + 3]) {

eval = fscore;

return eval;

}

}

}

//3in a row

for (row = 0; row < rows; row++) {

for (column = 0; column < cols-2; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row][column + 1]

&& posn[row][column] == posn[row][column + 2]) {

eval=eval+ _3inrow;

}

}

}

// check for 3-a vertical win

for (row = 0; row < rows-2; row++) {

for (column = 0; column < cols; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row + 1][column]

&& posn[row][column] == posn[row + 2][column]) {

eval=eval+ _3inrow;

}

}

}

// check for 3-a diagonal win (positive slope)

for (row = 0; row < rows-2; row++) {

for (column = 0; column < cols-2; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row + 1][column + 1]

&& posn[row][column] == posn[row + 2][column + 2]){

eval=eval+ _3inrow;

}

}

}

// check for 3- a diagonal win (negative slope)

for (row = 2; row < rows; row++) {

for (column = 0; column < cols-2; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row - 1][column + 1]

&& posn[row][column] == posn[row - 2][column + 2]) {

eval=eval+ _3inrow;

}

}

}

return eval;

}

So I made changes here,

public static int evaluate(char[][] posn, boolean white) {

char color = white ? 'x' : 'o';

int eval = 0;

//Winning.

for (row = 0; row < rows; row++) {

for (column = 0; column < cols - 3; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row][column + 1]

&& posn[row][column] == posn[row][column + 2]

&& posn[row][column] == posn[row][column + 3]) {

eval = fscore;

if (color == 'x') {

return eval;

}

if (color == 'o') {

return -eval;

}

}

}

}

// check for a vertical win

for (row = 0; row < rows - 3; row++) {

for (column = 0; column < cols; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row + 1][column]

&& posn[row][column] == posn[row + 2][column]

&& posn[row][column] == posn[row + 3][column]) {

eval = fscore;

if (color == 'x') {

return eval;

}

if (color == 'o') {

return -eval;

}

}

}

}

// check for a diagonal win (positive slope)

for (row = 0; row < rows - 3; row++) {

for (column = 0; column < cols - 3; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row + 1][column + 1]

&& posn[row][column] == posn[row + 2][column + 2]

&& posn[row][column] == posn[row + 3][column + 3]) {

eval = fscore;

if (color == 'x') {

return eval;

}

if (color == 'o') {

return -eval;

}

}

}

}

// check for a diagonal win (negative slope)

for (row = 3; row < rows; row++) {

for (column = 0; column < cols - 3; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row - 1][column + 1]

&& posn[row][column] == posn[row - 2][column + 2]

&& posn[row][column] == posn[row - 3][column + 3]) {

eval = fscore;

if (color == 'x') {

return eval;

}

if (color == 'o') {

return -eval;

}

}

}

}

//3in a row

for (row = 0; row < rows; row++) {

for (column = 0; column < cols - 2; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row][column + 1]

&& posn[row][column] == posn[row][column + 2]) {

eval = eval + _3inrow;

}

}

}

// check for 3-a vertical win

for (row = 0; row < rows - 2; row++) {

for (column = 0; column < cols; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row + 1][column]

&& posn[row][column] == posn[row + 2][column]) {

eval = eval + _3inrow;

}

}

}

// check for 3-a diagonal win (positive slope)

for (row = 0; row < rows - 2; row++) {

for (column = 0; column < cols - 2; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row + 1][column + 1]

&& posn[row][column] == posn[row + 2][column + 2]) {

eval = eval + _3inrow;

}

}

}

// check for 3- a diagonal win (negative slope)

for (row = 2; row < rows; row++) {

for (column = 0; column < cols - 2; column++) {

if (posn[row][column] != '1'

&& posn[row][column] == posn[row - 1][column + 1]

&& posn[row][column] == posn[row - 2][column + 2]) {

eval = eval + _3inrow;

}

}

}

//....

if (color == 'o') {

return -eval;

}

return eval;

}

So does your suggestion and my code match?

P.S: Since I am only posting part of methods, I removed all debugging statements so that the size of post will reduce.

Posted 05 November 2010 - 02:40 AM

Quote:

Original post by BitSet

Well I cannot agree.

What is it you don't agree with?

Posted 05 November 2010 - 03:41 AM

Quote:

Original post by alvaro

Quote:

Original post by BitSet

Well I cannot agree.

What is it you don't agree with?

I don't agree there is minus in the front of negamax root node call, for example in ID framework.

I'm talking about this (from TSCP 1.81):

for (i = 1; i <= max_depth; ++i) {

follow_pv = TRUE;

x = search(-10000, 10000, i);

if (output == 1)

printf("%3d %9d %5d ", i, nodes, x);

else if (output == 2)

printf("%d %d %d %d",

i, x, (get_ms() - start_time) / 10, nodes);

if (output) {

for (j = 0; j < pv_length[0]; ++j)

printf(" %s", move_str(pv[0][j].b));

printf("\n");

fflush(stdout);

}

if (x > 9000 || x < -9000)

break;

}

The question was "Do we put minus in the front of search call?" and the answer is we don't do it. We do it here:

/* loop through the moves */

for (i = first_move[ply]; i < first_move[ply + 1]; ++i) {

sort(i);

if (!makemove(gen_dat[i].m.b))

continue;

f = TRUE;

x = -search(-beta, -alpha, depth - 1);

takeback();

if (x > alpha) {

/* this move caused a cutoff, so increase the history

value so it gets ordered high next time we can

search it */

history[(int)gen_dat[i].m.b.from][(int)gen_dat[i].m.b.to] += depth;

if (x >= beta)

return beta;

alpha = x;

/* update the PV */

pv[ply][ply] = gen_dat[i].m;

for (j = ply + 1; j < pv_length[ply + 1]; ++j)

pv[ply][j] = pv[ply + 1][j];

pv_length[ply] = pv_length[ply + 1];

}

}

As I explained in point 3. of my explanation.

Quote:

So does your suggestion and my code match?

I don't think so. In evaluation we count points for MAX player (MAX_SCORE) and MIN player (MIN_SCORE). Depending on side on move we return MAX_SCORE-MIN_SCORE (MAX on move) or MIN_SCORE-MAX_SCORE (MIN on move), where MIN_SCORE>0 and MAX_SCORE>0.

If you are still confused download source code of TSCP 1.81 and have a look how it is done.

Posted 05 November 2010 - 06:57 AM

Quote:

Original post by BitSet

I don't agree there is minus in the front of negamax root node call, for example in ID framework.

I'm talking about this (from TSCP 1.81):

*** Source Snippet Removed ***

Every alpha-beta search I've ever written had the loop over moves in the get_move function. I think TSCP is a bit of a toy and doesn't do some of the things you would normally want to do. For instance, whenever the program changes its mind and discovers a new move, I like it to print out the new principal variation (and all decent chess programs do this). I also want to keep the order of the moves from iteration to iteration as the depth increases.

So yes, the answer depends on whether you make a single call to negamax for the root (no minus sign) or if the function that searches the root has a loop over moves (requires minus sign).

EDIT: If you look at Crafty, the function called Iterate (which does the iterative deepening) calls SearchRoot without changing the sign, similarly to what TSCP does. SearchRoot then has a loop over moves and calls Search (the recursive function) for each one of them, with a minus sign. As long as you understand the logic, you can do it however you want.

Posted 05 November 2010 - 11:08 AM

TSCP is quite good but it cannot be compared to Crafty because Crafty is much more sophisticated. TSCP prints PV every iteration of ID loop. Some engines use separate search function for root node like Crafty does.