HelpEvaluationFunctionConnect4

Started by
19 comments, last by JamesThiago 8 years, 11 months ago
Let's start at the very beginning. An agent to play a game like this will generally follow one of three strategies:
(1) A list of conditionals, like "if you can win, do it; if you can block the opponent from winning, do it; if you can create a double threat for victory, do it", etc.
(2) Minimax search (with alpha-beta pruning and perhaps other improvements, like move-ordering heuristics, transposition tables and iterative deepening).
(3) Monte Carlo Tree Search (again with some improvements, like RAVE and so-called heavy playouts).

I don't recommend (1) if you want a strong agent. (2) is the standard approach for this game and it's the basis of every chess program out there, so there is a lot of good information available about it. (3) is a more recent development, which has been extremely successful in computer go, and can be applied to a class of games much larger than minimax (like games with randomness, games with simultaneous decisions and games with more than two players).

I recommend attempting (2). You can use a random number between -1 and 1 as a first attempt at an evaluation function. If the search is good enough, that's already a very tough player to beat. Once you get to that point, it would make sense to ask the question you are asking, and then you can re-read the answers above.

Here's some pseudocode for a version of minimax called NegaMax that is how I think you should start. In NegaMax, the convention is that the score returned is always from the point of view of the player that is about to move.

double NegaMax(Position &position, int depth_left) {
  if (position.game_over())
    return position.final_score(); // final_score from the point of view of the player to move, of course.
  
  if (depth_left <= 0 && position.no_immediate_threat())
    return static_evaluation(position); // static_evaluation also follows the NegaMax sign convention.
  
  Move moves[7];
  int n_moves = position.generate_moves(moves);
  
  double best_score = -Infinity;
  for (int i = 0; i < n_moves; ++i) {
    position.make_move(move[i]);
    double score = -NegaMax(position, depth_left - 1); // Notice the minus sign!
    position.undo_move(move[i]);
    if (score > best_score)
      best_score = score;
  }  
  
  return best_score;
}
You should have a separate function (search_root would be a good name) that has a similar loop over available moves, but this one should return a move, not a score. It can also print some information about the results of the search, implement iterative deepening, etc.

See if you can get that working.

At this point I need to see some serious effort on your part before I waste any more time on this.
Advertisement

Let's start at the very beginning. An agent to play a game like this will generally follow one of three strategies:
(1) A list of conditionals, like "if you can win, do it; if you can block the opponent from winning, do it; if you can create a double threat for victory, do it", etc.
(2) Minimax search (with alpha-beta pruning and perhaps other improvements, like move-ordering heuristics, transposition tables and iterative deepening).
(3) Monte Carlo Tree Search (again with some improvements, like RAVE and so-called heavy playouts).

I don't recommend (1) if you want a strong agent. (2) is the standard approach for this game and it's the basis of every chess program out there, so there is a lot of good information available about it. (3) is a more recent development, which has been extremely successful in computer go, and can be applied to a class of games much larger than minimax (like games with randomness, games with simultaneous decisions and games with more than two players).

I recommend attempting (2). You can use a random number between -1 and 1 as a first attempt at an evaluation function. If the search is good enough, that's already a very tough player to beat. Once you get to that point, it would make sense to ask the question you are asking, and then you can re-read the answers above.

Here's some pseudocode for a version of minimax called NegaMax that is how I think you should start. In NegaMax, the convention is that the score returned is always from the point of view of the player that is about to move.


double NegaMax(Position &position, int depth_left) {
  if (position.game_over())
    return position.final_score(); // final_score from the point of view of the player to move, of course.
  
  if (depth_left <= 0 && position.no_immediate_threat())
    return static_evaluation(position); // static_evaluation also follows the NegaMax sign convention.
  
  Move moves[7];
  int n_moves = position.generate_moves(moves);
  
  double best_score = -Infinity;
  for (int i = 0; i < n_moves; ++i) {
    position.make_move(move[i]);
    double score = -NegaMax(position, depth_left - 1); // Notice the minus sign!
    position.undo_move(move[i]);
    if (score > best_score)
      best_score = score;
  }  
  
  return best_score;
}
You should have a separate function (search_root would be a good name) that has a similar loop over available moves, but this one should return a move, not a score. It can also print some information about the results of the search, implement iterative deepening, etc.

See if you can get that working.

At this point I need to see some serious effort on your part before I waste any more time on this.

Thanks Alvaro...I'm working on it right now.

I was trying to develop this and I got:


double evaluate(int player, int level, double alpha, double beta)
{
    int i;
    double goodness, best, maxab;
    
    if (poll_function && next_poll <= clock()) {
        next_poll += poll_interval;
        (*poll_function)();
    }

    if (level == depth)
        return goodness_of(player);
    else {
        /* Assume it is the other player's turn. */
        best = -(INT_MAX);
        maxab = alpha;
        for(i=0; i<size_x; i++) {
            if (current_state->board[drop_order[i]][size_y-1] != C4_NONE)
                continue; /* The column is full. */
            push_state();
            drop_piece(other(player), drop_order[i]);
            if (current_state->winner == other(player))
                goodness = -(goodness_of(player)) + (level - depth) * 0.01;
            else
                goodness = evaluate(other(player), level, -beta, -maxab);
            if (goodness > best) {
                best = goodness;
                if (best > maxab)
                    maxab = best;
            }
            pop_state();
        }

        /* What's good for the other player is bad for this one. */
        return -best;
    }
}

is something like that?

That code has some problems, but I really think you should start with NegaMax with no alpha-beta pruning, just because it's the easiest version of minimax to get right.

Also, I would not worry about polling for the time being.

That code has some problems, but I really think you should start with NegaMax with no alpha-beta pruning, just because it's the easiest version of minimax to get right.

Also, I would not worry about polling for the time being.

alright....ill change it

I did function using alpha beta and my only problem now is my evaluation function....there are 3 agents(evaluation functions that return a double between -1 and 1):


double agent_random(Game_state *current_state, int player, int x, int y)
{
   return (double) rand() / (RAND_MAX / 2) - 1;
}

double agent_simple(Game_state *current_state, int player, int x, int y)
{
   double score = 0;
   int is_game_won = game_won(current_state);
   
   if (is_game_won == player)
      score = 1;
   else if (is_game_won == other(player))
      score = -1;   
   else 
   {
      score += 0.5 * agent_random(current_state, player, x, y);
      /* score += 0.3 * super_cool_heuristic_a(current_state, player, x, y) */ 
      /* score += 0.8 * super_cool_heuristic_b(current_state, player, x, y) */
   }
  
   return score;   
}

but my main agent is not defeating neither of them...i put to play random against mine and is so so, i put simple against my main and is not good....I know some strategies to win but i cannot put in the code....i just did this:


double agent_s3450124(Game_state *current_state, int player, int x, int y){
        double score = 0;
   int is_game_won = game_won(current_state);

   if (is_game_won == player)
      score = 1;
   else if (is_game_won == other(player))
      score = -1;
   else if(y==2)||(y==3)||(y==4)
   {
      score = 0.7;
   }
   else
   {
        score += 0.8 * ((double) rand() / (RAND_MAX / 2) - 1);
   }

   return score;
}
There is no need to handle victory conditions in an evaluation function: The search should handle that.

What kinds of things can't you put into code?

There is no need to handle victory conditions in an evaluation function: The search should handle that.

What kinds of things can't you put into code?

I have a lot of files...It's a bit hard to explain how all of them work...my work is just create a better agent than those....the other agents are working, but my cannot beat of of them....i need to create a stronger, that's my job...

17326208202_0d7e054bcc_m.jpg

that's the game, we choose the "players" and mine one is s3450124....so it's the function that needs to return a value between -1 and 1....so, I've posted my current code for my agent, sometimes it doesn't block the simple agent....for example, when i drop a piece in the center, and my opponent drop just above me and i drop in the left or right of my first drop....some strategies like that, I could put to work.

If you wanna see all the files, i can show you to a better understanding...

thanks!


double test(Game_state *current_state, int player, int x, int y){
   int rightmostHeight = 0;
   int width = current_state->width;
   int height = current_state->height;

   /* Loop until we reach the top of the board or an empty square */

        while (rightmostHeight < height &&
        (current_state->board[width - 4][rightmostHeight] != C4_NONE))
        {
        rightmostHeight ++;
        }
        if((current_state->board[width - 4][rightmostHeight] == C4_NONE)){
         while (rightmostHeight < height &&
                (current_state->board[width - 5][rightmostHeight] != C4_NONE))
        {
         rightmostHeight++;
        }
   }else  if((current_state->board[width - 5][rightmostHeight] == C4_NONE)){
         while (rightmostHeight < height &&
      (current_state->board[width - 3][rightmostHeight] != C4_NONE))
        {
         rightmostHeight++;
        }
   }else{
        rightmostHeight += ((double) rand() / (RAND_MAX / 2) - 1);
   }
   return rightmostHeight;
}

double agent_s3450124(Game_state *current_state, int player, int x, int y){
        double score = 0;
   int is_game_won = game_won(current_state);
 if (is_game_won == player)
      score = 1;
   else if (is_game_won == other(player))
      score = -1;
   else{
        score += 0.1 *test(current_state, player, x, y);
   }
   return score;
}

This is my evaluation function(agent_s345012()) at the moment...

I am still quite confused as to what you are doing. Is the main program calling the agent for each legal move on the current position and picking the maximum? Is that all it's doing? Or is it searching for a while using something like alpha-beta search and using those functions you keep posting as evaluation functions at the leaves of the tree?

By the way, if you go in the center, your opponent above you and you go one step to one side, you just lost the game (if your opponent knows what he's doing, of course).

This topic is closed to new replies.

Advertisement