Problem in C with NegaMax / Alpha-Beta (Connect 4)

Jaap klok · 2012-05-28T08:33:38

Hello, This might be a bit late, but i finally got my version af a Connect Four AI working (in C#). The link to my code is here: https://sourceforge.net/projects/connectfour85/ Maybe this can help you out a little bit. My code is currently very simple, without any heuristics or alpha-beta pruning, but it seems to work correctly!

Artificial Intelligence Programming

Started by Giustino May 23, 2012 06:49 PM

58 comments, last by Jaap85 11 years, 11 months ago

Giustino

107

Author

May 23, 2012 09:21 PM

Back sooner as i expected.

My IA_Evaluation function returns 0 like you asked.
But it's still the same problem.
I've more details: all the scores are equal to 0 and when there is a winning/losing move it's equal to 1000 (but the AI doesn't act as expected, she inserts coins in columns order in the loop). She fills column by column

Debug:





Test column 3

FREE column 3 / SCORE: 1000

BEST column 3 (score 1000)

Test column 2

FREE column 2 / SCORE: 1000

Test column 4

FREE column 4 / SCORE: 1000

Test column 1

FREE column 1 / SCORE: 1000

Test column 5

FREE column 5 / SCORE: 1000

Test column 0

FREE column 0 / SCORE: 1000

Test column 6

FREE column 6 / SCORE: 1000

when i'm at this point:

And before that, all scores to 0.

alvaro

21,604

May 23, 2012 10:37 PM

From that position, every move is a winning move. You didn't encourage your program to win in as few moves as possible. If you replace 1000 with 1000-number_of_moves_played it should do the right thing.

EDIT: Actually, that's wrong. Sorry. I don't have time to look at it now, but I will later.

Giustino

107

Author

May 24, 2012 12:07 AM

Ok no problem, thank you.

By the way, with my grid's verification method, i wouldn't be able to generate a score for coins alignment because it doesn't scan the entire grid, would i ?
But if it doesn't matter and if other parameters can be used (like the number of coins red/yellow), it's going to be ok.
I'd be grateful for a clarification about it.

alvaro

21,604

May 24, 2012 02:07 AM

I am back to thinking that there might not be anything wrong with your search. You make a depth-4 search after moving in the middle column and you see that you can win. All the other moves return scores of 1000 as well, but the way alpha-beta works, that only means that their scores are not higher than 1000. Then it plays in the center, which is a winning move.

Do you have another position where you think it's doing the wrong thing?

Giustino

107

Author

May 24, 2012 11:14 AM

For now, i don't

Because it plays column by column (problem with the evaluation)

alvaro

21,604

May 24, 2012 01:40 PM

For the evaluation function, you are going to have to identify where threats exist on the board. By "threat" I mean an empty place on the board that is aligned with three pieces of one player, so it the player were to play there he would win. A strong evaluation function needs to know the difference between threats that happen on even and odd rows (with the top row being even), and how multiple threats combine (e.g., player 1 wins with an odd threat, player 2 wins with an even threat, but an odd threat by player 1 beats any number of even threats by player 2). Read this or buy the book. You can also give small bonuses for things like controlling squares in the central column.

You also need to be able to search much much further than depth 4. Here are some things that can help:

You are currently using some simple heuristics to determine what order to use to explore moves in the search (3,4,2,5,1,0,6), but one can do much better than that by using statistics on what moves have produced beta cuts before. You can see how "history heuristic" is used in chess and adapt it for connect 4.
You then need transposition tables, because in this game it's very easy to reach the same position through different paths, and you want to reuse what you learned when you saw this position before. You can also use the hash tables to store the most promising move, to be searched first when you encounter this position again.
Instead of doing a fixed-depth search, you should use iterative deepening, where you search depth 1, depth 2, depth 3... until you run out of time. The early searches are not a waste of time because they fill out the hash tables and make the later searches faster. Iterative deepening also allows for proper time control.

That should keep you busy for a while.

Giustino

107

Author

May 24, 2012 03:13 PM

A strong evaluation function needs to know the difference between threats that happen on even and odd rows (with the top row being even), and how multiple threats combine (e.g., player 1 wins with an odd threat, player 2 wins with an even threat, but an odd threat by player 1 beats any number of even threats by player 2).

Sorry but i don't get it. Do you have an example for that "even/odd" thing ?
I've read your page but can't understand everything in your explanation.

Giustino

107

Author

May 24, 2012 03:51 PM

Should it be something like this ?
Knowing that "nbAlign" contains the max number of coins aligned for "joueur" in "colonne".



int IA_Evaluation(int colonne, int joueur, int nbAlign)

{

int score = 0;

if (nbAlign == 3)

{

  score = SCORE_3J;

}

else

{

  if (nbAlign == 2)

  {

   score = SCORE_2J;

  }

  else

  {

   if (nbAlign == 1)

   {

	score = SCORE_1J;

   }

  }

}

if (colonne == 2 || colonne == 3 || colonne == 4)

{

  score += SCORE_COL_MIL;

}

else

{

  score += SCORE_COL_AUT;

}

if (joueur == 0)

{

  return -score;

}

return score;

}

Because it's the same problem, almost everytime the same score for all columns.
So it fills column by column like before: 3-2-4-1-5-0-6

Giustino

107

Author

May 24, 2012 06:33 PM

Now, the whole thing works (i think, right now).

Except this problem:



Test column 3

FREE column 3 / SCORE: 1000

BEST column 3 (score 1000)

Test column 2

FREE column 2 / SCORE: 1000

Test column 4

FREE column 4 / SCORE: 1000

Test column 1

FREE column 1 / SCORE: 1

Test column 5

FREE column 5 / SCORE: 1000

Test column 0

FREE column 0 / SCORE: 1000

Test column 6

FREE column 6 / SCORE: 1000

And it should be the opposite

Everytime i can beat the AI, the "bad" column has a low score and the others have 1000.
Any idea where it comes from ?

Finally, the code above is a good result but it should take the MIN and not the MAX in that case.
What's the problem ?

alvaro

21,604

May 24, 2012 07:14 PM

I don't understand your description of the situation where you think there is a problem. You keep posting those blocks of output, but I don't really know what they mean, since the code you posted doesn't print anything.

If alpha-beta has found a move with a score of 1000, if the score of any other move is reported to be 1000 or less it just means that it is not more than 1000. So you should essentially ignore the exact value returned.

Finally, the code above is a good result but it should take the MIN and not the MAX in that case.
What's the problem ?

Ah, I see an error now:
scoreCoup = IA_NegaMaxAlphaBeta(4, -100000, 100000, grille, emplacements, colonne, emplacement, joueur);
should be
scoreCoup = -IA_NegaMaxAlphaBeta(4, -100000, -scoreMeilleur, grille, emplacements, colonne, emplacement, joueur);

EDIT: Actually, that is still not right. Let's try again:
scoreCoup = -IA_NegaMaxAlphaBeta(4, -100000, -scoreMeilleur, grille, emplacements, colonne, emplacement, joueur^1);

Problem in C with NegaMax / Alpha-Beta (Connect 4)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Problem in C with NegaMax / Alpha-Beta (Connect 4)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines