Jump to content

  • Log In with Google      Sign In   
  • Create Account


Problem in C with NegaMax / Alpha-Beta (Connect 4)


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
59 replies to this topic

#1 Giustino   Members   -  Reputation: 105

Like
0Likes
Like

Posted 23 May 2012 - 12:49 PM

Hi everyone,

I'm facing a problem with negamax. Think it comes from my eval function but i'm not sure at all.
First, can you tell me if my negamax and his call are ok ?
Because i don't find any solution, even if i read a lot about it these days.

Before showing you the code, i have to tell you that my verification function is based on (x,y) position (correspond to the last coin inserted with X = column and Y = line). Is this a portable method with negamax/alpha-beta or do i have to check the entire grid ?

Here is my negamax function:
int IA_NegaMaxAlphaBeta(int profondeur, int alpha, int beta, int grille[LIGNES][COLONNES], int emplacements[COLONNES], int colonne, int emplacement, int joueur)
{
int x = COLONNES/2, y, i, k = 1;

if (profondeur == 0)
{
  return IA_Evaluation(grille, emplacements, colonne, emplacement, joueur);
}

if (check(grille, emplacements, colonne, emplacement, joueur) == 4)
{
  if (joueur == 0)
  {
   return -1000;
  }

  return 1000;
}

if (GrillePleine(emplacements))
{
  return 0;
}
for(i = 0; i < COLONNES && alpha < beta; i++)
{
  // EFFICACITE: Balayer les colonnes du milieu vers l'extérieur (alternativement gauche/droite)
  x += i*k;
  k = -k;
  if (emplacements[x] > -1)
  {
   y = InsererJeton(grille, emplacements, x, joueur);
   alpha = max(alpha, -IA_NegaMaxAlphaBeta(profondeur-1, -beta, -alpha, grille, emplacements, x, y, joueur^1));
   AnnulerJeton(grille, emplacements, y, x);
  }
}

return alpha;
}

Is there something wrong in it ?


And here comes my negamax call:
int x = COLONNES/2, y, i, k = 1, scoreCoup, scoreMeilleur = -1000000, col = -1;

for(i = 0; i < COLONNES; i++)
{
  // EFFICACITE: Balayer les colonnes du milieu vers l'extérieur (alternativement gauche/droite)
  x += i*k;
  k = -k;
  if (emplacements[x] > -1)
  {
   y = InsererJeton(grille, emplacements, x, joueur);
   scoreCoup = IA_NegaMaxAlphaBeta(4, -100000, 100000, grille, emplacements, colonne, emplacement, joueur);
   if (scoreCoup > scoreMeilleur)
   {
	scoreMeilleur = scoreCoup;
	col = x;
   }
   AnnulerJeton(grille, emplacements, y, x);
  }
}

If you want to see my eval function, just ask. I don't show it right now because there are maybe errors in my negamax.


Oh and by the way... Sorry for my english if i made mistakes ! I'm from Belgium Posted Image
Thanks in advance for your help ! Posted Image

Sponsor:

#2 Álvaro   Crossbones+   -  Reputation: 11943

Like
0Likes
Like

Posted 23 May 2012 - 01:17 PM

x += i*k;
  k = -k;
That's cute, but probably incorrect. You may want to add (i+1)*k to x instead.

Or even better, you may want to wait until you have a working search before you add any embelishments of this type.

EDIT: Nevermind, I now think that is correct. I still would take it out until you know your search is working.

Edited by alvaro, 23 May 2012 - 01:19 PM.


#3 Giustino   Members   -  Reputation: 105

Like
0Likes
Like

Posted 23 May 2012 - 01:34 PM

I totally agree with your point of view, but my grid (columns) is [0,6] so grid[7] is overflow.

See with my code:
- explore column 3
- explore column 2
- explore column 4
- explore column 1
- explore column 5
- explore column 0
- explore column 6

But if i add 1 to i like you said before, it's going to be wrong:


- explore column 4
- explore column 2
- explore column 5
- explore column 1
- explore column 6
- explore column 0
- explore column 7

Any other error except that one ? Posted Image

#4 Álvaro   Crossbones+   -  Reputation: 11943

Like
0Likes
Like

Posted 23 May 2012 - 01:41 PM

I can't find any flaws in your code. Perhaps you should try debugging a depth-1 search first and see how that goes.

#5 Giustino   Members   -  Reputation: 105

Like
0Likes
Like

Posted 23 May 2012 - 01:48 PM

Don't you think that the problem may come from my grid's verification method ?
Because it doesn't scan the entire grid, but just all the directions from the last coin inserted.
I can see there is a big problem with all of it but can't find it Posted Image

About my eval function, i know it's not good at all so it's maybe the problem. Anyway, do you want me to show that function ?

#6 Álvaro   Crossbones+   -  Reputation: 11943

Like
0Likes
Like

Posted 23 May 2012 - 01:52 PM

Almost regardless of what the evaluation function returns, your program should be able to find forced victories and avoid tactical losses. Looking only at the last toekn inserted should work just fine to check if the game was won by the player that just played.

But feel free to post the code and I'll take a look.

#7 Álvaro   Crossbones+   -  Reputation: 11943

Like
0Likes
Like

Posted 23 May 2012 - 01:53 PM

Also, could you describe what the actual problem is? Perhaps you can show us a position that you think shows the program is not working well.

#8 Giustino   Members   -  Reputation: 105

Like
0Likes
Like

Posted 23 May 2012 - 02:24 PM

If you don't see any problem in my negamax, it probably comes from the eval function.

Have a look (debug function):

Test column 3
Test column 2
Test column 4
Test column 1
FREE column 1 / SCORE: 1000
BEST column 1 (score 1000)
Test column 5
FREE column 5 / SCORE: 900
Test column 0
FREE column 0 / SCORE: 1000
Test column 6
FREE column 6 / SCORE: 1000

Seems to be OK except for scores so i guess it comes from eval.
In this example, it's not the case here, but almost every time i've the same score for all the columns... So it cannot work.

Here is my eval crap:

int IA_Evaluation(int grille[LIGNES][COLONNES], int emplacements[COLONNES], int colonne, int emplacement, int joueur)
{
int score = 0, flag = check(grille, emplacements, colonne, emplacement, joueur);

if (flag == 4)
{
if (joueur == 0)
{
return -1000;
}

return 1000;
}

if (flag == 3)
{
if (joueur == 0)
{
return -900;
}

return 900;
}

if (flag == 2)
{
if (joueur == 0)
{
return -600;
}

return 600;
}

if (flag == 1)
{
if (joueur == 0)
{
return -300;
}

return 300;
}

return 0;
}


#9 Álvaro   Crossbones+   -  Reputation: 11943

Like
0Likes
Like

Posted 23 May 2012 - 02:36 PM

While it is perfectly OK to use your `check' function to detect the victory condition of the game, it is not OK to use if for evaluation purposes. But I would just start with an evaluation function that returns 0 every time and see if you program can find quick tactical wins in easy cases.

And I would still like to see a position together with a description of what the program does and what you think it should do. It's hard to diagnose a disease without knowing any of the symptoms.

#10 Giustino   Members   -  Reputation: 105

Like
0Likes
Like

Posted 23 May 2012 - 02:49 PM

You're right. When i posted, i thought it was a problem with my negamax so i didn't continue with deep tests.
I'm going to have a closer look to it and test it just like you said.

I'll let you know as soon as possible (like tomorrow if i can).
Thank you

#11 Giustino   Members   -  Reputation: 105

Like
0Likes
Like

Posted 23 May 2012 - 03:21 PM

Back sooner as i expected.

My IA_Evaluation function returns 0 like you asked.
But it's still the same problem.
I've more details: all the scores are equal to 0 and when there is a winning/losing move it's equal to 1000 (but the AI doesn't act as expected, she inserts coins in columns order in the loop). She fills column by column Posted Image

Debug:

Test column 3
FREE column 3 / SCORE: 1000
BEST column 3 (score 1000)
Test column 2
FREE column 2 / SCORE: 1000
Test column 4
FREE column 4 / SCORE: 1000
Test column 1
FREE column 1 / SCORE: 1000
Test column 5
FREE column 5 / SCORE: 1000
Test column 0
FREE column 0 / SCORE: 1000
Test column 6
FREE column 6 / SCORE: 1000

when i'm at this point:
Posted Image

And before that, all scores to 0.

#12 Álvaro   Crossbones+   -  Reputation: 11943

Like
0Likes
Like

Posted 23 May 2012 - 04:37 PM

From that position, every move is a winning move. You didn't encourage your program to win in as few moves as possible. If you replace 1000 with 1000-number_of_moves_played it should do the right thing.

EDIT: Actually, that's wrong. Sorry. I don't have time to look at it now, but I will later.

Edited by alvaro, 23 May 2012 - 04:39 PM.


#13 Giustino   Members   -  Reputation: 105

Like
0Likes
Like

Posted 23 May 2012 - 06:07 PM

Ok no problem, thank you.

By the way, with my grid's verification method, i wouldn't be able to generate a score for coins alignment because it doesn't scan the entire grid, would i ?
But if it doesn't matter and if other parameters can be used (like the number of coins red/yellow), it's going to be ok.
I'd be grateful for a clarification about it.

#14 Álvaro   Crossbones+   -  Reputation: 11943

Like
0Likes
Like

Posted 23 May 2012 - 08:07 PM

I am back to thinking that there might not be anything wrong with your search. You make a depth-4 search after moving in the middle column and you see that you can win. All the other moves return scores of 1000 as well, but the way alpha-beta works, that only means that their scores are not higher than 1000. Then it plays in the center, which is a winning move.

Do you have another position where you think it's doing the wrong thing?

Edited by alvaro, 23 May 2012 - 08:08 PM.


#15 Giustino   Members   -  Reputation: 105

Like
0Likes
Like

Posted 24 May 2012 - 05:14 AM

For now, i don't Posted Image
Because it plays column by column (problem with the evaluation)

#16 Álvaro   Crossbones+   -  Reputation: 11943

Like
0Likes
Like

Posted 24 May 2012 - 07:40 AM

For the evaluation function, you are going to have to identify where threats exist on the board. By "threat" I mean an empty place on the board that is aligned with three pieces of one player, so it the player were to play there he would win. A strong evaluation function needs to know the difference between threats that happen on even and odd rows (with the top row being even), and how multiple threats combine (e.g., player 1 wins with an odd threat, player 2 wins with an even threat, but an odd threat by player 1 beats any number of even threats by player 2). Read this or buy the book. You can also give small bonuses for things like controlling squares in the central column.

You also need to be able to search much much further than depth 4. Here are some things that can help:
  • You are currently using some simple heuristics to determine what order to use to explore moves in the search (3,4,2,5,1,0,6), but one can do much better than that by using statistics on what moves have produced beta cuts before. You can see how "history heuristic" is used in chess and adapt it for connect 4.
  • You then need transposition tables, because in this game it's very easy to reach the same position through different paths, and you want to reuse what you learned when you saw this position before. You can also use the hash tables to store the most promising move, to be searched first when you encounter this position again.
  • Instead of doing a fixed-depth search, you should use iterative deepening, where you search depth 1, depth 2, depth 3... until you run out of time. The early searches are not a waste of time because they fill out the hash tables and make the later searches faster. Iterative deepening also allows for proper time control.
That should keep you busy for a while. :)

#17 Giustino   Members   -  Reputation: 105

Like
0Likes
Like

Posted 24 May 2012 - 09:13 AM

A strong evaluation function needs to know the difference between threats that happen on even and odd rows (with the top row being even), and how multiple threats combine (e.g., player 1 wins with an odd threat, player 2 wins with an even threat, but an odd threat by player 1 beats any number of even threats by player 2).


Sorry but i don't get it. Do you have an example for that "even/odd" thing ?
I've read your page but can't understand everything in your explanation.

#18 Giustino   Members   -  Reputation: 105

Like
0Likes
Like

Posted 24 May 2012 - 09:51 AM

Should it be something like this ?
Knowing that "nbAlign" contains the max number of coins aligned for "joueur" in "colonne".

int IA_Evaluation(int colonne, int joueur, int nbAlign)
{
int score = 0;
if (nbAlign == 3)
{
  score = SCORE_3J;
}
else
{
  if (nbAlign == 2)
  {
   score = SCORE_2J;
  }
  else
  {
   if (nbAlign == 1)
   {
	score = SCORE_1J;
   }
  }
}
if (colonne == 2 || colonne == 3 || colonne == 4)
{
  score += SCORE_COL_MIL;
}
else
{
  score += SCORE_COL_AUT;
}
if (joueur == 0)
{
  return -score;
}
return score;
}

Because it's the same problem, almost everytime the same score for all columns.
So it fills column by column like before: 3-2-4-1-5-0-6 Posted Image

Edited by Giustino, 24 May 2012 - 09:52 AM.


#19 Giustino   Members   -  Reputation: 105

Like
0Likes
Like

Posted 24 May 2012 - 12:33 PM

Now, the whole thing works (i think, right now).

Except this problem:
Test column 3
FREE column 3 / SCORE: 1000
BEST column 3 (score 1000)
Test column 2
FREE column 2 / SCORE: 1000
Test column 4
FREE column 4 / SCORE: 1000
Test column 1
FREE column 1 / SCORE: 1
Test column 5
FREE column 5 / SCORE: 1000
Test column 0
FREE column 0 / SCORE: 1000
Test column 6
FREE column 6 / SCORE: 1000

And it should be the opposite Posted Image
Everytime i can beat the AI, the "bad" column has a low score and the others have 1000.
Any idea where it comes from ?


Finally, the code above is a good result but it should take the MIN and not the MAX in that case.
What's the problem ?

Edited by Giustino, 24 May 2012 - 12:49 PM.


#20 Álvaro   Crossbones+   -  Reputation: 11943

Like
0Likes
Like

Posted 24 May 2012 - 01:14 PM

I don't understand your description of the situation where you think there is a problem. You keep posting those blocks of output, but I don't really know what they mean, since the code you posted doesn't print anything.

If alpha-beta has found a move with a score of 1000, if the score of any other move is reported to be 1000 or less it just means that it is not more than 1000. So you should essentially ignore the exact value returned.

Finally, the code above is a good result but it should take the MIN and not the MAX in that case.
What's the problem ?


Ah, I see an error now:
scoreCoup = IA_NegaMaxAlphaBeta(4, -100000, 100000, grille, emplacements, colonne, emplacement, joueur);
should be
scoreCoup = -IA_NegaMaxAlphaBeta(4, -100000, -scoreMeilleur, grille, emplacements, colonne, emplacement, joueur);

EDIT: Actually, that is still not right. Let's try again:
scoreCoup = -IA_NegaMaxAlphaBeta(4, -100000, -scoreMeilleur, grille, emplacements, colonne, emplacement, joueur^1);

Edited by alvaro, 24 May 2012 - 02:20 PM.





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS