Followers 0

# Problem in C with NegaMax / Alpha-Beta (Connect 4)

## 59 posts in this topic

Hi everyone,

I'm facing a problem with negamax. Think it comes from my eval function but i'm not sure at all.
First, can you tell me if my negamax and his call are ok ?
Because i don't find any solution, even if i read a lot about it these days.

Before showing you the code, i have to tell you that my verification function is based on (x,y) position (correspond to the last coin inserted with X = column and Y = line). Is this a portable method with negamax/alpha-beta or do i have to check the entire grid ?

Here is my negamax function:
[CODE]
int IA_NegaMaxAlphaBeta(int profondeur, int alpha, int beta, int grille[LIGNES][COLONNES], int emplacements[COLONNES], int colonne, int emplacement, int joueur)
{
int x = COLONNES/2, y, i, k = 1;

if (profondeur == 0)
{
return IA_Evaluation(grille, emplacements, colonne, emplacement, joueur);
}

if (check(grille, emplacements, colonne, emplacement, joueur) == 4)
{
if (joueur == 0)
{
return -1000;
}

return 1000;
}

if (GrillePleine(emplacements))
{
return 0;
}
for(i = 0; i < COLONNES && alpha < beta; i++)
{
// EFFICACITE: Balayer les colonnes du milieu vers l'extérieur (alternativement gauche/droite)
x += i*k;
k = -k;
if (emplacements[x] > -1)
{
y = InsererJeton(grille, emplacements, x, joueur);
alpha = max(alpha, -IA_NegaMaxAlphaBeta(profondeur-1, -beta, -alpha, grille, emplacements, x, y, joueur^1));
AnnulerJeton(grille, emplacements, y, x);
}
}

return alpha;
}
[/CODE]

Is there something wrong in it ?

And here comes my negamax call:
[CODE]
int x = COLONNES/2, y, i, k = 1, scoreCoup, scoreMeilleur = -1000000, col = -1;

for(i = 0; i < COLONNES; i++)
{
// EFFICACITE: Balayer les colonnes du milieu vers l'extérieur (alternativement gauche/droite)
x += i*k;
k = -k;
if (emplacements[x] > -1)
{
y = InsererJeton(grille, emplacements, x, joueur);
scoreCoup = IA_NegaMaxAlphaBeta(4, -100000, 100000, grille, emplacements, colonne, emplacement, joueur);
if (scoreCoup > scoreMeilleur)
{
scoreMeilleur = scoreCoup;
col = x;
}
AnnulerJeton(grille, emplacements, y, x);
}
}
[/CODE]

If you want to see my eval function, just ask. I don't show it right now because there are maybe errors in my negamax.

Oh and by the way... Sorry for my english if i made mistakes ! I'm from Belgium [img]http://public.gamedev.net//public/style_emoticons/default/laugh.png[/img]
0

##### Share on other sites
[code] x += i*k;
k = -k;[/code]
That's cute, but probably incorrect. You may want to add (i+1)*k to x instead.

Or even better, you may want to wait until you have a working search before you add any embelishments of this type.

EDIT: Nevermind, I now think that is correct. I still would take it out until you know your search is working. Edited by alvaro
0

##### Share on other sites
I totally agree with your point of view, but my grid (columns) is [0,6] so grid[7] is overflow.

See with my code:
[CODE]
- explore column 3
- explore column 2
- explore column 4
- explore column 1
- explore column 5
- explore column 0
- explore column 6
[/CODE]

But if i add 1 to i like you said before, it's going to be wrong:

[CODE]

- explore column 4
- explore column 2
- explore column 5
- explore column 1
- explore column 6
- explore column 0
- explore column 7
[/CODE]

Any other error except that one ? [img]http://public.gamedev.net//public/style_emoticons/default/sad.png[/img]
0

##### Share on other sites
I can't find any flaws in your code. Perhaps you should try debugging a depth-1 search first and see how that goes.
0

##### Share on other sites
Don't you think that the problem may come from my grid's verification method ?
Because it doesn't scan the entire grid, but just all the directions from the last coin inserted.
I can see there is a big problem with all of it but can't find it [img]http://public.gamedev.net//public/style_emoticons/default/mellow.png[/img]

About my eval function, i know it's not good at all so it's maybe the problem. Anyway, do you want me to show that function ?
0

##### Share on other sites
Almost regardless of what the evaluation function returns, your program should be able to find forced victories and avoid tactical losses. Looking only at the last toekn inserted should work just fine to check if the game was won by the player that just played.

But feel free to post the code and I'll take a look.
0

##### Share on other sites
Also, could you describe what the actual problem is? Perhaps you can show us a position that you think shows the program is not working well.
0

##### Share on other sites
If you don't see any problem in my negamax, it probably comes from the eval function.

Have a look (debug function):

[CODE]
Test column 3
Test column 2
Test column 4
Test column 1
FREE column 1 / SCORE: 1000
BEST column 1 (score 1000)
Test column 5
FREE column 5 / SCORE: 900
Test column 0
FREE column 0 / SCORE: 1000
Test column 6
FREE column 6 / SCORE: 1000
[/CODE]

Seems to be OK except for scores so i guess it comes from eval.
In this example, it's not the case here, but almost every time i've the same score for all the columns... So it cannot work.

Here is my eval crap:
[CODE]

int IA_Evaluation(int grille[LIGNES][COLONNES], int emplacements[COLONNES], int colonne, int emplacement, int joueur)
{
int score = 0, flag = check(grille, emplacements, colonne, emplacement, joueur);

if (flag == 4)
{
if (joueur == 0)
{
return -1000;
}

return 1000;
}

if (flag == 3)
{
if (joueur == 0)
{
return -900;
}

return 900;
}

if (flag == 2)
{
if (joueur == 0)
{
return -600;
}

return 600;
}

if (flag == 1)
{
if (joueur == 0)
{
return -300;
}

return 300;
}

return 0;
}
[/CODE]
0

##### Share on other sites
While it is perfectly OK to use your `check' function to detect the victory condition of the game, it is not OK to use if for evaluation purposes. But I would just start with an evaluation function that returns 0 every time and see if you program can find quick tactical wins in easy cases.

And I would still like to see a position together with a description of what the program does and what you think it should do. It's hard to diagnose a disease without knowing any of the symptoms.
0

##### Share on other sites
You're right. When i posted, i thought it was a problem with my negamax so i didn't continue with deep tests.
I'm going to have a closer look to it and test it just like you said.

I'll let you know as soon as possible (like tomorrow if i can).
Thank you
0

##### Share on other sites
Back sooner as i expected.

My IA_Evaluation function returns 0 like you asked.
But it's still the same problem.
I've more details: all the scores are equal to 0 and when there is a winning/losing move it's equal to 1000 (but the AI doesn't act as expected, she inserts coins in columns order in the loop). She fills column by column [img]http://public.gamedev.net//public/style_emoticons/default/sad.png[/img]

Debug:
[CODE]

Test column 3
FREE column 3 / SCORE: 1000
BEST column 3 (score 1000)
Test column 2
FREE column 2 / SCORE: 1000
Test column 4
FREE column 4 / SCORE: 1000
Test column 1
FREE column 1 / SCORE: 1000
Test column 5
FREE column 5 / SCORE: 1000
Test column 0
FREE column 0 / SCORE: 1000
Test column 6
FREE column 6 / SCORE: 1000
[/CODE]

when i'm at this point:
[img]http://img811.imageshack.us/img811/4597/24018014.png[/img]

And before that, all scores to 0.
0

##### Share on other sites
From that position, every move is a winning move. You didn't encourage your program to win in as few moves as possible. If you replace 1000 with 1000-number_of_moves_played it should do the right thing.

EDIT: Actually, that's wrong. Sorry. I don't have time to look at it now, but I will later. Edited by alvaro
0

##### Share on other sites
Ok no problem, thank you.

By the way, with my grid's verification method, i wouldn't be able to generate a score for coins alignment because it doesn't scan the entire grid, would i ?
But if it doesn't matter and if other parameters can be used (like the number of coins red/yellow), it's going to be ok.
I'd be grateful for a clarification about it.
0

##### Share on other sites
I am back to thinking that there might not be anything wrong with your search. You make a depth-4 search after moving in the middle column and you see that you can win. All the other moves return scores of 1000 as well, but the way alpha-beta works, that only means that their scores are not higher than 1000. Then it plays in the center, which is a winning move.

Do you have another position where you think it's doing the wrong thing? Edited by alvaro
0

##### Share on other sites
For now, i don't [img]http://public.gamedev.net//public/style_emoticons/default/wacko.png[/img]
Because it plays column by column (problem with the evaluation)
0

##### Share on other sites
For the evaluation function, you are going to have to identify where threats exist on the board. By "threat" I mean an empty place on the board that is aligned with three pieces of one player, so it the player were to play there he would win. A strong evaluation function needs to know the difference between threats that happen on even and odd rows (with the top row being even), and how multiple threats combine (e.g., player 1 wins with an odd threat, player 2 wins with an even threat, but an odd threat by player 1 beats any number of even threats by player 2). Read [url="http://homepages.cwi.nl/~tromp/c4.html"]this[/url] or buy [url="http://www.amazon.com/Complete-Book-CONNECT-History-Strategy/dp/1402756216"]the book[/url]. You can also give small bonuses for things like controlling squares in the central column.

You also need to be able to search much much further than depth 4. Here are some things that can help:[list]
[*]You are currently using some simple heuristics to determine what order to use to explore moves in the search (3,4,2,5,1,0,6), but one can do much better than that by using statistics on what moves have produced beta cuts before. You can see how "history heuristic" is used in chess and adapt it for connect 4.
[*]You then need transposition tables, because in this game it's very easy to reach the same position through different paths, and you want to reuse what you learned when you saw this position before. You can also use the hash tables to store the most promising move, to be searched first when you encounter this position again.
[*]Instead of doing a fixed-depth search, you should use iterative deepening, where you search depth 1, depth 2, depth 3... until you run out of time. The early searches are not a waste of time because they fill out the hash tables and make the later searches faster. Iterative deepening also allows for proper time control.
[/list]
That should keep you busy for a while.
0

##### Share on other sites
[quote name='alvaro' timestamp='1337866828' post='4942889']
A strong evaluation function needs to know the difference between threats that happen on even and odd rows (with the top row being even), and how multiple threats combine (e.g., player 1 wins with an odd threat, player 2 wins with an even threat, but an odd threat by player 1 beats any number of even threats by player 2).
[/quote]

Sorry but i don't get it. Do you have an example for that "even/odd" thing ?
0

##### Share on other sites
Should it be something like this ?
Knowing that "nbAlign" contains the max number of coins aligned for "joueur" in "colonne".

[CODE]
int IA_Evaluation(int colonne, int joueur, int nbAlign)
{
int score = 0;
if (nbAlign == 3)
{
score = SCORE_3J;
}
else
{
if (nbAlign == 2)
{
score = SCORE_2J;
}
else
{
if (nbAlign == 1)
{
score = SCORE_1J;
}
}
}
if (colonne == 2 || colonne == 3 || colonne == 4)
{
score += SCORE_COL_MIL;
}
else
{
score += SCORE_COL_AUT;
}
if (joueur == 0)
{
return -score;
}
return score;
}
[/CODE]

Because it's the same problem, almost everytime the same score for all columns.
So it fills column by column like before: 3-2-4-1-5-0-6 [img]http://public.gamedev.net//public/style_emoticons/default/sad.png[/img] Edited by Giustino
0

##### Share on other sites
Now, the whole thing works (i think, right now).

Except this problem:
[CODE]
Test column 3
FREE column 3 / SCORE: 1000
BEST column 3 (score 1000)
Test column 2
FREE column 2 / SCORE: 1000
Test column 4
FREE column 4 / SCORE: 1000
Test column 1
FREE column 1 / SCORE: 1
Test column 5
FREE column 5 / SCORE: 1000
Test column 0
FREE column 0 / SCORE: 1000
Test column 6
FREE column 6 / SCORE: 1000
[/CODE]

And it should be the opposite [img]http://public.gamedev.net//public/style_emoticons/default/sad.png[/img]
Everytime i can beat the AI, the "bad" column has a low score and the others have 1000.
Any idea where it comes from ?

Finally, the code above is a good result but it should take the MIN and not the MAX in that case.
What's the problem ? Edited by Giustino
0

##### Share on other sites
I don't understand your description of the situation where you think there is a problem. You keep posting those blocks of output, but I don't really know what they mean, since the code you posted doesn't print anything.

If alpha-beta has found a move with a score of 1000, if the score of any other move is reported to be 1000 or less it just means that it is not more than 1000. So you should essentially ignore the exact value returned.

[quote name='Giustino' timestamp='1337884435' post='4942970']
Finally, the code above is a good result but it should take the MIN and not the MAX in that case.
What's the problem ?
[/quote]

Ah, I see an error now:
[code]scoreCoup = IA_NegaMaxAlphaBeta(4, -100000, 100000, grille, emplacements, colonne, emplacement, joueur);[/code]
should be
[code]scoreCoup = -IA_NegaMaxAlphaBeta(4, -100000, -scoreMeilleur, grille, emplacements, colonne, emplacement, joueur);[/code]

EDIT: Actually, that is still not right. Let's try again:
[code]scoreCoup = -IA_NegaMaxAlphaBeta(4, -100000, -scoreMeilleur, grille, emplacements, colonne, emplacement, joueur^1);[/code] Edited by alvaro
0

##### Share on other sites
Ok i'll have a look.

For my example, here is a cleaner explanation:
[CODE]
Test column 3
FREE column 3 / SCORE: 1000
BEST column 3 (score 1000)
Test column 2
FREE column 2 / SCORE: 1000
Test column 4
FREE column 4 / SCORE: 1000
Test column 1
FREE column 1 / SCORE: 1000
Test column 5
FREE column 5 / SCORE: 1000
Test column 0
FREE column 0 / SCORE: 3
Test column 6
FREE column 6 / SCORE: 1000
[/CODE]

produced by:
[img]http://img715.imageshack.us/img715/4127/18786785.png[/img]

Can you see the problem ? It always happened in those cases.

This is weird...

Initial call for negamax:
When i changed the code you've shown above, goes wrong.
But like this, it works (except the problem i've told before):

[CODE]
y = InsererJeton(grille, emplacements, x, joueur^1);
scoreCoup = IA_NegaMaxAlphaBeta(4, -1000000000, 1000000000, grille, emplacements, colonne, emplacement, joueur);
[/CODE]

Is it normal that i've to write "!joueur" and not simply "joueur" ? Edited by Giustino
0

##### Share on other sites
I edited my previous post. It looks like your initial call to IA_NegaMaxAlphaBeta was pretty messed up. It should be virtually identical to the recursive call made inside of IA_NegaMaxAlphaBeta, except instead of beta you can use the fixed value 1000000000. Edited by alvaro
0

##### Share on other sites
Think i tried that too, but i'm going to test it and i'll tell you.

Yep i already tested it. makes things worse (let's just say... not better).
Where can be the problem ? [img]http://public.gamedev.net//public/style_emoticons/default/wacko.png[/img] Edited by Giustino
0

##### Share on other sites
I would change the depth from 4 to 1 or even 0 and try to use a debugger to see what's going on.
0

##### Share on other sites
Ok, it works better since i've changed something in my eval function.
But there is still the same problem... When it can loose, it doesn't block the player (because the score of that column is the lower -> OK for the MIN but should be the biggest value to make things right).
Do you understand what i mean ? It's the same thing i told you above, just look at the image and the debug code to help you if it's not clear in your mind.
I really don't know what could be crap in all of that [img]http://public.gamedev.net//public/style_emoticons/default/sad.png[/img]
0

## Create an account

Register a new account