Jump to content
  • Advertisement
Sign in to follow this  
ojjemojj

machine learning

This topic is 2805 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi!

So, Ive decided to program a Othello game. I have a set of features (corners, edges, diagonals) and now I want to estimate their weights. Im trying to do this through linear regression. (I tried logistic regression too, but im experiencing the same issue).

Ive accumulated a DB of labeled othello positions and im using linear regression to minimize the error function (final disc diff in this case)

I have divided the training data based on number of empty squares on the board and i want to train weights for each stage of the game.

The problem Im having is that for example important features like corners and way underestimated in early stages and in the end stages the corners are way overestimated.

I guess this is because of how regression analysis work and since the corners are usually not occupied until the end of the game they are not represented as much in the early stages and are given too low weight.

I dont know how to counter for this but I guess I would have to somehow filter the training data to ensure feature variance or something...

The way it should be is that corners should be given highest values in eraly stage of the game and then eventually at the very end all features should pretty much converge to the same values...

Any thoughts on this, on how I can train the weights in a proper way and what the ideal training data should look like???

Thanks!

Share this post


Link to post
Share on other sites
Advertisement
Feature weights need to change every turn to reflect shifting strategies; as you already discovered, there is no single best value.

You have 60 turns and 60 different evaluation functions to learn, possibly further partitioned according to classes of board states (e.g. side positions are good, but if I own a side adding more discs to that side is far less useful than conquering a different side).

You can probably interpolate between the training results of a few representative turns, using data from another set of turns for cross-validation.

Share this post


Link to post
Share on other sites
What about taking a heat map approach? Assigning a weight to each space on the board based on the number of number of flips putting piece in that position will create. Those weights would be generated through your training. Your program will then visualize the board a set of positions worth different historic values.

You can then use an N turns ahead depth search to derive the best move to make based on the historical value of that path and current value. Although to be honest if you are going to take that approach you might as well scrape the machine learning and just look ahead N turns.

Another option would be to record plans, and train your ai to execute a series of moves that historically best achieve a set of goals. For instance you might have a goal of capturing a corner and record the best series of moves that allow you capture a corner in least possible number of moves.

For coming up with your training data I’d suggest using a few simple rule based AI opponents for your training AI to play full games against. You could have one that always places a piece at random, one that always plays to capture the most pieces with each move, etc. Run your training AI against 100 games of each should give you a good basis.

Share this post


Link to post
Share on other sites
Thanks for the suggestions guys! Great ideas.

I,ve been experiencing with dividing the training data into stages, where each stage being 1 move, 2 moves etc up to 10 moves, all yielding somewhat different results, but unfortunately they all have the same problem with underestimating rarely seen features.

I also did in a desperate attempt create 2 additional AI's, one that plays completely random and one based on the idea that every square on the board has a value assigned to it. Then I played my strongest AI against these two, 10 000 games and used them as training games.

Not surprisingly the feature weights depend completely on the training data, but I just dont know how yet. I get completely different weights when using the strong vs random compared to static vs random or when using only world class tournament games.

Do you know of any guidelines or what one should try and look for in the training data? And I,ve been experiencing with different sample sizes too and it drastically changes the weights... I need to find a balance in the data somehow, a good mix of well played games vs games where big mistakes have been made to get the variance needed i suppose...

Does anyone know what a good ratio between number of features in the eval function and the size of the training set?

Thanks again!

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!