Wargames WOPR AI?

Started by
3 comments, last by Yakyb 12 years, 1 month ago
I bet most of you in here have seen the old movie "Wargames"

Im wondering how would you go about programming an AI like WOPR that learns from itself?

Lets take an example in chess *Which he loved* He would make simulations of a chess game, if he lost he would learn from the mistake and improve from that the next time.
Advertisement
One of the many difficulties with this is how to represent the learned information.

For example, one learning approach suitable for self-directed learning is genetic algorithms. The results of genetic algorithms can be very unpredictable and impressive. The hidden "knowledge" the programmer provides is what the genetic code represents. If the programmer forgets to include a code to move the knight, the knight will never move. It may be much more subtle than that, e.g. the code may only allow a limited "memory" of past moves, may not understand that certain aspects of the board can be mirrored, etc. If the representation is wrong, there will be a strategy that always beats the computer and it cannot learn to win.

Similarly for any approach there may be a meta-game that the computer will not be able to represent/understand, such as the psychology of the opponent. A player may fall for a sucker punch after a few victories due to overconfidence. The player may assume the computer will attempt the same strategy again under the same circumstances. Extending a game arbitrarily may make the player tired and lose concentration. Or a technically inferior play may annoy the player and cause them to make mistakes later. If the system lacks a way to store information on past games, timing information, or a way to represent the player's expectations, it cannot learn the meta-game either.
Online learning algorithms are capable of being updated continuously. They are useful for situations where the rules are constantly changing (spam filters).

Most ensemble methods can be updated "on the fly" by adding new models when enough data has been collected, and removing old ones that perform poorly against the new data.

Most probabilistic algorithms that use opponent models are technically learning as they play, since their strategy is based more on opponents behavior than it is on some specific strategy. In this sense they are always learning/adapting.

It's not a solved problem, but there are lots of interesting attempts.
The problem with a lot of them is that it is often difficult to find out what went right or wrong. Using chess as an example, if you lose the game, which move caused it? It could have been a single move you made 20, 30, even 50 moves prior that was the thing you want to avoid. Trying to determine what you "learned" from that experience is a very difficult undertaking.

Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

I have seen learning AI put into Rock Paper Scissors games which enable the decision of the next item to show based upon the previous options the user has selected.
I.e. the game will have a history of moves made and will know that 60% of the time a user will follow Rock, rock with paper.
obviously this is a far easier option than a game of chess.

however taking this concept to a game of chess, not only would you need to consider the current state of the board, but also the opponents style of play

This topic is closed to new replies.

Advertisement