• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
DarrylStein

TD learning code example with function approxiamation

7 posts in this topic

I'm trying to implement a board game using features that are tuned using temporal difference learning.
I've read quite a few descriptions of the TD implementation but can't seem to find any clean code examples.

I'm specifically looking for c-like code (c# would be optimal, java next best) that demonstrates TD learning
using a function approximator (I'm not interested in q-states).

Particularly of interest is how weights of the function are updated and what values of alpha (refer the function approxiamation
part of [url="http://www.scholarpedia.org/article/Temporal_Difference_Learning"]http://www.scholarpe...erence_Learning[/url]) are reasonable - initial testing suggests less than 0.1
also does the value of alpha change over time

thanks
0

Share this post


Link to post
Share on other sites
[quote]I'm trying to implement a board game using features that are tuned using temporal difference learning.
I've read quite a few descriptions of the TD implementation but can't seem to find any clean code examples.[/quote]
If you choose a peculiar framework, you have the burden of proving that it makes sense.[list]
[*]Do your "features" really change over time, or it's a clumsy way to account for different strategies at different stages of the game, or just "evolution"?
[*]Why do you want to forget old examples with that alpha factor in the first place? If some examples are somehow worse than others you want to give them less importance than better examples, which has nothing to do with a meaningless sequential ordering.
[*]Usually, machine learning in board games is applied to finding [i]one[/i] good position evaluation function that, given a game state, matches the outcome of an exhaustive simulation from that state. How does the peculiar sequential structure of TD learning fit such an inherently memoryless problem?
[/list]
-1

Share this post


Link to post
Share on other sites
I'm not looking for a framework, I'm looking for example code implementing TD for function approxiamation. The tuning of features is only done once, it is not an on-going exercise. I will define many features and I'm using the TD to provide a weighting for each of those features. Once the TD algorithm converges I could potentially hardcode those weights against each feature. The TD code would then have done its job and not be needed again. Because there are potentially hundreds of feaures I don't want to try and tune each of them by hand. I already have a comprehensive implementation of adversarial search using all the usual suspects for look ahead. (alpha/beta, principle variation, killer moves, transpotition tables, etc) The heuristic function used returning an evaluation of the position will be made up of the features and their weightings.
0

Share this post


Link to post
Share on other sites
Well, I'm arguing that TD learning doesn't seem a particularly suitable framework for your application. Since you don't seem too clueless about boardgame AI, is it some kind of homework in which you [i]have [/i]to use TD learning? Or TD learning makes sense after all?
It isn't at all clear whether you have stumbling blocks in your implementation, or a complete implementation that performs badly.
0

Share this post


Link to post
Share on other sites
The bigger picture is that I'm creating a general gaming playing bot. Part of this is taking in GDL (game description language) and creating features from ruleset using evolutionary strategies. The features that are created from the ruleset are mostly going to be junk so there needs to be a mechanism for tuning the features to see what their
worth is towards a heuristic and whether the feature gets used or chucked and also what its worth is towards evaluation. I've used genetic programming for this in the past and had very limited success with it. I've read a bit of literature (scientific journals) using TD to provide feature tuning and was hoping to emulate some of these techniques. I'm just a hobbyist that finds this stuff extrememly interesting :) Over the past few days I've managed a TD implementation using a linear combination of features and realised that I probably need to move to a ANN. As a simple example if you take checkers and a single feature that returns the difference in the number of men that you have, a linear evaluation doesn't really give a good representation of the feature (e.g 12 vs 11 men would eval the same as 2 vs 1 men). I was looking for code examples to vet my understanding and to see how decaying learning rates and other issues (that I'm not aware of) are handled. I've implemented AI on quite a few boards games successfully using adversarial search and hand tuned evaluations and now stepping up to more complicated problems.
0

Share this post


Link to post
Share on other sites
Good luck, sounds interesting although outside my area of expertise. One possibility that may help with either a genetic algorithm or a NN is offering some "pre-chewed" metrics, e.g. have one input which is the difference, one which is the difference ratio, one which is the log of the difference. That reduces the complexity that the NN or GA needs to represent, as the most appropriate metric will tend to be strengthened/selected for. Although I'm not sure whether more inputs will introduce other difficulties.
0

Share this post


Link to post
Share on other sites
Interesting suggestion Jeff. The trade off would be possible more useful metrics to more metrics. The more inputs, the longer the NN/GA should take to converge, but the better metrics would shorten the converging. It would be an interesting test to see which option would work better
0

Share this post


Link to post
Share on other sites
I've heard of people using that approach successfully for specific tasks such as handwriting recognition. I think the success would partly be selecting potentially useful transformations and partly be a crapshoot. ;)
0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0