Sign in to follow this  
SurfingNerd

Prediction with Neural Networks

Recommended Posts

Hello,

i am Neural Network apprentice, i am learning that on university.

Currently i am developing a round based RPG,

and got stuck in decision making as you can read in my previous post due performance issues.

 

So i would love to give a solution with ANN a try...

 

Example Situation

 

Lets assume a simple Situation 1 vs 1:

1 Healer (AI) vs 1 Tank (enemy)

 

on your turn, the Healer can attack, or he can heal.

that turns out into 4 possible actions you can do:

a) Attack yourself (=healer)

b) Attack the Tank

c) Heal yourself

d) Heal the Tank

 

So decisions b) and c) are good ones, 

and a) and d) are stupid ones.

 

Scoring

 

i have written a scoring algorithm that determines on how good your actual state on the battlefield is, depending on current health and so on.

bad situation < 0 < good situation.

I am interested in a near future situation: "How does my decision now affect the resulst within x (= 5?)  turns ?"

I can do a calculation over a decision tree and calculate a score for each path in the decision tree.

this works, and i get good results. The only problem is, that i would need a Earthsimulator2 machine in order to get the Scores within an acceptable time....

 

The Network

So heres the idea of the neural networks come into play:

If i could feed a neural network with the data of the current situation (current health, skills, affected states like poisoned, stunned aso.) and the planned action as input value, 

and i want to get a forecast of the input after x (= 5?) rounds  as output,

 

The Training

I train the network with example situations.

I calculate the outgoings of an action within x rounds, and take the average of them.

i use this average value for training the network for the same input like in my calculation.

 

Execution

In execution phase, i know the 4 possible actions, and ask the network 4 times to give a prediction, 

then i choose the action that has the best value.

 

Type of Networks

I am really new to ANN, and currently have no idea, what kind of network suits best to solve that problem.

 

 

What do you think ? i this goal possible to achive, or is this completly nonsense ?

 

thanks for reading, Tom

Share this post


Link to post
Share on other sites


What do you think ? i this goal possible to achive, or is this completly nonsense ?

 

possible, yes.

 

nonsense, no.

 

your scoring system is reminiscent of utility based systems.

 

me personally, i've always written the same basic kind of ai for everything, as i've never been able to find anything that could beat it.

 

how to describe it?

 

i guess technically its a behavior tree.

Share this post


Link to post
Share on other sites

Sorry, but when I read this it strikes me as having just found this fancy electric screwdriver and then asking how to screw in your two nails. Also why are you even allowing stupid moves like healing an enemy or attacking yourself when you could simplify the problem by forbidding it in the game rules?

Then you seem to be mixing up two things: deciding the next move and predicting the outcome after x rounds.

For the first a simple rule-based system would be so easy to make:

- always heal when an ally is in risk of dieing (simple math with current hp and max. hp values and estimated damage/heal)

- if nothing else to do attack enemy

For the second you can just assume the AI is always following above two rules and the enemy is always attacking and then the problem gets reduced to a simple calculation.

Share this post


Link to post
Share on other sites

thx for the answers

 

@wintertime:

 

the reasons why i want a "learning ai" instead of a "rulebased AI" you mentioned:

of course, i am thinking very often about switching back to rulebased...

 

- i want to use AI for Balancing: AI has to find out tactics, i dontnt mention as game designer. 

- i study ANN at university, so this is a good object for my studies

- i tried allready a rulebased ai, and came to the conclusio that its good, as long as the scenrious are simple.

- rulebased AI means, that with every new skill and status effect, you have to redo all your rules, instead of just train the AI to adapt to the new rules.

 

The reasons, why this "stupid moves" are allowed, is that there are situations, where they aren't stupid.

- if you heal a enemy affected by Zombie, you deal damange

- you can nullify "confuse" effect by attacking an ally.

Edited by SurfingNerd

Share this post


Link to post
Share on other sites


- i want to use AI for Balancing: AI has to find out tactics, i dontnt mention as game designer. 

 

not really happening with a pure neural net. it learns what it learns, and that's it.  one AI difficulty level is the result.  only way to get multiple levels is to generate sub-par solutions by denying the NN some inputs.

 


- i study ANN at university, so this is a good object for my studies

 

this is about the only valid reason i see to do it this way.

 


- i tried allready a rulebased ai, and came to the conclusio that its good, as long as the scenrious are simple.

 

i've been building complex simulation games for 25 years, and have yet to encounter the game rule system that my method can't handle (usually in a rather trivial manner). 

 


- rulebased AI means, that with every new skill and status effect, you have to redo all your rules, instead of just train the AI to adapt to the new rules.

 

sounds like you want to setup auto-training, turn it on, and walk away, instead of coding rule based systems.  granted it can be less work, but "you get what you pay for". As a NN student, i'm sure you've heard the one about the NN used to ID tanks in recon photos. The got it all trained, only to discover it wasn't spotting tanks at all, just woods where tanks like to hide! <g>. can't recall if it was the US or Britain.

 

OTOH, NNs can come up with things you may not think of, which is the weakness of rule based systems, they're only guaranteed to work for situations the developer anticipated and coded for.

 

 


The reasons, why this "stupid moves" are allowed, is that there are situations, where they aren't stupid.

 

doesn't a neural net have to be taught everything though?  IE all the stuff it should not do, as well as what it should?  it should be capable of any response, good or bad, and then learn to only give good responses.

Share this post


Link to post
Share on other sites
I would use MCTS. This is what my plan of attack would be (this plan serves as an informal introduction to what MCTS is, actually):
1) Come up with a simple randomized strategy that doesn't do things that are too stupid (like hurting yourself or healing an enemy in normal circumstances). We'll call this the "playout policy".
2) Make a simple AI that tries each of the possible actions and for each one of them runs a bunch of "playouts" where every agent uses the playout policy until the situation is resolved and a winner is declared. Pick the move with the highest fraction of victories, of course.
3) Refine strategy (2) by using a clever algorithm based on previous experience to sample promising moves more often than unpromising ones (see multi-armed bandit for details, in particular search the web for a strategy called UCB1).
4) Do the refinement introduced in (3) at other situations that arise more than once, not just the root.
5) Try to tweak the playout policy to improve the performance of the AI described in (4).
6) Try some refinements that have been successful in computer go, like RAVE and creating fake statistics for new nodes in the tree (coming from heuristics).
7) Parallelize the search.

After each step I would ask myself if the performance is already good enough. If it is, stop there and be happy.

Share this post


Link to post
Share on other sites

What you started describing better fits the evaluation techniques of a planner, and would be much easier to develop and manipulate than a neural net.

 

Each decision option could be an expansion of the planner search space. The search can terminate when there are no more threats, or perhaps it would terminate when there are no more threats AND the guys team is no longer taking damage, as that can have an effect on the plan. The g and h costs can be some variation of the score of your teams 'effectiveness'

 

Calculating the effectiveness is where much of your work is focused. Some ideas, or perhaps a place to start, could be

effectiveness = total health + total dps

cost = enemy effectiveness - ally effectiveness

 

I say total in order to account for there being multiple members in each team.

I say total DPS for the same reason.

 

Using a difference between the enemy and ally allows attacking, healing, buffing, and debuffing to have their effect on the plan. Debuffs for example would probably largely have an effect on the enemy teams DPS, either in the form of stat reduction, damage reduction, or maybe forcing the target to skip a turn, etc, which would provide momentary or time based reductions in the enemy team effectiveness. Vice versa for buffing your team. It's not always obvious to even human players that a long term buff might have more beneficial effects to the team objectives than performing an attack or doing something else. This sort of planner would allow that to be accounted for and for you to tweak the weights of different types of branches. You could put bias scalars that allows you to tweak certain encounters to bias them towards healing, or debuffing, etc, or even change the biases mid combat.

 

But there are other reasons to calculate an effectiveness based on the entire team, and that is for the situation where there are options for a character to heal or buff another character, or debuff an enemy. Healing or buffing another character would have significant effect on these effectiveness calculations and would probably produce better action decisions. Damage/heal over time effects should probably be considered as their full positive or negative effect against the weighting, perhaps scaled to some extent based on how aggressive you want them to make regenerative moves or heal themselves of damage over time effects.

 

Rather than planning from the perspective of an individual character, I would probably attempt to do a team plan. Each node of the graph would be a specific persons turn, both ally and enemy. You include the enemies turns in order to simulate anticipation of enemy actions. Each edge out of the node is the possible actions for that character. You could also hide certain action edges from the search until they are 'discovered' by either the enemies first use of them, or some type of knowledge stat roll or something. This would give the simple appearance of learning. By default maybe it can only be assumed a very basic level of abilities. Fighters can attack at some average DPS suitable for their level, clerics can heal at some reasonable amount based on their level, etc. The first time the fighter attacks though, you could replace the 'assumed' fighter action with the real action so that the plan is refined based on the strength of the abilities that each player displays. If they use another attack, add that to the list of expansions in the search. Eventually one team will 'know' every action available to the enemy members, so their plan should make better decisions about who to attack first and stuff like that. Perhaps based on the difficulty a certain number of the actions links are automatically unlocked at the start or over time so that a human can't game them as easily by saving their really heavy hitting skills until late in the round. This would basically mimic the general flow of combat in D&D games, where often players do 'knowledge' checks as minor actions in order to discover more information about the enemies state of health, abilities, etc.

 

Also for purposes of the search, you probably will want to give them explicit knowledge of the frequency that such attacks can be used or else the plan will bias heavily toward defeating characters with powerful attacks because they will be seen as more of a threat, even if those attacks can only be used once or infrequently.

 

This would be ideal to experiment with in the context of a turn based game where there is plenty of time to perform a complex search plan. More real time games would have to be more selective and only update it infrequently or time splice the search to control the cost.

Share this post


Link to post
Share on other sites
I see a basic problem with the plan of evaluating game state a few turns later: it depends on what the opponents do (in the example, the enemy "Tank" can at the very least attack the Healer, attack someone else and retreat). So you really need an AI for the Healer, the Tank and all other units: evaluating a game tree doesn't seem particularly more expensive, considering that reasoning can prune a lot of bad actions and that most states can be folded back to the same state on the previous turn. Some game state information can be compressed to reduce the number of game states and get approximate results (e.g. distance between characters instead of grid positions)

Share this post


Link to post
Share on other sites

thank you very much for your answers!!

Didnt expect such a great participation :)

 

i will dig now deeper into monte carlo tree search, 

seems to be a good algorithm.

 

maybe i will still try out the ANN prediction method for study reasons.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this