Turn based RPG AI : how can I handle risk / reward ?

Started by
31 comments, last by IADaveMark 8 years, 5 months ago

it might be time to consider a different system.


It is quite possible. I'm opened to any suggestion

https://en.wikipedia.org/wiki/Expert_system

any types of AI can be used to make the decisions in the expert system. use the best AI type at each decision point. that may be simple rule based behavior, or something more complex, such as output from a planner module. build it up as a hierarchy of expert systems/modules. one overall module, and separate modules that specialize in specific types of decisions (fight or flight ?, select weapon, who's the greatest threat, etc). then you just need an expert to "write the rules of behavior". as game dev, one would assume that "expert" would be you <g>.

i find this approach allows you to tackle the problem the way you would as a human player, and simply make the AI do what you would do if you were playing the AI side yourself. it doesn't limit you to the constraints and weaknesses of a given type of AI, as it uses a variety of types as needed. i find that behavior trees and hierarchical finite state machines seem to be the AI types which are most often the best choice for a given decision module in the system.

the weakness of this approach is that the AI will never be better than the expert programming it. not that big a deal really. you're building the game, so until it get into the hands of some rabid fanboy who spends six months playing it all the time, odds are you'll be the best player out there.

a quick example:

this is the basic AI for band members in caveman at the moment:

1. are there badguys nearby?

if badguys nearby:

1. if in a shelter (tent, etc), leave shelter.

2. if doing an action (eating, crafting , etc), stop the action.

3. determine target

4. if cornered (pinned between terrain and PCs/NPCs/badguys), attack

5. if in collision recovery, do collision recovery

6. if taking fire, respond to taking fire (attack if have target, else flee)

7. if have target within 20 feet: if target is stronger, flee - else attack.

8. if have orders, follow orders (attack, follow, move to, stand, flee, maintain distance, etc)

9. if half dead, flee

10. if have target (beyond 20 foot range): if target is stronger, flee - else attack.

11. if fatigued, rest.

12. if none of the above, stand (do nothing).

there's a similar set of rules for "no badguys nearby".

the game has a number of different types of AI:

bandmember (PCs under AI control)

hostile (hostile NPCs)

friendly (neutral NPCs)

companion (volunteer followers)

warrior (hired followers)

pet (domesticated wild animal followers)

predator (when hungry, eat carcasses. hunts weaker targets if no carcasses).

avian predator

defend (stand your ground)

avian defend

maintain distance (keep a safe distance from threats)

avian maintain distance

defend location (defend a specific location)

each has its own set of rules of behavior and its own expert system.

at the lower levels it has things like "graze" behavior (used by maintain distance AI and a few others) which uses a finite state machine to transition between flocking, wandering, migrating, and stand behaviors.

the "expert" part comes in when you write the code for "attack" or "respond to taking fire", etc. IE how to attack - with what weapons? or how best to respond to taking fire from a known or unknown source. then you simply use more little expert systems with their AI type of preference, that do nothing but decide how to attack or how to respond to taking fire, etc. eventually you end up with a hierarchy of expert systems, perhaps using a variety of AI types, with each little module an expert at making one decision, be it high level or low level.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

Advertisement

For all intents and purposes, that's the same thing as a utility system. The advantage of the architecture in my lectures is that it is completely homogeneous and data-driven. You don't have to write separate AI modules for different types of decisions. It has most of the same power as an expert system but is far less cumbersome to develop and maintain.

Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

Risk/reward .... one fundamental way to make a situation where the risk/reward is clearer (logicwise) is to have a rule mechanism where once you get involved in a fight, you cannot leave it (gives more importance to maneuvering before engaging). or take some major defnsive penalty if you do.

--------------------------------------------[size="1"]Ratings are Opinion, not Fact

Dave Mark and co.'s utility system has the best level of abstraction that I have seen. Making it data driven and accessible for non-programmers is a big step forward.

Hello again smile.png

I've been working on that for the last couple of weeks and here is what happended :

My game engine overwall was slow, very slow. I spend a lot (too much) time optimizing every function that was taking too much time. The task was basically to optimize the whole game structure since 90% of my code related to battle was beeing called over and over...

My game engine performance improved vastly since last time but sadly I'm nowhere near the level of performance I'd like :/

I've removed all my utility based AI form my code and decided to give MCTS a try. Mixed results so far.

If I disregard some issue in certains browers (love you IE...), with a simple fight with a low number of skills/buffs/other costly mechanisms, I can run the MCTS process around 1500 time during my AI phase (1.5 seconds). That means 1500 node created, 1500 playout (limited to 30 turns), and 1500 evaluations.

If I skip the playout phase, I can process around 8.000 times my MCTS. On the contrary if I set it to do a maximum of 300 turns during the playout, it will process around 200 times my algorithm. (300 turns is about the number of turns that my test-battle takes to go to a win/loss situation)

So far it takes OK-ish decisons (using the 30-turns max playout) but still takes somme silly decisions sometimes.

I can't help to think that running 1500 times a MCTS algorithm is too low to have something decent. At the same time, i've optimized my code as best as I could and cannot find anywhere else to gain speed at the moment. Which is kinda of scary because I anticipate that under real circumstances, the AI will run even slower (having more things to check than with my test setup).

For now, one of the main issue besides speed of execution is that it has trouble handleing RNG.I've stopped using fixed random seed (it was way too time consuming to be viable). That means that if my AI test a move and roll a "miss" on the attack, it will likely discard the branch in favor of another attack that did not miss. The issue is not really that it selected that second attack but more that it avoided the first one based on a false idea of its sucess. it basically means that to be chosen, that attack have to pass a hit-check during the AI phase AND another one during the move processing for real. Hit chance double-dip = bad design here.

I've found a not-so-elegant solution that doesnt solve the issue entirely by at least diminish its impact : Only for the root of my tree (the unit I need to take a decision for in the end), I will add 2-3 times a node for each possible move. So if that move is a fireball, the AI will created 3 children resulting from the cast of a fireball on a target. If one hit and two misses for example, the AI will focus on the hit-branch for that spell and is more likely to select it in the end. At the same time, if my hit-chance was really low, I've got good chances that my AI will miss 3 times and doest not select the move.

All in all, it has many flaws (mainly the number of nodes added cannot be balanced properly vs the percentage of each RNG stat so it will still be RNG-dependant ... just a bit less that with only 1 node per move).

I'm open to any kind of better solution for this !

I think that with more processing power (for example if I was able to compute 15.000 moves per AI phase), the MCTS will likely be able to go much further down the tree and over the course of XXX turns, see that a fail for a move at depth 1 is not a big deal since everything after that can also fail or not and with a big number of turns evaluated, the hit/miss of each moves kinda averages down each branch.

EDIT: I was wrong here, giving it more time doesnt solve the underlying issue. I need to find a better way to solve the RNG nature of the moves.

One thing to note about my tree, is that since i know with certainty what moves the player will make, i've done a tree with only the monter's position in it. When creating a children node, I process turns untill a monster's unit should play. That allow me to have less node in my tree overall and go deeper/faster into the game.

I haven't tried yet to do a lot of optimization on the logic on the MCTS itself (like added domain knowledge and such) but I will come to it soon. A few questions I've got about that :

- it is worth it to add more than 1 children per expanding phase ?

- it is useful to do more than one evaluation per MCTS run ?

- should i initiliaze the number of win/simulation used during the selection process to something closer to the parent's value or keep 0/0 ?

I'm currenctly trying to find a way to prune entire branches based on game knowledge but so far I've failed to find easy to asses situation where I could prune a move with quasi-certainty.

i'm also thinking about using some faster-less accurate versions of the real one I use during the processing of a normal moves. This would allow me to gain speed, potentialy a lot by skipping some computation that have less effect on the end results) but it will comes with a potentialy huge loss of accuracy overall and may lead the AI to do stupid things. For example; if I decide to remove every life/mana leech computation for the AI process. It will be faster but will fail to properly evaluation situation where monsters or players heavily rely on leech.

I like the MCTS idea but my lack of processing power worries me a lot ^^

If you have ideas / suggestions / remarks / criticsm / questions, do not hesitate smile.png

EDIT : It turns out that even if I allow my MCTS to run for longer than planned, it still doesnt give me decent results. So i'm shifting my focus from game engine optimization to try and fix-up a MCTS version than can give decent results even if it is under a long time frame to start.

One of my main issue is how to handle the non-deterministic states of my moves. I've read a few papers about this but I'm still in the dark. Some suggest determinizing the game information and run a MCTS tree on that, repeat the process N times to cover a broad range of possible game states and use that information to take your final decision. In the end, it does multiply by a huge factor our computing time since we have to compute N times a MCTS tree instead of one. I cannot rely on that since over the course of a fight I've got thousands of RNG element : 2^1000 MCTS tree to compute where i already struggle with one is not an option :)

My idea of adding X children for the same move doesn't seems to be leading to a good answer either. It smooth the RNG curve a bit but can shift it in the opposite direction if the value of X is too big/small compared to the percentage of a particular RNG. And since I got multiple RNG par move (hit change, crit chance, percentage to proc something etc...) I cannot find a decent value of X that satisfies every cases. More of a badband-aid than anythign else.

Likewise adding 1 node per RNG tuple {hit or miss ,crit or not,proc1 or not,proc2 or not,etc...} for each move should cover every possible situations but has some heavy drawbacks : with 5 RNG mecanism only that means 2^5 node to consider for each move, it is way too much to compute. If we manage to create them all, we could assign them a probability ( linked to the probability of each RNG element in the node's tuple) and use that probability during our selection phase. This should work overall but be really hard on the cpu :/

I also cannot "merge" them in one single node since I've got no way of averaging the player/monsters stat's value accuractely based on two different game state and averaging the move's result during the move processing itself is doable but requieres a lot of simplifcation that are a pain to code and will hurt our accuracy really fast anyway.


If I disregard some issue in certains browers (love you IE...), with a simple fight with a low number of skills/buffs/other costly mechanisms, I can run the MCTS process around 1500 time during my AI phase (1.5 seconds). That means 1500 node created, 1500 playout (limited to 30 turns), and 1500 evaluations.

Do you really need to evaluate the game state for a full 30 turns? Do you really have combo chains that long -- where something you do *this* turn has a huge payoff 30 turns down the road? That seems like an excessively long planning horizon, and unlikely to pay off given the *huge* uncertainty of predicting what will happen 30 turns in advance. Combos should generally be 2 or 3 steps (eg. use a buf on myself, then de-buf the enemy, then attack). Have you tried setting the turn evaluation limit to something much smaller (like 4 or 5)? That would let you explore more possible actions in the same amount of time (more breadth instead of depth).

I can't help thinking you'd get better results if instead of simulating the results of the game, you simply give your AI a few simple heuristic rules. For example, you might use rules like these:

  • If I (or an ally) is badly injured and I have healing powers, then heal myself (or my ally)
  • If my enemy has fire resistance and I (or my allies) have fire-based powers, then use a de-buf power on my enemy to remove their fire resistance
  • Otherwise use the attack power that is likely to do the most damage

Those rules don't require simulating anything and should be quick to evaluate. If you can program enough of these simple heuristics then you'll get an AI that appears pretty smart even though it's not actually planning anything.


I can't help thinking you'd get better results if instead of simulating the results of the game, you simply give your AI a few simple heuristic rules. For example, you might use rules like these:
If I (or an ally) is badly injured and I have healing powers, then heal myself (or my ally)
If my enemy has fire resistance and I (or my allies) have fire-based powers, then use a de-buf power on my enemy to remove their fire resistance
Otherwise use the attack power that is likely to do the most damage
Those rules don't require simulating anything and should be quick to evaluate. If you can program enough of these simple heuristics then you'll get an AI that appears pretty smart even though it's not actually planning anything.

And this is utility-based AI.

Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

My browser killed my last answer, Ill try to make a quick recap :

I do not need a search of depth 30 smile.png A search of depth 6-7 would be enought I think.

EDIT : Since I've got 4 units taking turn to play, every 4 depth or so is like 1 ply for the unit I'm computing the initial move for. So in order to check combos over 4 turn, i'd need a depth of 16 O_o. God i'm far from that smile.png It seems to be what is holding me back the most, computing moves for a party of 4 instead of one unit just adds too much complexity to be handled properly.

I solved my RNG issue by using something close to expectimax with random sampling of my moves. I now generate 10 nodes per single move to account for the RNG.

It works decently so far but has killed my performance as you can imagine.

Another huge constraint I had to dealt with was the fact that each move cost a different amount of action point and in the end, while creating my tree, I end up with nodes at different game states. It is fine during the tree creation but when it is time to select my best move, it just doesnt work anymore. There is no realiable way to compare game states if they are at a completely different game turn. My solution was to create a tree with all leaves at around the same game turn. I do that by adding node on branches with a lower average turn number than average.

it works fine but it do flatten my tree a lot and prevent me from using MCTS effectively. I'm now using a simple MAX() evaluation on the tree's leaves to find my best move. It works but now I cannot go much further than depth 3.

Let's add to that that I havent found a reliable way to "prune" branch of my trees since I'd likely miss some good combos doing that.... I'll need to find some magic to be able to reach depth 5 at least (that is the minimum I am aiming for).

Edit: To add some fun, I'm now reaching some high memory usage with all the nodes I'm creating [around 10K] ;o) So I need to reduced my memory imprint and improve my computation speed at the same time... fun times ahead smile.png

And this is utility-based AI.

In almost every possible AI implementation for my problem, a good heuristic / utility / evalution function will be needed. I'm starting to feel that mine just doesn't cut it. I'm trying to find something suitable for the job but no luck so far.

I started with something easy : Health points. After all, it is what the battle is all about: kill or get killed if your HP gets to 0. This should probably be always one of the most important factor of the final scoring used.

Simple yes but effective meh. I quickly noticed that, if the AI was loosing overall [For every attack the AI makes, the opponent will strike back and strike harder than him], it would lead my AI to choose actions that would just delay it's death. It seems logical that the AI would like to die rather later than sooner. It did that by choosing fast actions that cost a low number of actions points and thus increasing the depth of the tree (decreasing its visiblity of the future :/).

To sum up, the AI will prefer to do:

A) FastAttack - (get hit) - FastAttack - (get hit) - FastAttack - (get hit) - FastAttack - (get hit)

than

B) SlowAttack - (get hit twice ) - SlowAttack (get hit twice) - SlowAttack (get hit twice) - SlowAttack (get hit twice) -

Both choices can be seen at at a depth of 4 (since the AI getting hit is just part of the environnement feedback resulting from its action and can be computed with certainty). Assuming 2 fast attack = 1 slow attack in terms of damage and that AI loose more hp when beeing hit that what is deals :

Initial score before A or B : 1

Score After A : 0.8

Score after B : 0.6

All that to illustrate my AI would choose to delay its death and that lead to poor move choice. Even if the slow attack had more than 2 times the damage of a fast attack, the AI would still choose the fast attack because it thinks it had lost less while in fact it is only due to its blindness to what happened later.

I tried to compute my tree by trying to keep an average of number of turn similar is all the leaves but it gave poor results since with 4 differents AI units playing a row and beeing stuck at a depth of 4, I'm often computing move that happend during the same game turn anyway.

I tried to add another factor by my AI utility/evaluation function : the ressources of its units (mainly action point and mana points) at a low weigth compared to HP. It hasnt really helped since figuring out a good weight for both hp/ressources has eluded me. That weigth would need to depend on too many variable to be reliable. [For now i'm not considering GA to tweak weigth since it would take me months of computing time to get a decent irations of my population ; anyway my algorithm is too far from "ok" to consider this atm]

To be honest, I've been at it for months now and feeling a little bumped about not finding a decent solution to all this :) But hey, i'm still working on it everyday, reading every article I can and trying to think about new ideas to solve this .... let's keep hope :)

Any input is appreciated :)

/my_diary end

In the few war games that I have played, winning is not about avoiding death, in my experience, but about delivering more damage to the opponent(s).
This is why numbers count, you can throw more bullets at the enemy than he can in the same time. As a result he dies faster.

This topic is closed to new replies.

Advertisement