• Advertisement
Sign in to follow this  

Planner-based AI and action costs

This topic is 3897 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello. I read Jeff Orkin's "Three States and a Plan: The A.I. of F.E.A.R." and found it easy enough to understand. However, he doesn't go into great detail about the costs per action. I suspect that implementing the system such that the A.I. completes actions that satisfy the goal would be easy enough, but what about completing the best actions? He implies the use of constant values for the action costs, but I doubt that's sufficient in all cases (generally higher-level actions). For example, suppose you have a squad-level "flank" action. When is flanking better than just standing and shooting? It seems that you'd need a function that evaluates how dangerous the flank would be for the cost to be relevant to anything. Has anyone here used procedural costs before? Also, because it is also skimmed over by Orkin, what ideas do you have for the goals and their competition? Thanks for any input.

Share this post


Link to post
Share on other sites
Advertisement
You can bet that the movies demonstrated in the F.E.A.R. presentation involved alot of magic number tweaking for useful results, probably simple stuff like timers limiting how often a given action can activate or other such things. A hierarchical behavior tree like the Halo 2 one has the same issue with having to balance the priorities of competing actions. IMO the F.E.A.R. presentation attempts to make it sound overly simple, when the reality was probably not so simple as creating an action and dropping it in. It was more like creating an action, dropping it in, and testing it quite a bit to get the priority evaluation to a level where it looks good. I'm personally using the Halo2 system in my projects as explained decently on Gamasutra with followups talks available from the latest GDC, as are many studios. It's an incredibly simple concept that ends up being very powerful and quick to plug behavior into.

Either way though there's no magic in figuring out your priority functions, you just gotta test and come up with stuff that looks good. Often that comes from simple things like timers to limit the frequency, or biasing a pathfinding query to some distance off an optimal path to vary the paths returned, or weighting evasive actions against some time scalar since they last did one, etc

Share this post


Link to post
Share on other sites
The way you do a behavior system like that is to attach a probability to each behavior that designers can tune. Each behavior also has it's own binary heuristic: "can i run right now"

You evaluate all potential behaviors and collect a list of ones that "can run". After that you randomly choose weighted by probability.

Many of the flanks and stuff in F.E.A.R were also controlled by scripts and carefully designed maps. If you place that AI into any random level layout it fails pretty horribly. But that's fine. It's a great AI because:

1) design fully understands how it works
2) design can control it
3) it does what you expect it to do

-me

Share this post


Link to post
Share on other sites
In my experience you'd be hard pressed to find an AI programmer that would give the designers that low level of control. Designers are more times than not, ignorant/incapable of understanding how an AI works well enough to be tweaking low level behavior priorities. In my experience the AI programmer comes up with what looks good, and you toss the designers some bones for very limited customization of some behaviors. I'd be very curious if any studio at all gave the designers that degree of customization.

Share this post


Link to post
Share on other sites
Quote:
Original post by DrEvil
In my experience you'd be hard pressed to find an AI programmer that would give the designers that low level of control. Designers are more times than not, ignorant/incapable of understanding how an AI works well enough to be tweaking low level behavior priorities. In my experience the AI programmer comes up with what looks good, and you toss the designers some bones for very limited customization of some behaviors. I'd be very curious if any studio at all gave the designers that degree of customization.


So I'm an AI programmer at a major studio, that's more or less in the ballpark of how we do our AI. It's all based on the experience and competency of your design team. Our lead designers are respectively: math major, statistics major, history major. The former 2 are extremely technical 1 of which came from Bungee on Halo (and that was how their AI worked).

The purpose of design (in my experience) is to design levels and encounters. If they cannot control the AI, how do they control the encounter?

I've seen other teams with the philosophy of designers get the bone, and IMHO their games suffered from it.

[EDIT: I should point out that only the lead designers are allowed to tweak the behaviors in the unit archetypes. The level designers place the units as-is. However, the unit designers certainly have script hooks to the AI so that they can do things like: "when player enters trigger A, send AI to Node group X", and such. That latter type of script, IIRC, is how a lot of the "flanks" in F.E.A.R. work.]

-me

Share this post


Link to post
Share on other sites
That's fair. We have a philosophy of being a bit on the careful side of how much power the designers get. We have a few much more technical designer types among our 3 teams which get varying degrees of flexibility on their respective projects. On our project we're simplifying the interface with the AI quite a bit. They generally don't get to control the lower level behavior like when to take a knee, how often to move cover, when to combat roll, etc. Their control is at a bit higher level. Go here, get in this vehicle, drive here, etc. and more control over how they use weapons by specifying a number of weapon properties and AI 'hints' dealing with them. The weapon usage is a big part of designing encounters, other than that, higher level movement control seems to make up the bulk of the rest. Million ways to do stuff, ultimately you need to determine what degree of technical ability the designers can handle, preferably without constantly bothering you with 'bugs' due to improperly set up control mechanisms.

Share this post


Link to post
Share on other sites
Well, I need to design things such that there are no per-encounter or per-level scripts, so the AI must be versatile and require no extra human input.

Palidine, I was aware of the basics, I just doubt that constant costs for high-level actions (think squad or even platoon level) would work well. Constant values make sense only for actions that don't change under varying circumstances (for example, an NPC's reload speed will probably always be the same, so the "reload" action can have a constant cost).

[Edited by - Baiame on June 13, 2007 1:33:46 PM]

Share this post


Link to post
Share on other sites
If you expect that action cost is context dependent, then what attributes of the domain do you expect affect your action cost? Answer that and you have a means of answering your original question.

Cheers,

Timkin

Share this post


Link to post
Share on other sites
Ive used procedural cost in my homebrew planning system, it was one of my improvement over Orkin's system.

First, it depends on what you want to minimize. Do you want to minimize the time length of the plan? Optimize the chances of success? Minimize the risk of failure? (they might not be complementary!)

Another thing that you have to take into account is the order of actions. Unless you do a re-ordering of the actions after the planing, the cost of your actions may change the order they play.

Share this post


Link to post
Share on other sites
Timkin, I guess what makes it tricky is how abstract (or maybe just complex) the attributes are. For squad or higher level tactics, you'd have to consider the enemy's relative positions, firing range, numbers, defensive strength (cover, armour, etc.), and mobility. If you knew all this, you could make good descisions for almost any tactic; but I'm not sure how best to represent the data.

Steadtler, I guess I'm trying to minimize casualties, and (to a lesser extent) optimizing the chance of success. Reordering actions sounds a bit complex. I think I'll just continually check the current results of the action, and if it's horribly failing, cancel it (which can trigger some relatively simple and immutable "regroup" goal).

Thanks for the help so far, everyone.

Share this post


Link to post
Share on other sites
Hi,

I'm working on my Masterthesis which is about tracking and predicting motion of objects. In my research I came across bayesian networks. So called decision graphs are an extension of them. The nice thing is that you can easly build those graphs. (Wkipedia Decision Tree)
Might be worth looking into.

Share this post


Link to post
Share on other sites
Quote:
Original post by Baiame
Timkin, I guess what makes it tricky is how abstract (or maybe just complex) the attributes are. For squad or higher level tactics, you'd have to consider the enemy's relative positions, firing range, numbers, defensive strength (cover, armour, etc.), and mobility. If you knew all this, you could make good descisions for almost any tactic; but I'm not sure how best to represent the data.


This is the heart of the problem for any real time decision system that is context aware. Essentially what you are asking for is a value function over (action,state) pairs (a cost function problem can be easily cast as a value function problem and vice-versa). Actually what you ideally want is a policy (a pairing of states and optimal actions) which covers the game state space. You could try learning such a value function or policy during a training period, but to be honest, I wouldn't go that way (at least not initially). This is an intractable problem in all but the simplest of games. This is why people go with heuristic solutions, or 'canned plans' devised during design time.

You could improve upon this by using a plan library, with heuristics for choosing a particular plan in any given scenario. You can also perform replanning by maintaining an estimate as to which canned plan is the best at any time and always perform the best.


Quote:
I guess I'm trying to minimize casualties, and (to a lesser extent) optimizing the chance of success. Reordering actions sounds a bit complex. I think I'll just continually check the current results of the action, and if it's horribly failing, cancel it (which can trigger some relatively simple and immutable "regroup" goal).


You might find some benefit from reading some literature on partial order planning and schedule debugging.

Cheers,

Timkin

Share this post


Link to post
Share on other sites
I was thinking yesterday that it must be possible to do GOAP backwards. That is, compare the AI agent's currently percieved state to the ideal (goal) state, and get a list of the compatible or necessary actions (based on their effects). Then you can sort them out by their preconditions and effects. Of course, there'd usually be a number of different ways of satisfying the goal, but at least you wouldn't need to search through the tens of thousands of possible states. Not that this is actually relevant to my question.

Now that I've thought about it a bit more, I see that extremely low-fidelity simulations should suffice for the action costs.

Share this post


Link to post
Share on other sites
Quote:
Original post by Baiame
I was thinking yesterday that it must be possible to do GOAP backwards. That is, compare the AI agent's currently percieved state to the ideal (goal) state, and get a list of the compatible or necessary actions (based on their effects). Then you can sort them out by their preconditions and effects.


If you use backtracking, like Orkins do for that matter, thats pretty much how GOAP works!

Share this post


Link to post
Share on other sites
^ Really? Huh, didn't know that. Guess it makes sense though. Pathfinding through the all the possible states would be excessive.

Share this post


Link to post
Share on other sites
Quote:
Original post by Baiame
^ Really? Huh, didn't know that. Guess it makes sense though. Pathfinding through the all the possible states would be excessive.


It would be! So basically, you find all unsolved goals, and you only apply actions that may solve one or many of the goals of the current state. After applying the goal to the current state you add the unfulfilled requirements of those actions to the new goal list and voila! a new state. There are a few things you need to worry about, but thats the core of it.

For a FEAR-like game, in the most extreme situations you should not expand more than a few dozen states. In the best situations, you will expand only 1 state.

Share this post


Link to post
Share on other sites
I would strongly suggest looking at HTN planning. I still haven't had time to implement a full HTN system, but on paper it addresses many of these problems. Personally, I believe it is a more intuitive way to structure behaviors for designers, who in my experience think in behavior sequences anyway.

FEARs prodecessors, NOLF2/TRON2.0 did have dynamic priority and we actually expected to add this to FEAR/Condemned. In practice, it was typically simpler to make a multiple actions any time we something like this. Part of this was that getting the priorities ordered to meet designer specified behavior in the first place was hard. As soon as you start making these numbers fuzzy, you introduce more complexity. The one place we may have used this is pathing, as the plan duration varied significantly depending on how far the AI needed to travel.

Share this post


Link to post
Share on other sites
The problem is that 'cost' is hard to calculate from a large number of factors present in a situation. Even if you have full knowledge of all the objects that will interact and their capabilities you likely will still have uncertainty about what actions they will take (potential tactics of the opponent becomes a factor...). Cost itself may be several values with varying significance depending on the situation and goals (using up resources to reach the goal may be acceptable in one case and be less desireable when preserving resources is one of the goals). Taking damage (losing capability) or using up ammo versus gaining a 'useful' tactical position or lessening the oppositions capabilities. Time may be a resource that could be lost (ie in a multi scenario campaign).

Risk is another consideration. It may take high risk (of not succeeding, of resource lost) to enable the possibility of success when less risky apporaches have no chance of success.

Metrics to judge any situation systematicly (for cost/risk/gains) can be quite complex and hard to boil down to simple evaluation functions which can be applied at runtime. Just judging the value of gaining a position on a map is dependant on your known capabilities and your enemies and how the terrain configuration effects those capabilities.

A static situation can be pre-analysed extensively, but a (more) general solution is a major challenge even in a greatly simplified game world.

Expect a need for constant reevaluation as the situation changes and may require tactics of probing actions to try to gain information or a cheap acquisition of a 'good' position which can be exploted. You cant really plan everything out ahead of time in a non-deterministic system. Complex actions fall apart very rapidly when key points of a plan are not achieved or get delayed. More versatile approachs which inherantly give more options to adapt to a changing situation usually are more successful.








Share this post


Link to post
Share on other sites
^ Thanks for those thoughts. Having several different cost values seems a good idea. I don't think I'll bother with "risk", as I don't see how I could implement it.
Quote:
Original post by BrianL
I would strongly suggest looking at HTN planning. I still haven't had time to implement a full HTN system, but on paper it addresses many of these problems. Personally, I believe it is a more intuitive way to structure behaviors for designers, who in my experience think in behavior sequences anyway.

Thanks for the suggestion, I'll look into them.

Share this post


Link to post
Share on other sites
A brief outline of risk:

Risk can be considered in terms of the behaviour of an individual with regards to betting. Consider a simple lottery offering a probability p of winning L dollars and a probability of 1-p of winning nothing. The expected reward from this lottery is pL dollars. Now, assume that I make you an offer of pL dollars not to play in the lottery. You are a risk-neutral person if you are indifferent to the lottery or my offer of pL. You are risk-taking if you would prefer the lottery, or would prefer the lottery even for a higher offer >pL. Finally, you are risk-averse if you would always accept my offer, or even a lower offer (<pL) not to play the lottery. To summarise, a risk-taking person will always gamble if the possible win is greater than the offer, a risk-averse person will always prefer the sure thing and a risk-neutral person is in the middle.

So what does this mean for a goal-directed planning agent? It means that if the outcomes of actions are uncertain, a risk-taking agent will always choose actions that offer maximimum utility (minimum cost); a risk-averse agent will choose actions with the highest probability of success; and a risk-neutral agent will choose actions that maximise expected utility (minimise expected cost).

Cheers,

Timkin

Share this post


Link to post
Share on other sites
Quote:
It means that if the outcomes of actions are uncertain...

Action outcomes will be uncertain, but only insofar as the AI is ignorant of the environment (the game is "close enough" to determinsistic). That's what I suspect is too hard to implement.

Share this post


Link to post
Share on other sites
Quote:
Original post by Baiame
Quote:
It means that if the outcomes of actions are uncertain...

Action outcomes will be uncertain, but only insofar as the AI is ignorant of the environment (the game is "close enough" to determinsistic). That's what I suspect is too hard to implement.


Determinism doesn't exclude uncertainty. Clearly it would be unreasonable to explicitly express the entire domain state or to exactly model the decision processes of the enemy. Using a truncated state description and approximate decision processes (which isn't too hard to do) introduces uncertainty and hence requires that we make decisions under uncertainty.

Cheers,

Timkin

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement