help me understand utility AI

Started by
34 comments, last by Alturis 7 years, 1 month ago

It seems like common practice is to come up with a utility function for every action... but that seems like a LOT of actions. Let alone if I start creating utility functions for "two-in-one" actions.

You can create functions like that, just like you can create unique classes for everything rather than using data variables, you can hard code all data, you can write programs that avoid standard data functions at all.

But most developers choose not to.

Most developers choose to create a small number of functions that operate on data, then allow other people --- in this case game designers and level designers and such --- to modify the data. The simple function takes whatever data it is given and computes a score from the data.

The AI's job is to look at the top few scores, maybe the top 1 score or maybe the top 1% or 5% of the scores, and use that as the decision.

You can, if you choose, write a unique utility function for every possible action in your game. I would advise against it, instead preferring the pattern of having everything driven by simple pieces of data. A collection of modifiers as data along with a collection of the system's state as data, the result is a scalar value. But even so, you can do it as a bunch of functions if you choose.

Advertisement

It seems like common practice is to come up with a utility function for every action... but that seems like a LOT of actions. Let alone if I start creating utility functions for "two-in-one" actions.

You can, if you choose, write a unique utility function for every possible action in your game. I would advise against it, instead preferring the pattern of having everything driven by simple pieces of data. A collection of modifiers as data along with a collection of the system's state as data, the result is a scalar value. But even so, you can do it as a bunch of functions if you choose.

Yeah, I think that makes more sense to me. Use utility to set goals, which drive a "family" of actions. Goals like "find food", "fall back", "rise in rank", and "take revenge".

  • "Find food" might simply eat what the NPC has, or hunt or forage, or even try to ration what food the NPC has left, depending on availability.
  • "Fall back" might just take cover or call for help if those options make sense, but full on retreat is always a last resort.
  • "Rise in rank" might challenge a weak peer in the hierarchy, or might build an army to ambush him, or might detour the NPC to train up until they're ready.
  • "Take revenge" might attack someone on the spot if no enemies/guards will stop them, but otherwise they might order an assassination if they have that power, or consider them an enemy and bide their time, or simply deal with it through insult or humiliation.

So how would you incorporate a potential decision into the rest in order to pick the one with the best score if you don't score it in the first place? You have to score them somehow. And the most relevant thing to do is to score them on what might be important regarding whether you would make that decision or not. That's pretty much the entire point of utility based AI.

... but then I'm sort of seeing the other point. Once I have determined a suitable goal, am I going to pick a suitable action for that goal from a behavior tree -- essentially hard coded logic? Or should I look at a more fluid series of utility scores? The latter makes sense: once the NPC decides their goal is to rise in rank, I do a more detailed utility scoring of the top two or three people they could challenge, and discover that one of them is also a hated enemy.

(I really appreciate all this guidance, btw.)

I guess my last question: is this sort of "two stage analysis" something that actually has logical problem solving value, leading to better decisions and reducing computational complexity? Or will implementing it be mathematically equivalent to just considering it all at once -- all the ways of finding food vs all the ways of taking revenge vs all the was of rising in rank (etc)?

It is one function, you pass data for the motives, you pass data for how the action satisfies the motives.

If your motives indicate food satisfies the goal, they will score high. If your motives indicate hunting one satisfies the goal, they will score high. If your motives indicate that targeting a specific character for attack satisfies the goal, those actions will score high.

The videos linked to earlier explain it all well enough, and it is quite portable. Pass in the motives, pass in the current state, and the result is a number that say how strongly the two match. Take the highest score, or perhaps choose randomly among the top few highest scores.

Let's say I have a utility function called "challenge someone for rank". It's calculated as the largest expected value of victory against someone my rank or higher. If I reach a certain threshold, I trigger a behavior tree that selects between several actions.
The obvious action is "challenge the best option for victory".


You didn't understand my previous post. I thought I was quite specific in that you would not have different actions for different motivations. You have different actions and they each may satisfy any, some, or all of your motivations, and it's that which is represented in the utility function.

This is why your fear about having to maintain too many utility functions is misplaced; you don't need a different function for every combination of action and motivation. Each action has a function, and that includes all motivations or other influences on utility.

Note that 'utility function' uses the word function in the mathematical term; i.e. it maps an input (the action, and the current environment) to an output (utility score). It doesn't mean function in computer programming terms, where there must be 1 coded function for each one.

But the system would also consider actions that challenge the second best option under a few circumstances


No, it should consider them under ALL circumstances. The whole point is that the system considers all these possibilities, then picks the one that scores the highest.

Thanks guys... this is helping me to get passed my intuitions.

Processing the problem myself, I'd want to break it apart into two stages (1: what's the goal that satisfies my main needs? 2: what's the action that satisfies the goal, and maybe even knocks off a few other needs in the process). But it seems like I could just flatten it / skip right to stage two: (for every action i could take, what is each action's impact on my needs/goals)

Like I said, it seems like a lot of functions -- one for every action. But it seems like any method of problem solving would get to that level of detail eventually.

Your two-stage approach appears to want to pick a current high-level strategy, and then pick a low-level behaviour to meet that strategy. That's a reasonable approach, but it is not the way utility-based AI is intended to work, which is probably why there has been confusion. And we've shown examples where picking a goal first actually makes it harder to look intelligent, because sometimes one action satisfies 2 goals simultaneously, better than any action could satisfy 1 single goal.

Utility based AI is simply this: "Pick a behaviour for the character based on how much expected utility that behaviour will provide".

Two parts of the sentence can be implemented in different ways:

  • "pick" usually means "select the action with the highest expected utility" but it can be "select one of the top-ranked actions" or whatever.
  • "expected utility" usually means "run the state of the world through a few pre-determined functions - hand-selected for that action type - to return a value". But it could have some sort of planning stage if necessary. Usually it does not.

Usually it's sufficient to write one subroutine for each type of action, which returns the utility for it. Most games have fewer than 20 types of action so this is not hard to do. The data in each case may differ, however. For example, "Kill Donald" and "Kill Adam" are both the same type of action ("Kill") but the function would return different values based on the values associated with Donald and Adam. To continue the example from earlier:


float KillCharacterAction::EvaluateUtility(Character& target)
{
    float statusScore = target.GetStatus();
    float familyScore = target.GetFamily();
    float overallUtility = statusScore + familyScore;
    return overallUtility;
}

Obviously it would be better if this was data-driven so that designers could edit it, including altering the way that the sub-scores are combined and scaled. But generally speaking each utility function can start out as simple as this, and you can have them all written in a day or so.

I guess this is the thing. We're not just talking about utility in a finite combat situation, but in a grand simulation sense. (Yeah, I know this is pretty ambitious -- Dwarf Fortress, The Sims, CK2, Prison Architect, and RimWorld as references.)

If I effectively multiply the number of actions by the number of characters, that's a LOT of actions.

I guess the best way to cull some of the actions is to only consider killing a finite list of "enemies". Or only consider killing the top 5 people I dislike. Similarly, with food I might consider only sources within range, plus whatever I have at home.

Even if there are 100 or more characters to consider killing/helping/bribing/whatever, that's still far fewer calculations than a typical physics engine is doing each frame; and you don't have to run this every frame.

You only need to evaluate it when an actor must find something to do. Usually that is infrequent, you've got many seconds between it, maybe 30 seconds, 40 seconds, whatever.

When it must be done, you can limit it to a small number of work per frame, perhaps 1000 or 5000 actions compared if you have that many. That is likely more than enough to process the items in the world, but if you've got more available actions than that, you can choose one of those items or continue processing the list on the next pass through the game loop, repeating until you've evaluated all your choices.

It can also be limited to processing a single actor per turn to prevent a massive stall after a load. Instead of every actor doing all the work that first instant, it may take a second or two before all the actors in the world have their activities selected.

Most games have fewer than 20 types of action so this is not hard to do.

Caveman has an unusually large number of possible situations it responds to - that's why i use a hierarchical approach to decision making - divide and conquer. Although in the end, they all result in some combo of turning, moving, and attacking.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

This topic is closed to new replies.

Advertisement