Animation states and blend trees

Started by
14 comments, last by RobM 9 years, 7 months ago

In a game with a complex animation system for a player, when using an animation blend tree, would one tree be used for all possible animations? Or would there be, for example, one blend tree for idling, one for walking/running, one for punching, kicking, or whatever moves there are? I've created a binary animation blend tree system which is pretty flexible but I'm not sure I should be making it so complex to contain all of my animations.

And also, how does the animation state machine relate to the blend tree?

Thanks

Advertisement

Generally you have a state machine that has different states. These states can be either for example a single motion, another state machine, or a blend tree.

This way you can build hierarchical systems to keep things much more clear and easy to understand visually.

Mostly you would see an idle state, which can be a state machine in which some random motion is picked for example. You can also make it a blend tree, in which you place a a state machine. The output of that state machine that picks the idle motion can then act as input to say a lookat node, which makes the character look at a given goal again, allowing to modify the idle motions.

Then next to Idle there can also be some Moving state, which is most likely a blend tree in which you blend different animations together, based on speed, turn angle, etc.

So to answer your first question, no, in modern games you wouldn't really put everything in one blend tree or state. Technically you could, but you will have it hierarchical where your blend tree would contain state machines as well. Generally you have one root state machine, in which you place things like Idle, and Moving, Attacking, etc.

I hope that also answered your question about how state machines relate to blend trees, if not please explain what exactly you mean in further detail smile.png

Thanks Buckshag.

The thinking behind my one animation blend tree is that I can conceivably have all my complex animations and blends in there. All I would need to give my player any animation is to supply some scalar values to the blend tree including blends between such and such or choices of this clip over that one, etc. I thought then I could have a separate state machine that contained the values I'd need to pass to the blend tree in order to get the relevant pose.

So in the blend tree you might have this (for my snowboarder character):

b1) IDLE

b1) Jump/board on/off

c1) Jump

c2) Put board on

c3) Take board off

b2) Idle

c1) Idle 1

c2) Idle 2

c3) Idle 3

b2) MOVE

c1) board off

c1) Motion

b1) Walk

b2) Run

c2) Misc

c1) Come to stop

c2) Jump

c2) board on

b1) Motion

b1) Upright stance

b2) Medium stance

b3) Crouched stance

b2) Jump

b1) Jump initiation

c1) Ollie

c2) Nollie

b2) Grab

c1) Indy

c2) Tail

c3) Nose

c4) Mute

b3) Style out (Depends on grab)

b3) Land

b4) Fall

Where an entry starting with b (e.g. b1)) is a blend node and c (e.g. c1)) is a choice node (choice node meaning out of these nodes only one clip can be played). So by passing in around 3-5 values into the blend tree, I can pretty much come up with any animation for my character.

Storing each set of values required for a particular animation into a state machine state would allow me to just use the state machine and then depending what state i'm in, play that particular animation.

Have I over-complicated the use of blend trees?

I just read some Unity documentation where they talk about animation states, transitions and blend trees. I think I've got a clearer picture now

So a state can be just an animation clip or the result from a blend tree and the transition between states can be a blend from one state to another. Does that sound about right? It seems like it fits my requirements at least

Yes that sounds correct smile.png

States can be anything that output a pose without needing a real input, other than some settings you can configure for that state (like which motion to use).

So anything that 'generates' an output pose can act as state. So instead of a predefined motion node you can also have say a procedurally generated motion node that does some physics based walking motion or so.

These states can be connected with transitions. These transitions can have conditions on them, which define when transitions should be activated. For example if you have an idle state and a move state, you can make a transition from Idle to Move, and put some condition on it, which triggers when the character 'speed' parameter becomes bigger than zero.

So when your game would then pass a value of say 0.5 for speed, it would automatically make the transition from idle to move. During the transition both outputs poses are calculated of both the idle and move state, and a blend is done between them.

The move state could be a blend tree. Blend trees have some final node to which you connect. This final node then represents the output pose of the blend tree.

A state can also be another state machine. So the output of that state would be the output of the state machine it represents.

This way you can build hierarchies. I think Unity 5 will also support that. I think its amazing they didn't/don't support hierarchies yet, as that makes it almost useless or impossible to manage.

I have seen some of our clients of our animation middleware EMotion FX have graphs of over 1500 nodes. Imagine having to place that all in one state machine. That would end up in one big spaghetti of transitions smile.png

Btw, to answer your question if you over-complicate the use of blend trees: you seem to pick between different motions quite frequently.

In theory you wouldn't need a blend tree for that, but in practise that can do that as well.

You could either make that a state machine that picks the right motion or pose based on say what kind of stance you are in, or you could make a blend tree which has some blend node which takes multiple input motions, and your weight will represent which motion to pick. The blend node can automatically blend between them. Or you make a node that picks a given motion based on an input value, without blending.

For your jump types you would most likely want a state machine rather than blend tree, as you are not likely to blend between different types of jumps.

Thanks for the explanation, really appreciate it. With what you've said I think I've got enough to go forward now.

I've been doing some work on the state machine tonight and I have a couple more questions...

Firstly, I obviously need some form of instance data with regard to my states, or at least the current state I'm in. For example, I have my player character and as an entity, he has a component which is a state machine. He is idling and so the current state in his state machine is the idle state. In that state, i'll need to have some record of the local time for the idle animation that is playing. That's all well and good, but if I have two players on the screen, would they each have their own state machine? I'm wondering whether I keep the data seperate from the structure of the state machine although I'm not really sure how that would work. A lot of the data in the state machine states would be common (fade time, fade type, animation clip name), it's only animation clip positions that would need to be stored, so copying the whole structure seems like a waste of memory. Also, if a state uses a blend tree, there may be lerp positions that needs to be kept track of.

Secondly, from an animation perspective, I understand the states and transitions that can occur but what should control the state machine? I mean if it's entirely data-driven (which I intend it to be), some game logic somewhere has to know that when the player is in the idle state, he can walk or run (for example). I know the data-driven state machine can represent this easily, but how would the game logic control that? My initial thought was that the current state of each state machine would handle events so, for example, when the user presses the up button on the controller, that event is fed through to the player's state machine (it's a component after all which can receive events) and the state knows (through data) that in this instance, pressing up in the idle state transitions to the walk forward state.

So the state data might be something like (in very simplistic terms):

<state name="idle" animation="idle_anim">

<transition name="idle-walk" type="fade" timeSpan="0.5" newState="walk" actionEventType="controller_up" velocity="controller_up_distance"/>

</state>

<state name="walk" animation="walk">

<transition name="walk-idle" type="fade" timeSpan="0.5" newState="idle" actionEventType="controller_centre"/>

<transition name="walk-run" type="fade" timeSpan="0.5" newState="run" actionEventType="controller_up,controller_trigger"/>

</state>

Is this how it is generally done? I can see it working this way and if any extra game logic is required, it can perhaps be hard-coded based on actionEventType. An example of this might be to add some IK for feet touching the ground when a player walks or runs.

Thanks

You indeed have shared data and unique data per character per anim graph node.

The play offset would be unique as it is different per character, while the transition time will be shared. I do not clone the actual full state machines though, although it would also be possible, but less memory efficient as you have a lot of shared data.

The game is setting parameter values, for example a "button_up_pressed" parameter could be a boolean. Then the state machine can make a transition when that button is pressed using some condition that checks for that boolean to become true. At least that is one way to do it.

Your xml would work to describe a simple state, although there can be more state types, not just with an animation.

Also you need a set of conditions on each transition, rather than just an actionEventType condition. So I would make that more flexible there.


Firstly, I obviously need some form of instance data with regard to my states, or at least the current state I'm in. For example, I have my player character and as an entity, he has a component which is a state machine. He is idling and so the current state in his state machine is the idle state. In that state, i'll need to have some record of the local time for the idle animation that is playing. That's all well and good, but if I have two players on the screen, would they each have their own state machine? I'm wondering whether I keep the data seperate from the structure of the state machine although I'm not really sure how that would work. A lot of the data in the state machine states would be common (fade time, fade type, animation clip name), it's only animation clip positions that would need to be stored, so copying the whole structure seems like a waste of memory. Also, if a state uses a blend tree, there may be lerp positions that needs to be kept track of.

I also separate static and dynamic data. That is useful not only here but in general (i.e. also for skeletons).


Secondly, from an animation perspective, I understand the states and transitions that can occur but what should control the state machine? I mean if it's entirely data-driven (which I intend it to be), some game logic somewhere has to know that when the player is in the idle state, he can walk or run (for example). I know the data-driven state machine can represent this easily, but how would the game logic control that? My initial thought was that the current state of each state machine would handle events so, for example, when the user presses the up button on the controller, that event is fed through to the player's state machine (it's a component after all which can receive events) and the state knows (through data) that in this instance, pressing up in the idle state transitions to the walk forward state.

As mentioned by Buckshag, the state machine as well as the various blend trees are controlled by a set of variables (of course, these variables count to the dynamic data). Nothing inside the animation system should directly investigate input or observe events. Words like "button_up" in animation transition conditions hint at a design flaw.

Input should be decoupled because the animation system by itself is not necessarily player driven. The same animation system can perhaps be used to animate a NPC. What often is called a player controller investigates input and reacts by setting variables if a specific input situation is detected. These variables are an abstraction and have a meaning w.r.t. the character, e.g. they mean "walk forward with speed x", "jump", "pick up that object", and so on, but never "button x is pressed". This helps not only to keep the animation system clean, but also to have a dedicated place (outside the animation system, just to emphasize this once more) to deal with input configurations.

Regarding event observation (although not explicitly mentioned in the question): The animation system is called at one or more specific moments during looping though the game loop. When the animation system's update() is called, its the natural moment for state transitions to be checked. There is not only no need in pushing events into the state machine, it is even counterproductive to do so.

Thanks again guys.

I was using "controller_up " as an example, I already have a decoupled architecture for input, but it's a very valid point.

Now the thing is, for my player animation design, pressing up on, say, the left thumbstick does a different movement or action depending on which state you're in. For example, when you're stood on your snowboard idling, pressing the left thumbstick forward makes the player shuffle along to start moving. When the player is jumping, pressing the left thumbstick makes him tilt forward, and so on.

So I have to have some game logic somewhere that has this information, whether it be data-loaded or script-based, but isn't a data-driven state machine the perfect place for this? So instead of passing an event as you've mentioned, the current state looks at its loaded data and checks the state of the [decoupled] input flags. (Let's assume that the player can configure their controller as they see fit and that I have decoupled the physical sticks and buttons from the internal representation)

So for example, in the idle state, we may have loaded up the following simplistic data:

<state name="idle_board_on">
<transition name="idle_board_on-shuffle_forward" triggeredBy="leftstick_up"/>
</state>
<state name="jump_board_on">
<transition name="jump_board_on-tilt_forward" triggeredBy="leftstick_up"/>
</state>

I've missed out the rest of the attributes for brevity but hopefully that makes sense?

If I need to decouple this from the state machine completely then it feels like it may be duplicating logic elsewhere?

This topic is closed to new replies.

Advertisement