Skeletal Animation System

Started by
23 comments, last by Danicco 10 years ago

Hi, I'm coding my skeletal animation system for my engine and I'm not really sure what and how I'm supposed to do what I need to do... I have a vague idea, but I can't really put it into the code in a way that I like. I'm having trouble with both the concept and technical parts of it.

I have this structure:


class KeyFrame
{
    public:
        unsigned int keyFrame;
        Vector3 keyValues;
};

class Bone
{
    public:
        //Other stuff to deal with hierarchy

        vector<KeyFrame*> keyTranslates;
        vector<KeyFrame*> keyRotates;
};

class Track
{
    public:
        wstring trackName;
        bool trackLoop;
        unsigned int frameStart;
        unsigned int frameEnd;
};

class Skeleton
{
    public:
        vector<Track*> tracks;
        vector<Bone*> bones;

        //skeleton data here
        float boneTranslations[MAX_BONES * 3]; //vec3 trans
        float boneRotations[MAX_BONES * 4]; //quat rots
};

I'm already loading the data I want, so I have something like "Bone01: keyFrame 5 > TranslateX 0.5, keyFrame 10 > TranslateX -0.5" etc

Each model has a pointer to a skeleton, who has a collection of bones, and I define tracks:


class Model
{
    public:
        Skeleton* skeleton;
        Track* currentTrack; //current playing track
};

//Game Start
Model* myModel = Resources.Load("myModelName");
myModel->skeleton = Resources.Load("mySkeletonName");

//Skeleton::AddTrack(wstring trackName, uint frameStart, uint frameEnd, bool trackLoop);
myModel->skeleton->AddTrack(L"WALK", 5, 10, true);

//Setting this model's current track as WALK
myModel->SetTrack(L"WALK");

So, first I think I need to change the "current track" to support multiple tracks to compose animations (such as WALK + ATTACK) right?

Then the problem I'm having most is figuring my "Update Skeleton" function:


void Model::Draw(Camera* camera, float& interpolation)
{
    Shader* shader = material->GetShader();

    skeleton->UpdateSkeleton(currentFrame); //this

    meshSkin->Activate(shader);
    mesh->Render(shader);
    meshSkin->Deactivate(shader);
}

void Skeleton::Update(uint currentFrame)
{
    for(uint i = 0; i < bones.size(); i++)
    {
        //checking ALL translates for that bone
        for(uint j = 0; j < bones[i]->keyTranslates.size(); j++)
        {
            //what should I do here?
            //check if this keyFrame value is higher or equal than currentFrame, then loop again
            //to search for the previousKeyFrame (if it has one), then interpolate between the two, 
            //then interpolate again to find the middle (currentFrame)?
            //And where/how does deltaTime fits into this? Maybe having a previousFrame as well?

            //Actually, scratch this, this seems bad to do at every render call, 
            //I'd like to avoid doing this if I can and "remember" the animation motion I'm currently at 
            //for that model so I can skip this operation.
            //...or a clever way to handle this
        }
        
        //Now check for rotations as well...
        for(uint j = 0; j < bones[i]->keyRotates.size(); j++)
        {}
    }
}

I'm trying to save these operations by remembering the motion I was, but then I'd need to save in the model, for each bone, the previous keyFrame and the next keyFrame for all of them. And I would only need to figure the next motion when I reach the "nextKeyFrame". But this means basically having a copy of the skeleton's bones in the model... not sure if I should go this way.

I've been thinking since yesterday but I can't figure how should I proceed from here... any help?

Advertisement

You have an architectural problem: It is wrong in principle if a drawing routine has to update the model. When it's time to render, all animation, physics, collision resolution stuff has been done and a stable situation is reached. At best the skinning may be done then, although I would do so only if the skinning is on-the-fly on the GPU.

So, the solution to animating a skeleton (or any other animatable variable) is to have an animation sub-system. The sub-system manages all of the currently active animation tracks. A track is bound to (e.g.) the orientation part of a bone, another track is bound to the position part of the bone (or, if you use combined tracks, simply to the placement of a bone). When the runloop invokes the update() method of the animation sub-system, the active tracks are iterated, the respective key values are interpolated, and the result is written to the animated target.

In practice this is a bit more complex, because of the need for animation blending and perhaps animation layering. (We have a thread here that discusses 2 possible technical solutions and some background.) In the end you need to compute a weighted average of position and orientation for each bone, where choosing the weights decides on blending or layering.

After all animation tracks are processed, also each bone has its current local transform set. The next step is then probably to compute their global transforms w.r.t. the model's space.

You still may want to implement an explicit Skelton::update(…). Notice that this means to have a location where the animation tracks for a particular skeleton are concentrated, but in the end it is important to obey the overall sequence of updates when processing the runloop. You may read this book excerpt over there at gamasutra about the need of a defined sequence of updates. Furthermore, with an animation sub-system that deals with animations in general you no longer have the special case of skeletons but use the system for animating other variables, too.


You have an architectural problem: It is wrong in principle if a drawing routine has to update the model.

Oh but that's been done already, it's that I want to use interpolation for the movements of the bones as well, that's why I put the update inside the Draw() function.

The currentFrame will be changed during the update, and I want to get the previousFrame > currentFrame interpolation during the draw. But I still haven't done the regular updates so I left the deltaTime/interpolation out and forgot to mention it, I'm sorry about that.


After all animation tracks are processed, also each bone has its current local transform set. The next step is then probably to compute their global transforms w.r.t. the model's space.

If I understood correctly, then I'm supposed to bind all affected bones to a track, then I can iterate only over the affected ones of the current tracks. That seems better than searching the whole skeleton again, I'll do that, thanks!

But still, there's the issue of figuring out where the model is at a time. I was thinking in "saving" the current time on each bone, for example, so for example I have an animation called "WALK" that are frames 5 to 10, and I'm currently at 6, I'll have references to both the keyFrame 5 values and keyFrame 10 values so I don't have to look up it again all the time.

But this idea crumbled when I realized that I can't "save" these references on the bones because of the situation when multiple models share the same skeleton and have different times. Then I went to the idea of having a reference of all bones and keyframes per model, but that didn't seemed right since I'm having to allocate a new bone/keyFrame structure per Track set on the model...

I'm trying to figure if there's some really clever way to deal with all of these either with some algorithm to handle the figuring out of the current keyframe and interpolation values, or using a new/changing structure to better handle all this.

And thanks for the link on blending, I'll definitely want to implement something like that and that'll be a good read.

You may not be able to view this article yet, but it might be some good info for you. It's currently in moderator review and should be "released" within a week or two.

In any case, one concept of an animation controller class is briefly as follows:

- The hierarchy maintains data for the character's rest (or pose) position.

- An animation controller maintains data for the character in an animated position.

- The animation controller "data" is comprised of sets of animations, each "animation" being a set of timed key frames associated with a name.

- The animation name does and must represent a bone name in the hierarchy. However, ...

- An animation controller knows nothing about a "hierarchy" and needs only information to ensure the names in the animation data match a list of names. That list of names happens to be a list of bone names from the hierarchy, but that isn't material.

- The animation data set has a period (length of time for the animation set) associated with it. E.g., a 3-second walk sequence. The animation set data and its period is a fixed set of data, and remains so for the life of the animation set.

- The animation controller maintains a set of "tracks," each track having a set of animations associated with it.

- The track maintains the "local" or "current" time of the animation set. That local time (normally) runs from 0 to the period of the associated animation set.

An animation controller may be setup to update either a single animation set, or blend two (or more) animation sets. Blending is the process of (for instance) transitioning from "walk" to "run."

The animation controller has (e.g.) an AdvanceTime( deltaTime ) function. That function checks all the tracks, checking whether they are enabled. For those tracks which are enabled, the track's current time is updated by deltaTime ( currentTime += deltaTime). After some checks on the currentTime of the track versus the period of the associated animation set, a routine is called which interpolates the animation set data appropriate for the current time, and stores the results in a buffer belonging to the track. The buffer is sized to hold interpolated key frame data for every name (in the hierarchy) for a single moment in time, the current time for the track.

When all tracks have been updated, if just a single track is enabled, the interpolated key frame data is used to calculate (for instance) the animation transform (matrix) for a bone and store the matrix in the appropriate place.

If a blend is in progress, then the data in the two appropriate track buffers is interpolated (or in some other way combined), and the animation transform calculated from that result.

EDIT: Note, fixed data is maintained by the hierarchy and the animation sets. Data which changes (times, matrices) is maintained separately.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

You don't forget how to play when you grow old; you grow old when you forget how to play.

Also helpful for you could be the tutorial I added in the graphics programming section a while ago:

http://www.gamedev.net/topic/654004-gpu-skinned-skeletal-animation-tutorial/

(Perhaps you already stumbled accross)


But still, there's the issue of figuring out where the model is at a time. I was thinking in "saving" the current time on each bone, for example, so for example I have an animation called "WALK" that are frames 5 to 10, and I'm currently at 6, I'll have references to both the keyFrame 5 values and keyFrame 10 values so I don't have to look up it again all the time.
But this idea crumbled when I realized that I can't "save" these references on the bones because of the situation when multiple models share the same skeleton and have different times. Then I went to the idea of having a reference of all bones and keyframes per model, but that didn't seemed right since I'm having to allocate a new bone/keyFrame structure per Track set on the model...

I'll describe the procedure I meant with more details.

An animation like WALK is a resource of the game. It provides a couple of tracks. Each track has a type which defines what kind of variable can be driven. That type may be a vector3 which can hence drive a variable of type vector3. The type may be a quaternion, it may also be a bigger compound like an entire placement. Perhaps the type is tagged with a semantic like position, orientation, normal, and so on; this would give a better control of what a track is good for. The animation itself has a duration. All of its tracks can deliver values for the entire duration. The animation is fix over time, it has no update (remember that it is a resource). Having a RUN besides WALK means that there is a second animation resource.

The set of tracks in an animation are not necessarily dedicated to drive the placement of bones of a skeleton. Instead their type (and perhaps semantic) defines what they are able to drive on a more abstract level. Furthermore there is not necessarily one track per variable which is exposed by a skeleton. It it totally fine if an animation drives only a subset of available variables, simply because there are tracks for only a few of them. This is useful e.g. to have an influence on legs and the hips only.

At runtime there is an PlayAnimation object. Whether this exists in a blend tree or whatever is a question outside of the scope of this post. The PlayAnimation holds a reference to a single animation resource, and the instance specific parameters that are needed to request the tracks of the referred animation for interpolated values. It has further a binding structure that tells which track is bound to which variable (of a bone, in your case). That means that all things like animation start time, playback speed adaption (is needed for blending cyclic animations like WALK and RUN to FASTWALK), and so on are managed therein.

Updating the animation then means to call the PlayAnimation objects. Those compute the normalized time from the current time, animation start, speed adaption, and animation duration (and perhaps cycle count). It then iterates the bindings, and requests each of the bound tracks to compute its value w.r.t. the normalized time, and transfers the result to the also bound target variable. The transfer is not just a set. The thread linked to in a post above has details about this.

So a particular animation exists once as a resource. At runtime there need to be one instance of PlayAnimation per currently played animation. If using a blend tree, then all needed PlayAnimation objects exist ever, namely as leaf nodes of the tree, but may be set inactive.

A skeleton as a prototype exists once as a resource if you want so. At runtime you need a storage for the placements of the bones. This is written to for preparation and when the animation updates. Of course, also the conversion from local to skeleton (or model) global transforms happens here.


Also helpful for you could be the tutorial I added in the graphics programming section a while ago:

I've done GPU skinning already, but it was kinda hard to find resources (I think I only found 2 or 3 explaining the whole process) so it's nice to see more resources about it coming up.

And your tutorial made me notice something I completely forgot about until now, I don't have any Bone "presence" in the scene. My bones don't have anything, actually, so they can't be used as elements to be parents or childs. The transformations I have stored are local only because I thought "since it's index/weight are already defined, I only need that".

This changes a bunch of things for me since I was trying to have a single Skeleton instance for multiple models and that was restraining some options...

I was trying to have this Skeleton class, with bones only having their keyFrames values, and I would calculate it's position/rotations according to each model's animation tracks, then upload the data, since the GPU is doing the skinning and all the indexes/weights are already defined, that was the only thing I needed to do (change the bones, upload).

Really, thanks! I'd be really frustrated if I only realized this AFTER I finished this hahaha

My bones don't have anything, actually, so they can't be used as elements to be parents or childs.

They should inherit from the same base class that gives the rest of your scene “scene graph” capabilities. A gun should be attachable to any joint as easily as it should to any game object or part of a game object.


Additionally, a keyframe is a time value and a single floating-point value pair. Not a vector. A track is a collection of keyframes. Since a keyframe can only modify a single value, you need 3 tracks to handle rotation, 3 more for translation, etc.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

You have a Pose skeleton, for example a human skeleton. This should be outside the entity so that many humans in the scene can use the same Pose skeleton, this prevents memory duplication.

A pose skeleton has an array of Pose bones. Each Pose bone has a matrix plus a reference to a parent Pose bone. Some systems also employ sibling bones, but you dont have to.

An animation track has an array of bone tracks. Each Bone track has an array of Key Frames. For each Bone track Find two key frames and interpolate between them, then combine that result to the corresponding Pose bone in the Pose skeleton and save the animated bone to a new location. Now combine each animated bone with the parent animated bone. You combine from top to bottom and you don't need another copy and can reuse the same memory location. The resulting animated skeleton is copied to the GPU and used for skinning.

The animated bone dont need a reference to the parent and can thus be represented only as a single matrix. Such that you can have a std::vector<matrix>. This whole array can then after parent combination be directly copied to GPU without having the reference bytes tag along.

Many edits now smile.png

You can have some animation tracks which do not have all bone tracks in them to animate all bones. For example you can have a track which only animate legs. What I do is that I copy the Pose Skeleton (only matrix part, not reference) to a std::vector<matrix> AnimationSkeleton. Then apply every interpolated matrix to AnimationSkeleton. Then you combine the whole array, using the references in Pose Skeleton. The matrix at index 0 is the root bone and so fourth. A bone has always a higher index then all of its parent bones. That way you can traverse the array in one sweep.

Additionally, a keyframe is a time value and a single floating-point value pair. Not a vector. A track is a collection of keyframes. Since a keyframe can only modify a single value, you need 3 tracks to handle rotation, 3 more for translation, etc.

IMHO: Technically there is no reason why a keyframe cannot be a <time,placement> pair, for example. The constraints that are added by doing so are

a) position and orientation (or whatever is represented by the keyframe's value) are set as a whole; if one wants to animate them separately they need their own keyframes and hence tracks;

b) the keyframes count for the entire value, even if parts of the value (for example the position) would also exactly result from interpolation between the surrounding keyframes, so that for those parts the keyframe would be not needed.

This topic is closed to new replies.

Advertisement