Skeletal Animation Optimization Tips and Tricks

Published February 20, 2015 by Peyman Massoudi, posted by Sector0
Do you see issues with this article? Let us know.
Advertisement
Skeletal animation plays an important role in video games. Recent games are using many characters within. Processing huge amount of animations can be computationally intensive and requires much memory as well. To have multiple characters in a real time scene, optimization is highly needed. There exists many techniques which can be used to optimize skeletal animations. This article tends to address some of these techniques. Some of the techniques addressed here have plenty of details, so I just try to define them in a general way and introduce references for those who are eager to learn more. This article is divided into two main sections. The first section is addressing some optimization techniques which can be used within animation systems. The second section is from the perspective of the animation system users and describes some techniques which can be applied by users to leverage the animation system more efficiently. So if you are an animator/technical animator you can read the second section and if you are a programmer and you want to implement an animation system you may read the first section. This article is not going to talk about mesh skinning optimization and it is just going to talk about skeletal animation optimization techniques. There exists plenty of useful articles about mesh skinning around the web.

1.Skeletal Animation Optimization Techniques

I assumed that most of the audiences of this article know the basics of skeletal animation, so I'm not going to talk about the basics here. To start, let's have a definition for a skeleton in character animation. A skeleton is an abstract model of a human or animal body in computer graphics. It is a tree data structure. A tree which its nodes are called bones or joints. Bones are just some containers for transformations. For each skeletal animation, there exists animation tracks. Each track has the transformation info of a specific bone. A track is a sequence of keyframes. A keyframe is transformation of a bone at a specific time. The keyframe time is specified from the beginning of the animation. Usually the keyframes are stored relative to a pose of the bone named binding pose. These animation tracks and skeletal representation can be optimized in different ways. In the following sections, I will introduce some of these techniques. As stated before, the techniques are described generally and this article is not going to describe the details here. Each of which can be described in a separated article.

Optimizing Animation Tracks

An animation consists of animation tracks. Each animation track stores the animation related to one bone. An animation track is a sequence of keyframes where each keyframe contains one of the translation, rotation or scale info. Animation tracks are one thing that can be optimized easily from different aspects. First we have to note that most of the bones in character animations do not have translation. For example we don't need to move fingers or hands. They just need to be rotated. Usually the only bones that need to have translation are the root bone and the props (weapons, shields and so on). The other body organs do not move and they are just being rotated. Also the realistic characters usually do not need to be scaled. Scale is usually applied to cartoony characters. One other thing about the scale is that animators mostly use uniform scale and less non-uniform scale. So based on this information, we can remove scale and translation keyframes for those animation tracks that do not own them. The animation tracks can become light weighted and allocate less memory and calculation by removing unnecessary translation and scale keyframes. Also if we use uniform scale, the scale keyframes can just contain one float instead of a Vector3. Another technique which is very useful for optimization of animation tracks, is animation compression. The most famous one is curve simplification. You may know it as keyframe reduction as well. It reduces the keyframes of an animation track based on a user defined error. With this, the consecutive keyframes which have small differences can be omitted. The curve simplification should be applied for translation, rotation and scale separately because each of which has its own keyframes and different values. Also their difference is calculated differently. You may read this paper about curve simplification to find out more about it. One other thing that can be considered here, is how you store rotation values in the rotation keyframes. Usually the rotations are stored in unit quaternion format because quaternions have some good advantages over Euler Angles. So if you are storing quaternions in your keyframes, you need to store four elements. But in unit quaternions the scalar part can be obtained easily from the vector part. So the quaternions can be stored with just 3 floats instead of four. See this post from my blog to find out how you can obtain the scalar part from the vector part.

Representation of a Skeleton in Memory

As mentioned in previous sections, a skeleton is a tree data structure. As animation is a dynamic process, the bones may be accessed frequently while the animation is being processed. So a good technique is to keep the bones sequentially in the memory. They should not be separated because of the locality of the references. The sequential allocation of bones in memory can be more cache-friendly for the CPU.

Using SSE Instructions

To update a character animation, the system has to do lots of calculations. Most of them are based on linear algebra. This means that most of calculations are with vectors. For example, the bones are always being interpolated between two consecutive keyframes. So the system has to LERP between two translations and two scales and SLERP between two quaternion rotations as well. Also there might be animation blending which leads the system to interpolate between two or more different animations based on their weights. LERP and SLERP are calculated with these equations respectively: LERP(V1, V2, a) = (1-a) * V1 + a * V2 SLERP(Q1, Q2, a) = sin((1-a)*t)/sin(t) * Q1 + sin(a*t)/sin(t) * Q2 Where 't' is the angle between Q1 and Q2 and 'a' is interpolation factor and it is a normalized value. These two equations are frequently used in keyframe interpolation and animation blending. Using SSE instructions can help you to achieve the results of these equations efficiently. I highly recommend you check the hkVector4f class from Havok physics/animation SDK as a reference. The hkVector4f class is using SSE instructions very well and it's a very well-designed class. You can define translation, scale and quaternion similar to hkVector4f class. You have to note that if you are using SSE instructions, then your objects which are using it have to be memory aligned otherwise you will run into traps and exceptions. Also you should consider your target platform and see that how it supports these kind of instructions.

Multithreading the Animation Pipeline

Imagine you have a crowded scene full of NPCs in which each NPC has a bunch of skeletal animations. Maybe a herd of bulls. The animation can take a lot of time to be processed. This can be reduced significantly if the computation of crowds becomes multithreaded. Each entity's animations can be computed in a different thread. Intel introduced a good solution to achieve this goal in this article. It defines a thread pool with worker threads which their count should not be more than CPU cores otherwise the application performance decreases. Each entity has its own animation and skinning calculations and it is considered as a job and is placed in a job queue. Each job is picked by a worker thread and the main thread calls render functions when the jobs are done. If you want to see this computation technique more in action, I suggest you to have a look at Havok animation/physics documentation and study the multithreading in the animation section. To have the docs you have to download the whole SDK here. Also you can find that Havok is handling synchronous and asynchronous jobs there by defining different job types.

Updating Animations

One important thing in animation systems is how you manage the update rate of a skeleton and its skinning data. Do we always need to update animations each frame? If true do we need to update each bone every frame? So here we should have a LOD manager for skeletal animations. The LOD manager should decide whether to update hierarchy or not. It can consider different states of a character to decide about its update rate. Some possible cases to be considered are listed here: 1- The Priority of The Animated Character: Some characters like NPCs and crowds do not have very high degree of priority so you may not update them in every frame. At most of the times, they are not going to be seen clearly so they can be ignored to be updated every frame. 2- Distance to Camera: If the character is far from the camera, many of its movements cannot be seen. So why should we just compute something that cannot be seen? Here we can define a skeleton map for our current skeleton and select more important bones to be updated and ignore the others. For example when you are far from the camera you don't need to update finger bones or neck bone. You can just update spines, head, arms and legs. These are the bones which can be seen from far. So with this you have a light weighted skeleton and you are ignoring many bones to update. Don't forget that human hands have 28 bones for fingers and 28 bones for a small portion of a mesh is not very efficient. 3- Using Dirty Flags For Bones: In many situations, the bone transformation is not changed in two consecutive frames. For example the animator himself didn't animate that bone in several frames or the curve simplification algorithm reduced consecutive keyframes which are more similar. In these kind of situations, you don't need to update the bone in its local space again. As you might know the bones are firstly calculated in their local space based on animation info and then they will be multiplied by their binding pose and parent transformation to be specified in world or modeling space. Defining dirty flags for each bone can help you to not calculate bones in their local space if they are not changed between two consecutive frames. They can be updated in their local space if they are dirty. 4- Update Just When They Are going To Be Rendered: Imagine a scene in which some agents are following you. You try to run away from them. The agents are not in the camera frustum but their AI controller is monitoring and following you. So do you think we should update the skeleton while the player can't see them? Not in most cases. So you can ignore the update of skeletons which are not in the camera frustum. Both Unity3D and Unreal Engine4 have this feature. They allow you to select whether the skeleton and its skinned mesh should be updated or not if they are not in the camera frustum. You might need to update skeletons even if they are not in the camera frustum. For example you might need to shoot an object to a character's head which is not in the camera. Or you may need to read the root motion data and use it for locomotion extraction. So you need the calculated bone positions. In this kind of situations you can force the skeleton to be updated manually or not using this technique.

2.Optimized Usage of Animation Systems

So far, some techniques have been discussed to implement an optimized animation system. As a user of animation systems, you should trust it and assume that the system is well optimized. You assume that the system has many of the techniques described above or even more. So you can produce animations which can be friendlier with an optimized animation system. I'm going to address some of them here. This section is more convenient for animators/technical animators.

Do Not Move All Bones Always

As mentioned earlier the animation tracks can be optimized and their keyframes can be reduced easily. So by knowing this, you can create animations which are more suitable for this kind of optimization. So do not scale or move the bones if it is not necessary. Do not transform bones that cannot be seen. For example while you are making a fast sword attack, not all of the facial bones can be seen. So you don't need to move them all. In the cutscenes where you have a predefined camera, you know which bones are in the camera frustum. So if you have zoomed the camera on the face of your character, you don't need to move the fingers or hands. With this you will save your own time and will let the system to save much memory by simplifying the animation tracks for bones. One other important thing is duplicating two consecutive keyframes. This occurs frequently in blocking phase of animation. For example, you move fingers in frame 1 and again move them frame 15 and you copy the keyframe 15 to frame 30. Keyframe 15 and 30 are the same. But the default keyframe interpolation techniques are set to make the animation curves smooth. This means that you might get an extra motion between frame 15 and 30. Figure1 shows a curve which is smoothed with keyframe interpolation techniques. Smooth curve.jpg Figure1: A smoothed animation curve As you can see in Figure1, the keyframe number 2 and 3 are the same. But there is an extra motion between them. You might need this smoothness for many bones so leave it be if you need it. But if you don't need it make sure to make the two similar consecutive keyframes more linearly as shown in figure 2. With this, the keyframe reduction algorithm can reduce the keyframe samples. Linear Curve.jpg Figure2: Two linear consecutive keyframes You should consider this case for finger bones more carefully. Because fingers can be up to 28 bones for a human skeleton and they are showing a small portion of the body but they take much memory and calculation. In the previous example if you make the two similar consecutive keyframes linear, there would be no visual artifact for finger bones and you can drop 28 * (30 - 15 + 1) keyframe samples. Where 28 is the number of finger bones and 30 and 15 are the frames in which keyframes are created by the animator. The sampling rate is one sample per frame in this example. So by setting two consecutive similar keyframes to linear for finger bones, you will save much memory. This amount of memory can't be very huge for one animation but it can become huge when your game has many skeletal animations and characters.

Using Additive and Partial Animations instead of Full Body Animation

Animation blending has different techniques. Two of them which are very good at both functionality and performance are additive and partial animation blending. These two blending schemes are usually used for asynchronous animation events. For example when you are running and decide to shoot. So lower body continues to run and the upper body blends to shoot animation. Using additive and partial animations can help you to have less animations. Let me describe this with an example. Imagine you have a locomotion animation controller. It blends between 3 animations (walk, run and sprint) based on input speed. You want to add a spine lean animation to this locomotion. So when your character is accelerating the character leans forward for a period of time. First you can make 3 full body walk_lean_fwd, run_lean_fwd and sprint_lean_fwd animations which are blending synchronously with walk, run and sprint respectively. You can change the blend weight to achieve a lean forward animation. Now you have three full body animations with several frames. This means more keyframes, more memory usage and more calculation. Also your blend tree gets more complicated and high dimensional. Imagine that you are adding 6 more animations to your current locomotion system. Two directional walks, two directional runs and two directional sprints. Each of them have to be blended with walk, run and sprint respectively. So with this, if we want to have leaning forward, we have to add two directional walk_lean_fwd, two directional run_fwd and two directional sprint_fwd and blend them respectively with walk, run and sprint blend trees. The blend tree is going to be high dimensional and needs too much full body animations and too much memory and calculation. It even becomes hard for the user to manipulate. You can handle this situation more easily by using a light weighted additive animation. An additive animation is an animation that is going to be added to current animations. Usually it's a difference between two poses So first your current animations are calculated then the additive is going to be multiplied to the current transforms. Usually the additive animation is just a single frame animation which does not really need to affect all of the body parts. In our example the additive animation can be a single frame animation in which spine bones are rotated forward, the head bone is rotated down and the arms are spread a little. You can add this animation to the current locomotion animations by manipulating its weight. You can achieve the same results with just one single frame and half body additive animation and there is no need to produce different lean forward full body animations. So using additive and partial animation blending can reduce your workload and help you to achieve better performance very easily.

Using Motion Retargeting

A motion retargeting system promises to apply a same animation on different skeletons without visual artifacts. By using it you can share your animations between different characters. For example you make a walk for a specific character and you can use this animation for other characters as well. By using motion retargeting you can save your memory by preventing animation duplication. But just note that a motion retargeting system has its own computations. So it is not just the basic skeletal animation and it needs many other techniques like how to scale the positions and root bone translation, how to limit the joints, ability to mirror animations and many other things. So you may save animation memory and animating time, but the system needs more computation. Usually this computation should not become a bottleneck in your game. Unity3D, Unreal Engine4 and Havok animation all support motion retargeting. If you do not need to share animations between different skeletons, you don't need to use motion retargeting.

Conclusion

Optimization is always a serious part of video game development. Video games are among the soft real time software so they should respond in a proper time. Animation is always an important part of a video game. It is important from different aspects like visuals, controls, storytelling, gameplay and more. Having lots of character animations in a game can improve it significantly, but the system should be capable of handling too much character animations. This article tried to address some of the techniques which are important in the optimization of skeletal animations. Some of the techniques are highly detailed and they were discussed generally here. The discussed techniques were reviewed from two perspectives. First, from the developers who want to create skeletal animation systems and second, from the users of animation systems.

Article Update Log

13 Feb 2015: Initial release 21 Feb 2015: Rearranged some parts.
Cancel Save
0 Likes 14 Comments

Comments

All8Up

There are issues I have with this article, not that anything is wrong, just that it seems a bit misleading in a couple areas due to not being complete. The issues start with the description of animations as being channels, while correct, that is generally the view from the DCC tool and definitely not the best representation for a game engine. The problem with channels is that during update you have 1-2k worth of channel data likely linear in memory and for each channel you are going to jump into the middle of that memory to read 2-4 pieces of data to compute a single quaternion or matrix depending on what you are doing. Over 30+ channels per animation you start exponentially slowing down as you destroy the cache in the CPU. This issue becomes even more major when you start distributing across multiple cores.

So, in the process of discussing animation optimization I would start by mentioning that no matter what you do there is going to be a balancing act between compression and performance. Channel based animation provides the best compression at the cost of pretty notable performance issues. Collapsing channels to full skeletal rig key frames costs compression since you can't adapt the individual curves to their needed details as easily, but performance is often nearly unbeatable.

Another item to consider is that very often, using piece wise linear approximations works for animation compression just as well at most curve fitting solutions. It uses more keys to get the same error rates but the per-key data requirement is reduced such that it generally comes out pretty similar in storage size. But, the final math used is greatly simplified and as such performance is gained. Again though, there is some balancing to be looked at here, really high quality low error rates (cinematics for instance) typically want the curves for C1 or C2 continuity reasons which linear starts failing to compress well.

Finally, when discussing curve compression you can't look at it as a single joint error rate. You need to take into account for the full skeletal chain from the bone to any end points effected by the error. I.e. you might be able to compress channels on the shoulder exceptionally well but you start noticing the hand at the end of the chain under the shoulder jittering. You need to compute the error as the means of all the chain end points in order to prevent hierarchical error accumulation from creating crappy looking animation.

Again though, the article is fine in general I'd just really like to see some of these points mentioned in order to make it clear that this is a fairly high level overview of a single direction of optimization and not a broad coverage of the subject.

February 23, 2015 01:27 PM
Sector0

Hi AllEightUp,

Thanks for your consideration. I try to answer some of the issues you mentioned.

There are issues I have with this article, not that anything is wrong, just that it seems a bit misleading in a couple areas due to not being complete

As I stated in the introduction of the article, the techniques are going to be addressed generally here and I try to introduce references for those who are eager to learn the details. Covering details of these techniques need a separated heavy weight article.

The problem with channels is that during update you have 1-2k worth of channel data likely linear in memory and for each channel you are going to jump into the middle of that memory to read 2-4 pieces of data to compute a single quaternion or matrix depending on what you are doing. Over 30+ channels per animation you start exponentially slowing down as you destroy the cache in the CPU.

I've quoted a part of the article here about animation tracks:

"So a good technique is to keep the bones sequentially in the memory. They should not be separated because of the locality of the references."

It says the bones should be allocated linearly in memory not the animation tracks. Very detailed characters have pretty much 70 bones. The bones usually accessed several times in the process of computing animation, skinning and post processes like IK and blending between ragdoll and animation. So keeping bones sequential can help to achieve better performance. However you're right about the animation tracks. Many of the animation tracks would not be accessed in a huge period of time. So we can have a memory manager for loading and unloading them. The memory manager can be strictly related to the game animation system. It can load most recently used animations linearly in memory or it can have a learning process to weight the motion graphs transitions if the game is using the similar transitional animation techniques. So the most recent animations (or parts of animations) remain resident in memory linearly.

For animation compression, usually the sample rate is high so the curves can just use linear interpolation and the compression can be just applied during import phase and the tracks will be constructed at import time.

Again though, there is some balancing to be looked at here, really high quality low error rates (cinematics for instance) typically want the curves for C1 or C2 continuity reasons which linear starts failing to compress well.

I mentioned that users should adjust the error rate and if they needed the continuity they should leave them continuous. Users can adjust error rates to have both optimization and beauty of a the animation. Here is a quote from article:

" As you can see in Figure1, the keyframe number 2 and 3 are the same. But there is an extra motion between them. You might need this smoothness for many bones so leave it be if you need it."

You need to take into account for the full skeletal chain from the bone to any end points effected by the error. I.e. you might be able to compress channels on the shoulder exceptionally well but you start noticing the hand at the end of the chain under the shoulder jittering

About the jittering, The jittering would not be occurred for the children bones, if you simplify the parents animation tracks, because the keyframes are always restored relative to each bone binding pose and the binding pose is not changing at run time. Binding pose is stored relative to its parent bone space. So simplification should not affect the child bones motion.

On some rare situations where you have translation tracks the jittering will occur. For example you have a weapon bone which is child of the right hand, the character passes it to the left hand in the animation for a short period of time. While the weapon bone is in the left hand, the compression can create jittering on it, because it is the child of the right hand. However I haven't written anywhere in the article that you should apply the error rate separately for each bone. Here is a related quote from article:

"The curve simplification should be applied for translation, rotation and scale separately because each of which has its own keyframes and different values. Also their difference is calculated differently"

So it says the scale, translation and rotation error rates should be specified differently. This doesn't mean that you should apply error rates to each bone separately however it can be done separately but it could be hard for user to manipulate it.

Again thanks for your critics and again the techniques are discussed generally here. All of them have many details and tricks when you want to have them in action. Each of which should be considered separately in different articles and their experimental results should be addressed as well smile.png

February 23, 2015 04:03 PM
All8Up

As I mentioned, I was just pointing out the areas which seemed unclear in the article during the initial reading. But, in the case of jittering either we are crossing wires or I believe you are wrong or missing the point, the specific error is not a problem, the accumulation of error is. A really bad example is one of the bosses in a game I did the animation systems for a couple years back had well over 120 bones, most of which were in long tentacle like spiky things which each had about 10 or so bones. The animation system had no bone tranlations, only rotational data in these spines but the ends of the tentacles were jittering around like mad, crossing through each other and otherwise not looking like the fluid animations found in Maya. Basically the math works out very simple:

+ ------------------------- >.

Plus being the joint you are introducing a little rotational error to and the dot at the end is the expected end point. If the bone is 10 meters long (the general length of the spines in that character), and the rotation is off by just 1% due to error, the end point of the bone is sin(1degree)*10=~0.17 meters from where it is supposed to be in world space. Break that chain into multiple bones with each having a bit of error and while the math is inversely proportional to the distance between the joint being modified to the final end point, the grand total of error possible can leave the end point upwards of .4 meters from where it should actually be located which is exceptionally noticeable.

So, basically I always suggest anything that covers animation compression also points out that the error is cumulative down bone chains. You don't need to solve the issue in the article, I just believe it is always very important to point this out since it bites most folks at least once or twice when writing compression systems.

Again, i don't expect the article to provide solutions, I was just pointing out that I didn't believe it was particularly clear on the points and could be much better if it spelled things out more explicitly.

February 24, 2015 01:20 PM
Buckshag

If the tentacles were going so crazy that it didn't look like what you made in Maya then the compression settings where probably quite a bit too extreme :)

Generally motions look fine without taking the hierarchy into account when optimizing them using keyframe reduction or so. Depends a lot on the settings. But at decent optimization rates the motion should look just fine. Maybe a tiny bit of foot sliding will happen, but yeah you can improve that by taking parent transforms into account. Definitely good to point that out indeed.

Also most of the time error on the legs/feet will be more visual disturbing than on say the arms, because you will directly notice sliding feet while you might not notice that the hands are not at the very exact locations in say an idle motion or so.

There are a few other techniques you can apply as optimizations, such as motion and skeletal LOD. Also you can do some caching when doing keyframe lookups if they are not uniformly sampled. Also you can simplify certain calculations when no scale involved etc, although you kind of mentioned some of those in the "do not transform everything". You have to watch out that the cost of figuring out if something changed is not higher than actually just calculating it though.

Optimizations in animation graphs / blend trees / networks could also be added.

Maybe a next step could be to go into more detail of some of the things you mentioned, so people can see how to implement these.

Anyway, good job on the article :)

February 24, 2015 08:06 PM
All8Up

Buck, as mentioned, anything that is 10 meters long with even very minor errors will exhibit the issues no matter how low the compression is. As you say, a little foot sliding is acceptable for a character, now make the player a mouse that see's the foot from two inches, that foot sliding is no longer acceptable. Everything depends on how you use the result, the source of the error is always there, it's simply something worth pointing out.

February 25, 2015 01:22 AM
Buckshag

Yes I know, but this 10 meter tentacles is a 0.001% of the characters you have, an exception. Just wanted to point that out. Also I assume you have configurable compression settings, so you can always either disable compression on such character or make it less agressive.

February 25, 2015 02:15 AM
Buckeye

Overall, a good article with some good recommendations. A few comments for consideration:

"Usually the keyframes are stored relative to a pose of the bone named binding pose."

A matter of English language semantics: "stored relative" implies data being placed in memory somewhere with a known relationship to another location. It's not clear if that's your intent. If so, it's not clear what you mean by "relative to a pose." I'm guessing you intend something similar to "Keyframe transformations are commonly local transformations, such that a bone orientation in model space is determined by KeyFrameTransformation * BindPoseTransformation." You should clarify.

Section 1, Skeletal Animation Optimization Techniques, states "These animation tracks and skeletal representation can be optimized in different ways."

In the section "Optimizing Animation Tracks," you do not indicate what the *intent* of the optimization is. That is, there are various reasons for "optimizing" code and data - speed of access, speed of calculation, memory footprint, etc. It appears that section primarily discusses the memory footprint of each track. The intent of your article is to provide optimization "tips and tricks." Tell them explicitly what benefits and penalties result from each optimization technique. Otherwise your information begs the question "Why should I do that?"

By contrast, in the section "Representation of a Skeleton in Memory," you mention access which is "cache-friendly," implying improved speed of access, thus providing readers with a reason why they might consider your recommendations. Similarly, "SSE Instructions" mentions efficiency, and the "downside" that one must be mindful of alignment issues. Good stuff.

Figure 1 and Figure 2: please provide an image with higher contrast and/or wider lines. They appear (on my monitor) as little more than black rectangles. In addition, you mention "keyframes" which don't appear in the images. If you mean the dots on the curves - label them clearly. Make it easy for your readers to make a relationship between the word description and the visual aid.

In your introduction, you imply that Section 2 is targeted at animators, as opposed to programmers. Providing information to provide a basis for discussion between programmers and other project members is an excellent approach.

However, a minor point - Section 2 contains the phrase "... your blend tree gets more complicated and high dimensional ...," which may not mean anything to an animator. I.e., if an animator is given the task of creating partial animations with the caveat: "Don't complicate the blend tree!" I can imagine the animator's response being nothing more than a blank stare. If your intent is to speak to animators, use terms and descriptions that animators use.

February 27, 2015 06:34 PM
Sector0

Hi Buckeye

Many thanks for you attention. I'll try to update the article and fix the issues you mentioned. I will edit the unclear parts based on your comment.

February 27, 2015 08:20 PM
Sector0

Hi Buckshag,

Thanks for comments and the cases you mentioned. All of them are true. BTW I'm following EmotionFX news for several years. I haven't used it before but I know it provides great run-time animation features. Hopefully I will use it in other future projects.

February 27, 2015 08:31 PM
Matias Goldberg

Ugh. I reviewed as Incomplete / Unclear. Though I was tempting to set "Extreme Poor Quality"; sadly there is no just "Poor Quality" option.

The reviewer clearly has not enough experience to be talking about this field yet (optimizing skeletal animations).

1. I'd expect cache coherency to be a big part of the section; since it's the elephant in the room. Yet only a parragraph and a minor link was shown. No source code examples either.

2. The published lerp formula is in its unoptimized form. It's nice to appreciate what is going on; but we're talking about optimizations here, and the optimized form is nowhere to be seen.

3. "Multithreading the Animation Pipeline" like in cache coherency, only two short parragraphs were spent. "Multithread your code". Gee, why didn't I think of that?

Multithreading is a hard topic. At least you link to a an Intel article. Too bad it has broken images all over its place. But at least the source code can be downloaded.

I would've let it pass if it was one of the issues; but considering everything as a whole; I have to bring this up.

4. The article points to the source code of hkVector4f for which the author does not have legal rights to distribute. I'm sure it was well intended, but the link needs to be taken down.

5. "Distance to Camera" tip. So why should we just compute something that cannot be seen? Because animations may affect logic or physics, and hence still need to be computed. Because multiplayer games have more than "one camera". Because what happens offscreen is still relevant in some games (i.e. RTS games; deterministic lockstep based games).

Author fails to mention potential problems for this approach.

6. "Using Dirty Flags For Bones". Updating the bone's local transform is the cheapest part of the whole animation process. Keeping track of dirty bones will only add overhead that will counter any potential benefit.

7. "Update Just When They Are going To Be Rendered". This is very similar to "Distance to Camera" but at least some of the pitfalls are mentioned.

However frustum culling is vastly overestimated. On a typical game with shadows and/or realtime cubemaps, almost everything gets capture by one camera. Be it the main camera, or the hidden ones like the shadow mapping camera, or one of the 6 camera needed by cubemapping (which cover a 360° FOV).

Implementing this technique means that, for every pass in the frame (instead of every frame) you need to iterate through all animations that passed culling, check if they're dirty, and update if so. You may as well update them all at once and be done.

If your game is open world with huge a world scenario; then rather than frustum culling, you'll be paging in/out the animations

8. No mentioning of GPGPU animation at all. Animation is mostly a bandwidth intensive operation; and GPUs have a lot of it, which is why most dramatic improvements are often seen in this field.

9. No mention of different ways to upload all matrices to the GPU. All bones? Only the ones that actually affect vertices? Upload per skeleton and let each draw using the same skeleton index it? Or (re)upload the bones for every draw?

Each option has its tradeoffs.

Overall the author covers a lot of topics only by surface and then references external links and let the reader on their own to understand the topic (benefits, pitfalls, how it works, how to implement it for their own applications).

I'm sure the author some day will have gathered enough experience to write marvelous article; as he seems to be on the right track.

But I can't approve this article today, as is.

February 28, 2015 05:19 AM
Sector0

I do agree with the previous comments that some parts are unclear but as I mentioned several times in the article, this is a general purpose article. What you expecting me is to write a book instead of an general purpose article. I mentioned that each of which techniques should be considered as a separated article if someone is eager to learn the details.

Multithreading is a hard topic. At least you link to a an Intel article. Too bad it has broken images all over its place. But at least the source code can be downloaded.

I addressed Havok animation as well and sent a link for it. It has source codes , examples and rich documentation. You can find what you need there. Havok has implemented the same technique more in action.

4. The article points to the source code of hkVector4f for which the author does not have legal rights to distribute. I'm sure it was well intended, but the link needs to be taken down.

The link belongs to projectanarchy. Havok came up with free tool sets for mobile developers. You can download it free there. The hkVector4f is located in the downloaded packages as well. I've put the link there so audiences can check it more easily. It's free for mobile development.

Distance to Camera" tip. So why should we just compute something that cannot be seen? Because animations may affect logic or physics, and hence still need to be computed. Because multiplayer games have more than "one camera". Because what happens offscreen is still relevant in some games (i.e. RTS games; deterministic lockstep based games).

Author fails to mention potential problems for this approach.

It seems that you haven't read the paragraph carefully. Here is a quote from article:

"Here we can define a skeleton map for our current skeleton and select more important bones to be updated and ignore the others. For example when you are far from the camera you don't need to update finger bones or neck bone. You can just update spines, head, arms and legs. These are the bones which can be seen from far"

Yeah you're right. You need some bones for logic. You have different cameras for multiplayer games and you need deterministic operations on some type of games. But do finger bones has something to do with this? Do facial bones have something to do with this? You have 28 bones for fingers, 15 bones for facial, 7 bones for ponytail, 5 bones for extra cloths! Do they have anything to do with logic? with physics? I mentioned you should have skeleton map to save the more important bones even if they are far from the camera. The important bones are affecting both visual and logic. So it seems that you haven't worked with complex structured characters before! because in these kind of characters many of the bones are just there to have better visual and nothing to do with logic and if they are not going to be seen, they should not be calculated.

"Using Dirty Flags For Bones". Updating the bone's local transform is the cheapest part of the whole animation process. Keeping track of dirty bones will only add overhead that will counter any potential benefit.

You might have complex blend trees in which you are blending more than 20 animations in a frame. There should be other procedural calculations like IK/FK blending, computing bone rotation limits in your run-time rig and many other things. They are firstly calculated in local or parent space. So having a dirty flag can help you a lot here. These calculations are heavy. They have lots of more overhead than checking for dirty flags! So ignoring these heavy calculations, when you don't need them to be updated is not cheap at all! To produce realistic characters many of these techniques are needed and managing their computation is not a cheap decision.

9. No mention of different ways to upload all matrices to the GPU. All bones? Only the ones that actually affect vertices? Upload per skeleton and let each draw using the same skeleton index it? Or (re)upload the bones for every draw?

Each option has its tradeoffs.

This part is related to mesh skinning and I mentioned in the introduction part that this article is not going to talk about mesh skinning. Here is a quote from article:

" This article is not going to talk about mesh skinning optimization and it is just going to talk about skeletal animation optimization techniques. There exists plenty of useful articles about mesh skinning around the web."

Overall, the expectation from you as a reviewer is to read the article more carefully. This one is a general purpose article. It tries to give the enthusiastic audiences some cues and introduce some techniques to them generally. I mentioned this several times in the article. So you should not expect details here. Covering each of the techniques details need a separated heavy weight article.

One article can be more general like a survey and one article can be more expert which is trying to cover all the details and experimental results. You should note that this article is general and the general articles like surveys have their own audiences as well.

Ugh. I reviewed as Incomplete / Unclear. Though I was tempting to set "Extreme Poor Quality"; sadly there is no just "Poor Quality" option.

The reviewer clearly has not enough experience to be talking about this field yet (optimizing skeletal animations).

The first thing is that you haven't payed attention to the note at the end of the page which is stated that:

"Note: Please offer only positive, constructive comments - we are looking to promote a positive atmosphere where collaboration is valued above all else."

You haven't read the article with patience. The issues you mentioned are very good and useful but the article defined its scope in its different sections.

I really appreciate the comments from Buckeye and Buckshag because they read the article carefully and with patience and they mentioned many valuable comments and stated great issues and they found what the article is tending to say to which group of audiences. I'll try to update the article based on their comments.

You mentioned that I clearly have not enough experience in this field. First I have to say that writing articles is not about promoting ourselves and it's about to share information with other enthusiastic people. So I'm sorry if I'm going to tell you some of my experiences here. I've been in the gaming industry for almost 9 years. I've just focused on animation in these years and made research and development on many animation techniques during these years. I've also been a user of different animation tools, technologies and engines in this interval.

At the end, thanks for your comments. I will update some unclear parts of the article.

February 28, 2015 09:05 AM
Matias Goldberg

The link belongs to projectanarchy. Havok came up with free tool sets for mobile developers. You can download it free there. The hkVector4f is located in the downloaded packages as well. I've put the link there so audiences can check it more easily. It's free for mobile development.

Unfortunately being free does not equal it can be redistributed freely.
The license Havok grants is very permissive in the sense that allows commercial uses for free.

But they're reserving the rights on how the package is being distributed (which is via their site directly). Project Anarchy belongs to Havok, but the license header on the file makes me still dubious.

Do facial bones have something to do with this? You have 28 bones for fingers, 15 bones for facial, 7 bones for ponytail, 5 bones for extra cloths! Do they have anything to do with logic? with physics?

The question in that example is not why are you animating those bones, but rather why are there models other than from major characters (protagonists, some NPCs, bosses, certain enemies) with detailed finger animation and facial animations. Since such characters often do not exceed ~8.

Yes, you could be in a game with a huge world where eventually there could be hundreds characters with detailed animations. But a system that pages them in/out based on distance to player or area they're in works at a much higher level than an animation system would.

Giving this responsibility to an animation system, the system will start micromanaging bones based on visibility, and other unrelated components will end up doing the same.
These optimizations have to be worked out at a higher level.

A more important tip would be to describe how to prepare a budget guide for artists to follow so that you don't end up with a chaotic performance nightmare: Bone count limits categorized by character relevance, max. number of affecting bones to a single bone (eg. major chars: Up to 4 bones per vertex, unimportant chars: up to 2 bones per vertex, etc), sampling framerate standards (you do cover this one though)

You might have complex blend trees in which you are blending more than 20 animations in a frame. There should be other procedural calculations like IK/FK blending, computing bone rotation limits in your run-time rig and many other things. They are firstly calculated in local or parent space. So having a dirty flag can help you a lot here. These calculations are heavy. They have lots of more overhead than checking for dirty flags! So ignoring these heavy calculations, when you don't need them to be updated is not cheap at all!

I was first going to refute your claim (blending 20 FK animations is cheap).
Then afterwards you threw IK, constraints. And then's when it hit me. What we have in mind is completely different, and this basically the same problem from my comment's closure: you're covering a lot of topics only by surface and then reference external links and let the reader on their own to understand the topic.
With no reference implementation, some code, or pictures to put some context, it just ends up being confusing or even misleading. A lot of people will end up with a different idea of what you tried to say.

You mentioned that I clearly have not enough experience in this field. First I have to say that writing articles is not about promoting ourselves and it's about to share information with other enthusiastic people. So I'm sorry if I'm going to tell you some of my experiences here. I've been in the gaming industry for almost 9 years.

And I have no doubt about that. But if you would share that experience with us it would be nice: One thing is to say "don't update skeletons that aren't visible by the camera; trust me it's a performance tip"; another thing is to say "When we were working in <xx, indie game, unreleased project, can't say name due to NDA, etc> we were having hundreds of instances with 10 concurrent animations and 80 bones per instance. On retrospective we shouldn't have allowed our artists to have so many models with so many bones and animations. Even the trolls had 60 bones! Our framerate sucked at 60ms per frame, and we couldn't tell our art team to remake the animations.
So we moved the animation update loop after frustum culling. Our situation improved to 45ms; but it was still unacceptable. The first problem we saw is that shadow mapping cameras were including a lot of other instances that weren't really visible.
So we first made a distance to camera calculation, which was combined with frustum culling. An instance would have to be seen by a camera and be close enough to the main user's camera. After that our times went to 35ms. We then realized that all of our bones' transform were being processed by our complex IK animation system, so we wrote a system that would keep track of which bones needed to be updated...".

I made up that story and the numbers aren't real. But see the difference?
The hostility you sensed in my comment was because you were mentioning techniques which most are just common sense (avoid updating what isn't dirty or non-visible, use multithreading, use SSE, remove redundant keyframes, lossy-remove keyframes until the quality degradation stops being acceptable, prioritize & use different update frequencies) but weren't sharing any details like numbers to convince me it's worth it; or put some context to understand why sometimes it works, why sometimes doesn't. Tell me problems I'd expect to encounter while implementing it myself.
The only time you do that well is when you talked about removing keyframes and motion retargeting.

There are many ways to keep track of bones' dirtiness. Some with higher granularity than others. Some methods much more cache friendlier than others. Just telling "keep track of dirty bones" can end up being a disaster if implemented by someone who is not yet familiar with performance sensitive code.

If you just throw me some tips and tell me to trust they will always improve my framerate (which I already know they don't always do that, or that the cost/benefit ratio is too high) I'm going to think you're don't have enough experience in this particular subject (skeletal animation)


This part is related to mesh skinning and I mentioned in the introduction part that this article is not going to talk about mesh skinning. Here is a quote from article:

" This article is not going to talk about mesh skinning optimization and it is just going to talk about skeletal animation optimization techniques. There exists plenty of useful articles about mesh skinning around the web."

Overall, the expectation from you as a reviewer is to read the article more carefully. This one is a general purpose article. It tries to give the enthusiastic audiences some cues and introduce some techniques to them generally. I mentioned this several times in the article. So you should not expect details here.

Fair enough. That was a mistake on my behalf.
February 28, 2015 06:42 PM
TheChubu

Imma upvote y'all cuz all dis discussion is interesting :3

March 02, 2015 07:07 PM
LoneDwarf

I found the response by Matias to be really sad and somewhat of an attack actually. The person put some time and effort into this article and I didn't think it was all that bad. Your response stinks of you being offended that someone would actually attempt to write about a topic that you feel you are more qualified to have written. I am the smartest guy in the room mentality.

March 10, 2015 05:29 PM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!

Skeletal animation plays an important role in gaming industry. There are many techniques which can be used to optimize skeletal animations to make them run more efficiently in real-time scenes. This article tends to address some of these techniques in two levels. Implementation level and usage level.

Advertisement
Advertisement