FBX SDK skinned animation

Started by
39 comments, last by Dirk Gregorius 5 years, 5 months ago

I agree that you want to be able to author a rotation pivot in *tools* (e.g. scene or cutscene editors). But why do you want this for run-time? Say we have just a rotation, rotation pivot and translation track which are animated. At export I can evaluate the transform and export a single rotation and translation track . At run-time I evaluate, blend and finally compose these back into a matrix for the renderer. Note that I am not exporting a matrix as you suggest in your post, but I collapse all tracks into three (translation, rotation, scale). Are you arguing that the pose at each frame will be noticeable different from composing the matrix from each track at run-time? Run-time animation systems are usually designed for performance and memory. What you are describing seems more computational and memory expensive then it needs to be. A final pose in a game is usually a combination (blend) of many active animations in the game. Even if you don't hit the poses in each animation exactly it will be hard to notice since there are so many animation layered on top each other. From my experience animations don't look good if they don't blend together well. This is usually addressed these days by animating the rig and then run the IK solvers at runtime (e.g. Destiny does this). Can you elaborate more about the problems you had with your system and what you think the reasons were?

 

EDIT:

I just gave it more thought. Mathematically both methods create *exactly* the same matrices for a specific keyframe at run-time. Given whatever N tracks you use in your animation system the following holds:

Track1 * Track2 * ... * TrackN = Local Transform = T * R * S

So both systems hit the *exact* same pose (within the usual floating point tolerance) at each keyframe. The question is now what happens between the keyframes when we interpolate. What we will get will depend on many things. If your animator is using Euler keyframes for rotation and you export as quaternion the results will be different. If you don't fit the exact same curve as Maya (or any other tool), your result will be different. Good looking animation is about hitting strong expressive keyposes which I do and it is what the animator ultimately is authoring. The more I think about it the less understand the benefit of what you are suggesting. What is the problem your system solves for a run-time animation system? Maybe with an example...

 

 

Advertisement
23 hours ago, Dirk Gregorius said:

I think it will much easier for the OP as a first step to simply iterate each frame over each bone in the skeleton and export the local transform similar to what I have shown above for the skeleton. 

Here is an example  how to read an animation from an FBX file. It is the simplest I can come up with to give the OP the basic idea. It will give you the frame rate and length of the animation clip and the keyframes as an array of poses. 



void RnAnimation::Read( FbxScene* Scene, FbxAnimStack* AnimStack, const std::vector< FbxNode* >& Nodes )
    {
    
    }

 

Thank you for your feedback. I checked your results against my previous results from my first foray into FBX and they somewhat match except that my old code is a lot more bloated. I notice you didn't mention anything about a bindpose or comparing against a bindpose to actually get the mesh to move in the viewport which brings me to some questions: 

 

I assume I'll be recomposing the vector4 T/R values to get my Model matrix at runtime...looking at my XML file how would I approach doing that when I have so much data? Do I need to compare against a bindpose? I wrote into the XML the decompT/R per joint per frame. In my .abjmesh file I have pos/uv/tan/normal/boneIdx/boneWt information for each vertex. This example mesh rig animation is just half a torso rotating a shoulder up and then to the center like in L. Spiro's example but over a 50 frame duration animated at 24fps.

 

To sync with the game loop I need the delta time per frame like L.Spiro said, but how do I stay in sync with reading from my XML file if things get heavy and bogged down ? Do you have any next steps after your previous example? I made this XML file from reference from the TLang1991 tutorial.

2vdo3eu.jpg

The bind pose is usually the transform of the exported skeleton. You can also get the bindpose from the FbxClusters, but it should only rarely be different from your skeleton pose in the model. So my two code samples will give you the pose of the exported skeleton and the animated poses. At run-time you now can create poses for the skeleton. In a first step I would simply animate and render the skeleton. That should be pretty simple, but will give you confidence your data is correct. The next step is then binding the skeleton to your mesh. This is called skinning. The skinning information is stored in so called FbxCluster which you get from the FbxMesh. The simplest approach to deform the mesh using skinning is something like this:


std::vector< Vector > VertexPositions;
std::fill( VertexPosition.begin(), VertexPositions.end(), Vector::Zero );
  
for ( each Cluster )
{
    // You need to associate clusters wit bones (usually done by name and then by index at run-time)
	Matrix Transform = Pose->GetBone( Cluster ) * Inverse( Skeleton->GetBindpose( Cluster ) );
    
    for ( int Index = 0; Index < Cluster->GetVertexCount(); ++Index )
	{
  		int VertexIndex = Cluster->GetVertexIndex( Index );
  		float VertexWeight = Cluster->GetVertexWeight( Index );
  
  		VertexPositions[ VertexIndex ] += VertexWeight * Transform * Mesh->GetVertexPosition( VertexIndex );
	}
}

 

For skinning you have two options. For each cluster (bone) you store the vertices and weights it influences. Games do it often the other way around. For each vertex your store the bones and weights. The later is better for GPU skinning, though I don't know if this is still true now we have all these different shader stages. Maybe someone else can answer this. Let me know if this helps. I can dig up some code using real FBX examples.

This here is a really good course on animation. It will explain many concepts in an accessible way.

https://cseweb.ucsd.edu/classes/wi18/cse169-a/

Also the book 'Game Engine Architecture' has a very good and comprehensible entire chapter on animation:

https://www.gameenginebook.com/

 

Keyframe animation and skinning are actually two very simple problems. Unfortunately most of the resources and examples on this topic are very bloated and unnecessarily complicated. It is somewhat difficult to find good examples. 

 

28 minutes ago, Dirk Gregorius said:

The bind pose is usually the transform of the exported skeleton. You can also get the bindpose from the FbxClusters, but it should only rarely be different from your skeleton pose in the model. So my two code samples will give you the pose of the exported skeleton and the animated poses. At run-time you now can create poses for the skeleton. In a first step I would simply animate and render the skeleton. That should be pretty simple, but will give you confidence your data is correct.

 

So you mean creating a series of locators or small boxes with GL_LINE / GL_LINE_LOOP and apply the recomposed model matrix to them?

7 hours ago, Dirk Gregorius said:

I agree that you want to be able to author a rotation pivot in *tools* (e.g. scene or cutscene editors). But why do you want this for run-time?

Custom scene editors tend to aim at giving you basically a real-time "preview" of your result.  "WYSIWYG" editors such as those in Unreal Engine and Unity are examples of this.  You might be more used to editors in which you export your work and then run elsewhere, but the main purpose of a scene editor is to tie everything together after that step.  By that point, anything your run-time does not support is something your artists can't edit in the scene editor.  If you stick to tracks and expose as many properties as you can, then artists can easily continue how they are used to working and each property they can edit is one more level of expression your in-game scenes/characters can have.

Of course there is always a trade-off between features, memory, and run-time overhead, but one factor to remember before simply eliminating a property to save run-time costs is that dirty flags can allow you to skip a lot of work.  If your pivots and offsets are identity (meaning they generate an identity matrix) then you can entirely skip the parts where they are combined for the final result and your cost has become simply and if().  Checking for identity can be done just when those values change, and in my 2nd post I explained a dirty system to allow this.  That means a slight-bit more work when they change, but we are definitely talking about things that almost never change here.  So simply handling these properties is almost free and saves you the trouble of revisiting the code 2 years later when an artist actually does want to change a pivot in real-time.

7 hours ago, Dirk Gregorius said:

Note that I am not exporting a matrix as you suggest in your post, but I collapse all tracks into three (translation, rotation, scale). Are you arguing that the pose at each frame will be noticeable different from composing the matrix from each track at run-time?

No, they are functionally the same, because I still have to decompose my matrix into exactly the same thing you get.  This is why I said they are basically both baked systems with only a slight difference.  This was my first engine and naturally poorly designed, so I had to decompose the matrices at run-time because I had idiotically chosen to have the matrix responsible for more than it should have been and tried to use it as the only representation of an object orientation rather than as the product of combining those properties.

So, here, I am not complaining about the overhead of decomposing as much because that was my own fault, but the fact that a matrix is a lossy representation of the orientation.  Some matrices (which are perfectly valid in real games) do not decompose at all (https://computergraphics.stackexchange.com/questions/4491/detect-a-lossy-matrix-decomposition), and you also can't tell if only 1 scale has been inverted or 3.  That's a real problem when you want to assign custom tracks to a single scale property etc.

7 hours ago, Dirk Gregorius said:

Run-time animation systems are usually designed for performance and memory. What you are describing seems more computational and memory expensive then it needs to be. A final pose in a game is usually a combination (blend) of many active animations in the game.

Forward kinematics, inverse kinematics, and basic blending do account for the majority of all animations, but more and more forms of dynamic animations are gaining demand, and the last thing you want is to lose a contract because your engine can't handle the demands.  Note that IK automatically forces you to expose more properties than the baked values because you need constraints etc.

7 hours ago, Dirk Gregorius said:

Good looking animation is about hitting strong expressive keyposes which I do and it is what the animator ultimately is authoring. The more I think about it the less understand the benefit of what you are suggesting. What is the problem your system solves for a run-time animation system? Maybe with an example...

It's not related to accuracy at all.  They both express the animation the same, until you want to start to take control of the finer details in a more dynamic way, at which point baked animations start to show cracks.  If you wanted to change a pivot or offset at run-time you simply can't with a baked system.
Why would anyone do this?:

  1. It's not our job to decide when and why a game might want to do this; it is our job to empower the game to be able to do it.
  2. But an actual example could be where a robot gets an arm dislocated in a fight like bent metal and now starts to rotate around a slightly different point, or if you allow the player to take a part off a machine and reattach it.  These are very reasonable demands these days.

My point is more about functionality, because the run-time overhead is nearly the same.  It looks as though you are spending a lot of time combining a bunch of matrices to get the final result, but most of them are identity and you can if() them out of the process in virtually all cases, plus you are likely to implement a run-time track system anyway, and you will find that you are able to connect tracks to anything but rotation and scale (position is of course unchanged when stored in a matrix), which will hopefully leave you saying, "What the hell did I just do?"

 

All of this is coming from real-world examples.  You're going to make your own scene editor no matter what so you can combine all the different elements from other tools into one, and you necessarily are going to end up with new properties that don't exist in Maya or whatever authoring tool you use.
At this timecode you can see a good example of a real-world application: https://youtu.be/_gEm_WjMy88?t=52
His left monitor is the in-house scene editor where you can see trillions of custom properties you can apply to provide game-specific adjustments that Maya (right monitor) can't (and as per today's standards it is WYSIWYG).

 

Here is an example of that same editor being used to adjust a more global scene: https://youtu.be/_gEm_WjMy88?t=118
And what are you going to use to animate the clouds (which are fully procedural and can't be animated in Maya)?  Tracks and curves!
https://www.youtube.com/watch?v=Y_0OCZC8TVY

The color changes in the clouds, the cloud swirls, the wind, the particles, everything driven by tracks and curves.

This game is an extreme example to be bringing in here, but I ran into these exact issues very early in my first engine as soon as I wanted to animate anything beyond models.  I shot myself in the foot and there is no reason anyone else needs to lose a foot over the same issue.

 

 

 

2 hours ago, mrMatrix said:

To sync with the game loop I need the delta time per frame like L.Spiro said, but how do I stay in sync with reading from my XML file if things get heavy and bogged down ? Do you have any next steps after your previous example? I made this XML file from reference from the TLang1991 tutorial.

Everything in that XML file (which is still not an appropriate file format for models) should be loaded into memory at load-time.  Nothing here needs to be streamed in in real-time.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

40 minutes ago, L. Spiro said:

 

Everything in that XML file (which is still not an appropriate file format for models) should be loaded into memory at load-time.  Nothing here needs to be streamed in in real-time.


L. Spiro

In what file format should I save my translated vertex data and skeleton data then in your opinion? This is for my solo development at home as a hobby, not to sell a game at this time but still I would like to know what the "right" format is.

For now, your format makes it easier to post the results you are getting, so it's actually useful for the moment.

But normally it will be your own binary format that you load yourself.  Text is both slower and larger.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

Quote

So you mean creating a series of locators or small boxes with GL_LINE / GL_LINE_LOOP and apply the recomposed model matrix to them?

Yes, e.g. simply evaluate an animation and then draw the pose. That means simply for each bone draw a line from the bone world position to its parent world position. Once you have this working we can move on to skinning.

Quote

In what file format should I save my translated vertex data and skeleton data then in your opinion? This is for my solo development at home as a hobby, not to sell a game at this time but still I would like to know what the "right" format is.

Whatever format you are fine with. For your hobby project the first goal is that you can export models and animations from a DCC application and load it into your game.

@L. Spiro.

Thanks for sharing this information! I understand your points though I am not sure I agree with everything. Of course one can always just implement the most general system as a solution to a problem, but without knowing what the real gameplay requirement were that needed to be addressed in your linked videos it is a difficult to have a discussion about implementation. There is never one answer to a problem.

As a side note. What I suggested  is not necessarily how I would implement an animation system in a AAA engine. I chose a specific example which I found to be the best introduction for the OP on this topic. 

I feel we are hi-jacking the OP thread here since we were talking about basic skeletal animation and not AAA production animation systems and engine integrations. 

Quote

This is a typical example of when people don't understand what they are doing. If you support non-uniform scale the matrix cannot be decomposed into TRS anymore. You would need something like:


struct Transform
{
	Vector3 Translation;
	Quaternion Rotation;
	Matrix3 Stretch;
}

You can use Polar Decomposition and then you can just linearly interpolate the stretch matrix. This is sloppy implemented in many engines out here. My favorite example is Unity with its 'LossyScale' element. A positive example is Granny from RAD which handles all of this properly.

This is a good example of what I meant in my earlier post. Of course you can have the most general system, but it usual casts a long shadow into other engine parts and you need to understand the consequences. Many game engines still don't allow run-time bone scaling. A good compromise is uniform scale in my opinion. I suffer in particular from bad decisions here since writing rigid body transform back into a badly defined bone hierarchy is a major pain in the butt :)

This topic is closed to new replies.

Advertisement