FBX and Skinned Animation

Started by
22 comments, last by Buckeye 9 years, 8 months ago

Hi guys,

I'm trying to set up a skinned mesh keyframe animation system in my game. I create the mesh/bones in 3ds Max, export via FBX, then use the FBX SDK to get at the data to bring into my game.

I'm pretty sure what I need for a basic system is the following:

base data:
a) vertex positions of the mesh (in it's bind pose)
b) bone weights for each vertex
c) bone hierarchy
d) transform of each bone in it's bind pose

pose data:
a) for each pose, the local rotation of each bone (relative to it's parent).

Then I can smoothly go from one pose to another using interpolation (which I have working.) I'm not concerned with timing yet, I'll leave that for later.

I don't know how the files should be set up. Right now I have my bind pose mesh and bones in one file. Do I make a new file for each pose (or poses in a looped animation) ? Or maybe the poses need to be in the same file as the bind pose because transforms are relative to it ? Do I store each pose in a keyframe ? I'm lost.

I'm still learning how to extract data from FBX files and that's confusing me as well. I have more questions but I'll leave it here for now.

Any help/insight would be great.

Advertisement

You may want to take a look at a couple articles - one on skinned mesh animation, the other on an animation controller for a skinned mesh. The first article outlines the various structures needed for a skinned mesh hierarchy and how such a hierarchy can be constructed. The latter provides some information on animation data, it's storage and use.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

You don't forget how to play when you grow old; you grow old when you forget how to play.

I think you are mixing up some FBX concepts with your data requirements and also you might be getting confused since FBX 2014 has changed all the animation structures and any tutorials or information you find via Google are likely using the older SDK's. The reason I say this is that the "pose" structure in FBX is not directly related to animation data. The pose structure is more of a placeholder structure which is used to to describe a skin binding in the FBX file, but it is not absolutely required and for instance won't exist in files with rigid body animation. (NOTE: it seems pretty random if it exists or not depending on the DCC tool and data in the file.) In general, I suggest not really thinking in terms of poses directly as it will just confuse things, for the runtime animation side, I'd suggest thinking in terms of "keys" instead to stay separated from FBX concepts.

Now, in terms of the data you want, it's pretty simple given your description of how you are storing things. Basically if you have the inverse bind pose, you have the data structure you need already and you simply need to decide what to do with it. The style of animation you seem to be shooting for is full skeleton key frames which is among the most simple to implement and has some benefits in various ways. All you need for this to work is the duration of the animation, playback rate and a set of keys representing the animation. The keys are what you are calling poses but not related to the FbxPose nodes. Anyway, assuming you have a concept of a matrix hierarchy which represents the bind pose, you would use the same matrix hierarchy to store the key frames. So you end up with the following:

skeleton

MatrixHierarchy: bind pose

animation

duration

1..n MatrixHierarchy: key

Getting the data out of Fbx can be a chore. I won't go over the whole thing, I'll just mention some of the fun bits I ran into recently when switching to the 2014 SDK:

1. Make sure to tell FbxIOSettings to import animation. I spent a day thinking I had a bug but instead I had simply forgot to tell the damned thing to import the animation data. Without this set, the SDK lies to you saying there *is* animation and even allows you to iterate all the structures, get lengths and everything, but extracting animation just gives you the first frame repeatedly without errors or other hints that the data wasn't loaded.

2. Even if you only have one FbxAnimStack, make sure to go get the evaluator from the fbx manager and set the context to the stack. Otherwise I found in some cases it was doing the wrong things.

3. Don't rely on any FbxPose's existing. Between Max, Maya, Modo and others, it seems hit and miss if it will be there and in general you don't really need it anyway.

4. I occasionally got additive animation layers which didn't make much sense given the source art, I'd suggest running the layer collapse function right after import to remove any of them.

Anyway, good luck. Hopefully the little outline helps explain how you can store the data simply. The FBX SDK on the other hand is a box of crazy and drives me insane.

There were several topics on this.

I've tried to create an article but because I'm a lazy bas*ard, the article isnt finished: http://www.gamedev.net/topic/646588-scene-graph-fbx-and-stuff/

It is a total crap(don't directly use it) but the basic idea is explaned.

To obtain all the keyframes and transforms for a node :

1) I use the AnimLayer to obtain the keyframes for the node. There MUST be a better way to do this!

2) Use the scene evaluator to obtain the *flattened* node transform for each key frame.


for each key frame do :
FbxAnimEvaluator* pEvaluator =  m_pFbxScene->GetEvaluator();
transformData = pEvaluator->GetNodeLocalTranslation/Rotation/Scale(CurrentNode, KeyFrameTime);

PS:

As said above:


 The FBX SDK on the other hand is a box of crazy and drives me insane.

biggrin.png

Hi guys, thanks for the help. Things are starting to become clear but I still don't understand a lot. I'm thinking in terms of keys now (I was indeed getting confused looking at FbxPose stuff.)

I'm getting my bind pose data like this (pseudo):

bone=mesh->GetCluster->GetLink
bone->GetTransformMatrix() and bone->GetTransformLinkMatrix()

...that seems to be working alright.

I don't understand the concept of FbxAnimStack and FbxAnimLayer. I think a stack is a collection of layers, and a layer is what used to be called a 'take' (which I think is like a key). I also don't understand what the FbxAnimEvaluator does.

I'm doing the modelling, rigging, and animating myself in 3ds Max. Let's say I want to do a simple two key idle animation for my soldier, I pose him at time=0 and at time=30. All I really need are the bone transforms for those two times (for my simple system.)

Can you put that in terms of FbxAnimStack, FbxAnimLayer, FbxAnimCurveNode, FbxAnimCurve, FbxAnimCurveKey ? I don't know where to look for the bone transforms at those two times.

Thanks again.

PS - Buckeye, your article on controllers is great, once I get the data from my FBX file I'll be referencing it a lot.
PSS - imoogiBG, I'm trying to go through your code but it's hard since some of it doesn't work in FBX 2015.1, especially the templated stuff.

I'm getting my bind pose data like this (pseudo):

bone=mesh->GetCluster->GetLink
bone->GetTransformMatrix() and bone->GetTransformLinkMatrix()

...that seems to be working alright.

The transform link is the way I do the bind pose matrix myself, so it looks correct to me.

I don't understand the concept of FbxAnimStack and FbxAnimLayer. I think a stack is a collection of layers, and a layer is what used to be called a 'take' (which I think is like a key). I also don't understand what the FbxAnimEvaluator does.

You can pretty much ignore the layers unless you intend to do some pretty advanced stuff, just collapse them at the start and you'll only have one layer per stack. (I don't remember the exact call, it's on the FbxAnimStack I believe.)

The FbxAnimStack is actually what used to be called a take as I remember it. For DCC tools which export multiple animations per FBX, you would have one of these stacks per animation. So, a file could have a walk stack, run stack, turn, jump etc. Maya seems to ignore this, I think Max will use them if you author the files in a specific manner.

Finally the evaluator is basically an animation playback system for the content of the FBX file. Basically if you set a stack as it's current context, it will allow you to sample the scene and get the transforms, etc from the scene at various times. This brings us to the rest:

I'm doing the modelling, rigging, and animating myself in 3ds Max. Let's say I want to do a simple two key idle animation for my soldier, I pose him at time=0 and at time=30. All I really need are the bone transforms for those two times (for my simple system.)

Can you put that in terms of FbxAnimStack, FbxAnimLayer, FbxAnimCurveNode, FbxAnimCurve, FbxAnimCurveKey ? I don't know where to look for the bone transforms at those two times.


The FBX SDK gives you a number of ways to get the data you might want. Unfortunately due to different DCC tools (Max/Maya/etc) you may not be able to get exactly the data you want. For instance, let's say you find the root bone and it has translation on it in the animation. You can access the transform in a number of ways. You can use the LclTransform property and ask for the FbxAMatrix at various times. Or you can call the evaluation functions with a time to get the matrix. Or you can use the evaluator's EvaluateNode function to evaluate the node at a time. And finally, the most complicated version is you can get the curve nodes from the properties and look at the curve's keys.

Given all those options, you might think getting the curves would be the way to go. Unfortunately Maya, for instance, bakes the animation data to a set of keys which have nothing to do with the keys actually setup in Maya. The reason for this is that the curves Maya uses are not the same as those FBX supports. So, even if you get the curves directly, they may have hundreds of keys in them since they might have been baked.

What this means is that basically unless Max has curves supported by FBX, it may be baking them and you won't have a way to find what the original two *poses* in your terms were. Generally you will iterate through time and sample the scene at a fixed rate. Yup, it kinda sucks and generally you'll want to simplify the data after sampling.

Hopefully this gets you past the understanding issues for the moment. FBX is fun stuff ain't it. biggrin.png

What this means is that basically unless Max has curves supported by FBX, it may be baking them and you won't have a way to find what the original two *poses* in your terms were. Generally you will iterate through time and sample the scene at a fixed rate. Yup, it kinda sucks and generally you'll want to simplify the data after sampling.

But that is what you want to do anyway.
There are several modes for interpolating between key frames and you want to reduce that down to all linear interpolations, because that is the only thing you should support at run-time.

You start by sampling the existing key frames into a set sorted by time, then you run over the animation and sample at fixed intervals of, say, 0.1666667 milliseconds (60 samples per second of animation).
Then you eliminate keys that are redundant when you linearly interpolate between the keys on both sides of it.

	/**
	 * Loads data from an FBX animation curve.
	 *
	 * \param _pfacCurve The curve from which to load keyframes.
	 * \param _pfnNode The node effected by this track.
	 * \param _ui32Attribute The attribute of that node affected by this track.
	 * \return Returns true if there are no memory failures.
	 */
	LSBOOL LSE_CALL CAnimationTrack::Load( FbxAnimCurve * _pfacCurve, FbxNode * _pfnNode, LSUINT32 _ui32Attribute ) {
		SetAttributes( _pfnNode, _ui32Attribute );

		LSUINT32 ui32Total = _pfacCurve->KeyGetCount();
		static const FbxTime::EMode emModes[] = {
			FbxTime::eFrames96,
			FbxTime::eFrames60,
			FbxTime::eFrames48,
			FbxTime::eFrames30,
			FbxTime::eFrames24,
		};
		LSUINT32 ui32FrameMode = 0UL;
		if ( ui32Total ) {
			while ( ui32FrameMode < LSE_ELEMENTS( emModes ) ) {
				// Count how many entries we will add.
				FbxTime ftTotalTime = _pfacCurve->KeyGetTime( ui32Total - 1UL ) - _pfacCurve->KeyGetTime( 0UL );
				// We sample at emModes[ui32FrameMode] frames per second.
				FbxLongLong fllFrames = ftTotalTime.GetFrameCount( emModes[ui32FrameMode] );
				if ( ui32Total + fllFrames >= 0x0000000100000000ULL ) {
					// Too many frames!  Holy crazy!  Try to sample at the next-lower resolution.
					++ui32FrameMode;
					continue;
				}

				m_sKeyFrames.AllocateAtLeast( static_cast<LSUINT32>(ui32Total + fllFrames) );
				// Evaluate at the actual key times.
				int iIndex = 0;
				for ( LSUINT32 I = 0UL; I < ui32Total; ++I ) {
					if ( !SetKeyFrame( _pfacCurve->KeyGetTime( I ), _pfacCurve->Evaluate( _pfacCurve->KeyGetTime( I ), &iIndex ) ) ) {
						return false;
					}
				}
				// Extra evaluation between key times.
				iIndex = 0;
				FbxTime ftFrameTime;
				for ( FbxLongLong I = 0ULL; I < fllFrames; ++I ) {
					ftFrameTime.SetFrame( I, emModes[ui32FrameMode] );
					FbxTime ftCurTime = _pfacCurve->KeyGetTime( 0UL ) + ftFrameTime;
					if ( !SetKeyFrame( ftCurTime, _pfacCurve->Evaluate( ftCurTime, &iIndex ) ) ) {
						return false;
					}
				}


				// Now simplify.
				LSUINT32 ui32Eliminated = 0UL;
				for ( LSUINT32 ui32Start = 0UL; m_sKeyFrames.Length() >= 3UL && ui32Start < m_sKeyFrames.Length() - 2UL; ++ui32Start ) {
					const LSUINT32 ui32End = ui32Start + 2UL;
					while ( m_sKeyFrames.Length() >= 3UL && ui32Start < m_sKeyFrames.Length() - 2UL ) {
						// Try to remove the key between ui32Start and ui32End.
						LSDOUBLE dSpan = static_cast<LSDOUBLE>(m_sKeyFrames.GetByIndex( ui32End ).tTime.GetMilliSeconds()) - static_cast<LSDOUBLE>(m_sKeyFrames.GetByIndex( ui32Start ).tTime.GetMilliSeconds());
						LSDOUBLE dFrac = (static_cast<LSDOUBLE>(m_sKeyFrames.GetByIndex( ui32Start + 1UL ).tTime.GetMilliSeconds()) -
							static_cast<LSDOUBLE>(m_sKeyFrames.GetByIndex( ui32Start ).tTime.GetMilliSeconds())) / dSpan;
						// Interpolate by this much between the start and end keys.
						LSDOUBLE dInterp = (m_sKeyFrames.GetByIndex( ui32End ).fValue - m_sKeyFrames.GetByIndex( ui32Start ).fValue) * dFrac + m_sKeyFrames.GetByIndex( ui32Start ).fValue;
						LSDOUBLE dActual = m_sKeyFrames.GetByIndex( ui32Start + 1UL ).fValue;
						LSDOUBLE dDif = ::abs( dInterp - dActual );
						if ( dDif < 0.05 ) {
							m_sKeyFrames.RemoveByIndex( ui32Start + 1UL );
							++ui32Eliminated;
						}
						else {
							// Move on to the next key frame and repeat.
							break;
						}
					}
				}
				::printf( "\tOriginal key frames: %u\r\n\tFinal total key frames: %u %f\r\n", ui32Total,
					m_sKeyFrames.Length(), m_sKeyFrames.Length() * 100.0f / static_cast<LSFLOAT>(ui32Total) );
				break;
			}
		}
		return ui32FrameMode < LSE_ELEMENTS( emModes );
	}


	/**
	 * Adds the given value at the given time to this track.
	 *
	 * \param _tTime The time of the keyframe to set.  If it already exists it will be overwritten.
	 * \param _fValue The value to set at the given time.
	 * \return Returns true if there was enough memory to complete the operation.
	 */
	LSE_INLINE LSBOOL LSE_CALL CAnimationTrack::SetKeyFrame( FbxTime _tTime, LSFLOAT _fValue ) {
		LSFC_KEY_FRAME kfInsertMe = {
			_tTime,
			_fValue
		};
		return m_sKeyFrames.Insert( kfInsertMe );
	}
		/** A key and value pair. */
		typedef struct LSFC_KEY_FRAME {
			/** The time of the keyframe. */
			FbxTime								tTime;

			/** The value at the time of the keyframe. */
			LSFLOAT								fValue;
		} * LPLSFC_KEY_FRAME, * const LPCLSFC_KEY_FRAME;
(This is not updated with the latest SDK.)


The algorithm for this is extremely trivial and the implementation is really just a handful of lines of code.
You will often end up with fewer key frames than there are in the animation as well, so even though you explode the number of key frames during the frequency sampling part, only the few key frames that matter actually remain, which means you may have even fewer than there were in the original Maya animation (but the same run-time result).



The fact is you need to do key frame reduction no matter what file format you are using. The Autodesk FBX SDK just makes it easy by giving you an animation evaluator.

So, yes, keys are the way to go.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid


osted Yesterday, 11:33 PM
AllEightUp, on 07 Jul 2014 - 10:21 PM, said:
What this means is that basically unless Max has curves supported by FBX, it may be baking them and you won't have a way to find what the original two *poses* in your terms were. Generally you will iterate through time and sample the scene at a fixed rate. Yup, it kinda sucks and generally you'll want to simplify the data after sampling.
But that is what you want to do anyway.
There are several modes for interpolating between key frames and you want to reduce that down to all linear interpolations, because that is the only thing you should support at run-time.

Yes, this is generally what you will want to do and I didn't mean to suggest it was a useless thing. On the other hand, I would really like FBX to retain original key location information in the animation data so I can mark those keys as more important when evaluating key reductions. The added context can greatly help when trying to preserve C1 continuity so things like fingers at the end of long bone chains don't jitter around so much. It generally works out best if you keep those keys in preference to others since very often they are the extreme's of the curves and removal can throw off other reductions and increase the positional/velocity errors to a very notable degree.

Hi guys,

I haven't been able to work on this again until this weekend. Thanks to your help I'm able to get the data from my FBX file but I'm still having difficulty getting it to work correctly. I think it's the offset (or inverse bind) transform that is tripping me up.

I created two bones, root and child, to test with. Here is how it looks in 3ds Max (plus my annotations):

bones_zpsfe8f6b9d.jpg

Here is the data I pulled from the FBX:

root_cluster->GetTransformLinkMatrix(lMatrix);
lMatrix.GetT() : (0, 0, 0)
lMatrix.GetR() : (0, -90.0, 0) // note: I swap Y and Z

child_cluster->GetTransformLinkMatrix(lMatrix);
lMatrix.GetT() : (1.0, 0, 0)
lMatrix.GetR() : (0, 0, 0) // note: I swap Y and Z

At this point I need to create the offset for each bone, I did it by hand because it seemed straight-forward and I need to learn:

root: matrixRotationY(&root_bone.offset, halfPi); // the opposite of a -90 degree rotation, no transation

child: matrixTranslation(&child_bone.offset, -1.0, 0, 0); // the opposite of a (1.0, 0, 0) translation, no rotation

Is that correct ? It's not coming out right, take a look:

skin_zps4390b60a.jpg

This is how it should look, it's the mesh rendered as is, without bones or transforms (the way it should look in it's bind pose):

model_zpse391d366.jpg

I am using the identity for the local transforms (which should keep it in it's bind pose). I do it like this:

 
// bind to world
matrixRotationY(&bindToWorld, -halfPi); // because the root is rotated -90 degrees according to FBX
 
// fill in the transfrom matrices
 
// root bone
parent=bindToWorld; // the root bone's parent is the bindToWorld
matrixIdentity(&local); // using identity as noted above
matrixMultiply(&transforms[0], &local, &parent);
 
// child bone
parent=transforms[0]; // the root is the child's parent
matrixIdentity(&local); // using identity as noted above
matrixMultiply(&transforms[1], &local, &parent);
 
// now finalize by incorporating the offset matrix
 
matrixMultiply(&transforms[0], &root_bone.offset, &transforms[0]); // transforms[0] = rootOffset * transforms[0]
matrixMultiply(&transforms[1], &child_bone.offset, &transforms[1]); // transforms[1] = childOffset * transforms[1]
 
 

Note that for this test I simulate vertex weighting, the big "bone" would have a weight of 1.0 to the root, and the smaller "bone" would have a weight of 1.0 to the child. Maybe I am doing that part wrong but I don't think so. It's a single mesh like a regular skin would be.

As you can see from the image something isn't correct. I feel like I am almost there.

Thanks again for all your help.

If I understand what you're trying to do, the offset matrix (used for skinning vertices, not display) is the inverse of the world transform for the child. Assuming left-to-right matrix multiplication is appropriate, the child's world transform is child_local_matrix * parent_world_matrix. It appears, in your case, the parent_world_transform = root_local_transform.

You've sort of got things backwards. The root local transform is the 90 deg rotation. The child local transform is the translation (1,0,0).

root_world_matrix = rotation by 90 deg;

child_local_matrix = translate by (1,0,0);

child_world = child_local_matrix * root_world_matrix;

child_offset = inverse(child_world);

The offset matrices are used in the vertex skinning process.

The "bind pose" for the child is simply it's world matrix.

Take another look at the article on skinned mesh animation I posted a link to earlier. You don't need to use the hierarchy described, but read about the process of animation near the end of the article. It discusses what the offset matrix is, how to calculate it and how to use it.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

You don't forget how to play when you grow old; you grow old when you forget how to play.

This topic is closed to new replies.

Advertisement