• Create Account

Banner advertising on our site currently available from just \$5!

Like
14Likes
Dislike

# How to Work with FBX SDK

By Tianyu Lang | Published Mar 03 2014 12:01 PM in Graphics Programming and Theory
Peer Reviewed by (NightCreature83, Buckeye, Dave Hunt)

how to fbx autodesk sdk graphics animation model mesh exporter

I have wanted to make an FBX Exporter to convert FBX files to my own format for a while. The entire process is not very smooth, mainly because FBX's official documentation is not very clear. Plus, since FBX format is utilized by a number of applications, rather than just game engines, the sample code provided is not using the more common technical terms we use in game development.

I have searched almost all the corners on the Internet to clarify things so that I can have a clear mapping from FBX SDK's data to what I need in a game engine. Since I don't think anyone has ever posted a clear and thorough tutorial on how to convert FBX files to custom formats, I will do it. I hope this will help people.

This tutorial would be specifically about game engines. Basically I will tell the reader how to get the data they need for their game engine. For things like "how to initialize FBX SDK", please check the sample code yourself, the "ImportScene" sample would be very useful in this respect.

If you have no knowledge about how skeletal animation works and what data you need to make skeletal animation happen, please look at Buckeye's article "Skinned Mesh Animation Using Matrices". It would be very helpful.

## Mesh Data(position, UV, normal, tangent, binormal)

The first thing you want to do is to get the mesh data; it already feels pretty good if you can import your static mesh into your engine.
For the clarity of this section, I choose to show you how I traverse the mesh in a FBX file first. This allows me to give you a Top-Down understanding of what you need to do to gather mesh data. You don't know what each function does specifically, but you should get the idea that I am traversing the 3 vertices on each triangle of the mesh. I will come back to each function later.

Note that there is some code related to blending info for animation. You can ignore it for now. We will come back to it later.

void FBXExporter::ProcessMesh(FbxNode* inNode)
{
FbxMesh* currMesh = inNode->GetMesh();

mTriangleCount = currMesh->GetPolygonCount();
int vertexCounter = 0;
mTriangles.reserve(mTriangleCount);

for (unsigned int i = 0; i < mTriangleCount; ++i)
{
XMFLOAT3 normal[3];
XMFLOAT3 tangent[3];
XMFLOAT3 binormal[3];
XMFLOAT2 UV[3][2];
Triangle currTriangle;
mTriangles.push_back(currTriangle);

for (unsigned int j = 0; j < 3; ++j)
{
int ctrlPointIndex = currMesh->GetPolygonVertex(i, j);
CtrlPoint* currCtrlPoint = mControlPoints[ctrlPointIndex];

// We only have diffuse texture
for (int k = 0; k < 1; ++k)
{
ReadUV(currMesh, ctrlPointIndex, currMesh->GetTextureUVIndex(i, j), k, UV[j][k]);
}

PNTIWVertex temp;
temp.mPosition = currCtrlPoint->mPosition;
temp.mNormal = normal[j];
temp.mUV = UV[j][0];
// Copy the blending info from each control point
for(unsigned int i = 0; i < currCtrlPoint->mBlendingInfo.size(); ++i)
{
VertexBlendingInfo currBlendingInfo;
currBlendingInfo.mBlendingIndex = currCtrlPoint->mBlendingInfo[i].mBlendingIndex;
currBlendingInfo.mBlendingWeight = currCtrlPoint->mBlendingInfo[i].mBlendingWeight;
temp.mVertexBlendingInfos.push_back(currBlendingInfo);
}
// Sort the blending info so that later we can remove
// duplicated vertices
temp.SortBlendingInfoByWeight();

mVertices.push_back(temp);
mTriangles.back().mIndices.push_back(vertexCounter);
++vertexCounter;
}
}

// Now mControlPoints has served its purpose
// We can free its memory
for(auto itr = mControlPoints.begin(); itr != mControlPoints.end(); ++itr)
{
delete itr->second;
}
mControlPoints.clear();
}


First please let me explain how FBX stores all its information about a mesh. In FBX we have the term "Control Point", basically a control point is a physical vertex. For example, you have a cube, then you have 8 vertices. These 8 vertices are the only 8 "control points" in the FBX file. As a result, if you want, you can use "Vertex" and "Control Point" interchangeably. The position information is stored in the control points.

The following code would get you the positions of all the vertices of your mesh:

// inNode is the Node in this FBX Scene that contains the mesh
// this is why I can use inNode->GetMesh() on it to get the mesh
void FBXExporter::ProcessControlPoints(FbxNode* inNode)
{
FbxMesh* currMesh = inNode->GetMesh();
unsigned int ctrlPointCount = currMesh->GetControlPointsCount();
for(unsigned int i = 0; i < ctrlPointCount; ++i)
{
CtrlPoint* currCtrlPoint = new CtrlPoint();
XMFLOAT3 currPosition;
currPosition.x = static_cast<float>(currMesh->GetControlPointAt(i).mData[0]);
currPosition.y = static_cast<float>(currMesh->GetControlPointAt(i).mData[1]);
currPosition.z = static_cast<float>(currMesh->GetControlPointAt(i).mData[2]);
currCtrlPoint->mPosition = currPosition;
mControlPoints[i] = currCtrlPoint;
}
}


Then you ask "how can I get the UVs, Normals, Tangents, Binormals?" Well, please think of a mesh like this for a moment: You have this body of the mesh, but this is only the geometry, the shape of it. This body does not have any information about its surface. In other words, you have this shape, but you don't have any information on how the surface of this shape looks.

FBX introduces this sense of "Layer", which covers the body of the mesh. It is like you have a box, and you wrap it with gift paper. This gift paper is the layer of the mesh in FBX. In the layer, you can acquire the information of UVs, Normals, Tangents, Binormals.

However, you might have already asked me. How can I relate the Control Points to the information in the layer? Well, this is the pretty tricky part and please let me show you some code and then explain it line by line. Without loss of generality, I will use Binormal as an example:

Before we take a look at the function, let's go over its parameters first.
FbxMesh* inMesh: the mesh that we are trying to export
int inCtrlPointIndex: the index of the Control Point. We need this because we want to relate our layer information with our vertices (Control Points)
int inVertexCounter: this is the index of the current vertex that we are processing.
XMFLOAT3& outNormal: the output. We are passing by reference so that we can modify this variable inside this function and use it as our output

After seeing these parameters, you may ask me "Since you said ControlPoints are basically Vertices in FBXSDK. Why do you have inCtrlPointIndex and inVertexCounter? Aren't they the same thing?"

No, they are not the same. As I explained before, Control Points are physical vertices on your geometry. Let's use a quad as an example.
Given a quad(2 triangles), how many Control Points are there?

But how many vertices are there in our triangle-based game engine?
The answer is 6 because we have 2 triangles and each triangle has 3 vertices. 2 * 3 = 6

The main difference between FBXSDK's Control Point and our Vertex is that our Vertex has this sense of "per-triangle" but FBXSDK's Control Point does not.
We will come back to this point in the explanation of the code below. So don't worry if you still do not have a crystal-clear understanding of FBXSDK's Control Point and Vertex in your game engine.

One thing to keep in mind is that outside of this function, we are using a loop to traverse all the vertices of all the triangles in this mesh. If you are confused and do not know what mean by "we are using a loop to traverse all the vertices of all the triangles in this mesh", look at the very top of this "Mesh Data(position, UV, normal, tangent, binormal)" section. That is why we can have parameters like inCtrlPointIndex and inVertexCounter

void FBXExporter::ReadNormal(FbxMesh* inMesh, int inCtrlPointIndex, int inVertexCounter, XMFLOAT3& outNormal)
{
if(inMesh->GetElementNormalCount() < 1)
{
throw std::exception("Invalid Normal Number");
}

FbxGeometryElementNormal* vertexNormal = inMesh->GetElementNormal(0);
switch(vertexNormal->GetMappingMode())
{
case FbxGeometryElement::eByControlPoint:
switch(vertexNormal->GetReferenceMode())
{
case FbxGeometryElement::eDirect:
{
outNormal.x = static_cast<float>(vertexNormal->GetDirectArray().GetAt(inCtrlPointIndex).mData[0]);
outNormal.y = static_cast<float>(vertexNormal->GetDirectArray().GetAt(inCtrlPointIndex).mData[1]);
outNormal.z = static_cast<float>(vertexNormal->GetDirectArray().GetAt(inCtrlPointIndex).mData[2]);
}
break;

case FbxGeometryElement::eIndexToDirect:
{
int index = vertexNormal->GetIndexArray().GetAt(inCtrlPointIndex);
outNormal.x = static_cast<float>(vertexNormal->GetDirectArray().GetAt(index).mData[0]);
outNormal.y = static_cast<float>(vertexNormal->GetDirectArray().GetAt(index).mData[1]);
outNormal.z = static_cast<float>(vertexNormal->GetDirectArray().GetAt(index).mData[2]);
}
break;

default:
throw std::exception("Invalid Reference");
}
break;

case FbxGeometryElement::eByPolygonVertex:
switch(vertexNormal->GetReferenceMode())
{
case FbxGeometryElement::eDirect:
{
outNormal.x = static_cast<float>(vertexNormal->GetDirectArray().GetAt(inVertexCounter).mData[0]);
outNormal.y = static_cast<float>(vertexNormal->GetDirectArray().GetAt(inVertexCounter).mData[1]);
outNormal.z = static_cast<float>(vertexNormal->GetDirectArray().GetAt(inVertexCounter).mData[2]);
}
break;

case FbxGeometryElement::eIndexToDirect:
{
int index = vertexNormal->GetIndexArray().GetAt(inVertexCounter);
outNormal.x = static_cast<float>(vertexNormal->GetDirectArray().GetAt(index).mData[0]);
outNormal.y = static_cast<float>(vertexNormal->GetDirectArray().GetAt(index).mData[1]);
outNormal.z = static_cast<float>(vertexNormal->GetDirectArray().GetAt(index).mData[2]);
}
break;

default:
throw std::exception("Invalid Reference");
}
break;
}
}


Well, this is pretty long but please don't be scared. Actually it is very simple.

This gets us the normal information in the layer

FbxGeometryElementNormal* vertexNormal = inMesh->GetElementNormal(0);

The first switch statement is about MappingMode(). For a game engine, I think we only need to worry about FbxGeometryElement::eByControlPoint and FbxGeometryElement::eByPolygonVertex. Let me explain the 2 modes. As I said, Control Points are basically the vertices. However, there is a problem. Although a cube has 8 Control Points, it will have more than 8 normals if you want your cube to look correct. The reason is if you have a sharp edge, we have to assign more than one normal to the same control point to guarantee that feeling of sharpness. This is when the concept of Vertex in our game engine comes in, because even if you have the same position for a vertex of the cube, in a game engine, you are very likely to end up with 3 vertices with the same position but 3 different normals.

As a result, FbxGeometryElement::eByControlPoint is when you don't have sharp edged situations so each control point only has one normal. FbxGeometryElement::eByPolygonVertex is when you have sharp edges and you need to get the normals of each vertex on each face because each face has a different normal assigned for the same control point. So FbxGeometryElement::eByControlPoint means we can pinpoint the normal of a control point by the index of the control point while FbxGeometryElement::eByPolygonVertex means we cna pinpoint the normal of a vertex on a face by the index of the vertex

This is a more concrete and deep example of the difference of FBXSDK's ControlPoint and Vertex in a game engine and why when I talk about the parameters of this function, I said we have to pass in both inCtrlPointIndex and inVertexCounter. Because we don't know which one we need to get the information we need, we better pass in both.

Now we have another switch statement nested inside, and we are "switching" on ReferenceMode(). This is some kind of optimization FBX is doing, same idea like index buffer in computer graphics. You don't want to have the same Vector3 many times; instead, you refer to it using its index.

FbxGeometryElement::eDirect means you can refer to our normal using the index of control point or index of face-vertex directly

FbxGeometryElement::eIndexToDirect means using the index of control point or index of face-vertex would only gives us an index pointing to the normal we want, we have to use this index to find the actual normal.

This line of code gives us the index we need

int index = vertexNormal->GetIndexArray().GetAt(inVertexCounter);

So these are the main steps to extract position and "layer" information of a mesh.

Now we move onto animation and this is the hard part of FBX exporting.

## Animation Data

So let's think about what we need from FBX to make animation work in our renderer (game engine).

1. The skeleton hierarchy. Which joint is which joint's parent
2. For each vertex, we need 4 SkinningWeight-JointIndex pairs
3. The Bind pose matrix for each joint to calculate the inverse of global bind pose
4. The transformation matrix at time t so that we can transform our mesh to that pose to achieve animation

To get the skeleton hierarchy is pretty easy: basically we perform a recursive Depth-First-Search from the root node of the scene and we go down levels.
A Node is the the building block of a FBX Scene. There are many nodes in a FBX file and each type of node contains some type of information.
If a node is of skeleton type, we add it into our list of joints and its index will just be the size of the list. Therefore, we can guarantee that the index of the parent is always going to be less than that of the child. This is necessary if you want to store local transform and calculate the transformation of a child at time t manually. But if you are using global transformation like I do, you don't necessarily need it like this.

Note: if you are not familiar with the concept of Depth-First-Search.

After reading those pages, you may ask "Why don't we need to keep track of visited nodes?"
The answer is: The Skeleton Hierarchy is a tree, not a graph.

void FBXExporter::ProcessSkeletonHierarchy(FbxNode* inRootNode)
{

for (int childIndex = 0; childIndex < inRootNode->GetChildCount(); ++childIndex)
{
FbxNode* currNode = inRootNode->GetChild(childIndex);
ProcessSkeletonHierarchyRecursively(currNode, 0, 0, -1);
}
}

// inDepth is not needed here, I used it for debug but forgot to remove it
void FBXExporter::ProcessSkeletonHierarchyRecursively(FbxNode* inNode, int inDepth, int myIndex, int inParentIndex)
{
if(inNode->GetNodeAttribute() && inNode->GetNodeAttribute()->GetAttributeType() && inNode->GetNodeAttribute()->GetAttributeType() == FbxNodeAttribute::eSkeleton)
{
Joint currJoint;
currJoint.mParentIndex = inParentIndex;
currJoint.mName = inNode->GetName();
mSkeleton.mJoints.push_back(currJoint);
}
for (int i = 0; i < inNode->GetChildCount(); i++)
{
ProcessSkeletonHierarchyRecursively(inNode->GetChild(i), inDepth + 1, mSkeleton.mJoints.size(), myIndex);
}
}


Now we need to get the SkinningWeight-JointIndex pairs of each vertex. Unfortunately, my code is not very clean on animation so the function below does steps 2,3,4 all at once. I will go over the code so please do not lose patience. This is mainly because the way FBX stores information prevents me from getting data in separate functions efficiently. I need to traverse the same data in multiple-passes if I want to separate my code.

Before seeing any code, please let me explain the terms used in FBX SDK. This is the part where I think most people get confused because FBX SDK's keywords do not match ours (game developers).

In FBX, there is such a thing called a "Deformer". I see a deformer as a way to deform a mesh. In Maya, you can have skeletal deformers but you can also have "contraints" to deform your mesh. I think you can think of "Deformers" as the entire skeleton of a mesh. Inside each "Deformer" (I think usually a mesh only has one), you have "Clusters". Each cluster is and is not a joint...... You can see a cluster as a joint, but actually, inside each cluster, there is a "link". This "link" is actually the real joint, and it contains the useful information I need.

Now we delve into the code:

void FBXExporter::ProcessJointsAndAnimations(FbxNode* inNode)
{
FbxMesh* currMesh = inNode->GetMesh();
unsigned int numOfDeformers = currMesh->GetDeformerCount();
// This geometry transform is something I cannot understand
// I think it is from MotionBuilder
// If you are using Maya for your models, 99% this is just an
// identity matrix
// But I am taking it into account anyways......
FbxAMatrix geometryTransform = Utilities::GetGeometryTransformation(inNode);

// A deformer is a FBX thing, which contains some clusters
// A cluster contains a link, which is basically a joint
// Normally, there is only one deformer in a mesh
for (unsigned int deformerIndex = 0; deformerIndex < numOfDeformers; ++deformerIndex)
{
// There are many types of deformers in Maya,
// We are using only skins, so we see if this is a skin
FbxSkin* currSkin = reinterpret_cast<FbxSkin*>(currMesh->GetDeformer(deformerIndex, FbxDeformer::eSkin));
if (!currSkin)
{
continue;
}

unsigned int numOfClusters = currSkin->GetClusterCount();
for (unsigned int clusterIndex = 0; clusterIndex < numOfClusters; ++clusterIndex)
{
FbxCluster* currCluster = currSkin->GetCluster(clusterIndex);
unsigned int currJointIndex = FindJointIndexUsingName(currJointName);
FbxAMatrix transformMatrix;
FbxAMatrix globalBindposeInverseMatrix;

currCluster->GetTransformMatrix(transformMatrix);	// The transformation of the mesh at binding time
currCluster->GetTransformLinkMatrix(transformLinkMatrix);	// The transformation of the cluster(joint) at binding time from joint space to world space
globalBindposeInverseMatrix = transformLinkMatrix.Inverse() * transformMatrix * geometryTransform;

// Update the information in mSkeleton
mSkeleton.mJoints[currJointIndex].mGlobalBindposeInverse = globalBindposeInverseMatrix;

// Associate each joint with the control points it affects
unsigned int numOfIndices = currCluster->GetControlPointIndicesCount();
for (unsigned int i = 0; i < numOfIndices; ++i)
{
BlendingIndexWeightPair currBlendingIndexWeightPair;
currBlendingIndexWeightPair.mBlendingIndex = currJointIndex;
currBlendingIndexWeightPair.mBlendingWeight = currCluster->GetControlPointWeights()[i];
mControlPoints[currCluster->GetControlPointIndices()[i]]->mBlendingInfo.push_back(currBlendingIndexWeightPair);
}

// Get animation information
// Now only supports one take
FbxAnimStack* currAnimStack = mFBXScene->GetSrcObject<FbxAnimStack>(0);
FbxString animStackName = currAnimStack->GetName();
mAnimationName = animStackName.Buffer();
FbxTakeInfo* takeInfo = mFBXScene->GetTakeInfo(animStackName);
FbxTime start = takeInfo->mLocalTimeSpan.GetStart();
FbxTime end = takeInfo->mLocalTimeSpan.GetStop();
mAnimationLength = end.GetFrameCount(FbxTime::eFrames24) - start.GetFrameCount(FbxTime::eFrames24) + 1;
Keyframe** currAnim = &mSkeleton.mJoints[currJointIndex].mAnimation;

for (FbxLongLong i = start.GetFrameCount(FbxTime::eFrames24); i <= end.GetFrameCount(FbxTime::eFrames24); ++i)
{
FbxTime currTime;
currTime.SetFrame(i, FbxTime::eFrames24);
*currAnim = new Keyframe();
FbxAMatrix currentTransformOffset = inNode->EvaluateGlobalTransform(currTime) * geometryTransform;
currAnim = &((*currAnim)->mNext);
}
}
}

// Some of the control points only have less than 4 joints
// affecting them.
// For a normal renderer, there are usually 4 joints
// I am adding more dummy joints if there isn't enough
BlendingIndexWeightPair currBlendingIndexWeightPair;
currBlendingIndexWeightPair.mBlendingIndex = 0;
currBlendingIndexWeightPair.mBlendingWeight = 0;
for(auto itr = mControlPoints.begin(); itr != mControlPoints.end(); ++itr)
{
for(unsigned int i = itr->second->mBlendingInfo.size(); i <= 4; ++i)
{
itr->second->mBlendingInfo.push_back(currBlendingIndexWeightPair);
}
}
}


At the beginning I have this:

// This geometry transform is something I cannot understand
// I think it is from MotionBuilder
// If you are using Maya for your models, 99% this is just an
// identity matrix
// But I am taking it into account anyways......
FbxAMatrix geometryTransform = Utilities::GetGeometryTransformation(inNode);


Well, this is what I saw on the FBX SDK Forum. The officials there told us we should take into account the "GeometricTransform". But according to my experience, most of the times, this "GeometricTransform" is just an identity matrix. Anyways, to get this "GeometricTransform", use this function:

FbxAMatrix Utilities::GetGeometryTransformation(FbxNode* inNode)
{
if (!inNode)
{
throw std::exception("Null for mesh geometry");
}

const FbxVector4 lT = inNode->GetGeometricTranslation(FbxNode::eSourcePivot);
const FbxVector4 lR = inNode->GetGeometricRotation(FbxNode::eSourcePivot);
const FbxVector4 lS = inNode->GetGeometricScaling(FbxNode::eSourcePivot);

return FbxAMatrix(lT, lR, lS);
}


The very most important thing in this code is how I get the inverse of global bind pose of each joint. This part is very tricky and screwed up many people. I will explain this in details.

FbxAMatrix transformMatrix;
FbxAMatrix globalBindposeInverseMatrix;

currCluster->GetTransformMatrix(transformMatrix); // The transformation of the mesh at binding time
currCluster->GetTransformLinkMatrix(transformLinkMatrix); // The transformation of the cluster(joint) at binding time from joint space to world space
globalBindposeInverseMatrix = transformLinkMatrix.Inverse() * transformMatrix * geometryTransform;

// Update the information in mSkeleton
mSkeleton.mJoints[currJointIndex].mGlobalBindposeInverse = globalBindposeInverseMatrix;


So let's start from this GetTransformMatrix. The TransformMatrix is actually a legacy thing. It is the Global Transform of the entire mesh at binding time and all the clusters have exactly the same TransformMatrix. This matrix would not be needed if your artists have good habits and before they rig the model, they "Freeze Transformations" on all channels of the model. If your artists do "Freeze Transformations", then this matrix would just be an identity matrix.

Now we go on to GetTransformLinkMatrix. This is the very essence of the animation exporting code. This is the transformation of the cluster (joint) at binding time from joint space to world space in Maya.

So now we are all set and we can get our inverse of global bind pose of each joint. What we want eventually is the InverseOfGlobalBindPoseMatrix in VertexAtTimeT = TransformationOfPoseAtTimeT * InverseOfGlobalBindPoseMatrix * VertexAtBindingTime

To get this, we do this: transformLinkMatrix.Inverse() * transformMatrix * geometryTransform

Now we are 2 steps away from animation. We need to get the SkinningWeight-JointIndex pair for each vertex and we still need to get the transformations at different times in the animation

Let's deal with SkinningWeight-JointIndex pair first.

In our game engine, we have this relationship: Vertex -> 4 SkinningWeight-JointIndex pairs. However, in FBX SDK the relationship is inverted. Each cluster has a list of all the control points (vertices) it affects and how much it affects. The code below gets the relationship in the format we favor but please recall that when I process control points, I stored all the control points into a map based on their indices. This is where we can profit. With this map, here we can lookup and update the control point a cluster affects in O(1).

// Associate each joint with the control points it affects
unsigned int numOfIndices = currCluster->GetControlPointIndicesCount();
for (unsigned int i = 0; i < numOfIndices; ++i)
{
BlendingIndexWeightPair currBlendingIndexWeightPair;
currBlendingIndexWeightPair.mBlendingIndex = currJointIndex;
currBlendingIndexWeightPair.mBlendingWeight = currCluster->GetControlPointWeights()[i];
mControlPoints[currCluster->GetControlPointIndices()[i]]->mBlendingInfo.push_back(currBlendingIndexWeightPair);
}


Now we only need the last piece in the puzzle: the Transformations at time t in the animation. Note that this part is something I did not do well, my way is not very optimized since I get every keyframe. What should ideally be done is to get the keys and interpolate between them, but I guess this is a trade-off between space and speed. Also, I did not get down to my feet and study the animation hierarchy of FBX. There is actually an animation curve stored inside FBX file and with some work, you can access it and get lean and clean what you need.

// Get animation information
// Now only supports one take
FbxAnimStack* currAnimStack = mFBXScene->GetSrcObject<FbxAnimStack>(0);
FbxString animStackName = currAnimStack->GetName();
mAnimationName = animStackName.Buffer();
FbxTakeInfo* takeInfo = mFBXScene->GetTakeInfo(animStackName);
FbxTime start = takeInfo->mLocalTimeSpan.GetStart();
FbxTime end = takeInfo->mLocalTimeSpan.GetStop();
mAnimationLength = end.GetFrameCount(FbxTime::eFrames24) - start.GetFrameCount(FbxTime::eFrames24) + 1;
Keyframe** currAnim = &mSkeleton.mJoints[currJointIndex].mAnimation;

for (FbxLongLong i = start.GetFrameCount(FbxTime::eFrames24); i <= end.GetFrameCount(FbxTime::eFrames24); ++i)
{
FbxTime currTime;
currTime.SetFrame(i, FbxTime::eFrames24);
*currAnim = new Keyframe();
FbxAMatrix currentTransformOffset = inNode->EvaluateGlobalTransform(currTime) * geometryTransform;
currAnim = &((*currAnim)->mNext);
}


This part is pretty straightforward - the only thing to be noted is that Maya currently does not support multi-take animations (Perhaps MotionBuilder does). I will decide if I write about exporting materials based on how many people read this article, but it is pretty easy and can be learnt through the "ImportScene" example

## DirectX and OpenGL Conversions

My goal for this FBX exporter is to provide a way to extract data from FBX file, and output the data in a custom format such that the reader's renderer can just take the data and render it. No need for any conversion inside the renderer because all the work of conversion falls on the exporter itself.

Before I say anything, I need to clarify that my way of conversion is only guaranteed to work if you make the model/animation in Maya and export the model/animation from Maya using its default coordinate system(X-Right, Y-Up, Z-Out Of Screen).

If you want to import your model/animation into OpenGL, then more likely you do not need to do any extra steps for conversion because I think by default OpenGL has the same right-handed coordinate system as Maya, which is (X-Right, Y-Up, Z-Out Of Screen). In FBXSDK's sample code "ViewScene", there is no conversion for the data and it uses OpenGL as its renderer with default coordinate system in OpenGL. So if you do run into trouble, take a look at that code. However, if you specify your own coordinate system, then some conversions might be needed.

Now it is time for DirectX and I saw online that most problems come from the case where people want to renderer FBX model/animation in DirectX. So, if you want to import the model/animation into DirectX, you are very likely to need to make some conversions.

I will only address the case where there is a left-handed "X-Right, Y-Up, Z-Into Screen" coordinate system with back-face culling, because from the posts I read, most people use this system when they use DirectX. This does mean anything in general; it is only an observation from my experience.

You need to do the following to convert the coordinates from the right-handed "X-Right, Y-Up, Z-Out Of Screen" to the left-handed "X-Right, Y-Up, Z-Into Screen" system:

Position, Normal, Binormal, Tangent -> we need to negate the Z component of the Vector
UV -> we need to make V = 1.0f - V
Vertex order of a triangle -> change from Vertex0, Vertex1, Vertex2 to Vertex0, Vertex2, Vertex1 (Basically invert the culling order)

Matrices:

1. Get translation component of the matrix, negate its Z component
2. Get rotation component of the matrix, negate its X and Y component
3. I think if you are using XMMath library, you don't need to take the transpose. But don't quote me on that.

To use my way of conversion, you need to decompose the matrix and change its Translation, Rotation and Scale respectively.
Fortunately, FBXSDK provides ways to decompose matrices as long as your matrix is a FbxAMatrix(FBX Affine Matrix). The sample code below shows you how:
FbxAMatrix input; //Assume this matrix is the one to be converted.
FbxVector4 translation = input.GetT();
FbxVector4 rotation = input.GetR();
translation.Set(translation.mData[0], translation.mData[1], -translation.mData[2]); // This negate Z of Translation Component of the matrix
rotation.Set(-rotation.mData[0], -rotation.mData[1], rotation.mData[2]); // This negate X,Y of Rotation Component of the matrix
// These 2 lines finally set "input" to the eventual converted result
input.SetT(translation);
input.SetR(rotation);


If your animation has Scaling, you need to figure out yourself what conversion needs to be done since I have not encountered the case where Scaling happens.

## Limitations and Beyond

So this tutorial is only intended to get you started on FBXSDK. I myself am quite a noob so many of my techniques are probably very inefficient. Here I will list out the problems that I think I have. In doing so, the reader can decide themselves whether to use my technique and what needs to be careful about.

1. The conversion method is only for model/animation exported from Maya with Maya's right-handed X-Right, Y-Up, Z-Out coordinate system. It is very likely that my conversion technique will NOT work in other modeling softwares(Blender, MotionBuilder, 3ds Max)

2. The way I extract animation is inefficient. I need to bake the animation before I export the animation, then I get all keyframes at a rate of 24 frames/sec. This can lead to huge memory consumption. If you know how to play with keys instead of keyframes, please let me know by commenting below.

3. My conversion method does not handle scaling in the animation. As you can see from my code, I never deal with scale component in the transformation matrix when I extract animation. As a result, you need to figure it out on your own if your animation has scaling in it.

4. In this tutorial I did not include the code to remove duplicated vertices, but in reality you will end up a lot of duplicates if you use my way to export FBX file without some optimization. I did a comparison and an optimized export can cut the file size by 2/3.......The reason why you will have duplicates is: if you are traversing each vertex of each triangle in your mesh, although the same Control Point with different normals would be handled well, the same Control Point with the same normal will be counted more than 1 times!

Well, I am actually quite a noob on FBXSDK and game programming in general. If you see any mistakes, or you see any space for improvements, please comment on this article and help me get better. I know there are a lot of pros on this forum, they just don't have enough time to write an article like this to teach people step by step.

## Conclusion

Well, FBXSDK can be pretty nasty to work with. However, once you know what data in FBX means, it is actually very easy to use. I think my article is enough to get people started on using FBXSDK. Please leave a comment if you have any questions.

## Source Code

On demand, I decided to provide the github repo because some readers told me it would be more clear if they have access to my structs.
So here it is.
Please be nice and do not mess up the git repo.
Advice on my coding habit and efficiency is very welcomed.
git@github.com:lang1991/FBXExporter.git

# Article Update Log

3.03.2014 Added a section for limitations of this tutorial
3.03.2014 Corrected misunderstanding about handedness of coordinate system
3.01.2014 Added more explanations on coordinate conversions
3.01.2014 Changed the order of some paragraphs to make the article more clear
2.19.2014 First Version Submitted

Tianyu Lang
University of Southern California

It's worth mentioning that Open Asset Importer ( assimp ) is going to support fbx in the next release.  It already supports tons of other formats including collada, 3ds, and more formats.

Assimp has a much easier to load format and api.

That is pretty awesome. Wish there could be some sort of standard........

Great article! I was one of those who stuck with animations and your explanations are quite handy. Some things I have to advice you: Dont mess with the coordinate systems yourself, FBXSDK can convert it for you
KFbxAxisSystem SceneAxisSystem = m_pScene->GetGlobalSettings().GetAxisSystem();
KFbxAxisSystem OurAxisSystem(KFbxAxisSystem::DirectX) ;
if( SceneAxisSystem != OurAxisSystem )
{
OurAxisSystem.ConvertScene(m_pScene);
}

To get a better understanding of the FBX format I suggest to check out their webinar series: http://download.autodesk.com/media/adn/FBXSDKwebcast_Recordings.zip
The last thing you may look into is https://github.com/shaderjp/FBXLoader it is commented in japanese but the code itself is readable

Hi Tapahob,

I tried the way you convert coordinate system but it seems it does not work very well. I remember I read on the FBXSDK forum that ConvertScene function has a bug or something like that(at least in FBXSDK 2013). Also, I remember it only convert the very parent, not the children attached to it, so I did not use it.

But thanks for the advice and the links! I don't think I have ever looked at the links you gave me!

"argot" : The meaning of that word is more at a secret language specifically intended to exclude outsiders, or an informal slang used by groups. From the context, I assume you mean "jargon" or even "more common technical terms."

"...feels pretty damn good..." "pretty damn long" "how the hell" - No need for profanity in a technical document. That style of writing reflects on you as an author, and on the entire gamdev community.

FBXExporter::ReadNormal code: suggest you move the description of the parameters to a position before the code snippet to help the reader's understanding while the code is examined.

"XMFLOAT3& outNormal: the output. This is trivial to explain" - Explain it anyway. You may well have readers that are looking for just such as explanation. If it is, indeed, trivial, delete the line.

int inCtrlPointIndex: the index of the Control Point...
int inVertexCounter: this is the index of the current
vertex that we are processing. This might be confusing.
Ignore this for now.
...
This is why in the above code I passed in both inCtrlPointIndex and
inVertexCounter. Because we don't know which one we need to get
the information we need,we better pass in both.

You use both inCtrlPointIndex and inVertexCounter in the paragraphs that follow without further definition, apparently relating them to control points and polygons. That sequence of paragraphs is confusing.

You may want to consider moving the code snippet for FBXExporter::ProcessMesh(FbxNode* inNode) to an earlier position in the code and note that you will describe parts of it later in the article, as you do with other parts: "We will come back to it later." It may provide a better context for the various indices you later describe. Just a suggestion.

You could add a link to (yeah, it's my own article) "Skinned Mesh Animation Using Matrices" for an overview of the skeleton hierarchy, nodes (frames, bones..) and animation data. That may provide a context for FBX nodes and your derivation of the the bind pose ("offset") transforms, etc.

I'm vaguely familiar with the DFS concept, but it may help your readers if simply expand the term, or delete the phrase "perform a recursive DFS from the root node" altogether as you haven't introduced the concept of nodes and what the root node is.

"You know in Maya,.." I don't personally know, no. Maybe just "In Maya,.."

Just a guess: I'm not familiar with FBX or Maya. However, it may be the Geometry Transform is the world tranform for the hierarchy/mesh is they weren't modeled symmetric to the origin.

You may want to take a look at my previously mentioned Skinned Mesh article and my blog on "An Animation Controller" for an idea how keyframe interpolation can be implemented.

It's apparent you've done a lot of work regarding FBX import. Thumbs up for your effort. The quantity and quality of code snippets is very good and provides a base for others. It's apparent from gamdev and other internet posts that you're not the only one who finds FBX difficult to work with. Thank you for posting your work, particularly with notations regarding code which may just be "guesses" at how it's all supposed to work.

However, I'd like to see the article cleaned up a bit more, particularly in the area of describing terms and variables used in the code. I understand very well that you may not understand all the inner workings of the FBX SDK, few probably do, but I think you can explain a bit better how you use the variables.

You may want to include or both of these links to provide a context for the skeletal hierarchy.

http://en.wikipedia.org/wiki/Depth-first_search

http://courses.cs.washington.edu/courses/cse326/03su/homework/hw3/dfs.html

Personally, in order to learn how to use FBX SDK I took some provided sample that printed on output pretty much all important information and studies its source code. That's sort of reverse engineering but I made my FBX importer (supporting skeletal animation) pretty quickly. You just need to prepare yourself a simple scene that has features important to you, export to to FBX and import it into that SDK's sample app.

"argot" : The meaning of that word is more at a secret language specifically intended to exclude outsiders, or an informal slang used by groups. From the context, I assume you mean "jargon" or even "more common technical terms."

Yeah.... I am not a native speaker so I just use this word as I learned it several days ago from GRE vocabulary....Will change it

"...feels pretty damn good..." "pretty damn long" "how the hell" - No need for profanity in a technical document. That style of writing reflects on you as an author, and on the entire gamdev community.

Yeah. I need to be more professional! Definitely gonna change it

FBXExporter::ReadNormal code: suggest you move the description of the parameters to a position before the code snippet to help the reader's understanding while the code is examined.

"XMFLOAT3& outNormal: the output. This is trivial to explain" - Explain it anyway. You may well have readers that are looking for just such as explanation. If it is, indeed, trivial, delete the line.

int inCtrlPointIndex: the index of the Control Point...
int inVertexCounter: this is the index of the current
vertex that we are processing
. This might be confusing.
Ignore this for now.
...
This is why in the above code I passed in both inCtrlPointIndex and
inVertexCounter. Because we don't know which one we need to get
the information we need,we better pass in both.

You use both inCtrlPointIndex and inVertexCounter in the paragraphs that follow without further definition, apparently relating them to control points and polygons. That sequence of paragraphs is confusing.

You may want to consider moving the code snippet for FBXExporter::ProcessMesh(FbxNode* inNode) to an earlier position in the code and note that you will describe parts of it later in the article, as you do with other parts: "We will come back to it later." It may provide a better context for the various indices you later describe. Just a suggestion.

Yeah. Some of these are my concerns, too. I just got lazy when I wrote the article. I will make some structural changes to my article so that it is smooth and will not hold questions for the readers.

You could add a link to (yeah, it's my own article) "Skinned Mesh Animation Using Matrices" for an overview of the skeleton hierarchy, nodes (frames, bones..) and animation data. That may provide a context for FBX nodes and your derivation of the the bind pose ("offset") transforms, etc.

I read through your article and it is good stuff! Yeah. I think we should add links to each other's article because it is not cool if a reader knows how to do skeletal animation but cannot use their own models, or has models but does not know how to use those data.

I'm vaguely familiar with the DFS concept, but it may help your readers if simply expand the term, or delete the phrase "perform a recursive DFS from the root node" altogether as you haven't introduced the concept of nodes and what the root node is.

Yeah. I will explain it more since DFS is a very important algorithm in computer science.

"You know in Maya,.." I don't personally know, no. Maybe just "In Maya,.."

Well, English is not my first language and I will change it.

Just a guess: I'm not familiar with FBX or Maya. However, it may be the Geometry Transform is the world tranform for the hierarchy/mesh is they weren't modeled symmetric to the origin.

Yeah. I have the same kind of feeling. But I cannot pinpoint what GeometricTransform does. Since when you get TransformMatrix, it is already the transform of the entire mesh. I will write more about it to clarify.

You may want to take a look at my previously mentioned Skinned Mesh article and my blog on "An Animation Controller" for an idea how keyframe interpolation can be implemented.

I know how to do NLERP but not SLERP. I did not really do it right now mainly because interpolation Matrices is not ideal, too expensive. If we can interpolate vectors and quaternions, then I guess it is worth a shot? I will modify my exporter a bit so that instead of outputting the entire matrices, it outputs vectors and quaternions. This way interpolation will be good.

However, I'd like to see the article cleaned up a bit more, particularly in the area of describing terms and variables used in the code. I understand very well that you may not understand all the inner workings of the FBX SDK, few probably do, but I think you can explain a bit better how you use the variables.

Yeah. I know at some point I need to "refactor" my article....because I slacked a bit when writing it.....

Re: using the data in DirectX - Your suggestion for transforming between a right-handed coordinate system to a left-handed system is incorrect. The method you mention performs translations which merely create a mirror-image of the mesh in the right-hand system and make it renderable in DirectX by changing the culling. That does not change the coordinate system. With the method you suggest, I suspect you'll find that, for instance, a model which moves its right arm in OpenGL will move its left arm when rendered in DirectX.

The information you mentioned about transforming only the root node sounds like the correct approach. I import models from Blender (right-hand coordinate system) and apply a transformation (not a translation) only to the root node for use in DirectX.

EDIT: We apparently cross-posted as I see you are addressing previous comments. Excellent.

Regarding the transformation from right- to left-landed coordinate systems, the beauty of the hierarchical structure is that a transformation applied to the root frame is propagated to its children. The transformation I apply for Blender (right-handed, Z-up) to DirectX (left-handed, Y-up) is (in right-hand space):

- rotate by pi/2 (90 degrees) about the X-axis to bring the Y-axis up. Z is now pointing in the wrong direction.

- scale Z by -1

because interpolation Matrices is not ideal

Correct. Interpolation of matrices normally requires decomposition into rotation, scale and translate components. If the matrix involves non-linear scaling (shearing), the results of the decomposition will be indeterminate. Better to store animation information as (e.g.,) quaternions and vectors. For each time increment in the animation, NLERP (or SLERP) the quats, interpolate the vectors, and form a transformation from the result.

Note: For most animations, NLERP is probably sufficient as the changes in rotation are very small, and NLERP is faster than SLERP.

Re: using the data in DirectX - Your suggestion for transforming between a right-handed coordinate system to a left-handed system is incorrect. The method you mention performs translations which merely create a mirror-image of the mesh in the right-hand system and make it renderable in DirectX by changing the culling. That does not change the coordinate system. With the method you suggest, I suspect you'll find that, for instance, a model which moves its right arm in OpenGL will move its left arm when rendered in DirectX.

The information you mentioned about transforming only the root node sounds like the correct approach. I import models from Blender (right-hand coordinate system) and apply a transformation (not a translation) only to the root node for use in DirectX.

EDIT: We apparently cross-posted as I see you are addressing previous comments. Excellent.

Regarding the transformation from right- to left-landed coordinate systems, the beauty of the hierarchical structure is that a transformation applied to the root frame is propagated to its children. The transformation I apply for Blender (right-handed, Z-up) to DirectX (left-handed, Y-up) is (in right-hand space):

- rotate by pi/2 (90 degrees) about the X-axis to bring the Y-axis up. Z is now pointing in the wrong direction.

- scale Z by -1

I tested the animation I exported and did not see the mirror problem you mentioned.

I think my way of converting the coordinate system is fine but it does not really use a careful "linear algebra" way to solve the problem.

The main reason is that I tried FBX's ConvertScene function but many people complain it has bugs for DirectX conversion so I avoided it.

I can do more research on it and figure out a better way to solve this conversion problem.

Thank you for your careful observation.

By the way, I changed the several things you mentioned that I need to change. Again, thank you for the advice.

because interpolation Matrices is not ideal

Correct. Interpolation of matrices normally requires decomposition into rotation, scale and translate components. If the matrix involves non-linear scaling (shearing), the results of the decomposition will be indeterminate. Better to store animation information as (e.g.,) quaternions and vectors. For each time increment in the animation, NLERP (or SLERP) the quats, interpolate the vectors, and form a transformation from the result.

Note: For most animations, NLERP is probably sufficient as the changes in rotation are very small, and NLERP is faster than SLERP.

Yeah. Ideally, I should only have the keys and interpolate between them.

What is your take on Speed VS Space?

Because now as you can see, I get all the frames inside an animation instead of just getting keys and interpolate between them. This is memory-intense but could save some processing power doing the interpolation.

What do you think is a good balance between space and speed? I don't really know how expensive interpolation can be if you need to do a lot of them.

Of course, if the expense is negligible, then I should definitely change the way I am doing it.

First, speed vs. space:  my personal choice when it comes to real-time animation is normally in favor of speed. If the purpose for the animation is for game purposes, there's going to be plenty to do into between renders without spending time animating just one of the characters. With regard to memory usage, you can do a simple calculation to find out what the difference is between storing data in one format versus another. The difference between a matrix (16 floats) and a quaternion and 2 vectors ( 4 + 3 + 3  = 10 floats ) is moot.

However, it's not necessarily a question of speed vs. space - as mentioned, decomposing a matrix isn't a good idea. If it means a bit more memory space and a bit more code to do something right, that's the only choice.

With regard to transforming the data from a right-handed system to a left-handed system, several more comments:

DirectX does not require or enforce the use of a right-handed or left-handed coordinate system. It doesn't require that any particular axis is "up." Further, back-face or front-face culling is a choice, not a requirement. In fact, several functions are provided explicitly for right-hand systems. How data is managed is just a matter of the correct application of mathematics.

It may be that a lot of programmers use a left-handed system with Z up, but that may not always be the case.

I would recommend that you either provide a sound mathematical method for conversion or provide a disclaimer that your article applies to right-handed systems, maybe even to just OpenGL. I certainly recommend against providing a method that you haven't thoroughly tested yourself. I think that's the simplest solution for that section of your article. Just mention that some DirectX applications may use left-handed systems and a proper method for use of FBX data will have to be determined by the user.

I make that suggestion for two reasons:

1. You haven't convinced that you know the method is correct.

2. If I understand it right, your suggested method requires writing a routine to access and modify thousands of pieces of data, and that routine would be comprised of accessing a transformation, decomposing it, modifying the components, rebuilding the transformation and storing it again. If it were left up to me as a DirectX user, I would, instead, calculate a single transformation to apply to the root node.

That being said, if you have tested the conversion for a DirectX application using a left-handed, Y-up coordinate system, exactly the way you've described it, be more specific about how to "get" translations and rotation from a matrix. As mentioned above, If you mean decomposition, say that directly and mention the limitations. You should also address how to modify of the scale and which axis or axes should be modified. I've created several models (not in FBX) that use scaling as part of the animation. If FBX does not support scaling, that's a moot point and should be stated.

First, speed vs. space:  my personal choice when it comes to real-time animation is normally in favor of speed. If the purpose for the animation is for game purposes, there's going to be plenty to do into between renders without spending time animating just one of the characters. With regard to memory usage, you can do a simple calculation to find out what the difference is between storing data in one format versus another. The difference between a matrix (16 floats) and a quaternion and 2 vectors ( 4 + 3 + 3  = 10 floats ) is moot.

However, it's not necessarily a question of speed vs. space - as mentioned, decomposing a matrix isn't a good idea. If it means a bit more memory space and a bit more code to do something right, that's the only choice.

With regard to transforming the data from a right-handed system to a left-handed system, several more comments:

DirectX does not require or enforce the use of a right-handed or left-handed coordinate system. It doesn't require that any particular axis is "up." Further, back-face or front-face culling is a choice, not a requirement. In fact, several functions are provided explicitly for right-hand systems. How data is managed is just a matter of the correct application of mathematics.

It may be that a lot of programmers use a left-handed system with Z up, but that may not always be the case.

I would recommend that you either provide a sound mathematical method for conversion or provide a disclaimer that your article applies to right-handed systems, maybe even to just OpenGL. I certainly recommend against providing a method that you haven't thoroughly tested yourself. I think that's the simplest solution for that section of your article. Just mention that some DirectX applications may use left-handed systems and a proper method for use of FBX data will have to be determined by the user.

I make that suggestion for two reasons:

1. You haven't convinced that you know the method is correct.

2. If I understand it right, your suggested method requires writing a routine to access and modify thousands of pieces of data, and that routine would be comprised of accessing a transformation, decomposing it, modifying the components, rebuilding the transformation and storing it again. If it were left up to me as a DirectX user, I would, instead, calculate a single transformation to apply to the root node.

That being said, if you have tested the conversion for a DirectX application using a left-handed, Y-up coordinate system, exactly the way you've described it, be more specific about how to "get" translations and rotation from a matrix. As mentioned above, If you mean decomposition, say that directly and mention the limitations. You should also address how to modify of the scale and which axis or axes should be modified. I've created several models (not in FBX) that use scaling as part of the animation. If FBX does not support scaling, that's a moot point and should be stated.

If your DirectX renderer uses DX's default settings and you use Left-Handed LookAt and Perspective Matrix, then you are using a "X-Right, Y-Up, Z-Into Screen" coordinate system.

That is incorrect.

Your effort to make your article useful to DirectX users is laudable, no doubt about that. However, it appears you're unfamiliar with DirectX and, as a result, you're making technically incorrect statements.

Also, as mentioned in previous posts, you do not justify your method of conversion which performs thousands of calculations, in lieu of a single transformation matrix. If that's because you haven't been able to get a "single transformation" to work, mention that in the article. You shouldn't ignore a method which may be much simpler if you don't have technical reasons to do so. You should really consider that FBX provides a function to do that transformation, and consider mentioning that, also, perhaps along with your personal experience with it. Unless you're sure of the technical knowledge of forum posters on FBX, just saying you "heard" there may be bugs doesn't really fit with the technical nature of the rest of your article.

If you feel you must provide a specific conversion method, perhaps you should just describe what the results of that method are, and let the user determine if that fits his circumstances.

However, it appears you're unfamiliar with DirectX and, as a result, you're making technically incorrect statements.

Can you explain a little bit about DX's coordinate system because I feel I have some misunderstanding to this MSDN page:

http://msdn.microsoft.com/en-us/library/windows/desktop/bb324490(v=vs.85).aspx

It says "Microsoft Direct3D uses a left-handed coordinate system. If you are porting an application that is based on a right-handed coordinate system, you must make two changes to the data passed to Direct3D." and above the words it shows a graph that shows a "X-right, Y-up, and Z-in" system.

You should really consider that FBX provides a function to do that transformation, and consider mentioning that, also, perhaps along with your personal experience with it.

The link below shows why I do not use ConvertScene function to perform the change. The staff says "Nope, we haven't spent time fixing the RHS to LHS in 2013.1, mainly because there is work-around for this." And my work-around is one of the work-arounds.

http://forums.autodesk.com/t5/FBX-SDK/Converting-scene-coordinate-system-and-other/td-p/4103701

If that's because you haven't been able to get a "single transformation" to work, mention that in the article.

Well. Yeah. I could not figure out the math to perform the same effect of "Negating Z of translation, negating X, Y of rotation". If you can give me some hints, that would be really good. Because that way I don't need to decompose matrices.

If you feel you must provide a specific conversion method, perhaps you should just describe what the results of that method are, and let the user determine if that fits his circumstances.

Yeah. I feel I have to give a specific conversion method because I see so many people on the Internet trying to figure out how to do the conversion and they have no clue.

"Before I say anything, I need to clarify that my way of conversion is only guaranteed to work if you make the model/animation in Maya and export the model/animation from Maya using its default coordinate system(X-Right, Y-Up, Z-Out Of Screen). "

"If your animation has Scaling, you need to figure out yourself what conversion needs to be done since I have not encountered the case where Scaling happens."

With regard to Direct3D, the D3DX API provides a right- and left-handed version for every projection matrix function - D3DXMatrixPerspectiveLH/RH, D3DXMatrixOrthoLH/RH, etc., etc. Choosing back-face or front-face culling is provided for in some way in every Dx API I've used. I can create a LookAt matrix to look in any direction, with an orthogonal Up vector in any direction. I can set it up with the Z-axis pointing into the screen, out of the screen or any direction I choose. I can create models that are vertical along the X-axis if I like, and, by setting my view matrix with Up = the X-axis, it will render just fine. All that, however, has nothing to do with your article.

if you want to import the model/animation into DirectX, you need to make some conversions. [emphasis mine]

The point I'm trying to make is that the data does NOT need to be modified to render it in either a right-handed or left-handed system. If your intent is to export modified data for the sole purpose of avoiding a single transformation at render time, among many transformations that will be done anyway, that's a different intent. But you're stating that it's necessary to modify the data to render it correctly. That is not correct.

Consider: if you want to animate an object to make it appear to be rotating, you do not modify the object data every render cycle. You set up a world matrix (a single transformation) for it that includes a rotation. You then send the unmodified object data through the pipeline where that transformation is applied, as well as other operations to transform it into clipped screen-space.

By the very same process, using the unmodified data you've provided from the FBX file in your article (very explicitly explained, I might add), you can render it in a left-handed coordinate space by applying a single transform which I described earlier. All that has to be done is to ensure that that coordinate system transformation is applied to the mesh, after it is animated but before it enters the pipeline. If the root node is setup with that single transformation as part of the data, that's all that's necessary.

During the animation process, each local transformation matrix will be multiplied by its parent's transformation, though the hierarchy to the root node, where it will get multiplied by the root node's transformation matrix, in any case.

With regard to Direct3D, the D3DX API provides a right- and left-handed version for every projection matrix function - D3DXMatrixPerspectiveLH/RH, D3DXMatrixOrthoLH/RH, etc., etc. Choosing back-face or front-face culling is provided for in some way in every Dx API I've used. I can create a LookAt matrix to look in any direction, with an orthogonal Up vector in any direction. I can set it up with the Z-axis pointing into the screen, out of the screen or any direction I choose. I can create models that are vertical along the X-axis if I like, and, by setting my view matrix with Up = the X-axis, it will render just fine. All that, however, has nothing to do with your article.

if you want to import the model/animation into DirectX, you need to make some conversions. [emphasis mine]

The point I'm trying to make is that the data does NOT need to be modified to render it in either a right-handed or left-handed system. If your intent is to export modified data for the sole purpose of avoiding a single transformation at render time, among many transformations that will be done anyway, that's a different intent. But you're stating that it's necessary to modify the data to render it correctly. That is not correct.

Consider: if you want to animate an object to make it appear to be rotating, you do not modify the object data every render cycle. You set up a world matrix (a single transformation) for it that includes a rotation. You then send the unmodified object data through the pipeline where that transformation is applied, as well as other operations to transform it into clipped screen-space.

By the very same process, using the unmodified data you've provided from the FBX file in your article (very explicitly explained, I might add), you can render it in a left-handed coordinate space by applying a single transform which I described earlier. All that has to be done is to ensure that that coordinate system transformation is applied to the mesh, after it is animated but before it enters the pipeline. If the root node is setup with that single transformation as part of the data, that's all that's necessary.

During the animation process, each local transformation matrix will be multiplied by its parent's transformation, though the hierarchy to the root node, where it will get multiplied by the root node's transformation matrix, in any case.

I think I got your point.

Basically, I made the assumption that everyone is gonna use the DirectXMath library and when they create lookat and projection matrices, they would use the functions provided to make a left-handed lookat and a left-handed projection matrix. This way, it fits my description(correct?) Also, I made the assumption that everyone is using back-face culling in their renderer, and I need to explicitly say that I use back-face culling.

To make it clear, I need to say what specifications I used to render my scene in DirectX.  (This is what you want me to be clear about, correct?)

As for the matrix multiplication, it has something to do with my intent and the way I extract animation data.

Basically, my intent is that I want the reader to do no work on in their renderer to display the animations correctly. In other words, I want the work all falls on the exporter, because this is a tutorial to export FBX file to people's own custom format. I want the data in the custom format ready-to-use.

Additionally, the targeted readers of this article are beginners so I don't want to make them modify the code in their renderer. That can be confusing to them.

The way I am exporting animation data now is: everything I export is in global space. So you can say that the skeleton hierarchy is not that useful in my renderer because I do not need to multiply up the skeleton hierarchy to calculate the final global transformation for each joint.

However, the reader can easily calculate the local transform themselves.

In this case, since I query FBXSDK for the global transformation of each joint and FBXSDK does not take into account the coordinate system conversion, I have to do the conversion myself.

I admit that I can think of a better way than decomposing the matrices........

But some kind of conversion has to be done in the exporter if I want to make the data in people's custom format ready-to-use.

I think what you want me to clarify is:

1. Tell my readers my intent(make the data ready-to-use in their renderer)

2. Tell them that my process makes skeletal hierarchy less useful because they can directly feed the shader the already-computed global transformations (this has limitations because if they want to rotate the characters by certain degree according to player's mouse input(like in counter-strike when your character looks around), this can cause problems. And I will mention this.)

3. Tell them my whole mechanics of getting the global transformation directly, and the consequence that my decomposing and conversion are necessary. And probably mention how they can construct the local transformations according to global transformations.

Is this the correction/clarification that you want me to make?

Thank you very much for your corrections. I am quite a noob on renderers since I have not written my own from scratch, so your correction on DX's coordinate system is very valuable.

If your DirectX renderer uses DX's default settings and you use Left-Handed LookAt and Perspective Matrix, then you are using a "X-Right, Y-Up, Z-Into Screen" coordinate system.

As mentioned before, that statement is incorrect. The mere use of a left-hand LookAt (view) matrix or a left-hand perspective projection does not guarantee that Z will be into the screen.

It would, perhaps, be much simpler for you to leave out all of the "Up," "Right," "In" and "Out" descriptions of the axes and talk only in terms of left- and right-handed coordinate systems.

It appears your routine transforms data from one coordinate system to another, and nothing more. Just say that without all the Ups and Rights and Ins and Outs.

"Many DirectX applications use a left-hand coordinate system. The following method converts the data from..." etc.

Separate the direction of the view from the handedness of the system. Maybe something like: "Following the conversion, a model created in FBX facing in such-and-such direction and 'up' in some-other-direction can then best be viewed in a left-handed coordinate system with the view facing this-way with 'up' that-way."

EDIT: Regarding the scaling issue which results from the use of decomposition, I still think you should be more descriptive about the limitations of your conversion method. Some modeler (who doesn't know programming) is going to model Pinnocchio with a growing nose (by making the "nose-bone" a lot longer and the width not so much) and the programmer (who doesn't know how to model) is going to tell the modeler his model "doesn't work." Neither will know that there's non-uniform scaling involved. IMHO, a technical article should discuss the limitations as well as the advantages of whatever is being discussed.

You mention several limitations or requirements for the use of your code elsewhere in the article. Be as explicit with its use in that discussion also.

If your DirectX renderer uses DX's default settings and you use Left-Handed LookAt and Perspective Matrix, then you are using a "X-Right, Y-Up, Z-Into Screen" coordinate system.

As mentioned before, that statement is incorrect. The mere use of a left-hand LookAt (view) matrix or a left-hand perspective projection does not guarantee that Z will be into the screen.

It would, perhaps, be much simpler for you to leave out all of the "Up," "Right," "In" and "Out" descriptions of the axes and talk only in terms of left- and right-handed coordinate systems.

It appears your routine transforms data from one coordinate system to another, and nothing more. Just say that without all the Ups and Rights and Ins and Outs.

"Many DirectX applications use a left-hand coordinate system. The following method converts the data from..." etc.

Separate the direction of the view from the handedness of the system. Maybe something like: "Following the conversion, a model created in FBX facing in such-and-such direction and 'up' in some-other-direction can then best be viewed in a left-handed coordinate system with the view facing this-way with 'up' that-way."

EDIT: Regarding the scaling issue which results from the use of decomposition, I still think you should be more descriptive about the limitations of your conversion method. Some modeler (who doesn't know programming) is going to model Pinnocchio with a growing nose (by making the "nose-bone" a lot longer and the width not so much) and the programmer (who doesn't know how to model) is going to tell the modeler his model "doesn't work." Neither will know that there's non-uniform scaling involved. IMHO, a technical article should discuss the limitations as well as the advantages of whatever is being discussed.

You mention several limitations or requirements for the use of your code elsewhere in the article. Be as explicit with its use in that discussion also.

Ha. After some digging into the concept of "handedness", I think I finally finally know what you mean.

Basically, I think I will delete all my mentioning of the "handedness" of my coordinate systems.

I am only going to describe a coordinate system in terms of "Where positive X is pointing to, where positive Y is point to, and where positive Z is pointing to".

Because saying it this way will imply the handedness of the coordinate system and at the same time, fix the technical inaccuracy you are talking about.

This is what you ultimately mean, right?

As for the limitations, I think I will aggregate all the limitations in one section called "Limitations and Beyond". That way, the reader can easily spot all the things that my method is lack of and make proper adaptions based on their own demand.

Again, thank you very much!

I am only going to describe a coordinate system in terms of "Where positive X is pointing to, where positive Y is point to, and where positive Z is pointing to".

Because saying it this way will imply the handedness of the coordinate system and at the same time, fix the technical inaccuracy you are talking about.

This is what you ultimately mean, right?

Wrong. It's precisely the use of those terms that make your statements incorrect.

If your DirectX renderer uses DX's default settings and you use Left-Handed LookAt and Perspective Matrix, then you are using a "X-Right, Y-Up, Z-Into Screen" coordinate system.

That statement is incorrect as explained above (EDIT: and below)

Example: D3DXMatrixLookAtLH( &viewMat, &D3DXVECTOR3( 0, 0, 0 ), &D3DXVECTOR3( 0, 1, 0 ), &D3DXVECTOR3( 0, 0, 1 ) );

That's a left-handed LookAt matrix, just as you specify. It results in an eyepoint at the origin, looking into the screen along the +Y axis, with the Z-axis parallel with the screen, and pointing up.

I am only going to describe a coordinate system in terms of "Where positive X is pointing to, where positive Y is point to, and where positive Z is pointing to".

Because saying it this way will imply the handedness of the coordinate system and at the same time, fix the technical inaccuracy you are talking about.

This is what you ultimately mean, right?

Wrong. It's precisely the use of those terms that make your statements incorrect.

If your DirectX renderer uses DX's default settings and you use Left-Handed LookAt and Perspective Matrix, then you are using a "X-Right, Y-Up, Z-Into Screen" coordinate system.

That statement is incorrect as explained above.

Hmmmm. This makes me even more confused.

once we decide X and Y axis, the line along which Z lies is decided. And "handedness" is decided by the direction of Z axis.

So, if I only specify the direction of each axes, then the reader should be able to deduce the handedness of the coordinate system that I describe. And a "x-right, y-up, z-in" coordinate system is a left-handed coordinate system.

By the way,

If your DirectX renderer uses DX's default settings and you use Left-Handed LookAt and Perspective Matrix, then you are using a "X-Right, Y-Up, Z-Into Screen" coordinate system.

this sentence is abandoned. I will not use this sentence anymore because it is wrong. I am aware of this fact. Because "x-right, y-up and z-in" IS a left-handed coordinate system but a left-handed coordinate system IS NOT necessarily "x-right, y-up and z-in".

I am confused because I thought the sentence in BOLD is what you mean, but it seems it is not.

Can you clear the confusion for me a little bit?

EDIT:

Basically I am not saying "Left-handed = X-right, Y-up, Z-in" anymore.

Equating handedness to any specific configuration of a coordinate system's axis is wrong.

As a result, I am only going to say:

Under this specific configuration X-Right, Y-Up, Z-in in your renderer, if you follow my tutorial, then the data will be ready-to-use.

Therefore,

If your DirectX renderer uses DX's default settings and you use Left-Handed LookAt and Perspective Matrix, then you are using a "X-Right, Y-Up, Z-Into Screen" coordinate system.

I will delete this sentence.

And replace it with

"If your DirectX renderer has a X-right, Y-up, Z-in coordinate system, then you can follow my article without making any modification to the output data or your renderer."

I hope this clear something up.

That statement is incorrect as explained above (EDIT: and below)

Example: D3DXMatrixLookAtLH( &viewMat, &D3DXVECTOR3( 0, 0, 0 ), &D3DXVECTOR3( 0, 1, 0 ), &D3DXVECTOR3( 0, 0, 1 ) );

That's a left-handed LookAt matrix, just as you specify. It results in an eyepoint at the origin, looking into the screen along the +Y axis, with the Z-axis parallel with the screen, and pointing up.

I think my response above addresses this point. Can you take a look? Thank you!

Because "x-right, y-up and z-in" IS a left-handed coordinate system but a left-handed coordinate system IS NOT necessarily "x-right, y-up and z-in".

That statement is the whole idea! A left-handed system doesn't have to be rendered with X pointing to the right of the screen, or the Z axis oriented pointing inward when viewed on the screen.

It just occurred to me: are you trying to establish an alternate jargon for describing coordinate systems, and not refer to how axes appear on-screen? The terms up, right, and in are normally used to describe how the scene is actually rendered, i.e., how the axes are actually oriented with respect to the user sitting in front of a monitor.

Describing coordinate systems as "left-handed" and "right-handed" are quite often used to make the distinction between DirectX and OpenGL. Those terms are well-defined and accurately describe the situation. I'm afraid I don't understand your aversion to using technical terms in a technical article.

If your intent is to use those terms to generally describe coordinate systems, rather than using "left" and "right," I can't really object.

Because "x-right, y-up and z-in" IS a left-handed coordinate system but a left-handed coordinate system IS NOT necessarily "x-right, y-up and z-in".

That statement is the whole idea! A left-handed system doesn't have to be rendered with X pointing to the right of the screen, or the Z axis oriented pointing inward when viewed on the screen.

It just occurred to me: are you trying to establish an alternate jargon for describing coordinate systems, and not refer to how axes appear on-screen? The terms up, right, and in are normally used to describe how the scene is actually rendered, i.e., how the axes are actually oriented with respect to the user sitting in front of a monitor.

Describing coordinate systems as "left-handed" and "right-handed" are quite often used to make the distinction between DirectX and OpenGL. Those terms are well-defined and accurately describe the situation. I'm afraid I don't understand your aversion to using technical terms in a technical article.

I don't really have an aversion towards the technical terms.

The problem with me is that whenever I use DX, I use the same left-handed X-right, Y-up, Z-in system.

This makes me think that everyone else does the same thing, which causes some confusion.

The reason why I always try to specify X-right, Y-up, Z-out/in is that in Maya, it has a X-right, Y-up, Z-out system and I want the reader to make connections between Maya, OGL and DX.

Anyways, thank you very much for rectifying my misunderstanding.

I will definitely clarify the coordinate system configuration in my sentences.

Note: Please offer only positive, constructive comments - we are looking to promote a positive atmosphere where collaboration is valued above all else.

PARTNERS