FBX SDK is really flexible solution but it has really complicated API. For a start I suggest you to go with Assimp and *.DAE files. I'm suggesting this because FBX SDK is developed to serve as a complete scene file used for storing "WorkInProgess" data, rather than completely modeled scene. I suggest the ASSIMP library in order to skip all those UNIMPORTANT details that are introduced by FBX SDK.
The over all thing: There are a lot of different methods for animating a 3d scene/object.
If you look closely to 3DSMax/Maya you will notice that each mesh is attached to a 'node'. Each node represents a location in space.
Those nodes are connected to form a 'hierarchy tree'. Each node may or may not have attached meshes to it. If a mesh is attached to a node that means that you must draw the attached mesh at the location the node.
So lets imagine that we have a scene with 2 similar boxes and a sphere. Our scene would look like this.
Contained meshes : Box, Sphere.
RootNode(Transform:Z, no meshes are attached to that node)
|
|\------Body(Transform:TBody, AttachedMeshes Sphere)
|
|\------Feet0(Transform:TFeet0, AttachedMeshes Box)
|
\------Feet1(Transform: TFeet1, AttachedMeshes Box)
Basically all nodes are children of The root node.
so the 1st methods is to animate the transforms . if you imagine that our scene is a very simple character (feets are represented by those 2 boxes, and the sphere is the body of the character).
So if we want to create a waking animation we must add KEY FRAMES to the transform parameter. A KEY Frame is just a pair of (time-in-some-metric(second, ticks milisecond, value)
If we want to represent the bouncy motion of the body the key-frames will would look something like this :
Transform TBody :
Frame0( 0 second, translation(0, 0, 0) , rotation (...) scaling(...) ....)
Frame1( 0.5 second, translation(0, 5, 0) , rotation (...) scaling(...) ....)
Frame2( 1 second, translation(0, 0, 0) , rotation (...) scaling(...) ....)
// a similar thing for the legs ....
Only thing that we must do to play the animation is to Evaluate the transform for the current time of the animation, draw the mashes using the newly evaluated transforms, advance the time and repeat process.
The second type of animation that is closely related to the Node based animation is called Skeletal animation. This animation type works by deforming the mesh. Basically the artist creates few nodes that will be used to (called bones) deform the mesh. Each bone has a list of vertices/indices that are influenced by that bone. In order to evaluate that transform use again just need to revaluate the node transforms an then manually modify the vertex positions for each frame. You can search the web for more details around that method. Feel free to ask if you need more help.
Keep in mind that there are much more methods for animating a model/scene but this one is the most commonly used method.