Is it possible to do Batch Rendering with 3D skeletal animation data?

Started by
7 comments, last by mmakrzem 9 years, 4 months ago

I've implemented a batch rendering technique that is described in detail here:http://www.gamedev.net/page/resources/_/technical/opengl/opengl-batch-rendering-r3900

I'd like to extend it now to support 3D skeletal animation data, but I'm not sure what is the best way to do that so that the bone transformation can happen on the GPU rather than on the CPU. The Batches as they are defined right now depend on a transform matrix but that means that if I try to render a human, each limb would go into its own Batch, which means I would not get any performance gain by using the BatchManager as described in the article above.

Can someone suggest how to do batch rendering that works with 3D skeletal animation data? Is that even possible?

Advertisement
Use an array of transforms.
Each vertex can store an index into that array - or more commonly 4 indices and 4 weights for smooth 'skinning' transitions at the elbows, etc...
Then to draw multiple characters in one batch, have each instance store an offset to add to each vertex's bone-index.

I'm a little confused how this array would work. Lets say I am trying to animate a stick that is made up of 2 points. The first point is the feet, and the second point is the head. By default, the feet are at 0,0,0 and the head is at 0,1,0. Okay in my game I have 3 sticks and each is animated in a different position and orientation. My batch would contain 6 vertices, 3 of which would be at 0,0,0 and 3 that would be at 0,1,0. If I want each stick to be animated and positioned correctly, I'd need three 4x3 transformation matrices to define how to move and rotate each of the sticks in 3D space. If I do what you propose, I'd have 3 indices per vertex, and an array containing three 4x3 matrices? This doesn't scale very well as a model will have 100's of vertices which means 100's of 4x3 matrices.

If I want each stick to be animated and positioned correctly, I'd need three 4x3 transformation matrices to define how to move and rotate each of the sticks in 3D space. If I do what you propose, I'd have 3 indices per vertex, and an array containing three 4x3 matrices? This doesn't scale very well as a model will have 100's of vertices which means 100's of 4x3 matrices.

I don't understand how you've come up with three in the bolded bit. Each stick only has two bone (head/foot), so each stick has two matrices. Each vertex also only has one bone-index because it's either connected to the head, or to the feet.


It doesn't matter how many vertices are in the feet/head. Per object, you have one 'feet' transform and one 'head' transform.

Transform buffer: {Head0, Feet0, Head1, Feet1, Head2, Feet2...}

Vertex Buffer if the head was made up of 2 verts and the feet also of two verts:


{
//stick 0's verts
  {pos={a,b,c},uv={d,e},bone={0/*aka Head0*/}},
  {pos={f,g,h},uv={i,j},bone={0/*aka Head0*/}},
  {pos={k,l,m},uv={n,o},bone={1/*aka Feet0*/}},
  {pos={p,q,r},uv={s,t},bone={1/*aka Feet0*/}},
//stick 1's verts
  {pos={u,v,w},uv={x,y},bone={2/*aka Head1*/}},
  {pos={z,A,B},uv={C,D},bone={2/*aka Head1*/}},
  {pos={E,F,G},uv={H,!},bone={3/*aka Feet1*/}},
...
}

In the vertex shader, you then do something like:


  int boneIndex = vertex.bone;
  Vec4 transform0 = TransformBuffer.Load(boneIndex*3+0);//index*3 because we have 3 Vec4's per transform
  Vec4 transform1 = TransformBuffer.Load(boneIndex*3+1);
  Vec4 transform2 = TransformBuffer.Load(boneIndex*3+2);
  Mat4 transform = Mat4( transform0, transform1, transform2, vec4(0,0,0,1) );
  Vec3 worldPosition = mul(transform, Vec4(vertex.position,1) );

Then as an extension to this, you can get "skinning" (soft transitions between bones) by using more than one bone index per vertex.
e.g. A vertex that's 75% controlled by the head bone, but 25% by the feet bone:
{pos={a,b,c},uv={d,e},bones={0/*aka Head0*/, 1/*aka Feet0*/}, weights={0.75,0.25}},

Then a VS that loads multiple bone indexes and blend weights for each one.


  int boneIndex0 = vertex.bones.x;
  Vec4 transform0_0 = TransformBuffer.Load(boneIndex0*3+0);
  Vec4 transform0_1 = TransformBuffer.Load(boneIndex0*3+1);
  Vec4 transform0_2 = TransformBuffer.Load(boneIndex0*3+2);

  int boneIndex1 = vertex.bones.y;
  Vec4 transform1_0 = TransformBuffer.Load(boneIndex1*3+0);
  Vec4 transform1_1 = TransformBuffer.Load(boneIndex1*3+1);
  Vec4 transform1_2 = TransformBuffer.Load(boneIndex1*3+2);

  Vec4 transform0 = transform0_0 * vertex.weights[0] + transform1_0 * vertex.weights[0];
  Vec4 transform1 = transform0_1 * vertex.weights[0] + transform1_1 * vertex.weights[0];
  Vec4 transform2 = transform0_2 * vertex.weights[0] + transform1_2 * vertex.weights[0];

  Mat4 transform = Mat4( transform0, transform1, transform2, vec4(0,0,0,1) );

p.s. the above code does horrible linear blending of matrices, which doens't produce very good quality. Often animation systems will use a quaternion + a vec3 scale + a vec3 position, blending them individually, and then using those blended results to construct a Mat4x4.
p.p.s. Half-Life 1 in 1998 was one of the first games I know of that pioneered "skinned animation" and it's been the defacto standard character animation technique ever since. It's common these days to have characters with, say, 10k verts and 50 bone matrices. Nextgen even more like 100k verts and 150 bone matrices.

Okay, I see what you have done here. The position for each stick's verts ( what you've called {a,b,c}, {f,g,h}, {k,l,m}, etc ) is calculated on the CPU. I was thinking this would be done in the shader.

Now about the TransformBuffer, that is just a *giant* array of values that contains all possible transformation to each bone, is that right? So if I have a model with 7 different animations that are possible, and the model has 10 bones in it, and each bone has 25 keys that define a particular animation sequence, then this TransformBuffer would contain 7x10x25x4x3 = 21,000 values?

How do I define and set a TransformBuffer in C++ so that it can be used as shown in your sample VS? Clearly the TransformBuffer variable is not a standard uniform because it would easily overflow the max uniform size set out by a graphics card.


a *giant* array of values that contains all possible transformation to each bone, is that right?

No. You would be rendering only 1 animation at a time, so the array would consist of 1 transformation per bone. You may want to look at this article regarding skinned mesh animation to give you a better idea how skinned mesh animation is done.

Also, if you're unfamiliar with creating and using constant buffers in a shader, I would strongly suggest you put off skinned mesh animation until later, as it's a rather complex process.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

You don't forget how to play when you grow old; you grow old when you forget how to play.

No. You would be rendering only 1 animation at a time, so the array would consist of 1 transformation per bone.

I realize that the standard way to animate would be to do one animation at a time, but here we are discussing batching multiple animated objects to render at once. Hodgman has been suggestion ways in which this could be done, but I'm not convinced that it is possible or a good idea yet.

Also, if you're unfamiliar with creating and using constant buffers in a shader, I would strongly suggest you put off skinned mesh animation until later, as it's a rather complex process.

I'm familiar with creating and using Uniforms but Hodgman's example looks to be using something different. Do you have a url that I can read more about what is going on here with the TransformBuffer?


standards way ... animate ... one animation at a time, but here we are discussing batching multiple animated objects to render at once.

Sounds like you're confusing the term "animation" with "animated object." Hodgman's example demonstrates how multiple objects could be batched, each object animated by it's own single animation.

That is, any single vertex is processed using 1 or more matrices, each matrix indexed by bone id. Provided the indices for each vertex index the appropriate matrix, it doesn't matter which object that vertex "belongs" to.

Yes, there are limits to how many GPU registers can be used, and that depends on the shader model and the available hardware. But that limits the number of matrices, not the method.


Hodgman's example looks to be using something different.

I don't think so. His description:

Transform buffer: {Head0, Feet0, Head1, Feet1, Head2, Feet2...}

lists the matrices for object0's head bone transform, object0's feet bone transform, object1's head bone transform, etc. Object0's vertices would access the transforms using indices 0, 1, 2, etc., to calculate a transform. Object1's vertices would access transforms using an offset into the transform array - rather than starting ot 0, 1, 2, etc., object1 would access the array using, e.g., 6, 7, 8, etc. It's the same principle**. Hodgman's code example demonstrates a way you could specify an offset into the transform buffer.

** EDIT: That is, the same principle as a single object using bone indices to access the appropriate bone transforms. For a single object, a vertex may be influenced by bones 4, 5, and 6. For multiple objects, that vertex may be still influenced by bones 4, 5 and 6, BUT will use indices 24, 25 and 25 into the array because transforms 0 through 23 are used by other objects.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

You don't forget how to play when you grow old; you grow old when you forget how to play.

okay, I think I got it.

This topic is closed to new replies.

Advertisement