# OpenGL Is it possible to do Batch Rendering with 3D skeletal animation data?

This topic is 1181 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I've implemented a batch rendering technique that is described in detail here:http://www.gamedev.net/page/resources/_/technical/opengl/opengl-batch-rendering-r3900

I'd like to extend it now to support 3D skeletal animation data, but I'm not sure what is the best way to do that so that the bone transformation can happen on the GPU rather than on the CPU. The Batches as they are defined right now depend on a transform matrix but that means that if I try to render a human, each limb would go into its own Batch, which means I would not get any performance gain by using the BatchManager as described in the article above.

Can someone suggest how to do batch rendering that works with 3D skeletal animation data?  Is that even possible?

##### Share on other sites
Use an array of transforms.
Each vertex can store an index into that array - or more commonly 4 indices and 4 weights for smooth 'skinning' transitions at the elbows, etc...
Then to draw multiple characters in one batch, have each instance store an offset to add to each vertex's bone-index.

##### Share on other sites

I'm a little confused how this array would work.  Lets say I am trying to animate a stick that is made up of 2 points.  The first point is the feet, and the second point is the head.  By default, the feet are at 0,0,0 and the head is at 0,1,0.  Okay in my game I have 3 sticks and each is animated in a different position and orientation.  My batch would contain 6 vertices, 3 of which would be at 0,0,0 and 3 that would be at 0,1,0.  If I want each stick to be animated and positioned correctly, I'd need three 4x3 transformation matrices to define how to move and rotate each of the sticks in 3D space.  If I do what you propose, I'd have 3 indices per vertex, and an array containing three 4x3 matrices?  This doesn't scale very well as a model will have 100's of vertices which means 100's of 4x3 matrices.

##### Share on other sites

If I want each stick to be animated and positioned correctly, I'd need three 4x3 transformation matrices to define how to move and rotate each of the sticks in 3D space.  If I do what you propose, I'd have 3 indices per vertex, and an array containing three 4x3 matrices?  This doesn't scale very well as a model will have 100's of vertices which means 100's of 4x3 matrices.

I don't understand how you've come up with three in the bolded bit. Each stick only has two bone (head/foot), so each stick has two matrices. Each vertex also only has one bone-index because it's either connected to the head, or to the feet.

It doesn't matter how many vertices are in the feet/head. Per object, you have one 'feet' transform and one 'head' transform.

Vertex Buffer if the head was made up of 2 verts and the feet also of two verts:

{
//stick 0's verts
{pos={k,l,m},uv={n,o},bone={1/*aka Feet0*/}},
{pos={p,q,r},uv={s,t},bone={1/*aka Feet0*/}},
//stick 1's verts
{pos={E,F,G},uv={H,!},bone={3/*aka Feet1*/}},
...
}

In the vertex shader, you then do something like:

  int boneIndex = vertex.bone;
Vec4 transform0 = TransformBuffer.Load(boneIndex*3+0);//index*3 because we have 3 Vec4's per transform
Mat4 transform = Mat4( transform0, transform1, transform2, vec4(0,0,0,1) );
Vec3 worldPosition = mul(transform, Vec4(vertex.position,1) );

Then as an extension to this, you can get "skinning" (soft transitions between bones) by using more than one bone index per vertex.
e.g. A vertex that's 75% controlled by the head bone, but 25% by the feet bone:

Then a VS that loads multiple bone indexes and blend weights for each one.

  int boneIndex0 = vertex.bones.x;

int boneIndex1 = vertex.bones.y;

Vec4 transform0 = transform0_0 * vertex.weights[0] + transform1_0 * vertex.weights[0];
Vec4 transform1 = transform0_1 * vertex.weights[0] + transform1_1 * vertex.weights[0];
Vec4 transform2 = transform0_2 * vertex.weights[0] + transform1_2 * vertex.weights[0];

Mat4 transform = Mat4( transform0, transform1, transform2, vec4(0,0,0,1) );

p.s. the above code does horrible linear blending of matrices, which doens't produce very good quality. Often animation systems will use a quaternion + a vec3 scale + a vec3 position, blending them individually, and then using those blended results to construct a Mat4x4.
p.p.s. Half-Life 1 in 1998 was one of the first games I know of that pioneered "skinned animation" and it's been the defacto standard character animation technique ever since. It's common these days to have characters with, say, 10k verts and 50 bone matrices. Nextgen even more like 100k verts and 150 bone matrices.

Edited by Hodgman

##### Share on other sites

Okay, I see what you have done here.  The position for each stick's verts ( what you've called {a,b,c}, {f,g,h}, {k,l,m}, etc ) is calculated on the CPU.  I was thinking this would be done in the shader.

Now about the TransformBuffer, that is just a *giant* array of values that contains all possible transformation to each bone, is that right?  So if I have a model with 7 different animations that are possible, and the model has 10 bones in it, and each bone has 25 keys that define a particular animation sequence, then this TransformBuffer would contain 7x10x25x4x3 = 21,000 values?

How do I define and set a TransformBuffer in C++ so that it can be used as shown in your sample VS?  Clearly the TransformBuffer variable is not a standard uniform because it would easily overflow the max uniform size set out by a graphics card.

##### Share on other sites

a *giant* array of values that contains all possible transformation to each bone, is that right?

No. You would be rendering only 1 animation at a time, so the array would consist of 1 transformation per bone. You may want to look at this article regarding skinned mesh animation to give you a better idea how skinned mesh animation is done.

Also, if you're unfamiliar with creating and using constant buffers in a shader, I would strongly suggest you put off skinned mesh animation until later, as it's a rather complex process.

##### Share on other sites

No. You would be rendering only 1 animation at a time, so the array would consist of 1 transformation per bone.

I realize that the standard way to animate would be to do one animation at a time, but here we are discussing batching multiple animated objects to render at once. Hodgman has been suggestion ways in which this could be done, but I'm not convinced that it is possible or a good idea yet.

Also, if you're unfamiliar with creating and using constant buffers in a shader, I would strongly suggest you put off skinned mesh animation until later, as it's a rather complex process.

I'm familiar with creating and using Uniforms but Hodgman's example looks to be using something different.  Do you have a url that I can read more about what is going on here with the TransformBuffer?

##### Share on other sites

standards way ... animate ... one animation at a time, but here we are discussing batching multiple animated objects to render at once.

Sounds like you're confusing the term "animation" with "animated object." Hodgman's example demonstrates how multiple objects could be batched, each object animated by it's own single animation.

That is, any single vertex is processed using 1 or more matrices, each matrix indexed by bone id. Provided the indices for each vertex index the appropriate matrix, it doesn't matter which object that vertex "belongs" to.

Yes, there are limits to how many GPU registers can be used, and that depends on the shader model and the available hardware. But that limits the number of matrices, not the method.

Hodgman's example looks to be using something different.

I don't think so. His description:

lists the matrices for object0's head bone transform, object0's feet bone transform, object1's head bone transform, etc. Object0's vertices would access the transforms using indices 0,  1, 2, etc., to calculate a transform. Object1's vertices would access transforms using an offset into the transform array - rather than starting ot 0, 1, 2, etc., object1 would access the array using, e.g.,  6, 7, 8, etc. It's the same principle**. Hodgman's code example demonstrates a way you could specify an offset into the transform buffer.

** EDIT: That is, the same principle as a single object using bone indices to access the appropriate bone transforms. For a single object, a vertex may be influenced by bones 4, 5, and 6. For multiple objects, that vertex may be still influenced by bones 4, 5 and 6, BUT will use indices 24, 25 and 25 into the array because transforms 0 through 23 are used by other objects.

Edited by Buckeye

##### Share on other sites

okay, I think I got it.

• 12
• 10
• 11
• 18
• 13
• ### Similar Content

• By QQemka
Hello. I am coding a small thingy in my spare time. All i want to achieve is to load a heightmap (as the lowest possible walking terrain), some static meshes (elements of the environment) and a dynamic character (meaning i can move, collide with heightmap/static meshes and hold a varying item in a hand ). Got a bunch of questions, or rather problems i can't find solution to myself. Nearly all are deal with graphics/gpu, not the coding part. My c++ is on high enough level.
Let's go:
Heightmap - i obviously want it to be textured, size is hardcoded to 256x256 squares. I can't have one huge texture stretched over entire terrain cause every pixel would be enormous. Thats why i decided to use 2 specified textures. First will be a tileset consisting of 16 square tiles (u v range from 0 to 0.25 for first tile and so on) and second a 256x256 buffer with 0-15 value representing index of the tile from tileset for every heigtmap square. Problem is, how do i blend the edges nicely and make some computationally cheap changes so its not obvious there are only 16 tiles? Is it possible to generate such terrain with some existing program?
Collisions - i want to use bounding sphere and aabb. But should i store them for a model or entity instance? Meaning i have 20 same trees spawned using the same tree model, but every entity got its own transformation (position, scale etc). Storing collision component per instance grats faster access + is precalculated and transformed (takes additional memory, but who cares?), so i stick with this, right? What should i do if object is dynamically rotated? The aabb is no longer aligned and calculating per vertex min/max everytime object rotates/scales is pretty expensive, right?
Drawing aabb - problem similar to above (storing aabb data per instance or model). This time in my opinion per model is enough since every instance also does not have own vertex buffer but uses the shared one (so 20 trees share reference to one tree model). So rendering aabb is about taking the model's aabb, transforming with instance matrix and voila. What about aabb vertex buffer (this is more of a cosmetic question, just curious, bumped onto it in time of writing this). Is it better to make it as 8 points and index buffer (12 lines), or only 2 vertices with min/max x/y/z and having the shaders dynamically generate 6 other vertices and draw the box? Or maybe there should be just ONE 1x1x1 cube box template moved/scaled per entity?
What if one model got a diffuse texture and a normal map, and other has only diffuse? Should i pass some bool flag to shader with that info, or just assume that my game supports only diffuse maps without fancy stuff?
There were several more but i forgot/solved them at time of writing
• By RenanRR
Hi All,
I'm reading the tutorials from learnOpengl site (nice site) and I'm having a question on the camera (https://learnopengl.com/Getting-started/Camera).
I always saw the camera being manipulated with the lookat, but in tutorial I saw the camera being changed through the MVP arrays, which do not seem to be camera, but rather the scene that changes:
#version 330 core layout (location = 0) in vec3 aPos; layout (location = 1) in vec2 aTexCoord; out vec2 TexCoord; uniform mat4 model; uniform mat4 view; uniform mat4 projection; void main() { gl_Position = projection * view * model * vec4(aPos, 1.0f); TexCoord = vec2(aTexCoord.x, aTexCoord.y); } then, the matrix manipulated:
..... glm::mat4 projection = glm::perspective(glm::radians(fov), (float)SCR_WIDTH / (float)SCR_HEIGHT, 0.1f, 100.0f); ourShader.setMat4("projection", projection); .... glm::mat4 view = glm::lookAt(cameraPos, cameraPos + cameraFront, cameraUp); ourShader.setMat4("view", view); .... model = glm::rotate(model, glm::radians(angle), glm::vec3(1.0f, 0.3f, 0.5f)); ourShader.setMat4("model", model);
So, some doubts:
- Why use it like that?
- Is it okay to manipulate the camera that way?
-in this way, are not the vertex's positions that changes instead of the camera?
- I need to pass MVP to all shaders of object in my scenes ?

What it seems, is that the camera stands still and the scenery that changes...
it's right?

Thank you

• Sampling a floating point texture where the alpha channel holds 4-bytes of packed data into the float. I don't know how to cast the raw memory to treat it as an integer so I can perform bit-shifting operations.

int rgbValue = int(textureSample.w);//4 bytes of data packed as color
// algorithm might not be correct and endianness might need switching.
vec3 extractedData = vec3(  rgbValue & 0xFF000000,  (rgbValue << 8) & 0xFF000000, (rgbValue << 16) & 0xFF000000);
extractedData /= 255.0f;

• While writing a simple renderer using OpenGL, I faced an issue with the glGetUniformLocation function. For some reason, the location is coming to be -1.
Anyone has any idea .. what should I do?

• Hi all!
I try to use the Sun shafts effects via post process in my 3DEngine, but i have some artefacts on final image(Please see attached images).
The effect contains the following passes:
1) Depth scene pass;
2) "Shafts pass" Using DepthPass Texture + RGBA BackBuffer texture.
3) Shafts pass texture +  RGBA BackBuffer texture.