Jump to content

  • Log In with Google      Sign In   
  • Create Account


Model animation - too many calculations


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
12 replies to this topic

#1 Milcho   Crossbones+   -  Reputation: 1175

Like
0Likes
Like

Posted 04 February 2011 - 06:44 AM

Well, I've been working on animating a model based on a skeleton. I have a half-assed algorithm that auto-attaches vertecies to bones. Each vertex is attached to anywhere from 1 to 3 bones. Each bone attachment also has an associated weight.

The problem is then with animating the model. I'm using a low-polygon model - something like 760 faces, with less than 2000 vertecies.

I then have to calculate the new position of each vertex based on it's bone attachment. For each bone, this consists of 3 multiplications, type double, to rotate to the correct position, 3 additions, again for doubles, to translate it to the correct position, and then a multiplication of that whole result by the weight assigned to that bone. At the end, I simply add up all the results for each bone, to get the new position.

The above method works, assuming the attachments are correct. The problem is that it seems to take TOO long to compute that for all vertecies. I split the calculation up in 4 parts, where each update event it does 1/4th of the vertecies, and then updates only after all of them are calculated. That allowed me a bit over 60fps, but considering I'm only drawing one low poly model, that framerate is unacceptable. (For comparison, without calculating the vertex positions, I get framerate of 900+). I'm not sure where to optimize the calculations. They're all done on the CPU, which is a pretty big bottleneck, but I can't figure out how, if at all possible, to use the the GPU for that.

Anyone have any experience programming model animation? Perhaps animating a model from a skeleton from an existing animation format? Any help at all would be greatly appreciated.



Sponsor:

#2 AndyEsser   GDNet+   -  Reputation: 385

Like
0Likes
Like

Posted 04 February 2011 - 06:52 AM

Only update vertex positions associated with bones that have moved?

Bear in mind that a Bone is effectively a transformation matrix, and your Graphics API (OpenGL/DirectX) will be doing these anyway when you set the position of the Vertex (glVertex...(), etc). Offloading the calculations to a Vertex Shader would be a good boost.

#3 Buckeye   Crossbones+   -  Reputation: 4422

Like
0Likes
Like

Posted 04 February 2011 - 07:32 AM

As AndyEsser mentions, the GPU will do those calcs a lot faster in a shader. In addition, a change in framerate is not a good measure of performance. The actual time to do the calc is what you're concerned about.

Also, it sounds like your weighting algorithm is suspect. As AndyEsser mentions, bones are really reference frames, commonly implemented as matrices, not separate rotations and translations.

In any case, a common approach (which can be implemented on the CPU or GPU):
// Given a vertex vin, 3 weights, 3 bone indices for that vertex, and an array of bone matrices..
vector3 vout = vector3(0,0,0);
for( int i=0; i<3; i++) vout += weight[i]*vectorMatrixMult( vin, boneMat[ boneIndex[i] ] );

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.


#4 Milcho   Crossbones+   -  Reputation: 1175

Like
0Likes
Like

Posted 04 February 2011 - 07:56 AM

Well, that's unfortunate, because I have no experience with shaders.

I thought I was minimizing calculations by using my method instead of a matrix. What I have is a set of Axes that gets rotated, and that determines the bone's orientation. When loading the model I store the original relative position of each vertex to every attached bone's axes. Then to get the vertex's transformed location I essentially have a vector: ( origRelVer_x * x_axis + origRelVer_y * y_axis + origRelVer_z * z_axis), where xyz_axis are orthogonal vectors (i.e. the axes's axis) I thought this would be less calculations than a matrix multiplication, but then, it's all cpu. (Edit: now that I saw my code again, I realize that's actually a multiplication of a scalar times a vector = 3 mults, so 9 total multiplications + 2 additions, and that's before the other 3 additions for the correct position. So, yes, quite a bit more than I originally mentioned, and not actually an improvement over matrix multiplication)

You're right that FPS isn't the best measure, especially since I average it over a second. The function clearly is slow though, so I need to improve it. Though from what it sounds like I might have to rework the way my bone rotation is stored.

I'm starting to wonder whether it's worth reinventing the wheel for the experience. I know this problem has been solved before, though I haven't found too detailed descriptions about it.

#5 Buckeye   Crossbones+   -  Reputation: 4422

Like
1Likes
Like

Posted 04 February 2011 - 08:18 AM

Depending on what API you're using, there are downloadable examples available. On my old machine ( ca. 2005 ), I have no problem (DirectX 9.0c) with CPU skinning/animation for a ~300 vertex model and 22 bones at ~90 FPS (I know, I know.. FPS isn't a good measure :) )

Also, if you have 700 faces and 2000 vertices, sounds like you're not using indexed vertices. If you use an indexed mesh, that might cut down the calcs quite time a bit, too.

Also, it obviously depends on the number of material changes that're made, also. Combining multiple textures into one will reduce the time to change texture samplers.

EDIT: With regard to experience, depending on how much animation work you'll be doing in the future, understanding what it takes to do skinned animation might serve you well. However, downloading some code and examining it for a good understanding may be better than working out all the details yourself.

EDIT2:

Well, that's unfortunate, because I have no experience with shaders.


No time like the present. If you continue with game development, shaders are where you need to be anyway.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.


#6 Milcho   Crossbones+   -  Reputation: 1175

Like
0Likes
Like

Posted 04 February 2011 - 08:53 AM

Yes, well, this was just an attempt at animation because I don't want to use a pre-made engine, and I thought I'd try to build just a fairly simple animated character. It really wasn't the best use of my time. And the animation was going so well until this point.

Also for the vertex and face count and vertex count...hmm I just realized. See, I wrote my own .obj loader, and .obj format specifies a list of vertex positions, a separate list of normals, and a third list of texture coordinates. Then each face is a set of indecies to each list. Because I'm storing the texture and normal data in my vertex struct, I had to read in all the lists, and then when reading in a face, create its vertecies on the spot. I have a check to see if that perticular vertex (combination of position, normal and texture) already exists, and if it does, link to that instead of creating a new one. But, as I just realized, I turned that off, because it lengthens loading time by about a second. I'm going to turn that check on, which does indeed greatly decrease the number of vertecies (as most faces share). I don't think it will have a huge impact, but it's worth a try. Thanks for reminding me!



#7 haegarr   Crossbones+   -  Reputation: 4172

Like
1Likes
Like

Posted 04 February 2011 - 09:05 AM

The following test runs in approx. 300 micro-seconds (debug) / 150 micro-seconds (release) on my 2.66 GHz i7 laptop, compiled by gcc 4.2; if I made no mistake, it uses 20 bones, 2000 vertices, and 3 bones per vertex:

float matrices[20*4*3];
float vertices[2000*3];
float weights[6000];
float results[2000*3];

int matrixIndex = -1;
float * vertex = vertices;
float * result = results;
float * weight = weights;
for(uint vertexIndex = 0; vertexIndex<2000; ++vertexIndex) {
    result[0] = result[1] = result[2] = 0.0f;
    for(uint counter = 0; counter<3; ++counter) {
        matrixIndex = (matrixIndex+1)%20;
        float * matrix = matrices+matrixIndex*4*3;
        result[0] += *weight * (*matrix++ * vertex[0] + *matrix++ * vertex[1] + *matrix++ * vertex[2] + *matrix++);
        result[1] += *weight * (*matrix++ * vertex[0] + *matrix++ * vertex[1] + *matrix++ * vertex[2] + *matrix++);
        result[2] += *weight * (*matrix++ * vertex[0] + *matrix++ * vertex[1] + *matrix++ * vertex[2] + *matrix++);
        ++weight;
    }
    vertex += 3;
    result += 3;
}
Using doubles instead means it to last approx. 330 micro-seconds (debug) / 190 micro-seconds (release).

#8 Milcho   Crossbones+   -  Reputation: 1175

Like
0Likes
Like

Posted 04 February 2011 - 09:18 AM

I feel kind of stupid now. I was running in debug mode. I ran my algorithm in release mode, and even without splitting the calculations in 4 parts, I got a good speed on it. Yes, I realized that from the last post, thanks Haegarr.

Here's what my vertex update code looks like:

mesh_vertex *currVer;
	for (unsigned int i = start; i < end; i++)
	{	// for each vertex of the model
		
		Vec3f newPos(0.0, 0.0, 0.0), tPos;
		currVer = vertexList.at(i);
		for (int k = 0; k < vertexAttachedBones.at(i)->getNumBones(); k++)
		{	// for each of it's attached bones
			tPos = getAbsolutePosition(
				vertexOriginalPosition.at(i).at(k), 
				vertexAttachedBones.at(i)->getBoneAt(k)
				);
			newPos = newPos + 
				     tPos * 
				     vertexAttachedBones.at(i)->getNormalizedWeightAt(k);
		}
		// update position
		currVer->x = newPos.x;
		currVer->y = newPos.y;
		currVer->z = newPos.z;
	}

Specifically, it runs in 1 or less than 1 millisecond (I used clock() to get the time). It could probably still use some improvement, but it's nowhere near as bad as I thought it was. Also, I know it's a bad measure, but I get ~650fps now. Much more acceptable.

Thanks for the help everyone.

edit: I have no clue why there's such a huge gap in performance between debug and release. Is it because I update everything with pointers? Not that it makes a big difference, since release is the mode to use for .. well, releasing.

#9 Milcho   Crossbones+   -  Reputation: 1175

Like
0Likes
Like

Posted 04 February 2011 - 09:37 AM

Here's a screenshot of the animation in progress:
screen-01.jpg
I still have some issues to work out with the automatic vertex-bone association, but it runs smoothly now!

Again, thanks for the help people.

#10 Buckeye   Crossbones+   -  Reputation: 4422

Like
1Likes
Like

Posted 04 February 2011 - 09:51 AM

Debug compiles (depending on what API you're using) usually link in debug (vs. release) libraries. Those debug libraries commonly do a lot of error-checking ( that's what "debug" does for a living ) - looking for unitialized variables, array indices out of bounds, etc., etc. Your experience isn't necessarily out of the ordinary. E.g., I get ratios of 200-400 ( 180mS vs. 0.7mS for one of my routines comes to mind ) between debug and release execution times.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.


#11 Milcho   Crossbones+   -  Reputation: 1175

Like
0Likes
Like

Posted 04 February 2011 - 09:58 AM

Debug compiles (depending on what API you're using) usually link in debug (vs. release) libraries. Those debug libraries commonly do a lot of error-checking ( that's what "debug" does for a living ) - looking for unitialized variables, array indices out of bounds, etc., etc. Your experience isn't necessarily out of the ordinary. E.g., I get ratios of 200-400 ( 180mS vs. 0.7mS for one of my routines comes to mind ) between debug and release execution times.


Yeah, I knew debug does a lot for checking for errors, hence why I usually run in debug mode when debugging (actually, I've noticed you can't rely on the debugger in release mode). I've just never had such a huge gap in performance, until now. That'll teach me to post on here with questions that could've been resolved by two mouse clicks. =)

#12 Milcho   Crossbones+   -  Reputation: 1175

Like
0Likes
Like

Posted 04 February 2011 - 10:33 AM

Actually, since I'm not testing with a higher polygon count, does anyone know what a low- to medium- acceptable polygon count is for a human model? I'm no modeler, but I can just use subdivision to create a higher-resolution model to test.

#13 Buckeye   Crossbones+   -  Reputation: 4422

Like
1Likes
Like

Posted 04 February 2011 - 11:52 AM

I don't know about "acceptable," but something a bit more complete than what you have in your posted image: maybe on the order of 2000-4000 faces (triangles), about as many vertices.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS