Skinning in software

Started by
7 comments, last by Heodox 15 years, 10 months ago
I'm trying to implement model skinning in software opengl (not using shaders). The animation works, the problem is the speed, first it was done directly by calling glbegin/glend , and using glVertex... This was SLOW (~10 fps), now i implemented it with glDrawArrays and it is still prety slow (~25 fps). the skin data is stored along with vertex data in standard shader skinning format (one bone_index vec4, and one weight_index vec4) I'm wondering what would be the best/fastes way of doing this?
Advertisement
Get a profiler and figure out where the bottleneck is. It's possible that your GPU is faster at transforming verts than your CPU is.

Why are you trying to do software skinning? How long was it taking the GPU to skin the verts? Was the bottleneck really the transform?
Quote:Original post by RDragon1
Get a profiler and figure out where the bottleneck is. It's possible that your GPU is faster at transforming verts than your CPU is.




i think the biggest bottle neck is that i calculate every vertex like this on CPU:

v += mat[ind1] * original_v * weight1
v += mat[ind2] * original_v * weight2
v += mat[ind3] * original_v * weight3
v += mat[ind4] * original_v * weight4

but if i remove this (the model is not moving) i still get prety bad results (200-400-600 fps, very jumpy ), also i get very jumpy results for display lists 200-400-800 fps.

Quote:Original post by RDragon1
Why are you trying to do software skinning? How long was it taking the GPU to skin the verts? Was the bottleneck really the transform?


I'm trying to add as much compatibility to the engine as i can (this would mainly be character animation without shaders), the skinning on gpu is not implemented yet but i know it will be fast (did it on last project).

I belive the bottlenect mostly to be transform, but i also now think there is something else slowing it down.


also what profiler would you recomend? (i'm using visual studio 2008 if that is of any importance)
You would need to optimize it with some intrinsics or inline assembly SSE.
Sig: http://glhlib.sourceforge.net
an open source GLU replacement library. Much more modern than GLU.
float matrix[16], inverse_matrix[16];
glhLoadIdentityf2(matrix);
glhTranslatef2(matrix, 0.0, 0.0, 5.0);
glhRotateAboutXf2(matrix, angleInRadians);
glhScalef2(matrix, 1.0, 1.0, -1.0);
glhQuickInvertMatrixf2(matrix, inverse_matrix);
glUniformMatrix4fv(uniformLocation1, 1, FALSE, matrix);
glUniformMatrix4fv(uniformLocation2, 1, FALSE, inverse_matrix);
Quote:Original post by V-man
You would need to optimize it with some intrinsics or inline assembly SSE.


It's unlikely to give you a huge boost to frame rates though, chances are the bottlenecks are 1) the amount of memory that needs to be read/written 2) the time it takes to upload the vertex array to the graphics card. SSE might give you slightly better performance, however simply using openMP and a #pragma omp parallel for would be a better idea. Better still would be to utilise data compression for the vtx data to minimise the amount of data you need to read.

Quote:
I belive the bottlenect mostly to be transform, but i also now think there is something else slowing it down.


Software skinning shouldn't be that slow, i can do around 1 million verts at playable frame rates (though it is admittedly heavily optimised and multi-threaded). I suspect you have something nasty going on such as a floating point exception or division by zero. Try turning on all exceptions in your debugger and see if anything gets triggered. Alternatively vtune would probably be able to identify the problem (probably the best tool for this job - assuming you have an intel processor).
Quick *profiling* could be also disabling all drawing and check CPU only fps. This way you can quickly confirm if it is CPU or transfer/GPU bottleneck.
Quote:Original post by MaR_dev
Quick *profiling* could be also disabling all drawing and check CPU only fps. This way you can quickly confirm if it is CPU or transfer/GPU bottleneck.


The fps doesn't change if i turn off the the draw calls, so it is purely cpu bottleneck.

I've managed to do a couple optimizations and now debug version has 45 fps and release version 350 fps.

I see one more big optimization that i can do is to compress vertex data (as suggested by RobTheBloke), but else from that i can't see anything more i can do (profiler showed me some stupid stuff i did, and now that is removed)

Quote:Original post by Heodox
The fps doesn't change if i turn off the the draw calls, so it is purely cpu bottleneck.


Disable all rendering (and buffer swapping), then you'll have a true indication of the performance. Don't forget that any SwapBuffers call takes a fairly long time to complete..... probably longer than your skinning computation.

I'd also try experimenting with openMP to do something like:

#pragma omp parallel forfor(int i=0;i<num_verts;++i){  v = 0;  v += mat[ind1] * original_v * weight1  v += mat[ind2] * original_v * weight2  v += mat[ind3] * original_v * weight3  v += mat[ind4] * original_v * weight4}


For small data sets, you should see an improvement (i.e. if the skin/vtx data fits nicely in CPU cache) - assuming of course you have a dual or quad core CPU... (make sure you enable openMP in your project settings...)

If you mesh data is large, you won't see any difference and compression is your only option.
thanks for all the help, the skinning is much faster now, but now i'm trying to implement sse optimizations and opened another thread about it, here is the link:
http://www.gamedev.net/community/forums/topic.asp?topic_id=496728


This topic is closed to new replies.

Advertisement