Slow hardware skinning

Started by
9 comments, last by deathkrush 16 years, 8 months ago
Well i finally got skinning working and have been working on hardware skinning. After sorting out many many bugs, ive finally gotten it to work, but instead of a speed increase im seeing a huge speed decrease from 100fps to 60fps. Im using a very simple shader, only calculating vertex positions, normals. No lighting is calculated, color is just set at white. I doubt the shader could be optimised any further. Anyway here it is

uniform mat4 boneMat[32]; //72/3 == 24 bones. We cant use 4x3mat bcos ATI drivers suck
attribute vec4 boneIds;   //Upto 4 bones per vertex
attribute vec4 boneWeights; //Upto 4 weights for 4 bones

void main()
{	
	vec3 final = vec3 (0,0,0);
	vec3 finalN = vec3 (0,0,0);
	for(int i = 0; i < 4 ; ++i)
	{
    		int boneID = int(boneIds);//*3 because 1 matrix requires 3 vec4s

		mat4 boneMatrix = boneMat[boneID];
		
		final +=  (gl_Vertex * boneMatrix).xyz * boneWeights;
		
	//For normals, we only want the rotation to affect us
		mat3 boneRotMatrix = mat3( boneMatrix );// extract out only the rotational portion of the matrix
		finalN += (gl_Normal * boneRotMatrix).xyz  * boneWeights;

	}

	gl_Position = gl_ModelViewProjectionMatrix * vec4(final,1);

	gl_FrontColor = vec4(1,1,1, 1);	

	gl_FrontSecondaryColor = vec4(0);

	
	gl_TexCoord[0] = gl_MultiTexCoord0;

} 

The only possibility i can think of is that im sending vertex attributes through vertex arrays. Else i will have to create 2 VBOs(1 for indices/other for weights) just for the vertex attributes. But i doubt it should be slowing it down that much. What else could be the reason for the slowdown?
Advertisement
Don't worry, your fps always drops when your program starts rendering things.
CPU Skinning gives me 100FPS
GPU Skinning gives me 60FPS

The hardware skinning should be faster, but it's alot slower. Many games use hardware skinning and they run pretty well even on slower CPUs.

Btw im using Core Duo 1.66ghz, Radeon X1600 mobility.
You could try using some profiling tools (like PIX for DirectX) to figure out why hardware skinning runs slower, but I have an idea why. What you have done here is move the work from the CPU to the GPU. If your original scene was already GPU limited, giving the GPU more work will only make it run slower. Hardware skinning helps to free up the CPU so it could be used for other tasks, like AI, physics, etc.

The vertex program looks simple, but it could be optimized:
- unroll the loop (maybe the compiler is already doing this, but you never know until you disassemble)
- try to get 4x3 matrices to work, this might decrease the number of instructions

Just disassemble the vertex program after each code change and see if it gets any better. Also, GPUs like interleaved data. Try packing positions, normals, bone weights and indices into a single VBO.
deathkrushPS3/Xbox360 Graphics Programmer, Mass Media.Completed Projects: Stuntman Ignition (PS3), Saints Row 2 (PS3), Darksiders(PS3, 360)
Maybe the ATI drivers for OpenGL sucks, have you tried it in D3D?.
Maybe ATI drivers do suck.

I dont have a DX Renderer and im not planning to write one in the near future.

My scene is extremely simple, just a few quads with fonts for fps/info and 1 skinned mesh. The GPU is in no way taxed.

Previously i tried passing in bone indices as floats but it resulted in problems, probably because of floating point inaccuracies. 4.0000 might have become 3.9999 when passed to the GPU and in the shader it would be cast down to 3 as integer in the shader.

I was thinking it might be more with the GL side of things instead of the shader itself. Ill continue tinkering with it and post any new findings.

Also if anyone has implemented fast GPU skinning in GLSL, perhaps you could share your experience.
I'm gonna punt this over to the OpenGL forum and let them have a swing at it.
Make sure you update your drivers. There was a while when ATI drivers would run in software mode if you used indexing (what you are doing in the for loop).
Sig: http://glhlib.sourceforge.net
an open source GLU replacement library. Much more modern than GLU.
float matrix[16], inverse_matrix[16];
glhLoadIdentityf2(matrix);
glhTranslatef2(matrix, 0.0, 0.0, 5.0);
glhRotateAboutXf2(matrix, angleInRadians);
glhScalef2(matrix, 1.0, 1.0, -1.0);
glhQuickInvertMatrixf2(matrix, inverse_matrix);
glUniformMatrix4fv(uniformLocation1, 1, FALSE, matrix);
glUniformMatrix4fv(uniformLocation2, 1, FALSE, inverse_matrix);
Maybe it is a matter of your profiling, FPS isn't the best profiling tool ;-)

I think that at the current state of your project the GPU is most likely the bottleneck and the CPU is more a less idling. So, transfering work from your 'idling' CPU to your stressed GPU will decrease your FPS even if your new GPU-skinning approach is much faster than your CPU approach.

Just think about HW skinning as 'freeing' your CPU of a lot of workload while 'burden' your GPU with just a little bit more workload. Your game will benefit from this effect later on when you start stressing the CPU with heavy gamelogic, physics, lod-management, netcode ...

--
Ashaman
Hmm, another question for those who write shaders, is it possible to write an equivalent shader in GLSL which performs as well as the fixed function pipeline which is used by default when you don't attach any shaders?

The reason i ask this is because i believe my current vertex shader is doing less work then the fixed function pipeline(no lighting) so i don't see why it should run any slower.

Is the overhead to attach/detach program objects so high? Also if most games are CPU-limited why does changing to a faster GPU speed up framerates since any decent game should be coded to have the CPU and GPU running in parallel? The faster GPU will still end up waiting for the CPU to finish it's work.

This topic is closed to new replies.

Advertisement