Some advice on model rendering

Started by
7 comments, last by zedz 14 years, 10 months ago
Hello people. Some time ago, I posted some questions on the same issue, but I always seem to be bumping into new issues on this part. I'm a coder for a Half-Life mod, and I've been trying to increase performance in the mod. One of our main issues is the number of model polys that we must render, wich can sometime reach around 100k polys on worst cases. Originally, HL does most of it's transformations and light calculations on the CPU, but I've already ported the light calculations over to the GPU, using GLSL. Right now, I'm having trouble with the transformation matrices. Half-Life's transformation matrix for a bone is made up of 3x4 floats, wich I had extended to 4x4 floats so that it can fit with the GLSL variant of matrices( I looked, I can't use a 3x4 matrix in GLSL ). The fourth float in each slot is the position of the bone in world space. Now, my shader that I use for the studio model rendering is already housing a lot of uniform variables, all of wich are necessary for rendering. I know that my graphics card only supports 1024 uniform variables, so I know that I couldn't send the whole amount of bone matrices to OGL, wich can be a maximum of 128 4x4 floats. I'm wondering, does anyone have any suggestions on this matter? I don't think it's too performance healthy to send a matrix down for each vertex I pass down. Also, a second question: During my attempts, I tried sending at least one bone matrix down to GLSL, but it doesn't seem to be recieving anything. I get the uniform location properly, and I send my matrix down like this: glUniformMatrix4fvARB( loc_btrans, 1, GL_TRUE, m_pbonetransform[boneindex] ); I'd really apprechiate any suggestions. Thanks.
Advertisement
Does anyone have any suggestions at least? I know that this has been done before, but I just don't know how. I'm only a beginner with GLSL in general, so I'd apprechiate even the smallest advice.

Thanks.
I dont want to "nay-say" what you're doing because it sounds fun.

But did you benchmark and performance test your modifications, moving the lighting to the GPU? There may have been a reason why they chose to do it on the CPU, and it might have been to free up enough space for the bone transforms in shader.

GLSL tends to be very finicky. You may be having trouble transfering your bone data because of completely unrelated issues. GLSL programs only seem to function properly when the moons all align. Sometimes you just need to use different variable names, or other silly things to make the compiler not ditch your uniforms. (I really hope i'm not the only one who believes this).
Half-Life was doing all of it's model rendering calculations on the CPU, so a prime problem was that you couldn't get more complex models in the game, or a lot of models either.

I couldn't move the bone matrices into the shader either. It's 2048 floats, the way I do it, so I'm pretty sure I can't move my bone matrix to the shader. Instead, I guess what I could do is to send the matrix to the shader for each vertex, but is this truly beneficial? I mean, it's 3x4 floats, wich is a lot of data, especially for each vertex call when rendering a studio model.

Before I moved lighting calculations, I got around 60% of the performance loss on the lighting calculations, while 40% on the vertex transformations. A true fallback is that I have to rotate the normals too now, otherwise the shader can't work on them properly.

Also, I've been having serious issues with GLSL, so I guess it's time for me to switch to some other shading language. Wich one is the most supported you know of, and the least buggy?

I'm in a real trouble here, and without fixing this I can't get over the performance issues.

Thanks.
Andrew.
I'd say if you dont want bugs, go with d3d and HLSL, just a hunch though i'm not entirely familiar with that toolset.

I suppose you could split your model into chunks based on the number of bones. Render each chunk along with its subset of bones.

Interesting metric you have. How much performance gain, overall, did you achieve with your modifications?
Well, my problem is that I can't go with D3D, because Half-Life's SDK doesn't give access to the D3D device pointer, and Half-Life in general doesn't support D3D as much as it does OpenGL. I could chop the model, but I don't think it would be beneficial really, it's how Half-Life sets it's model structures up, but I could give it a try I think, I just hope I don't break anything.

Would sending the bone matrices for each vertex call really be that slow?

Oh, and I didn't really check the performance gain, but it's visible. I had to switch to shaders anyway, because I wanted to introduce per-vertex lighting into the game at least. Calculating that on the CPU would've been too slow for me, and I hated the per-bone quality, because it was incorrect on most models, and especially on the props. I had no choice on this really, I had to improve lighting quality.
With a system that complex, changing one piece will require you to change pieces all over the place. I'm sure their developers ended on the solution they did due to past experience and optimizations for their game and hardware. If you've got different requirements then it's understandable to make modifications, but you're going to have to make them complete. If you need to chop models then you need to change the structures, etc.

I'm not sure how exactly you'd send one matrix per vertex but it does sound painfully slow. Are you using immediate vertex mode in OGL? Is that why you say it's less compatible with d3d? It's hard for me to believe that half life used immediate mode.

If i were you i'd completely replace the renderer with one that suits your needs. Since evidently the original did not, why use it?
Instead of uploading 4x4 matrix, which consumes 4 registers. Upload a quaternion (XYZW, 1 register) and an offset (XYZW, 1 register).
In the shader, you would build a 4x4 matrix.
It costs some cycles in the vertex stage but GPUs are fast enough these days.

Of course, it is still possible you will run out of registers. I think SM 2.0 GPUs support 256 or 512.

Doing skeletal animation on the CPU is easier, IMO.
Sig: http://glhlib.sourceforge.net
an open source GLU replacement library. Much more modern than GLU.
float matrix[16], inverse_matrix[16];
glhLoadIdentityf2(matrix);
glhTranslatef2(matrix, 0.0, 0.0, 5.0);
glhRotateAboutXf2(matrix, angleInRadians);
glhScalef2(matrix, 1.0, 1.0, -1.0);
glhQuickInvertMatrixf2(matrix, inverse_matrix);
glUniformMatrix4fv(uniformLocation1, 1, FALSE, matrix);
glUniformMatrix4fv(uniformLocation2, 1, FALSE, inverse_matrix);
whats wrong with doing the calcs on the CPU?
100k verts shouldnt be to much.
the failing is not with glsl but with GPUs as when youre dealing with a lot of bones, a lot of GPUs cant really deal with to much info as youve found out, todays GPUs are much better than the used to be but are still nowhere as flexible as CPUs

btw FWIW HL rendered models as a collection of strips + fans (like quake2)
have u converted these to straight tris?

This topic is closed to new replies.

Advertisement