skinned mesh triangle count

Started by
14 comments, last by Norman Barrows 8 years, 10 months ago

if i'm shooting for about 100 skinned meshes onscreen at once, what sort of triangle count should i be looking at? less than 20K per character? , less than 10K? 5K? 2K? 1500?

this is my first time doing skinned meshes, i'm not sure how low i have to go on the poly budget.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

Advertisement

5472. Not more, not less.

Assuming your intent is maintain a reasonable frame rate, the poly count isn't as nearly as important (if even relevant) as the time it takes to prepare for and perform the rendering. And that depends on the efficiency of the entire process, both on the CPU side as well as the GPU side. E.g., vertex skinning on the CPU isn't nearly as efficient as shader skinning can be. Updating animation parameters that haven't changed is a waste of time. Rendering two skinned meshes with separate draw calls isn't as efficient as rendering both with one draw call if they use the same parameters. A model that has only 2 influence bones per vertex rendered with a shader that does calcs assuming only 2 weights will be faster than a model with 4 weights per vertex.

I would suggest you first get a working process running. Then, if needed, profile the process to determine where improvements could be made.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

You don't forget how to play when you grow old; you grow old when you forget how to play.


I would suggest you first get a working process running. Then, if needed, profile the process to determine where improvements could be made.

i've got all the code up and running.

its just a test routine off the tools menu in Caveman 3.0, but it does the following:

loads and draws skinned meshes with multiple animations, and can change between animations. no animation blending yet.

supports all 5 or 6 skinning methods from tiny.cpp.

can set the texture for each mesh in a skinned mesh on the fly before drawing. textures can be pooled. materials can be pooled if desired.

can draw an object in relation to a bone (IE attach wpn to hand bone type stuff).

drawing multiple instances of a skinned mesh, each with its own animation controller, using a single shared skinned mesh and skeleton.

there's different versions of the code so you can load non-pooled textures and materials with the mesh and use them for drawing subsets, like a regular dx mesh,

or you can draw using the current texture and material for all meshes and subsets in the skinned mesh, or you can set textures for each mesh in the skinned mesh individually, then draw it using the current material.

the last thing i tested was using the in-game realtime 10 channel matrix editor to scale, rotate, and translate eyeballs to their correct position with respect to the head bone, while playing an animation where they don't move. this allows me to read off the srt values for use in a pre-defined "offset" matrix used to draw each eye. since the eyes are drawn as an attachment to a bone, i can change their texture, or make them look in any direction desired. to support visual tracking, the xr and yr angles from the lookat (or camera's forward) vector to the target's direction vector from the camera should be the rotations that need to be applied before the offset matrix, follwed then by the combinedtransform matrix - i think. haven't tested it yet, but i think its right, might need to negate the angles or something.

about the only code left is a skinned mesh pool class, and a controller pool class. i've pseudo coded them, but was too lazy at 1am last night to lookup syntax and var names, so it still needs about and hour's worth of work perhaps, at most.

so i'm more or less ready to go for it with a fully rigged and ready to rock model. the question is how many tris? i started with a beautiful mesh - excellent topology, easy to mod, looked great right out of the box. when i went to export with two animations, i thought blender had crashed. i tried exporting just the mesh. it took 10 minutes. turned out the mesh was over 100K tris. i decimated it to about 10K for testing, but the topology was ruined. about half way through testing, i switched to a nice topology mesh of about 30K tris. but this is probably still a bit big for 100 onscreen at once type of thing. a female will require 3 drawsubset calls - a male, two calls (no bra). i decided to implement bra and loincloth as separate meshes along with a head and body mesh, as the two or three meshes in a single skinned mesh. eyeballs and hair will add three more DIP calls per character, using static VBs and IBs. in my rigid body system, i'm using 14 (male) or 15 (female) static meshes to draw a character (if they're bald, subtract one hair mesh). some of them i know are perhaps unreasonably highpoly, 50K for a body, 20K for a head, that kind of thing. but they're all static. i was somewhat surprised to find little difference in FPS between various skinning methods in tiny.cpp sample running on my pc. something like 169 vs 175.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php


Then, if needed, profile the process to determine where improvements could be made.

perhaps a second test is called for. one that draws 100 characters on the screen. then just keep dropping the poly count until it runs fast enough.

i was hoping for a little more rule of thumb type guidance such as "if you're going for 100, definitely don't try over 2k".

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

I believe there's no 1 truth.
You have to set a goal what you want to achieve, this might be a game where in a current frame/ view you aim to have i.e 20 skinned/ animated meshes simultaneously.

Then you start building and when it's done you can start measuring.
It might be that you dont need to optimize anything or that one of the steps in the pipeline slows the whole process down. Who knows.

Have fun and start optimizing when you know you have too.

Crealysm game & engine development: http://www.crealysm.com

Looking for a passionate, disciplined and structured producer? PM me

If they don't all need to be independently animated (like, if it's ok if they're all animated in unison) you could just calculate the bone info for one, and send it to the shader for all 100 of them. Or, if you need some variety, some variation of this, say, animate 5 or 10, and send in one of those for each instance.

I know for my own implementations, it's the animation that really hogs the resources for high-polygonal models, not necessarily drawing the drawing of them. This has been a pretty decent workaround for me.

Beginner here <- please take any opinions with grain of salt

cozzie's correct. There are too many things that can effect render-time. Even if you posted all your code (please don't), you'd still have to be the one to determine performance for the meshes you use, with whatever you do to sort materials, or depth, etc., to setup your renderqueue, with whatever optimization of vertex order you've done, whether you've combined vertex buffers for small meshes, whether you use 2 bones/vertex, or 4, (as Misantes mentions) how efficiently your animation calcs are performed, ...

Understand - your post is essentially: "I have a program. How fast does it run?" The response to that is pretty much - "How big is a box?" wink.png

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

You don't forget how to play when you grow old; you grow old when you forget how to play.


You have to set a goal what you want to achieve, this might be a game where in a current frame/ view you aim to have i.e 20 skinned/ animated meshes simultaneously.

not 20, 100! not 100 active and 20 onscreen at once, 200-500 active and 100 onscreen at once! <g>


If they don't all need to be independently animated

nope, no cheating! <g>. 100 truly independent animated skinned meshes.


I know for my own implementations, it's the animation that really hogs the resources for high-polygonal models, not necessarily drawing the drawing of them.

which resources precisely? are you saying that lots of models with lots of bones is an issue?

i got the test up and running last night. straight out of the box, 100 instances onscreen at once, 30K tris per instance, random animations, changing textures on individual meshes of a single skinned mesh on the fly. works like a charm. but not a lot of time left to draw anything else.

second test: 100 meshes at 3K tris each. screams along at max speed. didn't turn off vsync, so it maxed out at about 62fps. decimating to 3k tris loses a lot of detail around the eyes.

third test: 15K tris. decent speeds, but perhaps still a little slow when things get hot and heavy with dense terrain (~15K draw calls). i would analyze the frame times of a complex scene from a save game, and the frame times from the test, combined - to get a good worst case estimate.

4th test: 10K tris. decent speeds. a little lossy on mesh detail.

5th test: turn off changing textures twice for each mesh drawn. no difference. rasterizing tris, not changing textures, is the bottle neck.

all tests used fixed function pure device non-indexed hw vertex blending - no shaders, no effects. i've only measured a 2-3% difference in speed between all skinning methods, with non-indexed fixed function pure device hw vertex blending the slowest, and indexed HLSL the fastest. so HLSL vs HW vertex blending might increase the triangle budget for a given LOD by less than 5% at best. IE i can have an additional 750 tris on my 15K tri mesh and run at the same speed w/ HLSL - ooh and ahh! i'm SO impressed! yeah right.

conclusions:

i decided i'll go with something like a 15K tri mesh to start. and add a second level of LOD at 5K if needed.

i've discovered i can decimate the meshes of a rigged and animated character with no problems. so i'll rig and animate the original 30K mesh, then decimate to 15K for the game. later, i can decimate to 5K to implement LOD if desired. or use the original 30K decimated to 20K to step it up a notch. or even use the 30K, 10K and 5K meshes as a 3 level LOD strategy.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

all tests used fixed function pure device non-indexed hw vertex blending - no shaders, no effects. i've only measured a 2-3% difference in speed between all skinning methods, with non-indexed fixed function pure device hw vertex blending the slowest, and indexed HLSL the fastest. so HLSL vs HW vertex blending might increase the triangle budget for a given LOD by less than 5% at best. IE i can have an additional 750 tris on my 15K tri mesh and run at the same speed w/ HLSL - ooh and ahh! i'm SO impressed! yeah right.


You do realise that 'fixed function' hardware hasn't existed in GPUs for a good 10 or more years, right?
EVERYTHING you do on a GPU since around 2002 has been using shaders under the hood; in fact that small performance difference could be due to you using index triangles instead of non-indexed triangles, which is a bit of a non-brainer.

In short your sarcastic "I'm so impressed" is nothing more than a statement against your own misunderstanding of how to do things...

and you continue to ask bad questions; with no details about target platforms or performance levels this whole question and the performance metrics you are waffling about are meaningless... the fact you got anyone to reply to this thread to start with is amazing given the total lack of information.

This topic is closed to new replies.

Advertisement