Public Group

# Optimizing Meshes

This topic is 3024 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hi guys,

Without BLOWING my mind, can someone gently introduce me to how you go about optimizing a mesh when rendering? I've heard of octrees and quadtrees and all kind of trees bar oak trees and I was wondering if these can be applied to a model 5000 faces in size, as that's how big my main character is in face sizes. When I say main character I mean the one all humanoids will derive from, so there might be a few on screen. Damn I'd only need 5 to be hitting 25000+ faces per frame yikes!

Clearly something else must happen. It musn't be a simple case of chucking out a 5000 face model all the time and I'm not tooooo comfortable modelling LODS although I might just have to learn.

Is there a practical mathematical or code algorithm that can do this? Thanks ;o)

(Please try to keep it simple and always remember I am grateful for every reply).

##### Share on other sites
Well other than the line.

Mesh->OptimizeInplace( D3DXMESHOPT_ATTRSORT | D3DXMESHOPT_COMPACT | D3DXMESHOPT_VERTEXCACHE, (DWORD*)MeshBuffer->GetBufferPointer(), 0, 0, 0);

My knowledge of how to optimize models is hazy. Having one mesh and lots of animation tracks (per instance) is a decent means of reducing memory consumption but does not lower the CPU to GPU passing involved.

In DX10 I imagine you can pass the model in once and then pass in coordinates and an animation value and make the geometry buffer "build" the instance on the GPU for you cutting down the load between the two devices alot. Although of course I have never done that or I would be more specific or say you can do that.

I am aware of a VCache being a good way to increase rendering speeds down to the DX SDK although I do not know how to use it for all mesh allocation as the DX SDK tutorials haunt my nightmares with their convoluted code. All I can do to help is recommend you give OptimizeInplace a read over on the MSDN. I will watch this thread to see if anyone gives better advice I can follow too :p

As for management trees I think they come down to optimizing calculations for LOD in terrain and shadow lighting calculations. The only other instance you might be thinking of is a management system for ignoring models and objects offscreen in all drawing calculations.

##### Share on other sites
Nice reply thanks! It triggered off some ideas new and old. I'm using my own format so no access to hard coded DX functions and stuff sadly. I wonder if I can run something on a vertex buffer from hard coded functions? Unlikely probably.

I was thinking of uploading one set of vertices onto the GPU and then transforming these for each model using those vertices. So for example I upload the full set for the humanoid class. Then when animating I do the animation (vertex position changing) on the GPU. But I *somehow* keep an unaltered copy of the vertices on the GPU so that each model applies its own transform to this 'reference' vertex list (am I talking carp again?) so only one set ever has to be copied across.

Am I making enough sense for anyone to understand what I'm talking about?

##### Share on other sites
I am a sir. You are talking sense but the idea of using the meshes from one models code and drawing multiple instances of it would probably require a geometry shader (DX10+) I am not sure what the OpenGL equivlent version is.

Ok so I have never done this before but I know that if you have data in a vertex buffer of any description you can basically transfer it around so long as your confident of your vertex definition (I dont recommend FVF). What I am trying to do to clarify is take a vertex buffer dump its data into a mesh optimise it and then extract the vertex buffer. This is my first ever attempt at brainstorming a solution so dont hold me to this code because I have never done this before.

	//This by no account is the correct means to do it but if I was trying to do it I would be doing something like.	ID3DXMesh* meshGeneric; //Create a standardised mesh	meshGeneric->LockVertexBuffer(D3DLOCK_DISCARD, *YourVertexBufferData); //Attempt to lock and throw data into it from your vertex buffer.	//Attempt to run the optimisation based on your index buffers data. (might make the above line invalidated by the new data (I dont think so tho)	meshGeneric->OptimizeInplace(D3DXMESHOPT_ATTRSORT | D3DXMESHOPT_COMPACT | D3DXMESHOPT_VERTEXCACHE, (DWORD*)YourIndexBufferData->GetBufferPointer(), 0, 0, 0);	//Extract the optimised code out into your vertex buffer for your own intent.	meshGeneric->GetVertexBuffer(YourIndexBufferData);	//While I cannot be sure this is the right way to do this I can be sure its possible and that you wont be likely to out optimise this process.

Not sure thats how you do it but I am confident you could do it somehow. In retrospect you might need to perform lock and unlock commands during some of those processes.

##### Share on other sites
I'm sure I won't be the only one who will warn you about premature optimization, but mind the warning signs. If you don't know how much it will help things, than it's probably not worth it at the moment.

5000 faces is not much at all for remotely modern hardware... I've had 100 models(single texture, though for sure I'm not PS bound yet), rendering 650 verts in 400 faces through 4 DIP calls, rendered from 2 cameras... and even with the 800 DIP calls per frame(no instancing) I had a very happy 60 fps with no optimizations at all, and my rig is from the pre-vista days...

According to the research i've seen, which is by no means implied to be correct, a settexture(2500-3500cpu cycles) or worse setstreamsource(3500-6000) call is likely to stall you faster than 1200 - 1400 for DrawIndexedPrimitive. So having your tris ordered according to the texture/shader, will do wonders to help speed things up..

With that being said, the goal is to keep data in the cache as long as possible.
This means drawing nearby faces together. If I have to pull 3 verts for a tri in your nose, and then 3 more for a tri in your toes, and so on, it won't be long before the GPU says enough and kills your Framerate.

Look forward to seeing your work... I'm ready to tackle my own animation, but decided to stick around in GUI hell for a while...

##### Share on other sites
I can show 16 seconds of boredom in the video below of a repeated arm wave with my own mesh format:

It's just loaded out of a binary file with arrays corresponding to vertices/indices/normals/textures and the like laid out in a pre-defined way.

Every time I ask a question on here a whole new universe opens up. It will take me a while to digest what you have both said I'll reply properly shortly.

**Update**

EnlightenedOne:

Hi, is that only for .x files or can I use it on any vertexbuffer? Maybe a dumb question sorry. Very good reply! Lots of research needed there thanks alot for a great reply ;o)

##### Share on other sites
Quote:
 Original post by Burnt_FyrI'm sure I won't be the only one who will warn you about premature optimization, but mind the warning signs. If you don't know how much it will help things, than it's probably not worth it at the moment.

Ok thanks mate I'll keep that in mind!

Quote:
 Original post by Burnt_Fyr5000 faces is not much at all for remotely modern hardware... I've had 100 models(single texture, though for sure I'm not PS bound yet), rendering 650 verts in 400 faces through 4 DIP calls, rendered from 2 cameras... and even with the 800 DIP calls per frame(no instancing) I had a very happy 60 fps with no optimizations at all, and my rig is from the pre-vista days...

Sounds good!

Quote:
 Original post by Burnt_FyrAccording to the research i've seen, which is by no means implied to be correct, a settexture(2500-3500cpu cycles) or worse setstreamsource(3500-6000) call is likely to stall you faster than 1200 - 1400 for DrawIndexedPrimitive. So having your tris ordered according to the texture/shader, will do wonders to help speed things up..

So this means that I need to keep setStreamSource calls down right? So can I use one setStreamSource call and draw all my models off it, yet still animate them independently?

Quote:
 Original post by Burnt_FyrWith that being said, the goal is to keep data in the cache as long as possible.This means drawing nearby faces together. If I have to pull 3 verts for a tri in your nose, and then 3 more for a tri in your toes, and so on, it won't be long before the GPU says enough and kills your Framerate.

I didn't understand that bit. N00b time - what is the cache? Why does drawing verts in different places wreck the GPU's output?

Thanks ;o)

##### Share on other sites
The Cache as far as I understand it is your RAM although he might be refering to the tiny Cache of RAM on the CPU. D3DX calls might look specifically like they are for one purpose which is .x mesh format but you use D3DXMATRIX and D3DX all sorts whenever your using DX. That container I believe is just a generic form of storage for verticies which can hold any elaborate combination of vertex data you tell it too (to my knowledge). If that doesn't serve you true MSDN it.

HOLD IT!

You mesh looks nice but when I realised that was wireframe I was abit worried why on earth do you need such a high poly mesh? If you reduce the poly count to 2000 you could probably get a near identical looking character provided you used a shader to do per pixel lighting calculations so the light got distributed as it should rather than shading being done per vertex. It looks like your using the Fixed Function Pipeline to draw that and you have only upped the poly count to smooth the light/shadow appearance, am I wrong?

If not you need to stop worrying about optimisation and go straight to here and assimilate some critical code. My capabilities would be dead in the water without these invaluable examples.

Go here and click on HLSL and start learning the beauty of what the GPU can do for you.

##### Share on other sites
When drawing a triangle, the relevant vertices are fetched from the vertex-buffer, and are processed with the vertex shader. The outputs from the vertex shader are stored in the vertex-cache (but it has limited space, so new results will overwrite old results).

If another triangle is drawn soon after, and shares some vertices with the previous triangles, then it won't have to run the vertex shader 3 times - instead it can fetch some of the results from the cache.

##### Share on other sites
Quote:
 Original post by EnlightenedOneHOLD IT!You mesh looks nice but when I realised that was wireframe I was abit worried why on earth do you need such a high poly mesh? If you reduce the poly count to 2000 you could probably get a near identical looking character provided you used a shader to do per pixel lighting calculations so the light got distributed as it should rather than shading being done per vertex. It looks like your using the Fixed Function Pipeline to draw that and you have only upped the poly count to smooth the light/shadow appearance, am I wrong?

Spot on. I am using the fixed function pipeline. No shaders no HLSL. The model was mainly just to demo my animation explorations. I've found a much more suitable mesh of only 2000 polys for the body on TurboSquid. It will be being integrated in when I have some more cash.

I have used HLSL before but never got too deep into it although doing animation stuff on the GPU is something I consider a must before any release.

Thanks for a great reply time to check out that link. This kind of stuff is just what I wanted to know so I get any nasty stuff out of my head, and my code early on ;o)

Thanks for your reply too Hodgman ;o)

**Edit**

If I seem a bit daft it's because I'm quite n00b about all this to be truthful. I've book marked that page EnlightenedOne thanks alot.

1. 1
2. 2
3. 3
Rutin
15
4. 4
5. 5

• 10
• 9
• 9
• 11
• 11
• ### Forum Statistics

• Total Topics
633680
• Total Posts
3013304
×