Hardware Instancing - Optimization Query

Started by
8 comments, last by Gavin Williams 11 years, 5 months ago
Hi all,

I am writing a structural CAD/Modelling type of application that utilizies hardware instancing extensively for rendering 3D models.

Basically there are typically thousands of beams/girders to draw each frame. These beams are generally comprised of standard structural sections, but for the purpose of this post, lets say that all my beams sections are made of I-sections (for the non-structural savvy folks, just picture a steel beam that is shaped like the capital letter 'I')

Now at first glance, it may seem that hardware instancing is a relatively straightforward choice when you are dealing with thousands of meshes that are geometrically very similar. As such, thats the approach i adopted. My application is performing fairly well for the most part, but when im dealing with large models that have hundreds of different sections, i run into performance issues.

Basically because there a lots of different types of I sections that exist. Each section has differences in flange width and thickness, as well as web depth and thickness. I am having to create a new mesh for each different type of I section, each frame. The reason i am doing it each frame is that i am concerned about the memory cost of storing hundreds of meshes, not to mention having to recreate them every time the graphics device needs to be reset. Having said that i have a feeling thats the way im going to have to go, unless someone more knowledgable than me can help me out with an alternative solution. Which brings me to my question...

Can you locally scale different components of a mesh? When i create my mesh, I'm basically retrieving the cross-section geometry data from a database, and then creating the mesh using that. The mesh has a standard length of 1 metre. When it comes to rendering the meshs, i use a world transform to 'stretch' the mesh to the right length. If i can somehow do something similar on a local scale, i could adjust things like flange thickness, width etc without having to create a new mesh for each type of I section. According to PIX, all my performance issues are steming from the constant locking and unlocking of buffers when im creating my meshes each frame, which is very understandable!

Can anyone suggest a better way to do what i want in a more efficient manner?

Thanks in advance.

Aaron.
Advertisement
If you know ahead of time the vertices that will be scaled, you can add a vertex attribute that is either a 1 or a 0, with 1 meaning that it is affected by the scale parameter of that instance. Then in your shader you can apply the scaling selectively, by either having a 0 term, or a 1*scale term respectively.

In this case you could apply the scale in one of two ways - either as a constant buffer parameter (which allows you to batch all similar meshes together) or as an instance level attribute (which would let you batch all meshes that can use the base mesh as its representation).

In the latter case you will have to lock and update and unlock the instance buffer, while in the first case you will just have to update the constant buffer for each batch.

I hope that helps!
Thanks for the reply Jason.

I think i understand what your saying, although it may be hard to implement in my case. Basically every vertex has to be scaled. Every type of I-section has different depths, widths, flange thinkness, and web thicknesses. Scaling the height, width and length of the mesh can be done easily with a scaling matrix, its more the flange and web thicknesses that i am struggling with.

So really i would have to do something like this:
1. Scale mesh with a scaling matrix to give correct height, width and length.
2. Flag all vertices save for a few to not be affected by further scaling.
3. Apply additional vertical scale to adjust flange thickness
4. Flag all vertices save for a few to not be affected by further scaling.
3. Apply additional horizontal scaling matrix that will adjust web thickness.

Is this possible with your suggestion?
You could always add a multi-component vertex attribute that allows you to perform only certain operations on certain vertices. Think of it as something like vertex skinning - you have a number of integer attributes that simply say which scalings will affect each vertex. If I am understanding your scenario properly, then I think this will work for you...

Can you post a pic of the various scaling operations, or is it sensitive info?
I'm not very familar with skinning, i don't use it in my application. I think i know what your saying though. I am thinking an integer vertex attribute that can be either 0,1 or 2. 0 indicating that all vertices are effected by scaling, 1 indicating that only relevant web vertices are effected by scaling, and 2 indicating that only relevant flange vertices are effected by scaling. I would then have to pass two matrices as part of my instance data. One matrix being the 'global' scaling matrix, and the other being the 'local' scaling matrix. The shader will apply the global scaling matrix to all vertices flagged as 0. I can then make the local transformation matrix only contain a scaling in the x and y direction. So my shader will apply the x scaling component to my vertices flagged as 1 (web), and apply the y scaling component to my vertices flagged as 2 (flange).

Is that essentially what your saying?

I've attached a pic below showing the scaling operations. I'm a bit reluctant to post code, but i dont think the code will tell you anything anyway. I'm just passing a transformation matrix to my shader at the moment, on a per-instance basis.

isectionscaling.png

I am having to create a new mesh for each different type of I section, each frame. The reason i am doing it each frame is that i am concerned about the memory cost of storing hundreds of meshes, not to mention having to recreate them every time the graphics device needs to be reset.


Have you considered that this is the part where you loose your performance (ie. recreating the data every frame). Your beam section has 12 vertices (if the corners aren't rounded) so one beam element has 24 vertices, which is pretty much next to nothing. Now if your vertex size is like 32 bytes and you have 1000 different beam meshes, your data amount is still less than 1mb!

So, why not try to use static meshes, which you don't recreate every frame?

Otherwise, the beams are typically defines as a set of parameters (4 in your case?), so it totally possible to define how each vertex is affected by those parameter. Probably you'd need to give each vertex a set of weights which tell how each parameter affects the position.

Cheers!
You may be able to construct a girder in its entirely in a vertex shader, without using vertex buffers. If you pass the web,flange etc. data in as per instance data (e.g. read from a texture buffer), you can procedurally build a girder based on the SV_VertexID system value semantic. Either that, or tag parts of a mesh using a vertex weight map and scale and extrude based on that.

You may be able to construct a girder in its entirely in a vertex shader, without using vertex buffers. If you pass the web,flange etc. data in as per instance data (e.g. read from a texture buffer), you can procedurally build a girder based on the SV_VertexID system value semantic. Either that, or tag parts of a mesh using a vertex weight map and scale and extrude based on that.


Yup, that's the way I'd do it. Strikes me that if the position component is based on a formula anyway, then moving that formula to GPU-side is the way to go.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

Thanks guys. Yeah I had a think about it last night and it seems logical to just have a base beam mesh and pass the actual cross section geometry parameters to the shader, and let the shader do the work. Saves me having to create a mesh for each beam size, and also recreate the meshes each frame.

@kuana - yes I realize that's where the performance problems are steming from, as I mentioned in my earlier post. Thanks for the response.
I am having to create a new mesh for each different type of I section, each frame. The reason i am doing it each frame is that i am concerned about the memory cost of storing hundreds of meshes, not to mention having to recreate them every time the graphics device needs to be reset.[/quote] You shouldn't have to recreate your mesh data every frame anyway, hundreds of meshes might not take up that much memory, if they aren't complex. I don't get why you are worried about that ? Nor why you are worried about device resets. If you need to rebuild your mesh data on a device reset then so be it. That is a rare event, where-as you say you would rather rebuild your mesh data ever frame to avoid having to do it occasionally. Your reasoning seems backwards to me.

This topic is closed to new replies.

Advertisement