Meshes, Materials and Lights

Started by
9 comments, last by 0xffffffff 13 years, 3 months ago
Hi All,

So moving along with my engine I have Meshes, Materials and Lights. The problem I'm having now is moving from test demos to more real world applications.

When a mesh is created (either procedurally for a primitive or loaded through a model file), I parse the vertices and create a vertex buffer and index buffer for the mesh and upload both of those to the GPU.

I can then create materials (one instance of each possible type) and assign a pointer to that material to the various meshes. So one material may have many meshes using it. I sort the renderable meshes by which material they have assigned and then update it's transforms, set the various constants and variables on the GPU in the material using the mesh's vertex and index buffer.

The mesh render's as expected.

So that's great but now I have three cases that I'm really not sure how to handle.

1. Mesh large enough that it is bigger than the view frustrum.


In this case, since I'm simply culling based on which objects are in the view frustrum, i'm still sending a lot of triangles to the GPU to get rendered even though they won't be seen. This seems really inefficient. I've thought about instead breaking Meshes down into triangles and culling the individual triangles but this would then require me to rebuild the Vertex Buffer and Index Buffer every frame with just visible triangles which is also really inefficient.

On the plus side, the triangle way would let me batch all "like" triangles related to a certain material into one giant vertex buffer and less draw calls as opposed to a draw call per mesh.

Is there an optimal way of doing this? Or is it a pick you battles and live with the cons kind of thing?

2. What if meshes have multiple materials?

For demo purposes, objects can be just one material, but realistically a character would have cloth clothes, skin, metal buckles or armor. Should I be having my 3D team treat each different "material" piece as a separate mesh and I just group them? Seems like a lot of overhead.

Instead should my mesh have multiple vertex/index buffers and multiple materials that can be applied to it?

3. How can shaders handle multiple lights?

Again for demo purposes, it's trivial to create a light object and write a shader to show lighting. In the real world, there are multiple lights. But for a mesh car driving down a street for example, on one block it could be lit by the street lights, oncoming headlights and the traffic lights and the next, it's just lit by one lonely street lamp.

How would i being to architect a lighting model that can handle n amount of lights? Or how would I even write my shaders to handle that? Would i need to have a shader for each possibility and then switch between them? Again seems very inefficient.




Any thoughts or ideas on any of the three questions would be appreciated.
Advertisement


1. Mesh large enough that it is bigger than the view frustrum.


In this case, since I'm simply culling based on which objects are in the view frustrum, i'm still sending a lot of triangles to the GPU to get rendered even though they won't be seen. This seems really inefficient. I've thought about instead breaking Meshes down into triangles and culling the individual triangles but this would then require me to rebuild the Vertex Buffer and Index Buffer every frame with just visible triangles which is also really inefficient.

On the plus side, the triangle way would let me batch all "like" triangles related to a certain material into one giant vertex buffer and less draw calls as opposed to a draw call per mesh.

Is there an optimal way of doing this? Or is it a pick you battles and live with the cons kind of thing?

2. What if meshes have multiple materials?

For demo purposes, objects can be just one material, but realistically a character would have cloth clothes, skin, metal buckles or armor. Should I be having my 3D team treat each different "material" piece as a separate mesh and I just group them? Seems like a lot of overhead.

Instead should my mesh have multiple vertex/index buffers and multiple materials that can be applied to it?

3. How can shaders handle multiple lights?

Again for demo purposes, it's trivial to create a light object and write a shader to show lighting. In the real world, there are multiple lights. But for a mesh car driving down a street for example, on one block it could be lit by the street lights, oncoming headlights and the traffic lights and the next, it's just lit by one lonely street lamp.

How would i being to architect a lighting model that can handle n amount of lights? Or how would I even write my shaders to handle that? Would i need to have a shader for each possibility and then switch between them? Again seems very inefficient.



1) I would worry less about things that 'seem inefficient' and more about things that you have definitely profiled and are adversely impacting your performance more than is acceptable. The GPU is a screaming fast frustum culling machine, so I don't know why you would want to take this on on your CPU and check each triangle in a single threaded loop. You definitely don't want to be rebuilding buffers per frame. If it does become a problem, you can break your mesh into multiple chunks and test those, but don't go testing each triangle individually and don't rebuild buffers per frame.

2) I think a typical solution might be to have a "SubMesh" class, of which a "Mesh" class contains several. I would treat a submesh as a 1-material 1-shader 1-pass group of triangles that you can render in a single batch. That way you can throw around your higher level models (e.g. a "soldier") without worrying about the internals, while a soldier would contain several submeshes ("helmet", "skin", "boots", "hair", etc). Also in this case your renderer could actually sort by submeshes, so if you're drawing 10 soldiers, it draws all the helmets without switching state, then all the boots, then the skins, etc.

3) Really depends on your use case. You will need to pick a "maximum" amount of lights that you want, and then you can just loop in the shader over the number of active lights less than the maximum. If your lighting changes infreqently it may be a good idea to have separate shaders to avoid the dynamic loop overhead, but you're going to have to decide what you really "need", and then profile your different options to get there based on how many lights you want, and how often you're going to be changing the number of lights.
[size=2]My Projects:
[size=2]Portfolio Map for Android - Free Visual Portfolio Tracker
[size=2]Electron Flux for Android - Free Puzzle/Logic Game
1. Your scene granularity is always a trade off. Splitting meshes can result in finer-grained culling, but at the same time will make batching more difficult. And batching is very important on the PC, where draw calls and state changes have a lot of driver overhead.

2. Usually multiple materials on a mesh is handled by having each material be a subset of primitives in the vertex/index buffer. So You would have one vertex/index buffer for the whole mesh, and maybe have indices 500-800 marked as a second material. Then you can just draw the two separately. But there are lots of ways to handle it. Either way, your artists would probably like to avoid having to manually split meshes to assign different materials.

3. This is a tough and complex problem. If you stick to forward rendering, your major problem is shader permutations. IF you want to support a variable number of lights but don't want to suffer the overhead of multipass or dynamic branching/looping, then you need to pre-compile all possible permutations of the shader. The number of permutations can explode very quickly when combined with other shader features, and pretty soon your compile times get out of hand. Plus performance for many lights might not be so great if you have tons of lights per-pixel, and can't eliminate overdraw. It also limits your granularity for applying a light, since it's limited to the level of a draw call.

Deferred rendering aims to solve these problems by decoupling lighting computations, but introduces its own slew of problems. You shouldn't have a hard time finding info if you do some research.
Cool, thanks for the responses it is both as I expected and feared I guess there wouldn't be much point if it were easy.

A couple of things though:

@karwosts
The GPU is a screaming fast frustum culling machine, so I don't know why you would want to take this on on your CPU and check each triangle in a single threaded loop.

Are you implying that I do my mesh culling on the GPU in a shader? How would I go about doing that? I was under the impression that I would create an octree (or something similar) of my scene and each frame on the CPU, cull the scene to derive my visible set of meshes. Then send those meshes to the GPU to get drawn.

@MJP
And batching is very important on the PC, where draw calls and state changes have a lot of driver overhead.

Just to be clear here, if i organize by state (materials etc) i'll have at least one draw call per mesh. This is ok though correct? A sample flow would be:

  • Set State to Cloth Material
    • Draw Mesh 1
    • Draw Mesh 6
    • Draw Mesh 7
  • Set State to Metal Material
    • Draw Mesh 2
    • Draw Mesh 3
  • Set State to Skin Material
    • Draw Mesh 5
  • With State changes being more expensive than actual draws?

    Thanks again for the replies guys. I believe I'll do a forward lighting renderer first and then move to deferred shading after.


    Are you implying that I do my mesh culling on the GPU in a shader? How would I go about doing that? I was under the impression that I would create an octree (or something similar) of my scene and each frame on the CPU, cull the scene to derive my visible set of meshes. Then send those meshes to the GPU to get drawn.


    Broad based octree culling is good, because with 1 test you can prevent the gpu from having to frustum test thousands of triangles. You however mentioned breaking your meshes down into individual triangles and frustum testing them before adding them to a vertex buffer, which is too expensive. Start with mesh based testing, if that is too slow, than identify the typical meshes that have this problem and break them down into chunks.

    You don't have to do anything in a shader, frustum culling is still done in fixed hardware by the gpu, so you don't do anything related to culling in your shader.
    [size=2]My Projects:
    [size=2]Portfolio Map for Android - Free Visual Portfolio Tracker
    [size=2]Electron Flux for Android - Free Puzzle/Logic Game
    Ok perfect, we're aligned then. Thanks!


    Just to be clear here, if i organize by state (materials etc) i'll have at least one draw call per mesh. This is ok though correct? A sample flow would be:

    • Set State to Cloth Material
      • Draw Mesh 1
      • Draw Mesh 6
      • Draw Mesh 7
  • Set State to Metal Material
    • Draw Mesh 2
    • Draw Mesh 3
  • Set State to Skin Material
    • Draw Mesh 5
  • With State changes being more expensive than actual draws?



    In general, you want as few API calls as possible. This includes state changes, draw calls, setting render targets, etc. So grouping meshes by material is definitely a good idea. However Draw calls are typically the most expensive API call you can make, so you definitely want to do anything you can to reduce the number. Batching and instancing can help, where applicable.
    In order to avoid these draw calls though, I'd have to dynamically create new Vertex/Index Buffers and upload them though correct? We're going on the assumption that creating/uploading batched vertex buffers is faster than a million draw calls. Obviously it will depend on the scene etc but I just want to make sure I'm understanding you correctly.

    So my modified flow from before would be:

    • Set State to Cloth Material
      • Construct Vertex/Index Buffer and Upload to GPU
      • Draw Mesh 1, 6, and 7
  • Set State to Metal Material
    • Construct Vertex/Index Buffer and Upload to GPU
    • Draw Mesh 2 and 3
  • Set State to Skin Material
    • Construct Vertex/Index Buffer and Upload to GPU (Here since there's only one mesh, this step is overkill since simply drawing the one mesh is faster)
    • Draw Mesh 5
  • I am assuming that above draw calls, the most expensive operations are buffer allocation. I would not create or populate a GPU buffer on the fly if you don't require it. Your previous work flow seemed ideal because it assumed that you pre-allocated the buffers.
    With regard to lighting, you may want to divide your lighting requirements into static versus dynamic lighting. I.e., static lighting (e.g., street lights) of static objects (e.g., buildings, lampposts) can be done in a modeling program (e.g., using vertex lighting) and need not be done at runtime.

    Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

    You don't forget how to play when you grow old; you grow old when you forget how to play.

    This topic is closed to new replies.

    Advertisement