sorting scene

Started by
15 comments, last by dmatter 16 years, 9 months ago
Hello! I'm wondering, how do you sort your scene so that alpha blended geometry is drawn last and in order depending on the distance from the camera? It would be easy if I rendered all the geometry in a loop, where one iteration means one object, but that's not the way I do it. I have one big global vertex buffer that contains all the vertices in the scene, and one index buffer per collection of material properties (like texture, alpha blend, color, etc.). All objects that share the same material properties share index buffer. I have a solution: For each index buffer/material, store a list that describes the data in the index buffer and how to render it. Ie, solid objects that don't need sorting is one entry in that list, and objects that do need to be sorted have one entry per object. This is to minimize the number of rendering calls, there will be one call for the solid objects and one per alpha blended object. Another solution: Have two index buffers per material, one for the solid objects and one for objects that need sorting. Then when the camera moves, I just sort the alpha blended objects and write them to the second index buffer. This would also make it easier to implement LOD-surfaces and billboards, they could have their own buffers. What do you think of my ideas? Completely out of whack? Are there better ways?
Advertisement
Quote:Original post by patrrr
It would be easy if I rendered all the geometry in a loop, where one iteration means one object, but that's not the way I do it.

Thats usually how its done, so when you ask 'how do we do it', I think most of us probably do it this way.
We add objects in any order to a render-queue which is basically a big list (it could be an array, a linked-list or something more sophisticated - its all good), we sort all objects by their materials, after sorting we find that all the objects are grouped by their materials (like if they have the same texture then theyre close together) this means that we dont need to switch back and forth needlessly between different colours/textures/alpha, so all the oqaque objects will be grouped together at the top of the list, but all the semi-transparent ones will have sunk to the bottom and be in back-to-front order.
Now all we do is iterate the list from top to bottom and render each object.

Quote:I have one big global vertex buffer that contains all the vertices in the scene

Thats ok, provided your scene will fit into such a buffer, this is certainly an easy way of doing things [smile]

Quote:
and one index buffer per collection of material properties (like texture, alpha blend, color, etc.). All objects that share the same material properties share index buffer.

Can you elaborate on this?
Maybe its just me, but I dont understand why.
Your colours, tex-coords etc are all stored with your vertices right? Why do you need different index buffers?
Perhaps an example would help if you could think of an easy one.

Without fully understanding your existing system its hard to compare both methods. I can say that *usually* (read: your mileage may vary) you want to avoid reading/writing to index/vertex buffers any more than you have to.

Quick question: Does your system still work efficiently with animated meshes? have you thought that far ahead?
It might be that you're over demonising draw calls. Yes, the general consensus is to reduce them where possible but there is a point when it goes too far. You're effectively moving the overhead to resource modification - in sorting your data you'll have to change the index buffer data and in general resource modification is something you probably want to avoid. Even with a bounded-buffer approach you can introduce significant synchronization issues by modifying resources. The net result is that you don't really improve performance at all!!

If I were you, implement a sorting algorithm that can do f2b and b2f sorting that is temporal-aware. This was one of the most significant wins in an application of mine a few years ago - the f2b/b2f order doesn't change significantly unless you have fast moving objects or a fast moving camera; the order stays roughly the same on a frame-to-frame basis.

Render all opaque geometry in f2b (to take advantage of early-z) and then render all semi-transparent geometry in b2f. Stick with foreach(object){...} until your performance profiler tells you otherwise.

hth
Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

Thanks for your reply!

To clarify, this is just for level rendering; it is not intended to be used for objects like power-ups and enemies. In a level there are lots of small objects all over the place which would result in lots of calls to DrawPrimitive, and I don't think that would do the performance any good.

Quote:Original post by dmatter
Thats usually how its done, so when you ask 'how do we do it', I think most of us probably do it this way.
We add objects in any order to a render-queue which is basically a big list (it could be an array, a linked-list or something more sophisticated - its all good), we sort all objects by their materials, after sorting we find that all the objects are grouped by their materials (like if they have the same texture then theyre close together) this means that we dont need to switch back and forth needlessly between different colours/textures/alpha, so all the oqaque objects will be grouped together at the top of the list, but all the semi-transparent ones will have sunk to the bottom and be in back-to-front order.
Now all we do is iterate the list from top to bottom and render each object.


That's what I'm going to do with larger objects that animate and move about.


This is how my system is initialized (cool pseudo-code):

    vertexbuffer vertices = file.ReadVertices(); // per vertex: xyz, rgba, st, normal     object levelobjects[] = file.ReadObjects(); // per object: material reference, primitive indices    material materials[] = file.ReadMaterials();  // per material: texture, color, alphablend    renderable renderables[];    foreach (mat in materials)    {        renderable newRend = new renderable(mat, vertices);  // create one renderable per material, use the global vertex buffer        renderables.add(newRend);        scene.add(newRend);    }    foreach (obj in levelobjects)    {        renderable rend = renderables.getByMaterial(obj.material);        rend.addIndices(obj.indices);    }    foreach (rend in renderables)     {         rend.indexbuffer = CreateIndexBuffer(rend.indices);    }


Grouping indices per material makes rendering fast; I render all the objects that share material in one call. No need for sorting when rendering - if there are no alpha blended objects.
Rendering is easy, just loop through all the renderables in the scene, select vertex buffer and index buffer, and then draw the primitive. I use index buffers because it saves space and I can share vertices between objects and materials.

I think this is a good method for rendering static level objects (just split the buffers if they get too big). Alpha blended objects, LOD-objects and billboards are exceptions, and that's the original question.
Another problem with my system is if you want to cull areas based on where the camera is positioned, you'll have to update all the buffers. This could probably be solved by storing index buffers per area per material. Well, I'm rambling. I hope I cleared it up a bit.


Quote:Original post by jollyjeffers
Render all opaque geometry in f2b (to take advantage of early-z) and then render all semi-transparent geometry in b2f. Stick with foreach(object){...} until your performance profiler tells you otherwise.


I tried the foreach-method at first but it was too slow. That might have been because I had one index buffer per object though, I'll have to do some more experimenting to see if that's the case.
Keep in mind that this is mostly static level data, and lumping all the static data together wouldn't be such a bad idea, would it? I could pick out the transparent objects and render them separately.

Thanks!
I've done some experimenting with the foreach-method for static level objects, and the results: drawing 1800 small primitives per frame is more than 10 times slower than my method (where you draw ~50 huge primitives), even if you sort objects per material at rendering.
In fact, sorting the objects per material seems to make no difference at all. Probably because the 1800 draw primitive calls make the app cpu-bound. (I've checked with a sampler, and the app is using 100% CPU)

Some numbers: The biggest primitive is 252 indices, the global index buffer is 61.68 KB, the global vertex buffer is 373.38 KB.

[Edited by - patrrr on July 26, 2007 6:25:25 PM]
With static geometry there is little excuse not to pre-process your geometry into suitably large, material sorted batches ideally using some sort of spatial partitioning.

Modern graphics cards require large batches to operate efficiently, try not to starve your gfx card with each call to draw primitive, 1800 tiny batches is certainly a good way to to get unimpressive results.

Using 100% CPU usage according the task manager isnt necessarily a bad thing by the way, its quite missleading actually.
Quote:Original post by dmatter
With static geometry there is little excuse not to pre-process your geometry into suitably large, material sorted batches ideally using some sort of spatial partitioning.


Exactly, and that's what my statistics say as well.
You say spatial partitioning, do you mean like a grid or an octree? To implement that, would you let each node have a list of index buffers, where one buffer maps to one material?
That's what I'm interpreting "material sorted batches" as anyway, a batch is an index buffer, right? And you can't change material mid-batch.
Yes you could have one fat vertex buffer and many batch-sized index buffers at leaf nodes of your spatial partitioning structure (which may be an octree, could also be a BSPtree or ABTree for example). You would need an index buffer per material type.

I think you're more likely to get better performance with a vertex buffer per leaf node and then have individual index buffers per material, but the principal remains the same.
dmatter: "I think you're more likely to get better performance with a vertex buffer per leaf node and then have individual index buffers per material, but the principal remains the same."

If memory serves, the common wisdom about vertex buffers used to be that you should make fat buffers to minimize vertex buffer changes. This was like years ago. Have things changed since then?

-- Jani

This topic is closed to new replies.

Advertisement