Sorting Objects

Started by
14 comments, last by Shael 12 years, 8 months ago

[quote name='AllEightUp' timestamp='1311255734' post='4838439']
[quote name='Hodgman' timestamp='1311212101' post='4838230']
Front-to-back sorting of opaques is designed to take advantage of the hierarchical z-buffer.
However, a common alternative [to front-to-back sorting] is the z-pre-pass, [which] gives you the benefits of the hierarchical z-buffer, without having to use a front-to-back ordering.
I tend to ignore the hierarchical z buffers lately as the z-prepass trumps it in both performance and added abilities.[/quote]Instead of sorting front-to-back to take advantage of the HZB, you can use a ZPP to take advantage of the HZB...
...but instead of either sorting or using a ZPP to exploit the HZB, you... use use a ZPP?
[/quote]

I should have probably stated this as more in terms that given my scenes are semi-ordered due to the culling system, a better sort may reduce a bit of overdraw but the CPU cost doesn't ballance out compared to the GPU gain. Again, this is a by product of my pipeline which is optimized for a specific type of scene in a style of game which is not likely going to be a genericly applicable result. But, still, given how fast you can do the ZPP anymore, leaving it with even a lot of overdraw may still be a win given a more simplistic pipeline on the CPU side. (NOTE: I do notice that ZPP on NVidia cards is notably slower than AMD, probably due to having fewer processors, I may find the sort is worth it when I do better testing on my NVidia cards.)
Advertisement

I should have probably stated this as more in terms that given my scenes are semi-ordered due to the culling system, a better sort may reduce a bit of overdraw but the CPU cost doesn't ...

In that case, one would do a single run of bubble sort per frame (yes, bubble sort, no joke). For "almost ordered", bubble sort works surprisingly good, and it's dead simple. A single pass per frame will eventually optimize your draw order over several frames at a fixed (basically zero) cost. There is no possible worst case like with any other sort, because you do exactly one run.

[quote name='AllEightUp' timestamp='1311338690' post='4838876']
I should have probably stated this as more in terms that given my scenes are semi-ordered due to the culling system, a better sort may reduce a bit of overdraw but the CPU cost doesn't ...

In that case, one would do a single run of bubble sort per frame (yes, bubble sort, no joke). For "almost ordered", bubble sort works surprisingly good, and it's dead simple. A single pass per frame will eventually optimize your draw order over several frames at a fixed (basically zero) cost. There is no possible worst case like with any other sort, because you do exactly one run.
[/quote]

That's actually what the culling system is doing, one pass of bubble sort per frame just to keep rough ordering. I do this due to the fact that the camera remains static for fairly long periods and as such, I can glob up lots of geometry for a quick cull test and only really have to deal with dynamic items for culling. It cuts down my frustum vs aabb/sphere tests by about 90% which while not a real performance problem it just free's up enough CPU to do some other things on low end boxes.

Usually you would treat opaque and transparent objects differently so it makes sense to have them in separate lists. As for sorting, you could use a sort key/hash which would allow you to sort by a number of properties (distance, shader/material, etc).


I am having some difficulty regarding the key sorted approach mentioned. I am trying to tie up rendering commands (set shader params, set render targets, etc) + draw calls (set pass, draw primitive) into a command queue, and sort that queue. This apparently means I do not want the commands and draw calls to go out of order, but only sort the draw calls based on certain properties. I thought this would be a convenient approach as I would not need to maintain a separate list of sorted primitives for each scene-cull, however after much considering as to how to go about creating the key for such a sparse-sort order I am thinking if having separate buckets is a better approach (less elements to sort, no complex key needed).

For Eg: The queue may look like:
* Set Render Target
* Set View Port
* Set Pass P1
* Set Shader Param T1
* Draw O1
* Set Shader Param T2
* Draw O2
* Set Pass P2
* ...

The problem is to keep the grouped draw calls and set calls in order, which requires me to use a grouping number in the key, something like:
[ layer | command | material id | distance | group number ]

I will have to think up something better however as the key is becoming very redundant.
What if everyone had a restart button behind their head ;P

This apparently means I do not want the commands and draw calls to go out of order, but only sort the draw calls based on certain properties. I thought this would be a convenient approach as I would not need to maintain a separate list of sorted primitives for each scene-cull, however after much considering as to how to go about creating the key for such a sparse-sort order I am thinking if having separate buckets is a better approach (less elements to sort, no complex key needed).

The problem is to keep the grouped draw calls and set calls in order, which requires me to use a grouping number in the key, something like:
[ layer | command | material id | distance | group number ]

I will have to think up something better however as the key is becoming very redundant.


Theres no difference between the grouping number that you just mentioned and the "complex key".

In an offline step, or even just when you construct a [material | draw command] batch, you can perform a generic hash of the contents of those members and that becomes your "material hash"

Have your distance calculated as an integer from either the nearplane or farplane based on whether you want front-to-back or back-to-front sorting for the element, and this is the "distance hash"

Bit pack the hashes together with the layer number, and you have your sorting key. Put the bits for the distance hash higher than the bits for the material hash to prioritise depth sorting to material sorting.

The buckets arent necissarily a bad idea either though, as they are effectively pre-sorting larger chunks of your rendering (unless you end up with your buckets being too many, and too small in size)

In my case i use a 32-bit hash which yeilds a blindingly quick radix sort
material sorted (opaques): [1bit layer = 0, 15bit MaterialHash, 16bit DepthHash]
depth sorted (everything else): [1bit layer = 1, 16bit DepthHash, 15bit MaterialHash]
Regarding the key/hash approach here: Order your graphics draw calls around!.

In particular the handling of materials. So let's just say you have 32 bits for your material ID. Then objects will be sorted by material and you can render many objects that share the same material without state changes. However a material may consists of several properties. For example:

1. Render state (for example, is the geometry two sided or not)
2. Shader programs
3. Input layout
4. Textures/constants

Say you had two materials out of N materials, but the two only differed by render state. It would be nice if the objects corresponding to these two materials were "next" to each other in the sorted list, as it would minimize state changes when going from one material to the next. But I don't think it is possible to do this without extra sorts or extra data (maintaining a large bitflag list with every material permutation to ensure they end up sorted like this).
-----Quat
There's no reason why the sort key can't be made up using the addresses of a render state, shader, input layout, and texture. Quoting Hodgman from the Frostbite thread he shows a simple example:



For sorting opaque objects, I create a hash from the address of the most expensive state-groups -- usually the material and shader ones. e.g. something like:u32 shader = (u32)pShaderState;
u32 material = (u32)pMaterialState;
u32 hash = ((shader^(shader<<16))&0xFFFF0000) | ((material^(material>>16))&0xFFFF);

[/quote]

This topic is closed to new replies.

Advertisement