Material and pass sorting - over my head!

Started by
18 comments, last by Dtag 19 years, 3 months ago
I've been working on a material/shader system, and trying to work out a sorting algorithm for material passes. Unfortunately I'm kind of stuck on this. Here's where I'm at: Set of materials: S = { M1, M2, ... Mm } For each material Mi, ordered set of passes: Ti = { MiP1, MiP2, ... MiPp } where p >= 1 but may be different for each material. Render process consists of ordered sequence of passes, eg Given S = { { M1P1 }, { M2P1, M2P2 }, { M3P1, M3P2 } } Possible render sequence: M2P1, M2P2, M3P1, M1P1, M3P2 Passes from different materials may appear in any order, but passes from within the same material must keep their relative ordering. Passes may be interleaved from different materials. Given a set S, how many possible render sequences are there? For bonus points, (and more importantly really,) how would you iterate them? I've been struggling with this problem for hours. Any help greatly appreciated :-) [Edited by - posit on January 17, 2005 1:36:20 PM]
Advertisement
Hey posit, I'm a little busy to discuss this right now, but I'll meet you later in IRC. Or you can get my IM off my site alternatively I'm connected to AfterNET right now if you want to PM. I think a real time conversation would be more appropriate for this topic. We can post a log of our conversation if others in the community request it. Any takers?
After some investigation, this looks very much like the Travelling Salesman Problem, but with a catch: passes within a material must remain in order. I don't yet know whether this will make things easier or harder.

Any thoughts? Anyone? Please?
Quote:Original post by posit
After some investigation, this looks very much like the Travelling Salesman Problem, but with a catch: passes within a material must remain in order. I don't yet know whether this will make things easier or harder.

Any thoughts? Anyone? Please?


I'd be interested in hearing others thoughts on this. The TSP problem has many applications in the world of software engineering. If we view the cities as the set of render states required to render a given pass. Then the cost (distance) to get from one set of states (city) to another could be used to heuristically calculate the optimal order of rendering. Questions? Comments? Concerns?
A diagram of an example having two materials, to help visualise the problem. M1 has 3 passes, M2 has 2 passes.


With the ordering constraint, it appears to be more of a shortest-path problem in a DAG, where the graph is an N-dimensional hypergrid, with N being the number of materials and the size of each dimension being the number of passes involved.
Assassin, aka RedBeard. andyc.org
Quote:I'd be interested in hearing others thoughts on this. The TSP problem has many applications in the world of software engineering. If we view the cities as the set of render states required to render a given pass. Then the cost (distance) to get from one set of states (city) to another could be used to heuristically calculate the optimal order of rendering. Questions? Comments? Concerns?

The system I'm planning to use to sort my render states (just a general theory at this point, I haven't looked at implementation), is where I don't try and consider everything at once. I add new items to the list one at a time, using a kind of tree structure, where a new item comes to rest in the heirachy when it reaches a leaf.

Ok, a little more detail. I prioritize certain render state changes, based on how much time a state change of that type will eat up, on average. This sort priority list might go something like this:

-Shader
-Texture
-Something else

I'd want to do some careful testing to decide what state changes I add to this list. I can't imagine I'd end up with more than 4 or 5. The further down the list a change is, the less you gain from sorting by it, the less likely you are to encounter cases where you can actually sort by it, and the longer it'll take you to sort.

Anyway, the sorter adds new items to the graph one at a time, traversing the graph to the branch where that renderable item belongs. When the heirachy is filled with all the items that are to be rendered, the sorter simply does a complete traversal of the heirachy, and sends the items off to the renderer to be drawn. Because all renderable items have been grouped together based on the most important common state properties however, the expensive state changes are minimized, based on the priority of the state change. To demonstrate, here's some test data:

Name	Shader	Texture	SomethingA: 	2	1	3B: 	2	2	1C: 	2	2	2D: 	3	1	3E: 	3	3	1F:	1	1	2G:	2	2	1


In this case, we have 6 renderable items that are to be sorted, based on three state changes of interest: Shader, Texture, and Something. Note that items like shader and texture now have numbers associated with them. Those numbers should probably be handles or referances to objects, but in the case that a string is of interest, a hash will be generated for that string so it can be quickly compared. Anyway, from that set of test data, here's how the structure would be generated:

		      Root		       |		-------+---------		|      |         \  Shader		1      2          3		|     / \        / \   		|    /   \      /   \  Texture		1   1     2     1   3		|   |    / \    |   |		|   |   /   \   |   |Something	2   3   1    2  3   1		F   A   BG   C  D   E


The internal grouping nodes represent a perticular state value. If a new asset is being added that doesn't have a matching grouping node for its state value, that grouping node is created, so when the first asset "A", was added to the list, three grouping nodes were created for each of the three states I'm sorting by. When B was added however, only two were created, because A shared the same shader, and had already created that grouping node. When sending data to the renderer, each branch is fully explored before returning to a higher level of the heirachy, so the topmost state change will change the absolute minumum number of times.


This is just the theoretical idea. I haven't given all that much thought to the data structures I'll have to employ to allow fast comparison of state values to grouping nodes, and I haven't run tests to measure the efficiency of this structure, but I think it has promise. It may be more of a hinderance than a help when there are just a few objects, but when there are a few thousand renderables being thrown at it, it should really provide a boost, as long as the time required to add new items can be minimized.

EDIT: Note that this doesn't solve the TSP problem to find the absolute minimum number of state changes. If state changes of both priorities 2 and 3 changing at once is more expensive than a single priority 1 state change, it won't handle it. This is slightly different to the TSP problem though, because in TSP, a change in any direction is equivelant. In this case, some are more expensive, hence the priority ordering. Also, in this, you don't have to end up back where you started.
Nemesis2k2 brings out some interesting points, unfortunately I'm still at work and don't have time to write a real response. I'll respond again in a few hours.
Quote:Original post by Nemesis2k2

Okay, I honestly wanted to reply to this with a long drawn out explanation, but unfortunately I'm swamped at work and I keep telling myself "I'll just get this last feature in before I go home". I've been saying that for over 5 hours now, and I'm still not ready to go home. But I said I'd respond again so I will, it'll just be a bit shorter and less explanatory.

Briefly: The cost of inserting every object into that tree every frame is more than likely going to outweigh any benefit you'd get from the sorting. Insertion into the tree gets more expensive as the tree grows, so the more objects you have in your world the longer it takes to build the tree. You'd probably be better off doing a quick or merge sort on a list of all visible objects. Alternatively, you could rely on temporal coherence and possibly be better off with an insertion sort.

The TSP problem is a unique way of looking at this issue, but I'm not completely convinced as to it's practicality on modern hardware. As far as optimization is concerned you're probably better off sorting a simple list of visible objects as was stated above. This method would be especially efficient if you took advantage of the state capturing features of your preferred graphics APIs.
Just thought I'd point out that it doesn't actually need to be a tree structure in memory. It can be easily implemented using a radix sort, by treating each component as a 'digit'.

-bodisiw
-bodisiw

This topic is closed to new replies.

Advertisement