Sign in to follow this  

Sorting draw calls when stages are executed multiple times

This topic is 1734 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Many game engines use a draw call sorting approach to create render queues (like this, Bitsquid also uses a similar system).


This works well when each stage (gbuffer pass, lighting) is only executed one time since you just have to set some bits in the sort key that identify the pass.


The problem appears when you have passes that might be executed multiple times like shadow mapping (and real time reflection maps) because it needs to be executed for each light on screen that casts shadows. There is a limited number of bits that identity each stage so you (probably) cannot simply assign each light a unique stage number.


I currently don't include stage bits in the sort key instead I have multiple "buckets" each bucket associated with a stage and render items are put in the buckets. Then the stages are executed in the defined order by rendering the render items inside each bucket, this way I can render the shadow map bucket multiple times.


In the end of the article I linked in the beginning of the topic the author wrote that this multiple buckets approach where draw calls are sorted inside the buckets was used in the PS2-era and nowadays draw calls from multiple stages are sorted together.


What am I missing?

Share this post

Link to post
Share on other sites

I also use the multiple buckets approach. Sorting 5 lists is faster than sorting a single list that's 5 times bigger.
AFAIK, you can't use a single list (with a single stage identifier) for repeated stages like shadow mapping, because each light/shadow-stage will have different culling results. You also probably don't want to share items between the shadow list and the g-buffer/forward list, because objects that are 3-draw calls in the latter (3 materials) can be just one draw call in the former (z-only pass disregards material).

In that article, he's talking about what they happened to do on their PS2 games, not necessarily saying that one approach is PS2 era and one is PS3 era. Also, IIRC, GoW didn't use deferred shading, and made little use of shadow mapping - they merge many lights together into a single light for forward shading, which seems incompatible with having shadow maps per light.

Edited by Hodgman

Share this post

Link to post
Share on other sites

I figured I'd jump in on this thread as I have a similar question.


I'm working on a data-driven renderer (similar to Horde) where you can define a rendering pipeline via a script.




a) a set of visible objects determined from the culling phase




b) a pipeline can have multiple stages each possibly requiring the objects sorted differently (back-to-front, front-to-back, material/state changes, etc)


What is the best way to perform sorting?


To me it'd seem that each stage with a sort option specified would have to loop through each object/drawcall/renderitem and generate a new sort key before sorting. That or each object would need to be filled with all necessary information so that custom sort functions could be written and choose what info to sort on instead of just doing the simple default integer sort using a key value.


Maybe I'm going about this completely wrong. Some insight would be great!

Share this post

Link to post
Share on other sites

Firstly I define a render single queue object (CRenderQueue) which does an index sort on the objects inside it (not sorting the render-queue items but the indices to each item in the queue).  These are maintained across frames to take advantages of temporal coherence (which is further exploited by using an insertion sort).

2 of these combine to make a CRenderQueueSet object in which one render queue is for opaque objects and the other for alpha-enabled objects.


In a basic scene with shadow mapping there are 3 render-pass types: Ambient/non-shadow-casting lights, shadow-map generation, and applying shadowed lights to objects (one extra pass per shadow).


The basic rule of thumb is that sorting needs to be done once per cull phase.

For the ambient and shadow-application phases the same objects will be drawn, so culling only happens once between the two stages, and thus sorting only happens once.

Then for each shadow you apply to objects, different objects will be culled on a per-light basis, so sorting also happens once for each light.




After culling (for any phase), the remaining objects are passed a CRenderQueueSet and so they can themselves to either the opaque render queue or the alpha-enabled render queue.

When submitting data to a render queue the objects need to know how to create a render-queue item, which requires that they know what type of render is being made (normal render or shadow-map-creation render, etc.)

This information is passed when the model is asked to submit itself to the render queue and it uses it in a simple switch case which selects the appropriate shader for that type of render and supplies that, along with other necessary information, to the render-queue item for the render queue(s).



A keen observer will then note that if they are only added to the render queue and sorted once, their shaders will change between the ambient pass and the shadow-map-application pass(es).  Since shaders are very important for sorting, shouldn’t they be resorted?

This is true, but they will all change in the same way.  That is, all shaders that had hemisphere lighting enabled and neither lights nor shadow-mapping enabled for the ambient pass will be permutated into the same shader when all of these are disabled or enabled for the later shadow-map-application pass(es).  In other words, they all changed their shaders, but have remained implicitly sorted anyway.



In all cases, each render-queue remains across frames.  Its clear function simply sets m_ui32Total to 0UL.  The sorted indices stay sorted.

After adding back all the objects for the next frame, the objects will likely still be rendered in the same order as last frame and the insertion sort will quickly verify that.



L. Spiro

Edited by L. Spiro

Share this post

Link to post
Share on other sites

How do you generate the indices that point to each render item? I'm assuming you would need to have some way to uniquely identify render items?


If we look at a Light Pre-Pass rendering pipeline, the first stage would require the queue to be sorted by depth and the material stage sorted by material/shader/state BUT both stages utilize the same cull results (unless my understanding is incorrect). How is this handled? We have two different sort orders that use the same set of culled objects.

Share this post

Link to post
Share on other sites
Sign in to follow this