Multithreaded renderer & command ordering

Started by
4 comments, last by alkisbkn 7 years, 2 months ago

This is a question that perhaps a seasoned engine programmer could asnwer for me. In a single threaded renderer, a possible approach would be something like this:

- Collect visible drawable objects

- Create DrawCall instances, containing everything that the renderer needs to visually describe this object, including a sort key.

- Sort those instances based on the sort key

- Go through them all and render them

Roughly...

However, with a commandlist approach (D3D12/Vulkan), I can see some problems. So, we want to record commandlists from multiple threads; let's say we have a task scheduler which we feed it rendering tasks. We also have a pool of commandlists and we get the next available. So far so good, we can record our commands. How would the drawcalls be ordered?

Thanks!

Advertisement
You can have the main thread actually submit the completed command lists to the queue (or a dedicated job to submit them to the queue).
It's the same as any other data dependency within your job system. You split a task up into an array of jobs (so it can be processed by many threads), but jobs can be dependent on earlier jobs, being forced to wait until those early jobs have been completed before executing
e.g. A job graph where three threads can do the scene traversal/culling, one thread does the sorting, two threads convert draw-call instances into command lists, and one thread submits those command lists to the queue in the appropriate order might look like:

[Collect Visible Objects Job 1] [Collect Visible Objects Job 2] [Collect Visible Objects Job 3]
                              \                |                /
                              [           Sorting Job           ]
                                 /                           \
           [Draw into command list  Job 1]          [Draw into command list  Job 2]
                                    \                     /
                                    [ Submit to Queue Job ]

I'd just break up the rendering into jobs per render target - generally you'll have the main scene and N shadow maps, maybe a reflection, maybe a cube map to render. Each one of these has its own Cull, Sort, Draw that is fairly independent. Each can be its own job in the thread system and they just need to be linked together at the end so all the dependent shadow maps / textures are available at the right time.

If they need to be linked throughout the main scene render (i.e. shadow map reuse), you just do some up front work to know what the command buffers / render target textures are so you can reference them as needed before they are fully generated and without syncing.

Hi Hogdman,

Thanks for your reply. That is how I do the Culling and sorting at the moment. DrawCalls are created before the Sort job, where 1 mesh = 1 task, if that makes sense.

Would it be correct to say that it'd be the Sort job's job to cut the sorted DrawCall array into pieces and create N BuildCmdList tasks (where N = the maximum number of commandlists)? That way I could guarrantee the order of the command list building.

@Dukus

That is how I have structured my renderer (or well, structuring it now :P). It is based on an article from Blizzard. However, 1 task per scene-view will not parallelise very well; that is why I am trying to add subtasks to each scene-view rendering process.

If you need to break the rendering up into more jobs here's a few ideas.

For culling, if you have a spacial subdivision (quadtree/octree) you can create a job per node of a certain size, say anything over N meters becomes a separate job.

For sorting, if you know certain things have certain properties and rendering order, such as opaque, transparent, z sorted, etc, you can bin them together at cull time and then sort them separately as their own job.

For drawing, you can certainly take the sorted lists and break them up into N jobs. I'd just make sure that N is big enough that you don't take a GPU hit from having to set all render state at the beginning of each sub list, since you don't know how render state will end up between each small command buffer that results.

Hi Dukus,

Those are some good points, thanks. I have a flat scenegraph at the moment, but intend to go to a more spacial approach soon.

For sort bins, you suggest that I have a render queue per property batch, let's say, and sort the objects that are bined in it using their sort key? That may parallelise better, as at the moment I am sorting all the objects in one job.

This topic is closed to new replies.

Advertisement