D3D11 Deferred Contexts
Posted by Jason Z, 23 July 2010 · 921 views
So, all the major surgery has been completed, and now that things are back up and running nicely I wanted to discuss deferred contexts just a little bit. From the DXSDK docs, conference presentations, and other various posts that google churned up, there is surprisingly little information on how to actually use deferred contexts in D3D11. As it turns out, the best reference I could find was actually the multithreading sample in the DXSDK itself...
From my small amount of research, there are generally two different ways to benefit from multithreading with D3D11: resource creation is one, and parallel render operation submission is the other. The first is fairly simple, since the device in D3D11 is free threaded - you more or less just create whatever threads are needed, and they can create the resources without worry about manually mutexing the device or anything like that. The second is a bit more tricky, and is where I think the actual treasure is buried.
Deferred contexts are basically identical to an immediate context (except for the return value of the GetType() method) except that the deferred version doesn't actually submit the state changes and execution requests to the GPU - instead it will build up a command list, which can then be executed on the immediate context. At first this sounds almost like double work, but in theory the command list execution should be faster than making individual function calls since everything is already formatted properly. As long as your rendering operations are parallelizable, then you just fire up a few threads, have them utilize a deferred context to each generate their command lists, and then have the immediate context execute each command list in the order that they should be. Sounds great so far...
I have modified Hieroglyph 3 to allow a 'PipelineManagerDX11' class to be passed to any method call that requires access to the pipeline. This cleverly remains incognito as to which type of context it houses - the object itself doesn't even know (except for the return value of the GetType() method). The beauty of doing this abstraction is that you can very easily have your rendering system decide at startup if it will utilize deferred contexts (i.e. on a 3 or 4 core processor), or if it won't (i.e. on a 1 or 2 core processor). A render view is handed a pipeline object, and executes its rendering pass on that pipeline - either it takes effect immediately, or it is cached into a command list and fired off later on.
With this change completed, I am now to the point where I am successfully utilizing deferred contexts to generate one RenderView's worth of commands into a command list. This is all single threaded action, but it proves out that the machinery is functional before I make the step into multithreaded rendering. This included caching all of the render views needed in a scene instead of recursively processing them all. They are still processed in the depth first order, but are now queued into the renderer before execution - this is the precursor to where these packages will be handed off to worker threads to build the command lists.
So now I must make the leap to multithreading, and then perform some testing on various scene types to find out what types of parallelization make the most sense and what is effective in speeding things up... It should be an interesting series of tests, so keep your eyes open for some further updates shortly!