Thoughts on Direct3D 12
At the BUILD 2014 conference, Max McMullen provided an overview of some of the changes coming in Direct3D 12. In case you missed it, take a look at it here. In general, I really enjoy checking out the API changes that are made with each iteration of D3D, so I wanted to take a short break from my WPF activities to consider how the new (preliminary) designs might impact my own projects.
Less Is More
When you take a look at the overall changes that are being discussed, you end up with less overhead but more responsibility - so less is more really does apply in this case. Most or all of the changes are designed to simplify the work of the driver and runtime at the expense of the application having to ensure that resources remain coherent while they are being used by the pipeline. This type of trade off can be a double edged sword, since it can require more work on your side to ensure that your program is correct. However, there have been a number of hintsabout significant tooling support - so I am initially encouraged that this is being considered by the D3D team.
My initial feeling when I saw all of these changes is that they are in fact quite reasonable. When I consider how I would modify Hieroglyph 3 to accommodate these changes, I don't see a major tear-up. Each changes seems fairly well contained on the API side, and Max provided a pretty good rationale for why each change was needed. Here are the major areas of changes that I noted, and some comments on how they fit with Hieroglyph 3.
Pipeline State Objects
The jump from D3D9 to D3D10/11 essentially saw a grouping of the various pipeline states into immutable objects. The concept was to reduce the number of times that you have to touch the API in order to set a pipeline up for a draw call. It sounds like D3D12 will take this all the way, and make most of the non-resource pipeline state into a single big state object - the Pipeline State Object (or PSO as it is sure to be referred to as...). This seems like an evolutionary step to me, continuing what was already started in the transition to D3D10.
Hieroglyph 3 already emulates this concept with the RenderEffectDX11 class, which encapsulates the pipeline state to be set when drawing a particular object. Each object can have its own state, and replacing this with a PSO will be fairly simple. Most likely the PSO can be created centrally in a cache of PSOs, and just handed out to whichever RenderEffectDX11 instance that happens to match the same state. If none match, then we create a new entry in the PSO cache. Since the states are immutable, we don't have to worry about modifications, and the runtime objects lifetimes can be managed centrally in the cache. If this makes the system faster, I'm all for it!
Resource Hazard Management
Instead of the runtime actively watching for your resources to be bound either as an input or an output (but not allow both simultaneously), Direct3D 12 will instead use an explicit resource barrier for you to indicate when a resource is transitioning from one to the other. I have actually run into problems with the way that Direct3D 11 handles this hazard management before, so this is a welcome change.
For example, in the MirrorMirror sample I do a multiple pass rendering sequence where you generate an environment map for each reflective object, followed by the final rendering pass where the reflective objects use the environment maps as shader resources. When you go to do the final rendering pass, you either have to set the output merger state or the pipeline state first. If you bind the pipeline state first, then the environment map gets bound to the pixel shader with a shader resource view. However, from the previous pass the environment map is still bound to the output merger with a render target view - so the runtime unbinds the oldest setting and issues a warning. If you set the states in the opposite order, then you get the same situation on the next frame when you try to bind the render target view for output.
This essentially forces you to either ignore the warning (and just take whatever performance hit it gives you) or you have to explicitly clear one of the states before configuring for the next rendering pass. Neither of these ever seemed like a good option - but in D3D12 I will have the ability to explicitly tell the runtime what I am doing. I like this change.
Descriptor Heaps and Tables
The next change to consider is how resources are bound to the pipeline. Direct3D 12 introduces the use of Descriptor Heaps and Tables, which sound like simple user mode PODs to point to resources. This moves the previous runtime calls for binding resources (mostly) out of the runtime and into the application code, which again should be a good thing.
In Hieroglyph 3, I currently use pipeline state monitors to manage the arrays of resource bindings at each corresponding pipeline stage. This is done mostly to prevent redundant state change calls, but this could easily be updated to accommodate the flexible descriptors. I'm more or less already managing a list of the resources that I want bound at draw time, so converting to using descriptors should be fairly easy. It will be interesting to try out different mechanisms here to see what gives better performance - i.e. should I keep one huge descriptor heap for all resources, or should I manage smaller ones for each object, or somewhere in between?
Bye Bye Immediate Context
The final major change that I noted is the removal of the immediate context. This one actually seems the least intrusive to me, due to the nature of the deferred context to immediate context relationship in the existing D3D11 API. Essentially both of these beasts use the same COM interface, but deferred contexts are used to generate command lists while immediate contexts consume them. This seems like a small distinction, but in reality you have to design your system so that it knows which context is the immediate one (or else you can't ever actually draw anything) and which are deferred. So they are the same interface only in theory...
In Hieroglyph 3, I would use deferred contexts to do a single rendering pass and generate a command list object. After all of these command lists were generated, I batched them up and fed them to the immediate context. The proposed changes in D3D12 are actually not all that different - they replace the immediate context with a Command Queue which more closely represents what is really going on under the covers with the GPU and driver. Porting to use such a command queue should be fairly easy (you just feed it command lists, same as immediate context), but updating to take advantage of the new monitoring of the command queue will be an interesting challenge.
There was also a Command Bundle concept introduced, which is essentially a mini-command list. These are expected to speed up the time it takes to generate GPU commands to match a particular sequence of API calls by caching those calls into a Command Bundle. This will introduce another challenging element into the system - how big or small should the command bundles be? When should you be using a command list instead of a command bundle? Most likely only profiling will tell, but it should be an interesting challenge to squeeze the most performance as possible out of the GPU, driver, and your application .
So those are my thoughts about Direct3D 12. Overall I am overtly positive about the performance benefits and the expected amount of additional effort it will require. There aren't any major show-stoppers that I can see, but of course it is still early days and the API can still change or introduce new elements before it is released.
I would be interested to hear if anyone else has considered this or found a particular piece of the talk interesting or if you see any issues with it. Now is the time to give feedback to the Direct3D team - so speak up and start the discussion!