Design advice for a front-end for modern graphics APIs

Graphics and GPU Programming Programming OpenGL

Started by GuyWithBeard April 07, 2017 09:25 AM

5 comments, last by Hodgman 7 years ago

1,932

Author

April 07, 2017 09:25 AM

I am looking to replace my current DX11-only renderer with something better and to make it easier to support many graphics APIs I am writing a common "front-end" API for the graphics APIs I want to support. The renderer is layered and looks a bit like this (I assume this is a farily common way to organize things):

First there is the high-level renderer, ie. the API that the rest of the game communicates with. It contains concepts such as scene-graphs, meshes, materials, cameras, lights etc.
Below the high-level renderer sits the front-end API for the actual graphics APIs we want to target. This is an API that contains (or can emulate) all important features of the graphics APIs, such as devices, textures, shaders, buffers (vb, ib, cb), pipeline state etc.
Below the front-end API are the actual graphics APIs (DX11, DX12, OpenGL, Vulkan, Metal, libGCM etc). These are loaded in as plugins an can be switched during runtime.

I started writing the front-end API with the mindset that I'll target DX11 first and perhaps add DX12 and Vulkan support later. However, this seems to be a very bad idea especially since I have the rare opportunity to rewrite my whole renderer without having to worry about shipping a game right now. Most people seem to agree that it is better to write the front-end to look like the modern APIs and the emulate (or in some cases simply ignore) the modern-only features for the older APIs.

My question is this: What features of the modern APIs should I expose through the front-end API? I would like to make somewhat good use of DX12 and Vulkan so consider those as the main back-ends for now. In my current version of the API I already moved all state into a PSO-like object which will be the only way to set state, even on older APIs. However, after looking into DX12/Vulkan a bit more (note that I still only have a few hours worth of experience with either) it seems that there are other new object types that ideally should be exposed through the front-end, such as command lists, queues, fences, barriers, semaphores, descriptors and descriptor sets + various pools. What about these? Anything else? Does it make sense to try to wrap them up as they are and can DX12's concepts be mapped to Vulkan's concepts or do I have to abstract some of these into completely new concepts?

Thanks for your time!

Hodgman

52,717

April 07, 2017 11:35 AM

Yeah your three-level organization matches up with what I do too. My front-end advice is here.

Honestly, most of new features actually aren't required by your high-level layer, 99% of the time, so many can be used only as internal implementation details.

I use a D11-style of state-setting API when constructing my draw-items (i.e. depth-stencil and blend modes are set individually, not as a PSO), but all state is baked into a PSO as part of that draw-item creation process. This makes the draw-item creation API friendly to users, but the execution of draw-items is still stupidly fast due to all the precomputation.

D3D11 has the deferred context already, which maps to modern command buffers. GL has some command buffer extensions. In D3D9/GL you can emulate command buffers yourself - obviously with worse performance characteristics than the real thing! So I expose command buffers in my front-end, but also a capability variable as to whether they're the real deal (D12/vulkan), semi-real (D11) or emulated. If the high-level wants to use a command buffer to move an entire large bit of rendering work to another thread (e.g. something that does it's own computations as well as generating low-level commands), then even emulated command buffers are useful for that as they let you move all those non-low-level computations onto another thread. If the high-level simply wants to process 1000 draw-items as fast as possible by splitting that low-level-only work over several threads, then emulated command buffers are not helpful. This makes for a uniform API, but the high level is responsible for choosing to use the feature or not, based on the way in which they intend to use that feature and the reported performance characteristics.

Resource heaps in D12/Vulkan can be kept entirely as an internal details without doing you too much harm. If you do want to make full use of them, you can expose them to the high level in a way such that resources with common lifetimes are known to the back end. e.g. instead of having the high-level create 10 textures which all have the same lifetime (loaded at start of level, unloaded at end of level) via 10 individual function calls, if you make a resource creation API where the high-level can complete that task with a single function call, then the back end can easily and invisibly put them all into a shared heap and track them as a single allocation (which simplifies your residency management).

Descriptors again can be hidden internally without hurting you too much -- dynamic descriptor management via a ring buffer, if done well, is still faster than D11's binding model :)
I abstract away reusable/static descriptor sets by exposing a resource binding system that's a slightly modified version of D11's... In D9, we bound individual constants (uniforms) to shaders, and then in D11 we bound constant-buffers (UBO's) to shaders instead. I do the same thing for textures -- instead of binding individual textures to a shader, I only allow the user to bind "texture lists", which are a collection of texture bindings. In HLSL this maps to a contiguous range of t# registers in the shader, and in D11's C++ side it maps to a single call to *SSetShaderResources, but in D12/Vulkan it maps to a reusable descriptor table.

Internally you'll use fences/etc to manage internal descriptor ring buffers and upload ring buffers, but the high-level code doesn't have much use for them. However, you've been able to implement fences since D9 -- they are not a new feature, so it's entirely possible to expose them in your front-end. Personally I have some capability flags that specify whether this back-end allows the CPU/GPU to signal a fence, and whether it allows the CPU/GPU to wait on a fence (4 bits). Every API lets the GPU do the signalling and the CPU do the waiting, which is what you need to build a safe CPU->GPU upload ring buffer, but the new API's allow all 4 communication options. I expose them in my front-end, but the high-level hasn't actually used them yet.

I dealt with barriers as an internal detail, and then exposed them to the high level... but then decided the impact on the high-level was too great and annoying, so went back to implementing them internally. However, I left behind a "hint" system, where the high-level can optionally inform the back-end about optimal points in the command buffer for transitions to take place.

D12 and Vulkan mostly match up to each other when it comes to features. D12 smooths over a few things at a higher level than Vulkan though, plus Vulkan has it's whole weird render pass system where you declare the upcoming render target bindings ahead of time, which is important knowledge for any GPU with dedicated render-target memory buffers...

. 22 Racing Series .

GuyWithBeard

1,932

Author

April 08, 2017 10:37 AM

Thanks Hodgman for the detailed response! Incidentally I found an interesting video on Vulkan yesterday at:

Especially the first and last talks are very interesting. Around the four minute mark there is a slide on what Xenko's renderer exposes. Seems like they went with exposing descriptor sets too, but don't really explain why, just "you can't get around it", and that they went with the Vulkan approach of descriptor sets rather than the DX12 approach (which I cannot comment on at all yet). I also very much like the PSO approach even on older hardware so I think I am going to make them first class objects in the front-end too.

Hodgman

52,717

April 08, 2017 12:04 PM

Seems like they went with exposing descriptor sets too, but don't really explain why, just "you can't get around it",

Yeah, "we can't get around [exposing these as first class concepts]" is an interesting throw-away... I guess it's in the context of wanting to redesign as a "next gen" API.

I mentioned above a way to get the benefits of PSO's without making them first class -- a stateless API built around "draw items" is analogous to PSO's, but doesn't have to expose a PSO-like interface. You can expose a D3D11 or even D3D9 style interface to the draw-item compiler.

To implement a D3D11 style resource binding model, you can create a non-shader visible descriptor heap, and pre-create all your resource-view objects on it (similar to pre-creating D3D11 resource-view objects). Then at draw-submission time (or draw-item creation time), you can copy the sparse collection of views from that non-shader-visible heap into a contiguous table in a shader-visible heap. You manage those tables with a ring-buffer so that you can dynamically create them every frame. Doing it this way won't get you to full benefits of letting the user create static/reusable descriptor tables, but will still be faster than D3D11/GL :)

I actually support both -- my "resource lists" are a simpler abstraction than the entire descriptor model (which is quite low-level and exposes many tricky details, such as CPU<->GPU data synchronisation to its users...), and at creation time my users can specify whether they want an immutable one or a mutable one. Immutable ones can be mapped to pre-created/reusable descriptor tables, while mutable ones can use something closer to the D3D11 emulation system described above, where the tables are dynamically constructed at draw submission time.

. 22 Racing Series .

Ryan_001

3,477

April 08, 2017 01:34 PM

I wonder how hard to would be to just use Vulkan, and then make a Vulkan -> D3D12, or Vulkan -> D3D11, etc... implementation. I wonder if anyone's working on that...

CrazyCdn

1,442

April 09, 2017 06:36 PM

I wonder how hard to would be to just use Vulkan, and then make a Vulkan -> D3D12, or Vulkan -> D3D11, etc... implementation. I wonder if anyone's working on that...

If your interface is well designed it would be trivial. In the engines I've written it would require writing either a plugin or a new set of .h and .cpp files to implement the new API in wrappers and away you go.

"Those who would give up essential liberty to purchase a little temporary safety deserve neither liberty nor safety." --Benjamin Franklin

Hodgman

52,717

April 09, 2017 10:39 PM

I wonder how hard to would be to just use Vulkan, and then make a Vulkan -> D3D12, or Vulkan -> D3D11, etc... implementation. I wonder if anyone's working on that...
If your interface is well designed it would be trivial. In the engines I've written it would require writing either a plugin or a new set of .h and .cpp files to implement the new API in wrappers and away you go.

I think he means emulating the Vulkan API as is, on top of other APIs.
Sure, it's possible, but performance would not be great. Real ports to each API are best for performance.

Looking into the future, implementing old APIs on top of Vulkan is totally feasible though, and would be a cheap way to port say, D3D11 games to a new platform.

. 22 Racing Series .

Design advice for a front-end for modern graphics APIs

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Design advice for a front-end for modern graphics APIs

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines