Jump to content
  • Advertisement
Sign in to follow this  
Seabolt

Vulkan What are your opinions on DX12/Vulkan/Mantle?

This topic is 1203 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Advertisement

Apparently the mantle spec documents will be made public very soon, which will serve as a draft/preview of the Vulkan docs that will come later.

I'm extremely happy with what we've heard about Vulkan so far. Supporting it in my engine is going to be extremely easy.

However, supporting it in other engines may be a royal pain.
e.g. If you've got an engine that's based around the D3D9 API, then your D3D11 port is going to be very complex.
However, if your engine is based around the D3D911 API, then your D3D9 port is going to be very simple.

Likewise for this new generation of APIs -- if you're focusing too heavily on current generation thinking, then forward-porting will be painful.

In general, implementing new philosophies using old APIs is easy, but implementing old philosophies on new APIs is hard.

 

In my engine, I'm already largely using the Vulkan/D3D12 philosophy, so porting to them will be easy.
I also support D3D9-11 / GL2-4 - and the code to implement these "new" ideas on these "old" APIs is actually fairly simple - so I'd be brave enough to say that it is possible to have a very efficient engine design that works equally well on every API - the key is to base it around these modern philosophies though!
Personally, my engines cross-platform rendering layer is based on a mixture of Mantle and D3D11 ideas.

Ive made my API stateless, where every "DrawItem" must contain a complete pipeline state (blend/depth/raster/shader programs/etc) and all resource bindings required by those programs - however, these way these states/bindings are described (in client/user code) is very similar to the D3D11 model.
DrawItems can/should be prepared ahead of time and reused, though you can create them every frame if you want... When creating a DrawItem, you need to specify which "RenderPass" it will be used for, which specifies the render-target format(s), etc.

On older APIs, this let's you create your own compact data structures containing all the data required to make D3D/GL API calls required for that draw-call.
On newer APIs, this let's you actually pre-compile the native GPU commands!

 

You'll notice that in the Vulkan slides released so far, when you create a command buffer, you're forced to specify which queue you promise to use when submitting it later. Different queues may exist on different GPUs -- e.g. if you've got an NVidia and an Intel GPU present. The requirement to specify a queue ahead of time means that you're actually specifying a particular GPU ahead of time, which means the Vulkan drivers can convert your commands to that GPU's actual native instruction set ahead of time!

In either case, submitting a pre-prepared DrawItem to a context/commanf-buffer is very simple/efficient.
As a bonus, you sidestep all the bugs involved in state-machine graphics APIs biggrin.png

 

That sounds extremely interesting. Could you make a concrete example of what the descriptions in a DrawItem look like? What is the granularity of a DrawItem? Is is it a per-Mesh kind of thing, or more like a "one draw item for every material type" kind of thing, and then you draw every mesh that uses that material with a single DrawItem?

Share this post


Link to post
Share on other sites

Can I say something I do not like (DX related)? The "new" feature levels, especially 12.1.

 

Starting from 10.1 Microsoft introduced the concept of "feature level", a nice and smart way to collect all together hundreds of caps-bits and thousand of related permutation in a single - unique - decree. With feature level you can target older hardware with the last runtime available. Microsoft did not completely remove caps-bits for optional features, but their number reduced dramatically, something like two orders of magnitude. Even with Direct3D 11.2 the caps-bits number remained relatively small, although they could add a new feature level - let's call it feature level 11.2 - with all new optional features and tier 1 of tiled resources; nevermind that's not a big deal after all - complaints should be focused on the OS support since D3D 11.1.

Since the new API is focused mostly on the programming model, with Direct3D 12 new caps-bits and tiers collections were expected, and Microsoft did a good job reducing dramatically the complexity of different hardware capabilities permutations. New caps-bits and tiers of DX12 are not a big issue. At GDC15 they also announce two "new" feature levels (~14:00): feature level 12.0 and feature level 12.1. While feature level 12.0 looks reasonable (All GCN 1.1/1.2 and Maxwell 2.0 should support this - dunno about first generation of Maxwell), feature level 12.1 adds only ROVs (OK) and tier 1 of conservative rasterization (the most useless!) mandatory support.

I will not go into explicit details (detailed information should be still under NDA), however the second feature level looks tailor-made for a certain particular hardware (guess what!). Moreover FL 12.1 do not requires some really interesting features (greater conservative rasterization tier, volume tiled resources and even resource binding tier 3) that you could expected to be mandatory supported by future hardware. In substance FL12.1 really brake the concept of feature level in my view, which was a sort of "barrier" that defined new hardware capabilities for upcoming hardware.

Edited by Alessio1989

Share this post


Link to post
Share on other sites

 

That sounds extremely interesting. Could you make a concrete example of what the descriptions in a DrawItem look like? What is the granularity of a DrawItem? Is is it a per-Mesh kind of thing, or more like a "one draw item for every material type" kind of thing, and then you draw every mesh that uses that material with a single DrawItem?

My DrawItem corresponds to one glDraw* / Draw* call, plus all the state that needs to be set immediately prior the draw.
One model will usually have one DrawItem per sub-mesh (where a sub-mesh is a portion of that model that uses a material), per pass (where as pass is e.g. drawing to gbuffer, drawing to shadow-map, forward rendered, etc). When drawing a model, it will find all the DrawItems for the current pass, and push them into a render list, which can then be sorted.

A DrawItem which contains the full pipeline state, the resource bindings, and the draw-call parameters could look like this in a naive D3D11 implementation:

struct DrawItem
{
  //pipeline state:
  ID3D11PixelShader* ps;
  ID3D11VertexShader* vs;
  ID3D11BlendState* blend;
  ID3D11DepthStencilState* depth;
  ID3D11RasterizerState* raster;
  D3D11_RECT* scissor;
  //input assembler state
  D3D11_PRIMITIVE_TOPOLOGY primitive;
  ID3D11InputLayout* inputLayout;
  ID3D11Buffer* indexBuffer;
  vector<tuple<int/*slot*/,ID3D11Buffer*,uint/*stride*/,uint/*offset*/>> vertexBuffers;
  //resource bindings:
  vector<pair<int/*slot*/, ID3D11Buffer*>> cbuffers;
  vector<pair<int/*slot*/, ID3D11SamplerState*>> samplers;
  vector<pair<int/*slot*/, ID3D11ShaderResourceView*>> textures;
  //draw call parameters:
  int numVerts, numInstances, indexBufferOffset, vertexBufferOffset;
};

That structure is extremely unoptimized though. It's a base size of ~116 bytes, plus the memory used by the vectors, which could be ~1KiB!

I'd aim to compress them down to 28-100 bytes in a single contiguous allocation, e.g. by using ID's instead of pointers, by grouping objects together (e.g. referencing a PS+VS program pair, instead of referencing each individually), and by using variable length arrays built into that structure instead of vectors.

When porting to Mantle/Vulkan/D3D12, that "pipeline state" section all gets replaced with a single object and the "input assembler" / "resource bindings" sections get replaced by a descriptor. Alternatively, these new APIs also allow for a DrawItem to be completely replaced by a very small native command buffer!

 

There's a million ways to structure a renderer, but this is the design I ended up with, which I personally find very simple to implement on / port to every platform.

 

 

Thanks a lot for that description. I must say it sounds very elegant. It's almost like a functional programming approach to draw call submission, along with its disadvantages and advantages. 

Share this post


Link to post
Share on other sites

 

There is something I don't really understand in Vulkan/DX12, it's the "descriptor" object. Apparently it acts as a gpu readable data chunk that hold texture pointer/size/layout and sampler info, but I don't understand the descriptor set/pool concept work, this sounds a lot like array of bindless texture handle to me.

Without going into detail; it's because only AMD & NVIDIA cards support bindless textures in their hardware, there's one major Desktop vendor that doesn't support it even though it's DX11 HW. Also take in mind both Vulkan & DX12 want to support mobile hardware as well.
You will have to give the API a table of textures based on frequency of updates: One blob of textures for those that change per material, one blob of textures for those that rarely change (e.g. environment maps), and another blob of textures that don't change (e.g. shadow maps).
It's very analogous to how we have been doing constant buffers with shaders (provide different buffers based on frequency of update).
And you put those blobs into a bigger blob and tell the API "I want to render with this big blob which is a collection of blobs of textures"; so the API can translate this very well to all sorts of hardware (mobile, Intel on desktop, and bindless like AMD's and NVIDIA's).

If all hardware were bindless, this set/pool wouldn't be needed because you could change one texture anywhere with minimal GPU overhead like you do in OpenGL4 with bindless texture extensions.
Nonetheless this descriptor pool set is also useful for non-texture stuff, (e.g. anything that requires binding, like constant buffers). It is quite generic.

 

 

Thank.
I think it also make sparse texture available ? At least the tier level requested by arb_sparse_texture (ie without shader function returning residency state).

Share this post


Link to post
Share on other sites

 

 

There is something I don't really understand in Vulkan/DX12, it's the "descriptor" object. Apparently it acts as a gpu readable data chunk that hold texture pointer/size/layout and sampler info, but I don't understand the descriptor set/pool concept work, this sounds a lot like array of bindless texture handle to me.

Without going into detail; it's because only AMD & NVIDIA cards support bindless textures in their hardware, there's one major Desktop vendor that doesn't support it even though it's DX11 HW. Also take in mind both Vulkan & DX12 want to support mobile hardware as well.
You will have to give the API a table of textures based on frequency of updates: One blob of textures for those that change per material, one blob of textures for those that rarely change (e.g. environment maps), and another blob of textures that don't change (e.g. shadow maps).
It's very analogous to how we have been doing constant buffers with shaders (provide different buffers based on frequency of update).
And you put those blobs into a bigger blob and tell the API "I want to render with this big blob which is a collection of blobs of textures"; so the API can translate this very well to all sorts of hardware (mobile, Intel on desktop, and bindless like AMD's and NVIDIA's).

If all hardware were bindless, this set/pool wouldn't be needed because you could change one texture anywhere with minimal GPU overhead like you do in OpenGL4 with bindless texture extensions.
Nonetheless this descriptor pool set is also useful for non-texture stuff, (e.g. anything that requires binding, like constant buffers). It is quite generic.

 

 

Thank.
I think it also make sparse texture available ? At least the tier level requested by arb_sparse_texture (ie without shader function returning residency state).

 

 

On DirectX 12 Feature Level 11/11.1 GPUs the support of tier 1 of tiled resources (sparse texture) is still optional. In that GPU range, even if their architecture should support tier 1 of tiled resource, there are some GPUs (low/low-mid end, desktop and mobile) that do not support it (e.g.: AMD HD 7700 Mobile GPUs driver support of tiled resources is still disable). The same should apply to OGL/Vulkan.

Edited by Alessio1989

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!