• Advertisement
  • Popular Tags

  • Popular Now

  • Advertisement
  • Similar Content

    • By Krypt0n
      Finally the ray tracing geekyness starts:
      lets collect some interesting articles, I start with:
    • By lubbe75
      What is the best practice when you want to draw a surface (for instance a triangle strip) with a uniform color?
      At the moment I send vertices to the shader, where each vertice has both position and color information. Since all vertices for that triangle strip have the same color I thought I could reduce memory use by sending the color separate somehow. A vertex could then be represented by three floats instead of seven (xyz instead of xys + rgba).
      Does it make sense? What's the best practice?
    • By ZachBethel
      Hey all,
      I'm trying to understand implicit state promotion for directx 12 as well as its intended use case. https://msdn.microsoft.com/en-us/library/windows/desktop/dn899226(v=vs.85).aspx#implicit_state_transitions
      I'm attempting to utilize copy queues and finding that there's a lot of book-keeping I need to do to first "pre-transition" from my Graphics / Compute Read-Only state (P-SRV | NP-SRV) to Common, Common to Copy Dest, perform the copy on the copy command list, transition back to common, and then find another graphics command list to do the final Common -> (P-SRV | NP-SRV) again.
      With state promotion, it would seem that I can 'nix the Common -> Copy Dest, Copy Dest -> Common bits on the copy queue easily enough, but I'm curious whether I could just keep all of my "read-only" buffers and images in the common state and effectively not perform any barriers at all.
      This seems to be encouraged by the docs, but I'm not sure I fully understand the implications. Does this sound right?
    • By NikiTo
      I need to share heap between RTV and Stencil. I need to render to a texture and without copying it(only changing the barriers, etc) to be able to use that texture as stencil. without copying nothing around. But the creating of the placed resource fails. I think it could be because of the D3D12_RESOURCE_DESC has 8_UINT format, but D3D12_RESOURCE_FLAG_ALLOW_DEPTH_STENCIL enabled too, and MSDN says Stencil does not support that format. Is the format the problem? And if the format is the problem, what format I have to use?

      For the texture of that resource I have the flags like: "D3D12_RESOURCE_FLAG_ALLOW_RENDER_TARGET | D3D12_RESOURCE_FLAG_ALLOW_DEPTH_STENCIL" and it fails, but when I remove the allow-stencil flag, it works.
    • By ritzmax72
      I know vertex buffer is just another GPU resource represented by ID3D12Resource, but why is it said that vertex buffer don’t need a descriptor heap??
      Other resources like depth/stencil resource, swap chain’s buffer need to have descriptor heaps. How does these resources differ from vertex buffer.
  • Advertisement
  • Advertisement

DX12 Is there a way to query max root signature size?

Recommended Posts

I have read the max root signature size is 64 DWORDS. It is basically 64 on NVIDIA and 13 on AMD. But I got this info from a person working at AMD. I can't do this for every vendor out there :P. There has to be a better way to query the max root size through the API without the need to contact a person working in the company.

So is there a way to find out the actual max root signature size for the active GPU?

Thank you

Share this post

Link to post
Share on other sites

The root signature is an abstraction, the size limit is to prevent abuse. Underneath, it is not how things works and there is no 13 limit on AMD what so ever.

What the AMD guy may have tell you is that they store stuffe up to 13 reserved SGPR and if this is not enough, they generate an extended user data buffer memory and store it in 2 SGPR. The consequence is that some of your constants or descriptors are accessed with an extra indirection.

In 99% of the time, the indirection is free because all of these are loaded with SGPR instruction and these load perfectly hide with other instructions..

If you want to know how a shader really access your data, you can take a look at the ISA by capturing a frame in PIX, 

Edited by galop1n

Share this post

Link to post
Share on other sites

It's 64 on everything.

AMD GPU's have 16 DWORDS of space in hardware available to implement the concept of the root signature. Apparently the driver eats up 3 for itself, leaving you 13 left over. If your root parameters fit into this space, then hopefully the driver will directly place them into these HW registers... otherwise the driver will allocate a temporary table, put some of your root parameters into that table, and place a pointer to the table into a HW register.

However, to further complicate things, you don't know what the actual size of different root parameters are either! D3D12 says that:
Descriptor tables cost 1 DWORD each.
Root constants cost 1 DWORD each, since they are 32-bit values.
Root descriptors (64-bit GPU virtual addresses) cost 2 DWORDs each.

And this is true regarding D3D12's 64 DWORD limit... but might not match the hardware. Perhaps a root descriptor is a different size depending on whether it's a texture or a buffer? Perhaps they're 1 DWORD, or 4 DWORDs? Perhaps the underlying hardware doesn't even have root parameter registers at all -- maybe it's all traditional style resource binding slots?

If you want to write cross platform stuff, just follow the rules of D3D12's abstraction... because there is no reliable way to get answers to these questions from the underlying hardware.

Share this post

Link to post
Share on other sites
11 minutes ago, Hodgman said:

If you want to write cross platform stuff, just follow the rules of D3D12's abstraction...

Sorry, but what are these rules?

Does it mean that I just use root descriptors for constantly changing cbvs and use sets for all other resources?

Edited by mark_braga

Share this post

Link to post
Share on other sites
1 hour ago, mark_braga said:

Sorry, but what are these rules?

That the root parameter block size is 64 DWORDs on every GPU, a table takes up 1 DWORD, a root-constant takes up 1 DWORD per element, and a root descriptor takes up 2 DWORDs.

If you really want a particular parameter to go into a HW register instead of being spilled to memory, put it earlier in the root descriptor.
e.g. Let's say that AMD can support 6 root descriptors in hardware -- if you use 12 root descriptors, the first 6 will go into HW registers (fast for shader to read, fast for driver to update), and the other 5 will spill into memory (very very slight penalty for the shader setup, more cost for the driver to update). So, if some of those descriptors change frequently, you should put the frequently-changing ones first in the root, and the rest afterwards. No matter the actual HW root register size, this will ensure that your most dynamic resources are the most likely to go into HW registers.

After that, if you want to micro-optimize for different GPUs... yeah, there is no official API to help with this. You're down to asking individual vendors for advice and applying that advice for every different GPU model out there...

Share this post

Link to post
Share on other sites

Right now AMD suggest to put your frequently changing stuff first, while nVidia say last. I would say none matters much because unless you have a very edgy case for the bound resources and access pattern, the difference in performance on the CPU and the GPU are at the level of a white noise.

I would even suggest to you to even care less, because the trend in rendering large amount of data is to go bindless, and it adds an explicit indirection in the shader from material/object properties to the actual descriptors. Cost again is negligible in practice and compensate by the shrinking amount of work the command processor has to do plus you have the possibility to prepare many of your binding in advance in immutable buffers and reuse them from frame to frame instead spending time flushing everything everyframe.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Advertisement