Sign in to follow this  

DX12 DX12 - Documentation / Tutorials?

This topic is 1203 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Haven't heard of any releases for DX12 yet, hints are for an early access preview later this year. You can apply for the early access from a link down the page here. That same page gives a good overview of what is changing - I think the main thing is lower level of hardware abstraction. It looks like it builds upon the approach used in DX11 (e.g. still supports command lists and so on), so I think you will be safe to start with DX 11.1 or 11.2

Edited by spazzarama

Share this post


Link to post
Share on other sites

Closest thing to documentation right now: http://channel9.msdn.com/Events/Build/2014/3-564

(Rant: And knowing Microsoft, probably that's the only "documentation" we will ever have for a long time. With DX 10 & 11 I got the feeling they either didn't finish the documentation, or they did finish it but didn't provide it for free. I hear some guy at Futuremark (3DMark) already had access to the API to write a DX12 demo - I fear they, like other companies are probably already working directly with MS to figure out stuff, and we average Joes will again be left in the dark.)

Edited by tonemgub

Share this post


Link to post
Share on other sites

If you're a professional developer you can apply for their early access program, which has documentation/samples/SDK available. If not...then you'll unfortunately have to wait until the public release.

Share this post


Link to post
Share on other sites

what kind of documentation are you searching for?


Sorry, should've been more specific. I'm referring to documentation on the binary format to allow you to produce/consume compiled shaders like you can with SM1-3 without having to pass through Microsoft DLLs or HLSL. Consider projects like MojoShader that could make use of this functionality to decompile SM4/5 code to GLSL when porting software or a possible Linux D3D11 driver that would need to be able to compile compiled SM4/5 code into Gallium IR and eventually GPU machine code.

There's also no way with SM4/5 to write assembly and compile it which is a pain for various tools that don't work to work through HLSL or the HLSL compiler.

Share this post


Link to post
Share on other sites

So DirectX12 is it going to be like DX10. where thery get you started then they just drop it in no time and replace it with 11 wtf. Is it Is it going to be like that.

Share this post


Link to post
Share on other sites

 

 

D3D12 will be the same, except will perform much better (D3D11 deferred context do not actually provide good performance increases in practice... or this is the excuse of AMD and Intel which do not support driver command lists).

Fixed cool.png

 

AMD support them on Mantle and multiple game console APIs. It's a back end D3D (Microsoft code) issue, forcing a single-thread in the kernel mode driver be responsible for kickoff. The D3D12 presentations have pointed out this flaw themselves.

 

 
I know that D3D11 command lists are far away from be perfect, but AMD was the first IHV to sell DX11 GPUs (Radeon HD5000 Series) claiming "multi-threading support" as one of the big features of theirs graphics cards.
 
Here what AMD proclaims: 
 
http://www.amd.com/en-us/products/graphics/desktop/5000/5970
 

  • Full DirectX® 11 support
    • Shader Model 5.0
    • DirectCompute 11
    • Programmable hardware tessellation unit
    • Accelerated multi-threading
    • HDR texture compression
    • Order-independent transparency

 

They also claimed the same thing with DX 11.1 GPUs when WDDM 1.2 drivers came out. 

 

Yes, their driver is itself "multi-threaded" (I remember few years ago it scaled well on two cores with half CPU driver overhead), and you can always use deferred context in different "app-threads" (since they are emulated by the D3D runtime, more CPU overhead yeah!), but that's not the same thing.

 

Graphics Mafia.. ehm NVIDIA supports driver command lists, and where used in the correct they work just fine (big example: civilization 5). Yes, they also "cheat" consumers  on feature level 11.1 support (as AMD "cheated" consumers.. and developers! on tier-2 tiled resources support) and they really like to break old application and games compatibility (especially old OpenGL games), but those are other stories.

Edited by Alessio1989

Share this post


Link to post
Share on other sites

 

what kind of documentation are you searching for?


Sorry, should've been more specific. I'm referring to documentation on the binary format to allow you to produce/consume compiled shaders like you can with SM1-3 without having to pass through Microsoft DLLs or HLSL. Consider projects like MojoShader that could make use of this functionality to decompile SM4/5 code to GLSL when porting software or a possible Linux D3D11 driver that would need to be able to compile compiled SM4/5 code into Gallium IR and eventually GPU machine code.

There's also no way with SM4/5 to write assembly and compile it which is a pain for various tools that don't work to work through HLSL or the HLSL compiler.

 

 

I'm not sure what the actual problem you have here is.  It's an ID3DBlob.

 

If you want to load a precompiled shader, it's as simple as (and I'll even do it in C, just to prove the point) fopen, fread and a bunch of ftell calls to get the file size.  Similarly to save one it's fopen and fwrite.

 

Unless you're looking for something else that Microsoft actually have no obligation whatsoever to give you, that is.....

Share this post


Link to post
Share on other sites

The DX12 overview indicates that the "unlimited memory" that the managed pool offers will be replaceable with costum memory management.

 

Say your typical low end graphics card has 512MB - 1GB of memory. Is it realistic to say that the total data required to draw a complete frame is 2GB, would that mean that the GPU memory would have to be refreshed 2-5+ times every frame?

 

Do I need to start batching based on buffer sizes? 

Share this post


Link to post
Share on other sites

Is it realistic to say that the total data required to draw a complete frame is 2GB

Unless you have another idea, this is completely unrealistic. The amount of data required for one frame should be somewhere in the order of megabytes...  And DX11.2 minimizes the memory requirement with "tiled resources".

 

 


would that mean that the GPU memory would have to be refreshed 2-5+ times every frame?

This is not the case, but even if it was... The article is not very clear on this. It says that the driver will tell the operating system to copy resources into GPU memory (from system memory) as required, but only the application can free those resources once all of the queued commands using those resources have been processed by the GPU. It's not clear if the resources can also be released (from GPU memory, by the OS) during the processing of already queued commands, to make room for the next 512MB (or 1GB, or whatever size) of your 2GB data. But my guess is that this is not possible. This would imply that the application's "swap resource" request could somehow be plugged-into the driver/GPU's queue of commands, to release unused resource memory intermediately, which is probably not possible, since (also according to the article), the application has to wait for all of the queued commands in a frame to be executed, before it knows which resources are no longer needed. Also, "the game already knows that a sequence of rendering commands refers to a set of resources" - this also implies that the application (not even the OS) can only change resource residency in-between frames (sequence of rendering commands), not during a single frame. Also, DX12 is only a driver/application-side improvement over DX11. Adding memory management capabilities to the GPU itself would also require a hardware-side redesign.

 

 


Do I need to start batching based on buffer sizes?

If you think that you'll need to use 2GB (or more than the recommended/available resource limits) of data per frame, then yes. Otherwise, no.

Edited by tonemgub

Share this post


Link to post
Share on other sites

Thanks, Hodgman! Really good explanation!

 

 

 


Quote

Also, DX12 is only a driver/application-side improvement over DX11. Adding memory management capabilities to the GPU itself would also require a hardware-side redesign.

This kind of memory management is already required in order to implement the existing D3D runtime - pretending that the managed pool can be of unlimited size requires that the runtime can submit partial command buffers and page resources in and out of GPU-RAM during a frame.

What I meant to point out by that (and this was the main conclusion I reached with my train of thoughts) was that the CPU is still the one doing the heavy-lifting when it comes to memory management. But now that I think about it, I guess it makes no difference - the main bottleneck is having to do an extra "memcpy" when there's not enough video memory.

 

For your explanation of how DMA could be used to make this work, that method would have to also be used -for example- when all of that larger-than-video-memory resource is being accessed by the GPU in the same, single shader invocation? Or that shader invocation would (somehow) have to be broken up into the subsequently generated command lists? Does that mean that the DirectX pipeline is also virtualized on the CPU?

 

Anyway, I think the main question that must be answered here is if the resource limits imposed by DX11 will go away In DX12. Yes, theoretically (and perhaps even practically) the CPU & GPU could be programmed into working together to provide virtually unlimited memory, but will this really be the case with DX12? From what I can tell, that article implies that the "unlimited memory" has to be implemented as "custom memory management" done by the application - not the runtime, nor the driver or GPU. This probably also means that it will be the application's job to split the processing of the large data/resources into multiple command lists, and I don't think the application will be allowed to use that DMA-based synchronisation method (or trick? smile.png ) that you explained.

Edit: Wait. That's how tiled resources already work. Never mind... :)

Edited by tonemgub

Share this post


Link to post
Share on other sites

It's always been the case that you shouldn't use more memory than the GPU actually has, because it results in terrible performance. So, assuming that you've always followed this advice, you don't have to do much work in the future 

 

So in essence, a game rated to have minimum 512MB VRAM (does DX have memory requirements?) never uses more then that for any single frame/scene?

 

You would think that AAA-games that require tens of gigabytes of disk space would at some point use more memory in a scene then what is available on the GPU. Is this just artist trickery to keep any scene below rated gpu memory? 

Edited by Tispe

Share this post


Link to post
Share on other sites


Tiled resources tie in with the virtual address space stuff. Say you've got a texture that exists in an allocation from pointer 0x10000 to 0x90000 (a 512KB range) -- you can think of this allocation being made up of 8 individual 64KB pages.
Tiled resources are a fancy way of saying that the entire range of this allocation doesn't necessarily need to be 'mapped' / has to actually translate to a physical allocation.
It's possible that 0x10000 - 0x20000 is actually backed by physical memory, but 0x20000 - 0x90000 aren't actually valid pointers (much like a null pointer), and they don't correspond to any physical location.
This isn't actually new stuff -- at the OS level, allocating a range of the virtual address space (allocating yourself a new pointer value) is actually a separate operation to allocating some physical memory, and then creating a link between the two. The new part that makes this extremely useful is a new bit of shader hardware -- When a shader tries to sample a texel from this texture, it now gets an additional return value indicating whether the texture-fetch actually suceeded or not (i.e. whether the resource pointer was actually valid or not). With older hardware, fetching from an invalid resource pointer would just crash (like they do on the CPU), but now we get error flags.
 
This means you can create absolutely huge resources, but then on the granularity of 64KB pages, you can determine whether those pages are physically actually allocated or not. You can use this so that the application can appear simple, and just use huge textures, but then the engine/driver/whatever can intelligently allocate/deallocate parts of those textures as required.

 

So what you are saying is that we CAN have 2GB+ of game resources allocated on the GPU VRAM using virtual addresses just fine. But only when we need them we need to tell the driver to actually page things in to VRAM?

 

Assume now that a modern computer has atleast 16GB of system memory. And a game has 8GB of resources and the GPU has 2GB VRAM. So in this situation a DX12 game would just copy all game data from disk to system memory (8GB), then allocate those 8GB to the VRAM and create 8GB of resources, even though the physical limit is 2GB. Command queues would then tell the driver what parts of those 8GB to page in and out? But is that not just what managed pool does anyway?

Share this post


Link to post
Share on other sites

It's pretty much VirtualAlloc & VirtualFree for the GPU.

You still have to manually manage the memory yourself, flagging pages as appropriate and loading/inflating data from disk/RAM to VRAM on need.

 

Virtual Textures were available in hardware on 3DLabs cards 10 years ago, long before "Mega Textures" ever existed...

 

Doing manually allows you to predict what you'll need, and keep things compressed in RAM, that's not the case for Managed which needs you to upload everything in final format and let the system page in/out on use [which is, too late to avoid a performance hit].

Edited by Ingenu

Share this post


Link to post
Share on other sites
But is that not just what managed pool does anyway?

 

The managed pool needs to swap in and out entire textures.  Say you've a 2048x2048 texture but your draw call is only going to reference a small portion of it.  With the managed pool the entire texture needs to be swapped in in order for this to happen.  With proper virtualization of textures only the small portion that is being used (in practice it will probably be a litle bigger, in the order of a 64k tile) will get swapped in.  That's an efficiency win straight away.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Similar Content

    • By VietNN
      Hi all,
      I want to copy  just 1 mipmap level of a texture and I am doing like this:
      void CopyTextureRegion( &CD3DX12_TEXTURE_COPY_LOCATION(pDstData, mipmapIndex), 0, 0, 0, &CD3DX12_TEXTURE_COPY_LOCATION(pSrcData, pLayout), nullptr ); - pDstData : is DEFAULT_HEAP, pSrcData is UPLOAD_HEAP(buffer size was get by GetCopyableFootprints from pDstData with highest miplevel), pLayout is D3D12_PLACED_SUBRESOURCE_FOOTPRINT
      - I think the mipmapIndex will point the exact location data of Dest texture, but does it know where to get data location from Src texture because pLayout just contain info of this mipmap(Offset and Footprint).  (???)
      - pLayout has a member name Offset, and I try to modify it but it(Offset) need 512 Alignment but real offset in Src texture does not.
      So what I need to do to match the location of mip texture in Src Texture ?
      @SoldierOfLight @galop1n
    • By _void_
      Hello!
      I am wondering if there is a way to find out how many resources you could bind to the command list directly without putting them in a descriptor table.
      Specifically, I am referring to these guys:
      - SetGraphicsRoot32BitConstant
      - SetGraphicsRoot32BitConstants
      - SetGraphicsRootConstantBufferView
      - SetGraphicsRootShaderResourceView
      - SetGraphicsRootUnorderedAccessView
      I remember from early presentations on D3D12 that the count of allowed resources is hardware dependent and quite small. But I would like to learn some more concrete figures.
    • By lubbe75
      I am trying to set up my sampler correctly so that textures are filtered the way I want. I want to use linear filtering for both min and mag, and I don't want to use any mipmap at all.
      To make sure that mipmap is turned off I set the MipLevels to 1 for my textures.
      For the sampler filter I have tried all kind of combinations, but somehow the mag filter works fine while the min filter doesn't seem to work at all. As I zoom out there seems to be a nearest point filter.
      Is there a catch in Dx12 that makes my min filter not working?
      Do I need to filter manually in my shader? I don't think so since the mag filter works correctly.
      My pixel shader is just a simple texture lookup:
      textureMap.Sample(g_sampler, input.uv); My sampler setup looks like this (SharpDX):
      sampler = new StaticSamplerDescription() { Filter = Filter.MinMagLinearMipPoint, AddressU = TextureAddressMode.Wrap, AddressV = TextureAddressMode.Wrap, AddressW = TextureAddressMode.Wrap, ComparisonFunc = Comparison.Never, BorderColor = StaticBorderColor.TransparentBlack, ShaderRegister = 0, RegisterSpace = 0, ShaderVisibility = ShaderVisibility.Pixel, };  
    • By lubbe75
      Does anyone have a working example of how to implement MSAA in DX12? I have read short descriptions and I have seen code fragments on how to do it with DirectX Tool Kit.
      I get the idea, but with all the pipeline states, root descriptions etc I somehow get lost on the way.
      Could someone help me with a link pointing to a small implementation in DirectX 12 (or SharpDX with DX12)?
       
    • By HD86
      I have a vertex buffer on a default heap. I need a CPU pointer to that buffer in order to loop through the vertices and change one value in some vertices (the color value). In the past this was possible by creating the buffer with the flag D3DUSAGE_DYNAMIC/D3D11_USAGE_DYNAMIC and using IDirect3DVertexBuffer9::Lock or ID3D11DeviceContext::Map to get a pointer.
      What is the correct way to do the same in DX 12? As far as I understand, the method ID3D12Resource::Map cannot be used on a default heap because default heaps cannot be accessed directly from the CPU. The documentation says that upload heaps are intended for CPU-write-once, GPU-read-once usage, so I don't think these are equivalent to the "dynamic" buffers. Is the readback heap equivalent to what was called a dynamic buffer? Or should I create a custom heap?
      I am thinking to do the following:
      -Create a temporary readback heap.
      -Copy the data from the default heap to the readback heap using UpdateSubresources.
      -Get a CPU pointer to the readback heap using Map and edit the data.
      -Copy the data back to the default heap using UpdateSubresources.
      What do you think about this?
  • Popular Now