Jump to content

  • Log In with Google      Sign In   
  • Create Account

Interested in a FREE copy of HTML5 game maker Construct 2?

We'll be giving away three Personal Edition licences in next Tuesday's GDNet Direct email newsletter!

Sign up from the right-hand sidebar on our homepage and read Tuesday's newsletter for details!


DX12 - Documentation / Tutorials?

  • You cannot reply to this topic
34 replies to this topic

#1 Tispe   Members   -  Reputation: 1038

Like
4Likes
Like

Posted 03 September 2014 - 04:13 AM

Hi

 

 

I wanted to know if anyone here has had the chance to use/look at the Direct3D 12 API?

 

How different is it from DX9, DX11? Is it a whole paradigm shift?

 

Is it the basics: "create buffer -> copy data to buffer -> Draw from buffer" kind of thing?

 

Im mostly just familiar with DX9 and want to catch up with DX12. Should I learn DX11 while waiting for docs/tutorials?

 

Cheers!

 



Sponsor:

#2 spazzarama   Members   -  Reputation: 778

Like
0Likes
Like

Posted 03 September 2014 - 04:29 AM

Haven't heard of any releases for DX12 yet, hints are for an early access preview later this year. You can apply for the early access from a link down the page here. That same page gives a good overview of what is changing - I think the main thing is lower level of hardware abstraction. It looks like it builds upon the approach used in DX11 (e.g. still supports command lists and so on), so I think you will be safe to start with DX 11.1 or 11.2


Edited by spazzarama, 03 September 2014 - 04:31 AM.

Justin Stenning | Blog | Book - Direct3D Rendering Cookbook (using C# and SharpDX)

Projects: Direct3D Hook, EasyHookAfterglow, C#raft

@spazzarama

 

#3 tonemgub   Members   -  Reputation: 1143

Like
4Likes
Like

Posted 03 September 2014 - 07:02 AM

Closest thing to documentation right now: http://channel9.msdn.com/Events/Build/2014/3-564

(Rant: And knowing Microsoft, probably that's the only "documentation" we will ever have for a long time. With DX 10 & 11 I got the feeling they either didn't finish the documentation, or they did finish it but didn't provide it for free. I hear some guy at Futuremark (3DMark) already had access to the API to write a DX12 demo - I fear they, like other companies are probably already working directly with MS to figure out stuff, and we average Joes will again be left in the dark.)


Edited by tonemgub, 03 September 2014 - 07:04 AM.


#4 Hodgman   Moderators   -  Reputation: 30882

Like
12Likes
Like

Posted 03 September 2014 - 08:17 AM

It's not going to be out for quite a while, and no one is yet sure whether it will be compatible with older GPUs (feature levels) or older OS's (worse case is Win9+)... so I would definitely recommend learning D3D11 in the meantime biggrin.png

 

D3D9 -> D3D11 jump will be similar to the D3D12 jump, so this will still be a helpful learning exercise for you.

 

In D3D9, you have a lot of render-states, set with SetRenderState / SetSamplerState / etc... In D3D11, these have been replaced with a much smaller set of immutable state objects.

D3D12 will likely go further in the same direction, with an even smaller again set of immutable state objects. 

 

D3D11 is also multi-thread friendly in design (so you can get used to using an 'immediate' and 'deferred' contexts now), whereas D3D9 only provides a single thread/context.

D3D12 will be the same, except will perform much better (D3D11 deferred context do not actually provide good performance increases in practice).

 

The way that you bind resources to shaders will be completely different in D3D12 than in earlier D3D versions... The closest publicly available API at the moment is GL with all the "bindless" extensions...

However, the new D3D12 model of binding resources is more flexible than the current one, which means it's possible to implement D3D9/11 style resource-binding in this new API -- you just have options to do it other ways as well biggrin.png



#5 SeanMiddleditch   Members   -  Reputation: 6338

Like
5Likes
Like

Posted 03 September 2014 - 02:55 PM

(Rant: And knowing Microsoft, probably that's the only "documentation" we will ever have for a long time. With DX 10 & 11 I got the feeling they either didn't finish the documentation, or they did finish it but didn't provide it for free.


A problem they had is that the DirectX group did not generate any profit (they didn't "sell" anything) so they had a pretty low budget, leading to an inability to afford dedicated documentation people. This is also why SM4/5 aren't publicly documented.

Hopefully with all the restructuring lately the accountants will be forced to recognize that DX is a core part of the Microsoft development ecosystem and is important to Microsoft's long-term health.

#6 MJP   Moderators   -  Reputation: 11567

Like
0Likes
Like

Posted 03 September 2014 - 04:43 PM

If you're a professional developer you can apply for their early access program, which has documentation/samples/SDK available. If not...then you'll unfortunately have to wait until the public release.



#7 Alessio1989   Members   -  Reputation: 2055

Like
5Likes
Like

Posted 03 September 2014 - 05:19 PM

The first Windows 9 preview should be public at the end of this month I read, so it's perfectly plausible that at the same time we will see a first Windows SDK 9 preview. It should be out more or less in Q1-2015 (in coincidence with VS14, probably a little later).
 
So, yes, we're all waiting the next Windows SDK, hoping at least a lil of DX12 libraries will come to Windows 8.1 (capped @feature level 11.1?).
 

D3D12 will be the same, except will perform much better (D3D11 deferred context do not actually provide good performance increases in practice... or this is the excuse of AMD and Intel which do not support driver command lists).

 
Fixed cool.png
 

.... This is also why SM4/5 aren't publicly documented....

 
http://msdn.microsoft.com/library/bb509657.aspx http://msdn.microsoft.com/en-us/library/bb943998.aspx
http://msdn.microsoft.com/library/ff471356.aspx http://msdn.microsoft.com/en-us/library/hh447232.aspx
 
what kind of documentation are you searching for? unsure.png


Edited by Alessio1989, 03 September 2014 - 05:30 PM.

"Software does not run in a magical fairy aether powered by the fevered dreams of CS PhDs"


#8 Hodgman   Moderators   -  Reputation: 30882

Like
8Likes
Like

Posted 03 September 2014 - 07:12 PM

D3D12 will be the same, except will perform much better (D3D11 deferred context do not actually provide good performance increases in practice... or this is the excuse of AMD and Intel which do not support driver command lists).

Fixed :cool:
AMD support them on Mantle and multiple game console APIs. It's a back end D3D (Microsoft code) issue, forcing a single-thread in the kernel mode driver be responsible for kickoff. The D3D12 presentations have pointed out this flaw themselves.

#9 SeanMiddleditch   Members   -  Reputation: 6338

Like
4Likes
Like

Posted 03 September 2014 - 08:19 PM

what kind of documentation are you searching for?


Sorry, should've been more specific. I'm referring to documentation on the binary format to allow you to produce/consume compiled shaders like you can with SM1-3 without having to pass through Microsoft DLLs or HLSL. Consider projects like MojoShader that could make use of this functionality to decompile SM4/5 code to GLSL when porting software or a possible Linux D3D11 driver that would need to be able to compile compiled SM4/5 code into Gallium IR and eventually GPU machine code.

There's also no way with SM4/5 to write assembly and compile it which is a pain for various tools that don't work to work through HLSL or the HLSL compiler.

#10 ankhd   Members   -  Reputation: 1317

Like
-5Likes
Like

Posted 03 September 2014 - 08:52 PM

So DirectX12 is it going to be like DX10. where thery get you started then they just drop it in no time and replace it with 11 wtf. Is it Is it going to be like that.



#11 MJP   Moderators   -  Reputation: 11567

Like
5Likes
Like

Posted 03 September 2014 - 11:39 PM

 

 

D3D12 will be the same, except will perform much better (D3D11 deferred context do not actually provide good performance increases in practice... or this is the excuse of AMD and Intel which do not support driver command lists).

Fixed cool.png
AMD support them on Mantle and multiple game console APIs. It's a back end D3D (Microsoft code) issue, forcing a single-thread in the kernel mode driver be responsible for kickoff. The D3D12 presentations have pointed out this flaw themselves.

 

 

Indeed. There's also potential issues resulting from the implicit synchronization and abstracted memory management model used by D3D11 resources. D3D12 gives you much more manual control over memory and synchronization, which saves the driver from having to jump through crazy hoops when generating command buffers on multiple threads.


Edited by MJP, 03 September 2014 - 11:46 PM.


#12 Alessio1989   Members   -  Reputation: 2055

Like
0Likes
Like

Posted 04 September 2014 - 05:56 AM

 

 

D3D12 will be the same, except will perform much better (D3D11 deferred context do not actually provide good performance increases in practice... or this is the excuse of AMD and Intel which do not support driver command lists).

Fixed cool.png

 

AMD support them on Mantle and multiple game console APIs. It's a back end D3D (Microsoft code) issue, forcing a single-thread in the kernel mode driver be responsible for kickoff. The D3D12 presentations have pointed out this flaw themselves.

 

 
I know that D3D11 command lists are far away from be perfect, but AMD was the first IHV to sell DX11 GPUs (Radeon HD5000 Series) claiming "multi-threading support" as one of the big features of theirs graphics cards.
 
Here what AMD proclaims: 
 
http://www.amd.com/en-us/products/graphics/desktop/5000/5970
 

  • Full DirectX® 11 support
    • Shader Model 5.0
    • DirectCompute 11
    • Programmable hardware tessellation unit
    • Accelerated multi-threading
    • HDR texture compression
    • Order-independent transparency

 

They also claimed the same thing with DX 11.1 GPUs when WDDM 1.2 drivers came out. 

 

Yes, their driver is itself "multi-threaded" (I remember few years ago it scaled well on two cores with half CPU driver overhead), and you can always use deferred context in different "app-threads" (since they are emulated by the D3D runtime, more CPU overhead yeah!), but that's not the same thing.

 

Graphics Mafia.. ehm NVIDIA supports driver command lists, and where used in the correct they work just fine (big example: civilization 5). Yes, they also "cheat" consumers  on feature level 11.1 support (as AMD "cheated" consumers.. and developers! on tier-2 tiled resources support) and they really like to break old application and games compatibility (especially old OpenGL games), but those are other stories.


Edited by Alessio1989, 04 September 2014 - 04:25 PM.

"Software does not run in a magical fairy aether powered by the fevered dreams of CS PhDs"


#13 mhagain   Crossbones+   -  Reputation: 8134

Like
4Likes
Like

Posted 04 September 2014 - 07:01 AM

 

what kind of documentation are you searching for?


Sorry, should've been more specific. I'm referring to documentation on the binary format to allow you to produce/consume compiled shaders like you can with SM1-3 without having to pass through Microsoft DLLs or HLSL. Consider projects like MojoShader that could make use of this functionality to decompile SM4/5 code to GLSL when porting software or a possible Linux D3D11 driver that would need to be able to compile compiled SM4/5 code into Gallium IR and eventually GPU machine code.

There's also no way with SM4/5 to write assembly and compile it which is a pain for various tools that don't work to work through HLSL or the HLSL compiler.

 

 

I'm not sure what the actual problem you have here is.  It's an ID3DBlob.

 

If you want to load a precompiled shader, it's as simple as (and I'll even do it in C, just to prove the point) fopen, fread and a bunch of ftell calls to get the file size.  Similarly to save one it's fopen and fwrite.

 

Unless you're looking for something else that Microsoft actually have no obligation whatsoever to give you, that is.....


It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.


#14 iedoc   Members   -  Reputation: 1041

Like
4Likes
Like

Posted 04 September 2014 - 12:33 PM

I had an urge to start a post with a title saying "Can't wait for DX12! PLEASE HELP!"


Braynzar Soft - DirectX Lessons & Game Programming Resources!

#15 MJP   Moderators   -  Reputation: 11567

Like
6Likes
Like

Posted 04 September 2014 - 02:50 PM

 

 

what kind of documentation are you searching for?


Sorry, should've been more specific. I'm referring to documentation on the binary format to allow you to produce/consume compiled shaders like you can with SM1-3 without having to pass through Microsoft DLLs or HLSL. Consider projects like MojoShader that could make use of this functionality to decompile SM4/5 code to GLSL when porting software or a possible Linux D3D11 driver that would need to be able to compile compiled SM4/5 code into Gallium IR and eventually GPU machine code.

There's also no way with SM4/5 to write assembly and compile it which is a pain for various tools that don't work to work through HLSL or the HLSL compiler.

 

 

I'm not sure what the actual problem you have here is.  It's an ID3DBlob.

 

If you want to load a precompiled shader, it's as simple as (and I'll even do it in C, just to prove the point) fopen, fread and a bunch of ftell calls to get the file size.  Similarly to save one it's fopen and fwrite.

 

Unless you're looking for something else that Microsoft actually have no obligation whatsoever to give you, that is.....

 

 

He's specifically talking about documentation of the final bytecode format contained in that blog, so that people could write their own compilers or assemblers without having to go through d3dcompiler_*.dll, as well as being to disassemble a bytecode stream without that same DLL. That information (along with the full, complete D3D specification) is only available to driver developers. 



#16 wil_   Members   -  Reputation: 310

Like
10Likes
Like

Posted 04 September 2014 - 07:28 PM

I'm not sure if the link has already been provided in this discussion, but here it is in case you missed it:

 

"Direct3D 12 Overview" articles, a series of articles by Intel Engineer Michael Coppock, about the API design changes introduced with D3D12.

While not providing direct documentation about the API, it still contains a fairly interesting amount of information about how you will use D3D12:

 

https://software.intel.com/en-us/blogs/author/1048217



#17 Tispe   Members   -  Reputation: 1038

Like
0Likes
Like

Posted 08 September 2014 - 01:57 AM

The DX12 overview indicates that the "unlimited memory" that the managed pool offers will be replaceable with costum memory management.

 

Say your typical low end graphics card has 512MB - 1GB of memory. Is it realistic to say that the total data required to draw a complete frame is 2GB, would that mean that the GPU memory would have to be refreshed 2-5+ times every frame?

 

Do I need to start batching based on buffer sizes? 



#18 tonemgub   Members   -  Reputation: 1143

Like
0Likes
Like

Posted 08 September 2014 - 07:23 AM


Is it realistic to say that the total data required to draw a complete frame is 2GB

Unless you have another idea, this is completely unrealistic. The amount of data required for one frame should be somewhere in the order of megabytes...  And DX11.2 minimizes the memory requirement with "tiled resources".

 

 


would that mean that the GPU memory would have to be refreshed 2-5+ times every frame?

This is not the case, but even if it was... The article is not very clear on this. It says that the driver will tell the operating system to copy resources into GPU memory (from system memory) as required, but only the application can free those resources once all of the queued commands using those resources have been processed by the GPU. It's not clear if the resources can also be released (from GPU memory, by the OS) during the processing of already queued commands, to make room for the next 512MB (or 1GB, or whatever size) of your 2GB data. But my guess is that this is not possible. This would imply that the application's "swap resource" request could somehow be plugged-into the driver/GPU's queue of commands, to release unused resource memory intermediately, which is probably not possible, since (also according to the article), the application has to wait for all of the queued commands in a frame to be executed, before it knows which resources are no longer needed. Also, "the game already knows that a sequence of rendering commands refers to a set of resources" - this also implies that the application (not even the OS) can only change resource residency in-between frames (sequence of rendering commands), not during a single frame. Also, DX12 is only a driver/application-side improvement over DX11. Adding memory management capabilities to the GPU itself would also require a hardware-side redesign.

 

 


Do I need to start batching based on buffer sizes?

If you think that you'll need to use 2GB (or more than the recommended/available resource limits) of data per frame, then yes. Otherwise, no.


Edited by tonemgub, 08 September 2014 - 07:28 AM.


#19 Hodgman   Moderators   -  Reputation: 30882

Like
11Likes
Like

Posted 12 September 2014 - 08:29 AM

Say your typical low end graphics card has 512MB - 1GB of memory. Is it realistic to say that the total data required to draw a complete frame is 2GB, would that mean that the GPU memory would have to be refreshed 2-5+ times every frame?
Do I need to start batching based on buffer sizes?

It's always been the case that you shouldn't use more memory than the GPU actually has, because it results in terrible performance. So, assuming that you've always followed this advice, you don't have to do much work in the future wink.png
 

The article is not very clear on this. It says that the driver will tell the operating system to copy resources into GPU memory (from system memory) as required, but only the application can free those resources once all of the queued commands using those resources have been processed by the GPU. It's not clear if the resources can also be released (from GPU memory, by the OS) during the processing of already queued commands, to make room for the next 512MB (or 1GB, or whatever size) of your 2GB data. But my guess is that this is not possible. This would imply that the application's "swap resource" request could somehow be plugged-into the driver/GPU's queue of commands, to release unused resource memory intermediately, which is probably not possible, since (also according to the article), the application has to wait for all of the queued commands in a frame to be executed, before it knows which resources are no longer needed. Also, "the game already knows that a sequence of rendering commands refers to a set of resources" - this also implies that the application (not even the OS) can only change resource residency in-between frames (sequence of rendering commands), not during a single frame.

If D3D12 is going down the same path as the other low-level API's, the resources as we know them in D3D don't really exist any more.
 
A resource such as a texture ceases to exist. Instead, you just get a form of malloc/free to use as you will. You can malloc/free memory whenever you want, but freeing memory too soon (while command lists referencing that memory are still in flight) will be undefined behavior (logged by the debug runtime, likely to cause corruption in the regular runtime).
 
Resource-views stay pretty much as-is, but instead of creating a resource-view that points to a resource, instead they just have a raw pointer inside them, which points somewhere into one of the gpu-malloc allocations that you've previously made. These resource-view objects will hopefully be POD blobs instead of COM-objects, which can easily be copied around into your descriptor tables. These 'view' structures are in native-to-the-GPU formats, and will be read directly as-is by your shader programs executing on the GPU.
 
This is basically what's going on already inside D3D, but it's hidden behind a thick layer of COM abstraction.
At the moment, the driver/runtime has to track which "resources" are used by a command list, and from there figure out which range of memory addresses are used by a command list.
The command-list and this list of memory-ranges is passed down to the Windows display manager, which is responsible for virtualizing the GPU and sharing it between processes. It stores this info in a system-wide queue, and eventually gets around to ensuring that your range of (virtual) memory addresses are actually resident in (physical) GPU-RAM and are correctly mapped, and then it submits the command list.
 
At the moment, it's up to D3D to internally keep track of how many megabytes of memory is required by a command-list (how many megabytes of resources are referenced by that command list). Currently, D3D is likely ending your internal command-list early when it detects you're using too much memory, submitting this partial command-list, and then starting a new command list for the rest of the frame.
 

Also, DX12 is only a driver/application-side improvement over DX11. Adding memory management capabilities to the GPU itself would also require a hardware-side redesign.

This kind of memory management is already required in order to implement the existing D3D runtime - pretending that the managed pool can be of unlimited size requires that the runtime can submit partial command buffers and page resources in and out of GPU-RAM during a frame.
 
There's already lots of HW features available to allow this biggrin.png
Both the CPU and the GPU use virtual-addressing, where the value of a pointer doesn't necessarily correspond to a physical address in RAM.
Generally, most pointers (i.e. virtual addresses) we use on the CPU are mapped to physical "main RAM", but pointers can also be mapped to IO devices, or other bits of RAM, such as RAM that's physically on the GPU.
The most basic system is then for us to use an event, such that when the GPU executes that event command, it writes a '1' into an area of memory that we've previously initialized with a zero. The CPU can submit the command buffer containing this event command, and then poll that memory location until it contains a '1', indicating the GPU has completed the commands preceding the event. The CPU can then map physical GPU memory into the CPU's address space, and memcpy new data into it.
This is a slow approach though - it requires the CPU to waste time doing memcpys... but worse, because memcpy'ing from the CPU into GPU-RAM is much slower than to regular RAM!
 
Another approach is to get the GPU to do the memcpy. Instead, you map some CPU-side physical memory into the GPU's virtual address space, and at the end of the command buffer, insert a dispatch command that launches a compute shader that just reads from the CPU-side pointer and writes to a GPU-side pointer.
This frees up the CPU, but wastes precious GPU-compute time on something as basic as a memcpy. On that note - yep, the GPU can read from CPU RAM at any time really - you could just leave your textures and vertex buffers in CPU-RAM if you liked... but performance would be much worse... Also, on Windows, you can't have too much CPU-RAM mapped into GPU-address-space at any one time or you degrade system wide performance (as it requires pinning the CPU-side pages / marking them as unable to be paged out).
Even older GPUs that don't have compute capabilities will have a mechanism to implement this technique -- it just easier to explain if we talk about a memcpy compute shader wink.png
 
Lastly, modern GPUs have dedicated "DMA units", which are basically just asynchronous memcpy queues. As well as your regular command buffer full of draw/dispatch commands, you can have a companion buffer that contains DMA commands. You can insert a DMA command that says to memcpy some data from CPU-RAM to GPU-RAM, but before it, insert a command that says "wait until the word at address blah changes from a 0 to a 1". We can also put an event at the end of our drawing comamnd buffer like in the first example, which lets the DMA queue know that it's time to proceed. This can be an amazing solution as it has zero impact on the CPU or GPU!
 
Instead of D3D just doing all this magically internally, if you want to use an excessive amount of memory in your app (more than the GPU can handle), then in D3D12 it's going to be up to you to implement this crap...

If you don't want to use an excessive amount of RAM, the only thing you'll have to do is, while you're submitting draw-calls into your command buffer, you'll also have to generate a list of resources that are going to be used by that command buffer, so you can inform Windows which bits of GPU-RAM are going to be potentially read/written by your command buffer.



#20 tonemgub   Members   -  Reputation: 1143

Like
4Likes
Like

Posted 12 September 2014 - 02:50 PM

Thanks, Hodgman! Really good explanation!

 

 

 


Quote

Also, DX12 is only a driver/application-side improvement over DX11. Adding memory management capabilities to the GPU itself would also require a hardware-side redesign.

This kind of memory management is already required in order to implement the existing D3D runtime - pretending that the managed pool can be of unlimited size requires that the runtime can submit partial command buffers and page resources in and out of GPU-RAM during a frame.

What I meant to point out by that (and this was the main conclusion I reached with my train of thoughts) was that the CPU is still the one doing the heavy-lifting when it comes to memory management. But now that I think about it, I guess it makes no difference - the main bottleneck is having to do an extra "memcpy" when there's not enough video memory.

 

For your explanation of how DMA could be used to make this work, that method would have to also be used -for example- when all of that larger-than-video-memory resource is being accessed by the GPU in the same, single shader invocation? Or that shader invocation would (somehow) have to be broken up into the subsequently generated command lists? Does that mean that the DirectX pipeline is also virtualized on the CPU?

 

Anyway, I think the main question that must be answered here is if the resource limits imposed by DX11 will go away In DX12. Yes, theoretically (and perhaps even practically) the CPU & GPU could be programmed into working together to provide virtually unlimited memory, but will this really be the case with DX12? From what I can tell, that article implies that the "unlimited memory" has to be implemented as "custom memory management" done by the application - not the runtime, nor the driver or GPU. This probably also means that it will be the application's job to split the processing of the large data/resources into multiple command lists, and I don't think the application will be allowed to use that DMA-based synchronisation method (or trick? smile.png ) that you explained.

Edit: Wait. That's how tiled resources already work. Never mind... :)


Edited by tonemgub, 12 September 2014 - 03:18 PM.






PARTNERS