DX12 - Documentation / Tutorials?

Started by
33 comments, last by Alessio1989 9 years, 6 months ago
  • But if you never use more then physical available VRAM in a scene, why would you ever need to page data in and out during that scene?
  • Is it not only when new things come into the scene and old stuff leaves the scene we gotta put them in and out of gpu memory?
  • Does this memory management stuff really increase framerate? Or is it just loading times that gets better?
  • Also, what data is actually "paged out", unless the compute shader make some data for the CPU, what is there to "copy out"?

1) On an embedded system / game console -- yep, if you stay below the limits, nothing will be paged in/out.
On a desktop PC, you're not the only user of the GPU. Every single other process that's running is likely using the GPU as well -- Windows is using the GPU to draw the desktop, etc. So, you're actually using a "virtual GPU" -- Windows lets you think you're the only one using, and behind the scenes the OS has a fancy manager that combines all the "virtual GPU" objects together and lets them all share a single physical GPU. At any time, Windows might have to evict some of your data from VRAM so that it can put some it's own data (or data from another App) in there instead.

2/4) Ignoring the above situation, where on PC we're using a virtualized GPU, shared with many processes... It's only when a new resource is created, or:
When the total resources used by your game is bigger than the available VRAM *and* the set of resources that are referenced by this frame's command buffer contains resources that aren't currently in VRAM -- in this situation, those resources need to be moved into VRAM. In order to do this we probably have to make space by kicking some other resources out of VRAM -- so we have to find some resources that are not present in the current set, and then memcpy them back to main RAM first.

The data that's paged in and out of VRAM could be anything -- the things paged in are whatever is required for this frame, and the things paged out could be anything else currently present in VRAM.

3) The stuff in (1)/(2) is just about implementing virtual memory on the GPU. This is completely unrelated to performance, it's just a convenience feature of modern computers that allows developers to not care how much physical RAM actually exists.
e.g. on a PC that only has 1GB of physical RAM, you can still write a program containing the statement: new char[2*1024*1024*1024] (allocate 2GB), and it will work, thanks to virtual memory.

Windows (and hence Direct3D/OpenGL) do the same thing on the GPU. If you go over the physical limits, they continue to work, because you're using virtual memory and virtual devices.

On D3D11 and earlier, the stuff in (1)/(2) is completely automatic -- as it's building a command buffer, every time you bind a texture/buffer/etc, D3D internally adds it to a Set<ID3D11Resource*>. When submitting the command list, D3D passes this set of resources down to Windows, so it knows which virtual allocations have to be physically present that frame.

In D3D12, all these nice automatic features are being removed, so it's going to be our responsibility to create this set of required resources ourselves.

This is getting confusing though, because there's actually two things being discussed here. Above is how virtual VRAM works. On D11 it's automatic, on D12 we'll just have to create the Set<ID3D11Resource*> ourselves (or more like struct VramRegion{ size_t offset; size_t size; }; Set<VramRegion> ...;).

Now forget about virtual memory. Ideally, we ignore the fact that we're on Windows (using a virtual GPU) and we try to ensure that we don't use more RAM than is physically available. In this situation, virtual memory isn't a concern. This is the situation that console game developers are in (or full-screen games on PC with Windows GPU compositing disabled and no background processes using the GPU).

All of the new custom memory management stuff means that you don't have to rely on virtual memory. If the GPU only has 1GB of physical RAM, you can allocate 1GB of RAM and not one byte more. If you want to move stuff in and out of RAM, then instead of relying on windows doing it, you can implement your own schemes.

It's extremely common for games to implement texture-streaming systems - a level might have 2GB of texture data, but the GPU budget for textures is only 256MB. As you walk around the level, the game engine is changing the resolution of different textures, streaming new data from disc to VRAM depending on which part of the level you're in.

Same with vertex data and model LOD's, etc, etc...

With manual memory management, this kind of stuff is much easier to implement, as well as being more efficient...

e.g. Normally, you create a texture and then start drawing objects with it. Internally, the driver has to insert "Wait" commands before your draw-calls, which will stall the GPU if the texture transfer hasn't yet completed... This really isn't what the game engine wants.

Now, the game engine can explicitly tell the driver that it wants to asynchronously begin transferring the texture, and it would like to receive a notification when this transfer is complete. The engine can then draw objects with a low-res texture in the meantime, and then switch to drawing with the high-res texture once it's actually been transferred. This removes the potential GPU stalls.

e.g. #2 - with regular ID3D11Texture's etc, you have no control over where in memory your textures get allocated. With manual management, the game engine can pre-reserve a huge block to use for texture streaming. When textures are no longer needed, those areas of the block can be marked as being 'free', and then asynchronous background transfers can be used to implement compaction-based defragmentation of VRAM. A large amount of memory is usually lost to fragmentation, being wasted -- by having control over how things are allocated, and being able to write your own defragmentor allows you to reclaim this waste and effectively have more available RAM.

Advertisement

Let's see if I got this right. So the behind the scenes paging in and out based on the windows virtual gpu sharing stuff is not something DX12 game developers will see (if windows want to evics something for another process)?

Instead DX12 game developers can predict what textures we want to draw soon and start streaming thoese textures in beforehand. But DX12 does not guarantee that a texture currently in use will not be tossed out of VRAM by windows? Which is transparent to the programmer (goes on in the background)? In which case the GPU is stalled until the resource is paged back in?

So we can expect better performance, given that windows aren't paging things in and out in the background?

Pretty much yes.

In fullscreen mode you may expect the rules to be bent in your favour, so things won't be swapped out of VRAM.

-* So many things to do, so little time to spend. *-

I suppose by now you all heard about Windows 10 going to be released soon?

Apparently DX 12 will ship with it. :-)

http://blogs.msdn.com/b/directx/archive/2014/10/01/directx-12-and-windows-10.aspx

actually dx12 with the new windows sdk is in closed beta -.-

"Recursion is the first step towards madness." - "Skegg?ld, Skálm?ld, Skildir ro Klofnir!"
Direct3D 12 quick reference: https://github.com/alessiot89/D3D12QuickRef/

This topic is closed to new replies.

Advertisement