• Advertisement

DX12 Unbind UAV, then write to it - DX 11 vs 12

Recommended Posts


I can see that when there's a write to UAVs in CS or PS, and I bind a null ID3D11UnorderedAccessView into a used UAV slot, the GPU won't hang and the writes are silently dropped. I hope I amn't dreaming.

With DX12, I can't seem to emulate this. I reckon it's impossible. The shader just reads the descriptor of the UAV (from a register/offset based on the root signature layout) and does an "image_store" at some offset from the base address. If it's unmapped, bang, we're dead. I tried zeroing out that GPU visible UAV's range in the table, same result. Such an all-zero UAV descriptor doesn't seem very legit. That's expected.

Am I right? How does DX11 do it that it survives this? Does it silently patch the shader or what? Thanks, .P

Edited by pcmaster

Share this post

Link to post
Share on other sites

How did you zero it out? Did you call ZeroMemory (or similar) on the CPU descriptor handle? That's... not the right way to do it. What you're looking for is a null view. You can call CreateUnorderedAccessView with a null resource, but you must pass a valid view desc. That'll set up the descriptor such that reads return 0 (or last-written data, depending on architecture) and writes are dropped.

Share this post

Link to post
Share on other sites

Hi, SoL! I was looking exactly for this, is it actually documented anywhere? :) I'm on an unnamed architecture where I could do a memset... as I say it didn't seem very legit. I'm just trying what you propose.

Share this post

Link to post
Share on other sites

It does work!

dummyUavDesc.ViewDimension = D3D12_UAV_DIMENSION_TEXTURE3D;
dummyUavDesc.Texture3D.FirstWSlice = 0;
dummyUavDesc.Texture3D.MipSlice = 0;
dummyUavDesc.Texture3D.WSize = 2048;
dummyUavDesc.Format = DXGI_FORMAT_R8G8B8A8_SNORM;
pD3D12Device->CreateUnorderedAccessView(nullptr, nullptr, &dummyUavDesc, cpuHandle);

CreateUnorderedAccessView writes all zeroes to the cpuHandle designated memory. CopyDescriptors() copies the zeroes correctly to the contiguous GPU visible descriptor table and the GPU recognises this. All cool. Thank you SoL!


Edited by pcmaster

Share this post

Link to post
Share on other sites
On 11/28/2017 at 8:02 AM, pcmaster said:

Hi, SoL! I was looking exactly for this, is it actually documented anywhere? I'm on an unnamed architecture where I could do a memset... as I say it didn't seem very legit. I'm just trying what you propose.

It's mentioned in the docs for CreateUnorderedAccessView:

At least one of pResource or pDesc must be provided. A null pResource is used to initialize a null descriptor, which guarantees D3D11-like null binding behavior (reading 0s, writes are discarded), but must have a valid pDesc in order to determine the descriptor type.

It's also mentioned here in the programming guide.

One thing to watch out for is there's no way to have a NULL descriptor for a UAV or SRV  that's bound as a root SRV/UAV parameter. In this case there's really no descriptor (you're just passing a GPU pointer to the buffer data), so you forego any bounds checking on reads or writes. Just like raw pointer access on the CPU, reading or writing out-of-bounds will result in undefined behavior.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Advertisement
  • Advertisement
  • Popular Tags

  • Advertisement
  • Popular Now

  • Similar Content

    • By d3daywan
      【DirectX9 Get shader bytecode】
      I hook DrawIndexedPrimitive
          HookCode(PPointer(g_DeviceBaseAddr + $148)^,@NewDrawIndexedPrimitive, @OldDrawIndexedPrimitive);    
          function NewDrawIndexedPrimitive(const Device:IDirect3DDevice9;_Type: TD3DPrimitiveType; BaseVertexIndex: Integer; MinVertexIndex, NumVertices, startIndex, primCount: LongWord): HResult; stdcall;
              ppShader: IDirect3DVertexShader9;
              Device.GetVertexShader(ppShader);//<------1.Get ShaderObject(ppShader)
              ppShader.GetFunction(_Code,_CodeLen);//<----2.Get bytecode from ShaderObject(ppShader)
              Result:=OldDrawIndexedPrimitive(Self,_Type,BaseVertexIndex,MinVertexIndex, NumVertices, startIndex, primCount);
      【How to DirectX11 Get VSShader bytecode?】
      I hook DrawIndexed
          pDrawIndexed:=PPointer(PUINT_PTR(UINT_PTR(g_ImmContext)+0)^ + 12 * SizeOf(Pointer))^;
          procedure NewDrawIndexed(g_Real_ImmContext:ID3D11DeviceContext;IndexCount:     UINT;StartIndexLocation: UINT;BaseVertexLocation: Integer); stdcall;
              game_pVertexShader: ID3D11VertexShader;
                  ppClassInstances: ID3D11ClassInstance;
                  NumClassInstances: UINT
              g_Real_ImmContext.VSGetShader(game_pVertexShader,ppClassInstances,NumClassInstances);    //<------1.Get ShaderObject(game_pVertexShader)
              .....//<----【2.Here's how to get bytecode from ShaderObject(game_pVertexShader)?】
              OldDrawIndexed(ImmContext, IndexCount, StartIndexLocation, BaseVertexLocation);

      Another way:
      HOOK CreateVertexShader()
      HOOK need to be created before the game CreateVertexShader, HOOK will not get bytecode if the game is running later,I need to get bytecode at any time like DirectX9
    • By matt77hias
      Is it ok to bind nullptr shader resource views and sample them in some shader? I.e. is the resulting behavior deterministic and consistent across GPU drivers? Or should one rather bind an SRV to a texture having just a single black texel?
    • By matt77hias
      Is it common to have more than one ID3D11Device and/or associated immediate ID3D11DeviceContext?
      If I am correct a single display subsystem (GPU, video memory, etc.) is completely determined (from a 3D rendering perspective) by a
      IDXGIAdapter (meta functionality facade); ID3D11Device (resource creation facade); ID3D11DeviceContext (pipeline facade). So given that you want to use multiple display subsystems, you will have to handle multiple of these interfaces. A concrete example would be a graphics card dedicated to rendering and a separate graphics card dedicated to computation, or combining an integrated and dedicated graphics card. All such cases seem to me quite far fetched to justify support in a majority of games. So moving one abstraction level further downstream, should a game engine even consider multiple display systems (i.e. there is just one ID3D11Device and one immediate ID3D11DeviceContext)?
    • By Nimmagadda Subba Rao
         I am a CAM developer working with C++ and C# for the past 5 years. I started working on DirectX from past 6 months. I developed a touch screen control viewer using Direct2D. I am working on 3D viewer currently. I am very slow with working on Direct3D. I want to be a gaming developer. As i am new to this i want to know what are the possibilities to explore in this area. How to start developing gaming engines? Is it through tutorials? I heard suggestions from my friends that going for an MS helps. I am not sure on which path to choose. Is it better to go for higher studies and start exploring? I am currently working in India. I want to go to Canada and settle there. Are there any good universities there to learn about graphics programming? Sorry if I am asking too many questions but i want to know the options to choose to get ahead. 
    • By pcmaster
      Hi all, I have another "niche" architecture error
      On our building servers, we're using head-less machines on which we're running DX11 WARP in a console session, that is D3D_DRIVER_TYPE_WARP plus D3D_FEATURE_LEVEL_11_0. It's Windows 7 or Windows Server 2008 R2 with "Platform Update for Windows 7". Everything's been fine, it's running all kinds of complex rendering, compute shaders, UAVs, everything fine and even fast.
      The problem: Writes to a cubemap array specific slice and specific mipmap using PS+UAV seem to be dropped.
      Do note that with D3D_DRIVER_TYPE_HARDWARE it works correctly; I can reproduce the bug on any normal workstation (also Windows 7 x64) with D3D_DRIVER_TYPE_WARP.
      The shader in question is a simple average 4->1 mipmapping PS, which samples a source SRV texture and writes into a UAV like this:
      RWTexture2DArray<float4> array2d; array2d[int3(xy, arrayIdx)] = avg_float4_value; The output merger is set to do no RT writes, the only output is via that one UAV.
      Note again that with a normal HW driver (GeForce) it works right, but with WARP it doesn't.
      Any ideas how I could debug this, to be sure it's really WARP causing this? Do you think RenderDoc will capture also a WARP application (using their StartFrameCapture/EndFrameCapture API of course, since the there's no window nor swap chain)? EDIT: RenderDoc does make a capture even with WARP, wow
  • Advertisement