• Advertisement
  • Popular Tags

  • Popular Now

  • Advertisement
  • Similar Content

    • By Jason Smith
      While working on a project using D3D12 I was getting an exception being thrown while trying to get a D3D12_CPU_DESCRIPTOR_HANDLE. The project is using plain C so it uses the COBJMACROS. The following application replicates the problem happening in the project.
      #define COBJMACROS #pragma warning(push, 3) #include <Windows.h> #include <d3d12.h> #include <dxgi1_4.h> #pragma warning(pop) IDXGIFactory4 *factory; ID3D12Device *device; ID3D12DescriptorHeap *rtv_heap; int WINAPI wWinMain(HINSTANCE hinst, HINSTANCE pinst, PWSTR cline, int cshow) { (hinst), (pinst), (cline), (cshow); HRESULT hr = CreateDXGIFactory1(&IID_IDXGIFactory4, (void **)&factory); hr = D3D12CreateDevice(0, D3D_FEATURE_LEVEL_11_0, &IID_ID3D12Device, &device); D3D12_DESCRIPTOR_HEAP_DESC desc; desc.NumDescriptors = 1; desc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV; desc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE; desc.NodeMask = 0; hr = ID3D12Device_CreateDescriptorHeap(device, &desc, &IID_ID3D12DescriptorHeap, (void **)&rtv_heap); D3D12_CPU_DESCRIPTOR_HANDLE rtv = ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart(rtv_heap); (rtv); } The call to ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart throws an exception. Stepping into the disassembly for ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart show that the error occurs on the instruction
      mov  qword ptr [rdx],rax
      which seems odd since rdx doesn't appear to be used. Any help would be greatly appreciated. Thank you.
       
    • By lubbe75
      As far as I understand there is no real random or noise function in HLSL. 
      I have a big water polygon, and I'd like to fake water wave normals in my pixel shader. I know it's not efficient and the standard way is really to use a pre-calculated noise texture, but anyway...
      Does anyone have any quick and dirty HLSL shader code that fakes water normals, and that doesn't look too repetitious? 
    • By turanszkij
      Hi,
      I finally managed to get the DX11 emulating Vulkan device working but everything is flipped vertically now because Vulkan has a different clipping space. What are the best practices out there to keep these implementation consistent? I tried using a vertically flipped viewport, and while it works on Nvidia 1050, the Vulkan debug layer is throwing error messages that this is not supported in the spec so it might not work on others. There is also the possibility to flip the clip scpace position Y coordinate before writing out with vertex shader, but that requires changing and recompiling every shader. I could also bake it into the camera projection matrices, though I want to avoid that because then I need to track down for the whole engine where I upload matrices... Any chance of an easy extension or something? If not, I will probably go with changing the vertex shaders.
    • By NikiTo
      Some people say "discard" has not a positive effect on optimization. Other people say it will at least spare the fetches of textures.
       
      if (color.A < 0.1f) { //discard; clip(-1); } // tons of reads of textures following here // and loops too
      Some people say that "discard" will only mask out the output of the pixel shader, while still evaluates all the statements after the "discard" instruction.

      MSN>
      discard: Do not output the result of the current pixel.
      clip: Discards the current pixel..
      <MSN

      As usual it is unclear, but it suggests that "clip" could discard the whole pixel(maybe stopping execution too)

      I think, that at least, because of termal and energy consuming reasons, GPU should not evaluate the statements after "discard", but some people on internet say that GPU computes the statements anyways. What I am more worried about, are the texture fetches after discard/clip.

      (what if after discard, I have an expensive branch decision that makes the approved cheap branch neighbor pixels stall for nothing? this is crazy)
    • By NikiTo
      I have a problem. My shaders are huge, in the meaning that they have lot of code inside. Many of my pixels should be completely discarded. I could use in the very beginning of the shader a comparison and discard, But as far as I understand, discard statement does not save workload at all, as it has to stale until the long huge neighbor shaders complete.
      Initially I wanted to use stencil to discard pixels before the execution flow enters the shader. Even before the GPU distributes/allocates resources for this shader, avoiding stale of pixel shaders execution flow, because initially I assumed that Depth/Stencil discards pixels before the pixel shader, but I see now that it happens inside the very last Output Merger state. It seems extremely inefficient to render that way a little mirror in a scene with big viewport. Why they've put the stencil test in the output merger anyway? Handling of Stencil is so limited compared to other resources. Does people use Stencil functionality at all for games, or they prefer discard/clip?

      Will GPU stale the pixel if I issue a discard in the very beginning of the pixel shader, or GPU will already start using the freed up resources to render another pixel?!?!



       
  • Advertisement
  • Advertisement
Sign in to follow this  

DX12 [D3D12] Barriers are really necessary?

This topic is 404 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi.

I am learning the barrier stuff now.

I got a simple question.

I understood the  theory of the barrier.

But my D3D12 program(and MS sample program too) run fine WITHOUT barrier commands.

It seems like D3D12 driver doesn't care about barriers.

why D3D12 works fine without barriers?

Following code ,of course, works fine.

g_pCommandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(g_pRenderTargets[g_FrameIndex], D3D12_RESOURCE_STATE_PRESENT, D3D12_RESOURCE_STATE_RENDER_TARGET));

(Draw operations)

g_pCommandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(g_pRenderTargets[g_FrameIndex], D3D12_RESOURCE_STATE_RENDER_TARGET, D3D12_RESOURCE_STATE_PRESENT));

And bellow works fine TOO (barrier commands are commented out).

//g_pCommandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(g_pRenderTargets[g_FrameIndex], D3D12_RESOURCE_STATE_PRESENT, D3D12_RESOURCE_STATE_RENDER_TARGET));

(Draw operations)

//g_pCommandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(g_pRenderTargets[g_FrameIndex], D3D12_RESOURCE_STATE_RENDER_TARGET, D3D12_RESOURCE_STATE_PRESENT));

why?

Edited by Shigeo.K

Share this post


Link to post
Share on other sites
Advertisement

I did not mean I do not want to deal with barriers. Actually, opposite(That's why I post this thread).

You are in the world of undefined behaviors if you do not use proper barriers.

Yes this is what I thought.

, transition from render target to texture and vice versa are super important

Following your advice, I will test some texture transition, and see what happen if I don't put barrier commands.

Thanks.

AAA games, I agree in some ways.

But there is a developer like me who want to get maximum performance for his game. not only enormous budget games.

You might need to know that.

Share this post


Link to post
Share on other sites

But there is a developer like me who want to get maximum performance for his game. not only enormous budget games.

You might need to know that.

 

This is a miss-conception, yes, in some case you can get better performance than DX11, but unless you have a crazy load on your cpu and gpu, the chance you will see a marginal difference are quite low plus you put in balance months of complex development and intense hardcore low level debugging. That has a price !

I do not know your game, but many smaller production can run easily at >100fps if properly done even on moderate hardware with dx11 and DX12 is mostly a CPU improvement API to start with, while modest production usually hit the GPU bottleneck faster. On the GPU, you are looking for the quarter or half millisecond of implicit barriers in dx11 that could be avoid on dx12 on your GPU, or at best 1 or 2ms ( as observe on console renderer ) is you do a great use of async compute. Let me tell you, async compute is not the miraculous things AMD try to sell you,

Share this post


Link to post
Share on other sites

Shigeo.K out of curiosity what video card are you doing this on?

Also is it possible the driver is putting the barrier in for him?  I know its not supposed to for the sake of thinness but I was wondering.

Share this post


Link to post
Share on other sites

what video card are you doing this on?

I am using a very common hardware, GeForce GTX 1070(driver version 21.21.13.7653 2016.12.29) and Core i7 6700K.

I have examined texture transition without putting barriers.

Still D3D12 works fine.

Maybe we will need barriers in the near future. Not now.

Of course, we can't disrespect for the barrier. Barrier is exist because D3D12 want us to use it, obviously.

I just a little surprised that I can do almost everything WITHOUT barriers for now(on my hardware).

is it possible the driver is putting the barrier in for him?

Sorry I can't understand what this means.

This is a miss-conception, yes, in some case

I think I am some kind of peculiar developer.

I am 1 man developer, and I have passed steam greenlight, my game is in the steam store.

a crazy load on your cpu and gpu

My game is the game which has a crazy load on the cpu and the gpu.

 

rare synchronization artifacts that only show up on a particular GPU or driver revision.

Yes, it seems so. That what I was thinking.

Anyway, thanks to you guys, I could get a great confidence.

Thank you.

Share this post


Link to post
Share on other sites

What is that game ?

 

Example of transition that are really necessary :

RTV to SRV : Fast clear elimination and Color ROP cache flush because texture will use L2 cache. Same things if it is to PRESENT state

DSV to SRV : Depth decompression ( too bad, we do not have access to the ZTile data on PC :( )

UAV to SRV : Some cache need to be flushed or you may read outdated incorrect value

UAV to UAV : Not a transition barrier but it is needed to serialize two draw?dispatch or they may overlap. If there is no dependency, it is great and not needed, if there was and the barrier missing, well, it is hazard factory. Some cache flush will happen two for the same reason as uav to srv

To Indirect buffer argument : This one need even more flush than normal or your indirect draws won't see the proper value.

From copy dest to something : Obviously to wait for the copy to be complete.

Aliasing transition : Probably no work, but the barrier has a requirement to issue a clear that imply work.

That are the major chances of real work on transitions. Anyway, because you care, you run with the debug layer and gpu based validation, so even if the barriers where not useful, you do not want thousands of warning a frame to still see the real errors :)

Share this post


Link to post
Share on other sites
What is that game ?

http://store.steampowered.com/app/418050/?snr=1_7_7_151_150_1

I don't come here to advertise my game. So do not buy  from the above link :P

 

Example of transition that are really necessary :

Of course they are.

Needless to say, we need  transitions.

But I don't need  barriers (on my hardware, for now, on my simple examinations).

I just want to make it clear that I do not say I do not need a transition itself.

On my PC, program works fine without barriers, but I know I need barriers for many reasons.

Edited by Shigeo.K

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement