• Advertisement
  • Popular Tags

  • Popular Now

  • Advertisement
  • Similar Content

    • By Jason Smith
      While working on a project using D3D12 I was getting an exception being thrown while trying to get a D3D12_CPU_DESCRIPTOR_HANDLE. The project is using plain C so it uses the COBJMACROS. The following application replicates the problem happening in the project.
      #define COBJMACROS #pragma warning(push, 3) #include <Windows.h> #include <d3d12.h> #include <dxgi1_4.h> #pragma warning(pop) IDXGIFactory4 *factory; ID3D12Device *device; ID3D12DescriptorHeap *rtv_heap; int WINAPI wWinMain(HINSTANCE hinst, HINSTANCE pinst, PWSTR cline, int cshow) { (hinst), (pinst), (cline), (cshow); HRESULT hr = CreateDXGIFactory1(&IID_IDXGIFactory4, (void **)&factory); hr = D3D12CreateDevice(0, D3D_FEATURE_LEVEL_11_0, &IID_ID3D12Device, &device); D3D12_DESCRIPTOR_HEAP_DESC desc; desc.NumDescriptors = 1; desc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV; desc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE; desc.NodeMask = 0; hr = ID3D12Device_CreateDescriptorHeap(device, &desc, &IID_ID3D12DescriptorHeap, (void **)&rtv_heap); D3D12_CPU_DESCRIPTOR_HANDLE rtv = ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart(rtv_heap); (rtv); } The call to ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart throws an exception. Stepping into the disassembly for ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart show that the error occurs on the instruction
      mov  qword ptr [rdx],rax
      which seems odd since rdx doesn't appear to be used. Any help would be greatly appreciated. Thank you.
       
    • By lubbe75
      As far as I understand there is no real random or noise function in HLSL. 
      I have a big water polygon, and I'd like to fake water wave normals in my pixel shader. I know it's not efficient and the standard way is really to use a pre-calculated noise texture, but anyway...
      Does anyone have any quick and dirty HLSL shader code that fakes water normals, and that doesn't look too repetitious? 
    • By turanszkij
      Hi,
      I finally managed to get the DX11 emulating Vulkan device working but everything is flipped vertically now because Vulkan has a different clipping space. What are the best practices out there to keep these implementation consistent? I tried using a vertically flipped viewport, and while it works on Nvidia 1050, the Vulkan debug layer is throwing error messages that this is not supported in the spec so it might not work on others. There is also the possibility to flip the clip scpace position Y coordinate before writing out with vertex shader, but that requires changing and recompiling every shader. I could also bake it into the camera projection matrices, though I want to avoid that because then I need to track down for the whole engine where I upload matrices... Any chance of an easy extension or something? If not, I will probably go with changing the vertex shaders.
    • By NikiTo
      Some people say "discard" has not a positive effect on optimization. Other people say it will at least spare the fetches of textures.
       
      if (color.A < 0.1f) { //discard; clip(-1); } // tons of reads of textures following here // and loops too
      Some people say that "discard" will only mask out the output of the pixel shader, while still evaluates all the statements after the "discard" instruction.

      MSN>
      discard: Do not output the result of the current pixel.
      clip: Discards the current pixel..
      <MSN

      As usual it is unclear, but it suggests that "clip" could discard the whole pixel(maybe stopping execution too)

      I think, that at least, because of termal and energy consuming reasons, GPU should not evaluate the statements after "discard", but some people on internet say that GPU computes the statements anyways. What I am more worried about, are the texture fetches after discard/clip.

      (what if after discard, I have an expensive branch decision that makes the approved cheap branch neighbor pixels stall for nothing? this is crazy)
    • By NikiTo
      I have a problem. My shaders are huge, in the meaning that they have lot of code inside. Many of my pixels should be completely discarded. I could use in the very beginning of the shader a comparison and discard, But as far as I understand, discard statement does not save workload at all, as it has to stale until the long huge neighbor shaders complete.
      Initially I wanted to use stencil to discard pixels before the execution flow enters the shader. Even before the GPU distributes/allocates resources for this shader, avoiding stale of pixel shaders execution flow, because initially I assumed that Depth/Stencil discards pixels before the pixel shader, but I see now that it happens inside the very last Output Merger state. It seems extremely inefficient to render that way a little mirror in a scene with big viewport. Why they've put the stencil test in the output merger anyway? Handling of Stencil is so limited compared to other resources. Does people use Stencil functionality at all for games, or they prefer discard/clip?

      Will GPU stale the pixel if I issue a discard in the very beginning of the pixel shader, or GPU will already start using the freed up resources to render another pixel?!?!



       
  • Advertisement
  • Advertisement
Sign in to follow this  

DX12 [D3D12] Number of queued frames and swapchain buffers

This topic is 653 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi community

 

I read the following in a NVIDIA link about DirectX12 recommendations.

https://developer.nvidia.com/dx12-dos-and-donts

 

"Don’t forget that there's a per swap-chain limit of 3 queued frames before DXGI will start to block in Present()"

 

I was doing something similar in my architecture (basically, I match the number of queued frames with the number of swap chain buffers)

 

Should the number of queued frames be the same than the number of swap chain buffers? I think the answer is YES, because If you send 3 groups of command lists that will output to 3 different final render targets, then there should be a correspondence between your final render targets and swap chain buffers, because you can end with barriers problems at Present (You can be trying to present a resource that is going to be used as render target by an unprocessed command list in the GPU).

 

Am I missing something?

 

Thanks!

 

 

Share this post


Link to post
Share on other sites
Advertisement

At a high level, no it's not really that relevant. Since the frames are (probably) all submitted to the same command queue, there's no GPU parallelism going on here, so the number of buffers doesn't really matter.

 

For a full answer on why you might want more buffers, check out this other topic:

http://www.gamedev.net/topic/679050-how-come-changing-dxgi-swap-chain-descbuffercount-has-no-effect/

Share this post


Link to post
Share on other sites
I just read your post. Thanks for the information.

The reason why I am using 3 buffers in swap chain instead of 2 is because as I am queuing 3 frames, I was getting errors because DirectX complained about writing operation in current displayed back buffer. I think that is something you mentioned in that post.

Share this post


Link to post
Share on other sites

You should have one more buffer than the number of frames that you queue, since one of the buffer is the one that it's been presented. So for a 3 buffer swap chain, you should queue up to 2 frames. That being said, queuing 2 frames should be more than enough (you want to keep your GPU busy but also to have a reasonable latency).

And for that error, if you are requesting each frame the right buffer, it should not happen.

Share this post


Link to post
Share on other sites

The only reason for more buffers is to queue completed frames. And the only time you can queue completed frames is when using VSync. If you're submitting all of your frames serially, then buffer 0 gets off screen the same time buffer 1 gets on screen, and frame 2 can start writing to buffer 0 with no delay.

 

Though if you're using FLIP_SEQUENTIAL/FLIP_DISCARD (which you are since you're asking about D3D12), using sync interval 0, and not using the ALLOW_TEARING flag, then it makes sense to have more buffers, since you want to be able to queue completed frames here too.

Share this post


Link to post
Share on other sites
@Sergio J. De Los Santos:
I will check about your recommendation of having N swap chain buffers ans N - 1 queued frames in my code. Currently, both have the same number and I did not see any problems in the app, but what you says makes sense.

Why the latency could change according the number of queued frames?

@Jesse Natalie:
Where do you set the sync interval to be 0? When you create the swap chain?

Share this post


Link to post
Share on other sites
Ohh you are right. And yes, I use exactly the same configuration you described. Then I will try your suggestion and Sergio J. De Los Santos suggestion too.

Where did you learn about this stuff and the stuff you described in the post you recommended me to read (in your first answer to this post)

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement