• Advertisement
  • Popular Tags

  • Popular Now

  • Advertisement
  • Similar Content

    • By Jason Smith
      While working on a project using D3D12 I was getting an exception being thrown while trying to get a D3D12_CPU_DESCRIPTOR_HANDLE. The project is using plain C so it uses the COBJMACROS. The following application replicates the problem happening in the project.
      #define COBJMACROS #pragma warning(push, 3) #include <Windows.h> #include <d3d12.h> #include <dxgi1_4.h> #pragma warning(pop) IDXGIFactory4 *factory; ID3D12Device *device; ID3D12DescriptorHeap *rtv_heap; int WINAPI wWinMain(HINSTANCE hinst, HINSTANCE pinst, PWSTR cline, int cshow) { (hinst), (pinst), (cline), (cshow); HRESULT hr = CreateDXGIFactory1(&IID_IDXGIFactory4, (void **)&factory); hr = D3D12CreateDevice(0, D3D_FEATURE_LEVEL_11_0, &IID_ID3D12Device, &device); D3D12_DESCRIPTOR_HEAP_DESC desc; desc.NumDescriptors = 1; desc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV; desc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE; desc.NodeMask = 0; hr = ID3D12Device_CreateDescriptorHeap(device, &desc, &IID_ID3D12DescriptorHeap, (void **)&rtv_heap); D3D12_CPU_DESCRIPTOR_HANDLE rtv = ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart(rtv_heap); (rtv); } The call to ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart throws an exception. Stepping into the disassembly for ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart show that the error occurs on the instruction
      mov  qword ptr [rdx],rax
      which seems odd since rdx doesn't appear to be used. Any help would be greatly appreciated. Thank you.
    • By lubbe75
      As far as I understand there is no real random or noise function in HLSL. 
      I have a big water polygon, and I'd like to fake water wave normals in my pixel shader. I know it's not efficient and the standard way is really to use a pre-calculated noise texture, but anyway...
      Does anyone have any quick and dirty HLSL shader code that fakes water normals, and that doesn't look too repetitious? 
    • By turanszkij
      I finally managed to get the DX11 emulating Vulkan device working but everything is flipped vertically now because Vulkan has a different clipping space. What are the best practices out there to keep these implementation consistent? I tried using a vertically flipped viewport, and while it works on Nvidia 1050, the Vulkan debug layer is throwing error messages that this is not supported in the spec so it might not work on others. There is also the possibility to flip the clip scpace position Y coordinate before writing out with vertex shader, but that requires changing and recompiling every shader. I could also bake it into the camera projection matrices, though I want to avoid that because then I need to track down for the whole engine where I upload matrices... Any chance of an easy extension or something? If not, I will probably go with changing the vertex shaders.
    • By NikiTo
      Some people say "discard" has not a positive effect on optimization. Other people say it will at least spare the fetches of textures.
      if (color.A < 0.1f) { //discard; clip(-1); } // tons of reads of textures following here // and loops too
      Some people say that "discard" will only mask out the output of the pixel shader, while still evaluates all the statements after the "discard" instruction.

      discard: Do not output the result of the current pixel.
      clip: Discards the current pixel..

      As usual it is unclear, but it suggests that "clip" could discard the whole pixel(maybe stopping execution too)

      I think, that at least, because of termal and energy consuming reasons, GPU should not evaluate the statements after "discard", but some people on internet say that GPU computes the statements anyways. What I am more worried about, are the texture fetches after discard/clip.

      (what if after discard, I have an expensive branch decision that makes the approved cheap branch neighbor pixels stall for nothing? this is crazy)
    • By NikiTo
      I have a problem. My shaders are huge, in the meaning that they have lot of code inside. Many of my pixels should be completely discarded. I could use in the very beginning of the shader a comparison and discard, But as far as I understand, discard statement does not save workload at all, as it has to stale until the long huge neighbor shaders complete.
      Initially I wanted to use stencil to discard pixels before the execution flow enters the shader. Even before the GPU distributes/allocates resources for this shader, avoiding stale of pixel shaders execution flow, because initially I assumed that Depth/Stencil discards pixels before the pixel shader, but I see now that it happens inside the very last Output Merger state. It seems extremely inefficient to render that way a little mirror in a scene with big viewport. Why they've put the stencil test in the output merger anyway? Handling of Stencil is so limited compared to other resources. Does people use Stencil functionality at all for games, or they prefer discard/clip?

      Will GPU stale the pixel if I issue a discard in the very beginning of the pixel shader, or GPU will already start using the freed up resources to render another pixel?!?!

  • Advertisement
  • Advertisement
Sign in to follow this  

DX12 [D3D12] The number of rendered objects and the number of constant buffer views

This topic is 429 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi everyone,
The sample of the SDK renders a single triangle per frame.
I want to render multiple objects per frame (of course, it is fine if there is only one vertex buffer).

Before Dx11, it was possible to implement the rendering of a single object or of multiple objects by simply increasing the loop counter.
In Dx12, however, it seems that this is no longer the case.
Through trial and error, I was finally able to figure out how to do it!


However, I have concerns about whether it is a suitable solution.

For now, I am implementing multiple renderings by creating multiple CBVs.
For example, if I want to render 5 objects, I create 5 CBVs, one for each.

****Initializing Phase****
for (int i = 0; i < 5; i++)
	cbvDesc.BufferLocation = g_constantBuffer->GetGPUVirtualAddress()+i*CBSize;	
	D3D12_CPU_DESCRIPTOR_HANDLE cHandle = g_pCbvHeap->GetCPUDescriptorHandleForHeapStart();			
	cHandle.ptr += i*Stride;
	g_pDevice->CreateConstantBufferView(&cbvDesc, cHandle);

****Draw Phase****

//Update constant buffer
for (int i = 0; i<5; i++)
	char* ptr = reinterpret_cast<char*>(g_pCbvDataBegin)+256*i;		
(some matrix oparation...)

	g_constantBufferData.mWVP = mWVP;
	memcpy(ptr, &g_constantBufferData, sizeof(g_constantBufferData));

for(int i=0;i<5;i++)
	cbvSrvUavDescHeap.ptr += i * size;
	g_pCommandList->SetGraphicsRootDescriptorTable(0, cbvSrvUavDescHeap);
        g_pCommandList->DrawInstanced(3, 1, 0, 0);


Something about this solution feels ‘off’, so I wanted to see what you all think.
Is this the right way to handle this problem?


Best regards,

Share this post

Link to post
Share on other sites
D3D12 does not handle lifetime for you. Any memory you initialize from the cpu need to survive until the gpu is done with it, using fences after your command lists to track completion.

Usually, you allocate a memory block and advance in it allocating in a ring buffer fashion anytime you need to store something (constants, geometry, captain age).

Share this post

Link to post
Share on other sites
D3D11 drivers handled dynamic buffers through a process known as "versioning". Basically you see 1 buffer at an API level, but behind the scenes the driver switches to a new buffer allocation every time that you call Map. This allows the driver to fill the GPU-visible memory a frame or more before the GPU actually executes it (somewhat similar to how you're doing it in D3D12), while presenting a user-facing API that makes it look as though the buffer update happens synchronously with the draw calls or dispatches. In practice doing this was rather tricky for drivers, and added a lot of overhead. It essentially combined the typical problems of a memory allocator with the additional complexities of tracking whether the GPU was still reading from a particular piece of memory. It also required tracking tons of state behind the scenes, since you could Map a buffer that was already bound to the pipeline and the driver had to handle that transparently.
As galop1n already explained, you're totally on your own with this in D3D12. If you want similar facilities to what you had in D3D11 you'll need to implement them yourself. The good news is that you can create simple layers that are exactly tailored for your use case, which can let you do things much more efficiently than a D3D11 driver could. For instance, in our engine we support the idea of "temporary" dynamic buffers. The contents of these buffers don't persist from one frame to another, which means that the backing buffer memory can be sub-allocated from a larger buffer that's tied to a single GPU frame. So at the end of every frame we can swap buffers and say "the entire contents of this buffer are no longer being used, and we don't care because it was temporary". This makes versioning as cheap as incrementing a pointer, as long as we're willing to dedicated enough memory to hold 2x the number of temporary allocations. Alternatively you can use a ring buffer as galo1n suggested, and move forward the low watermark when a fence is signaled. This could allow you to be more efficient with your memory usage, while making the versioning process a bit more complicated. Really my point is that there's many approaches that you could take, so you should experiment to find something that works for your use cases.
Also, I wanted to point out that root constant buffer views can make allocating temporary constant buffers very easy:
void* tempCBMemCPUAddr = BigTempBufferCPUAddr + BigTempBufferUsed;
uint64_t tempCBMemGPUAddr = BigTempBufferGPUAddr + BigTempBufferUsed;
BigTempBufferUsed += cbSize;
memcpy(tempCBMemCPUAddr, cbData, cbSize);
cmdList->SetGraphicsRootConstantBufferView(cbRootParam, tempCBMemGPUAddr);
Edited by MJP

Share this post

Link to post
Share on other sites
you see 1 buffer at an API level

That clears up everything. Thanks!


D3D12 does not handle lifetime for you.

And thank you for the confirmation.

I was confident that my procedure was one of several correct approaches.
I also realized that there are several other approaches to achieve the same result.

By increasing the degree of freedom of the API, we increased our responsibility, but there are more options for programmers than with Dx11.As a API user, I think that this is a good trend.

I don’t entirely understand the “temporary buffer” you mentioned, but I will do my best to figure it out.
I’m sure it is a useful technique.

Thank you for showing me the snippet of the source code.
I will do my best to understand it.

It also seems that I misunderstood the notion.And also the thread title.
In actuality, I prepare multiple CB memory blocks, not multiple CBVs.

Edited by Shigeo.K

Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement