• 11
• 9
• 10
• 9
• 11
• ### Similar Content

• While working on a project using D3D12 I was getting an exception being thrown while trying to get a D3D12_CPU_DESCRIPTOR_HANDLE. The project is using plain C so it uses the COBJMACROS. The following application replicates the problem happening in the project.
#define COBJMACROS #pragma warning(push, 3) #include <Windows.h> #include <d3d12.h> #include <dxgi1_4.h> #pragma warning(pop) IDXGIFactory4 *factory; ID3D12Device *device; ID3D12DescriptorHeap *rtv_heap; int WINAPI wWinMain(HINSTANCE hinst, HINSTANCE pinst, PWSTR cline, int cshow) { (hinst), (pinst), (cline), (cshow); HRESULT hr = CreateDXGIFactory1(&IID_IDXGIFactory4, (void **)&factory); hr = D3D12CreateDevice(0, D3D_FEATURE_LEVEL_11_0, &IID_ID3D12Device, &device); D3D12_DESCRIPTOR_HEAP_DESC desc; desc.NumDescriptors = 1; desc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV; desc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE; desc.NodeMask = 0; hr = ID3D12Device_CreateDescriptorHeap(device, &desc, &IID_ID3D12DescriptorHeap, (void **)&rtv_heap); D3D12_CPU_DESCRIPTOR_HANDLE rtv = ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart(rtv_heap); (rtv); } The call to ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart throws an exception. Stepping into the disassembly for ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart show that the error occurs on the instruction
mov  qword ptr [rdx],rax
which seems odd since rdx doesn't appear to be used. Any help would be greatly appreciated. Thank you.

• By lubbe75
As far as I understand there is no real random or noise function in HLSL.
I have a big water polygon, and I'd like to fake water wave normals in my pixel shader. I know it's not efficient and the standard way is really to use a pre-calculated noise texture, but anyway...
Does anyone have any quick and dirty HLSL shader code that fakes water normals, and that doesn't look too repetitious?

• Hi,
I finally managed to get the DX11 emulating Vulkan device working but everything is flipped vertically now because Vulkan has a different clipping space. What are the best practices out there to keep these implementation consistent? I tried using a vertically flipped viewport, and while it works on Nvidia 1050, the Vulkan debug layer is throwing error messages that this is not supported in the spec so it might not work on others. There is also the possibility to flip the clip scpace position Y coordinate before writing out with vertex shader, but that requires changing and recompiling every shader. I could also bake it into the camera projection matrices, though I want to avoid that because then I need to track down for the whole engine where I upload matrices... Any chance of an easy extension or something? If not, I will probably go with changing the vertex shaders.
• By NikiTo
Some people say "discard" has not a positive effect on optimization. Other people say it will at least spare the fetches of textures.

if (color.A < 0.1f) { //discard; clip(-1); } // tons of reads of textures following here // and loops too
Some people say that "discard" will only mask out the output of the pixel shader, while still evaluates all the statements after the "discard" instruction.

MSN>
discard: Do not output the result of the current pixel.
<MSN

As usual it is unclear, but it suggests that "clip" could discard the whole pixel(maybe stopping execution too)

I think, that at least, because of termal and energy consuming reasons, GPU should not evaluate the statements after "discard", but some people on internet say that GPU computes the statements anyways. What I am more worried about, are the texture fetches after discard/clip.

(what if after discard, I have an expensive branch decision that makes the approved cheap branch neighbor pixels stall for nothing? this is crazy)
• By NikiTo
I have a problem. My shaders are huge, in the meaning that they have lot of code inside. Many of my pixels should be completely discarded. I could use in the very beginning of the shader a comparison and discard, But as far as I understand, discard statement does not save workload at all, as it has to stale until the long huge neighbor shaders complete.
Initially I wanted to use stencil to discard pixels before the execution flow enters the shader. Even before the GPU distributes/allocates resources for this shader, avoiding stale of pixel shaders execution flow, because initially I assumed that Depth/Stencil discards pixels before the pixel shader, but I see now that it happens inside the very last Output Merger state. It seems extremely inefficient to render that way a little mirror in a scene with big viewport. Why they've put the stencil test in the output merger anyway? Handling of Stencil is so limited compared to other resources. Does people use Stencil functionality at all for games, or they prefer discard/clip?

Will GPU stale the pixel if I issue a discard in the very beginning of the pixel shader, or GPU will already start using the freed up resources to render another pixel?!?!

# DX12 Reading from the CPU

## Recommended Posts

Hello,

I'm working on a system based on Structured Buffers, the idea is that they can be used in the GPU as UAV/SRV and then the data can be read back in the CPU. This system is used to do some queries in my application.

First I have two main resources:

+ Default Resource: This will be the resource used by the GPU.

+ Upload Resource: In case I want to upload data from the CPU I'll use this as an intermediate

+ Double Buffered ReadBack Resources: I have two Read Back buffers to copy data from the GPU to the CPU.

Let me show some code:

const void* OpenReadBuffer()
{
HRESULT res = S_FALSE;

if (mFirstUse)
mFirstUse = false;
else
mCbFence.WaitUntilCompleted(renderPlatform);

// Map from the last frame
}

{
}

{
// Schedule a copy for the next frame

mCbFence.Signal(renderPlatform);

// Swap it!
}

This is how I create the different resources:

// Default
D3D12_RESOURCE_FLAGS bufferFlags = computable ? D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS : D3D12_RESOURCE_FLAG_NONE;
res = device->CreateCommittedResource
(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT),
D3D12_HEAP_FLAG_NONE,
&CD3DX12_RESOURCE_DESC::Buffer(mTotalSize,bufferFlags),
nullptr,
IID_PPV_ARGS(&mBufferDefault)
);

res = device->CreateCommittedResource
(
D3D12_HEAP_FLAG_NONE,
&CD3DX12_RESOURCE_DESC::Buffer(mTotalSize),
nullptr,
);

res = device->CreateCommittedResource
(
D3D12_HEAP_FLAG_NONE,
&CD3DX12_RESOURCE_DESC::Buffer(mTotalSize),
D3D12_RESOURCE_STATE_COPY_DEST,
nullptr,
);

I'm using a Fence to be 100% sure that it is synchronised, I could have more that 2 buffers but at the moment I would like to keep it simple.

The values that I get from OpenReadBuffer() are all 0.

If I debug it, it looks like the Read Back Buffers have some valid data.

What could be the issue?

Thanks!

Edited by piluve

##### Share on other sites

If you want to specify that you'll read the entire contents of the buffer when calling Map, then you can pass NULL as the "pReadRange" parameter. Passing a range where End <= Begin means that you won't be reading any data, which isn't what you want:

"This indicates the region the CPU might read, and the coordinates are subresource-relative. A null pointer indicates the entire subresource might be read by the CPU. It is valid to specify the CPU won't read any data by passing a range where End is less than or equal to Begin."

##### Share on other sites

MJP is "write", but try to not use nullptr for a range, because the debug layer warn it may be a performance issue to map a full resource and an unwanted action. You can just send { 0, size } to mute the message

##### Share on other sites

@galop1n and @MJP are correct in that the range is wrong... but that's not your problem. That only matters on systems which don't have cache coherency, which I can pretty much guarantee you're not using.

Since you claim that "while debugging, the data has valid values," I take that to mean that if you use breakpoints and inspect the data, you see correct contents, but if you let the app run normally, you don't. That sounds to me like your synchronization isn't actually working correctly, and when your breakpoint hits, the GPU continues executing and fills in the memory you're inspecting by the time you look at it.

##### Share on other sites

Hello @MJP, @galop1n, totally agree, now that you mention it, I mostly always use the same 0,0 range, I will change it just to be nice to the API.

Hi @SoldierOfLight , by debugging I was talking about a GPU debugger (like Visual Studio graphics debugger), I can see that the Read Back buffers had data, but then its al 0 (CPU side).

EDIT

I would like to point something, I've checking it and it looks like in most cases, this read back system is working just fine, but I have one case where it is returning invalid data. Could the problem be in other place, like the state of the resources? The thing is, Why I see valid data with the debugger?

Edited by piluve

##### Share on other sites

What OS version are you on? I'm currently aware of a bug where mapping a resource can cause it to incorrectly return 0s on some Windows 10 Insider builds. If you map the resource earlier and leave it mapped, does the problem go away?

##### Share on other sites

@SoldierOfLight I'm using Windows 10 Pro (with the latest patch I guess), I tried mapping once the ReadBack buffers but getting same results :C. I'll dig around this one that is failing and try to find out why.