• Advertisement
  • Popular Tags

  • Popular Now

  • Advertisement
  • Similar Content

    • By Jason Smith
      While working on a project using D3D12 I was getting an exception being thrown while trying to get a D3D12_CPU_DESCRIPTOR_HANDLE. The project is using plain C so it uses the COBJMACROS. The following application replicates the problem happening in the project.
      #define COBJMACROS #pragma warning(push, 3) #include <Windows.h> #include <d3d12.h> #include <dxgi1_4.h> #pragma warning(pop) IDXGIFactory4 *factory; ID3D12Device *device; ID3D12DescriptorHeap *rtv_heap; int WINAPI wWinMain(HINSTANCE hinst, HINSTANCE pinst, PWSTR cline, int cshow) { (hinst), (pinst), (cline), (cshow); HRESULT hr = CreateDXGIFactory1(&IID_IDXGIFactory4, (void **)&factory); hr = D3D12CreateDevice(0, D3D_FEATURE_LEVEL_11_0, &IID_ID3D12Device, &device); D3D12_DESCRIPTOR_HEAP_DESC desc; desc.NumDescriptors = 1; desc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV; desc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE; desc.NodeMask = 0; hr = ID3D12Device_CreateDescriptorHeap(device, &desc, &IID_ID3D12DescriptorHeap, (void **)&rtv_heap); D3D12_CPU_DESCRIPTOR_HANDLE rtv = ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart(rtv_heap); (rtv); } The call to ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart throws an exception. Stepping into the disassembly for ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart show that the error occurs on the instruction
      mov  qword ptr [rdx],rax
      which seems odd since rdx doesn't appear to be used. Any help would be greatly appreciated. Thank you.
       
    • By lubbe75
      As far as I understand there is no real random or noise function in HLSL. 
      I have a big water polygon, and I'd like to fake water wave normals in my pixel shader. I know it's not efficient and the standard way is really to use a pre-calculated noise texture, but anyway...
      Does anyone have any quick and dirty HLSL shader code that fakes water normals, and that doesn't look too repetitious? 
    • By turanszkij
      Hi,
      I finally managed to get the DX11 emulating Vulkan device working but everything is flipped vertically now because Vulkan has a different clipping space. What are the best practices out there to keep these implementation consistent? I tried using a vertically flipped viewport, and while it works on Nvidia 1050, the Vulkan debug layer is throwing error messages that this is not supported in the spec so it might not work on others. There is also the possibility to flip the clip scpace position Y coordinate before writing out with vertex shader, but that requires changing and recompiling every shader. I could also bake it into the camera projection matrices, though I want to avoid that because then I need to track down for the whole engine where I upload matrices... Any chance of an easy extension or something? If not, I will probably go with changing the vertex shaders.
    • By NikiTo
      Some people say "discard" has not a positive effect on optimization. Other people say it will at least spare the fetches of textures.
       
      if (color.A < 0.1f) { //discard; clip(-1); } // tons of reads of textures following here // and loops too
      Some people say that "discard" will only mask out the output of the pixel shader, while still evaluates all the statements after the "discard" instruction.

      MSN>
      discard: Do not output the result of the current pixel.
      clip: Discards the current pixel..
      <MSN

      As usual it is unclear, but it suggests that "clip" could discard the whole pixel(maybe stopping execution too)

      I think, that at least, because of termal and energy consuming reasons, GPU should not evaluate the statements after "discard", but some people on internet say that GPU computes the statements anyways. What I am more worried about, are the texture fetches after discard/clip.

      (what if after discard, I have an expensive branch decision that makes the approved cheap branch neighbor pixels stall for nothing? this is crazy)
    • By NikiTo
      I have a problem. My shaders are huge, in the meaning that they have lot of code inside. Many of my pixels should be completely discarded. I could use in the very beginning of the shader a comparison and discard, But as far as I understand, discard statement does not save workload at all, as it has to stale until the long huge neighbor shaders complete.
      Initially I wanted to use stencil to discard pixels before the execution flow enters the shader. Even before the GPU distributes/allocates resources for this shader, avoiding stale of pixel shaders execution flow, because initially I assumed that Depth/Stencil discards pixels before the pixel shader, but I see now that it happens inside the very last Output Merger state. It seems extremely inefficient to render that way a little mirror in a scene with big viewport. Why they've put the stencil test in the output merger anyway? Handling of Stencil is so limited compared to other resources. Does people use Stencil functionality at all for games, or they prefer discard/clip?

      Will GPU stale the pixel if I issue a discard in the very beginning of the pixel shader, or GPU will already start using the freed up resources to render another pixel?!?!



       
  • Advertisement
  • Advertisement
Sign in to follow this  

DX12 array of texture of different res and dynamic index into them in shader

Recommended Posts

Hi Guys,

IIRC, before dx12, we can create texture array (textures have to be the same format, same reso) and dynamically index into them in shader. But it is impossible to dynamically index 'texture array' which texture have different size (actually you can't create texture array of different reso).

 

Now, from what I know, it seems with Dx12 rootsig, resource heap model, we can dynamically index 'texture array' of different size, format(same channel, same data type), right?  Here is the link talk about that: https://msdn.microsoft.com/en-us/library/windows/desktop/mt186614(v=vs.85).aspx

 

However, the above link only use textures has same format, reso. And it didn't mention whether it support texture array of different reso. So I think it's better first ask here before I write my test code (It will be frustrating if I spend half hour coding only get to know index into texture array of different size is not supported, while you guys already know it)

 

Also if it is supported, is there anythings I should be aware of? 

 

Thanks in advance~

Share this post


Link to post
Share on other sites
Advertisement

The example does use textures of the same resolution, but indeed there is no reason that they need have the same Width, Height, Format or Mip Count. So long as they are an array of 2D Textures, that's fine.

Depending on how many textures you want bound at once, be aware that you may be excluding Resource Binding Tier 1 hardware:

https://msdn.microsoft.com/en-gb/library/windows/desktop/dn899127(v=vs.85).aspx

Note that in order to get truly non-uniform resource indexing you need to tell HLSL + the compiler that the index is non-uniform using the "NonUniformResourceIndex" intrinsic. Failing to do this will likely result in the index from the first thread of the wave deciding which texture to sample from.

https://msdn.microsoft.com/en-us/library/windows/desktop/dn899207(v=vs.85).aspx

Share this post


Link to post
Share on other sites

Excluding tiers 1 resource binding is an easy call, only a few old generation Intel integrated gpu are tiers 1 :)

Yes, tiers 2 can dynamically index textures and buffers, a technique known as bindless.

 

The non uniform intrinsic is useful on AMD, and does nothing on nVidia, BUT, you do not want to use it as it can generate very ugly and not efficient shaders, it is better on AMD to sends draws in a way you can "uniformize" the index. Shader model 6 will also help in that regards with an explicit way to read first lane

Share this post


Link to post
Share on other sites

Note that in order to get truly non-uniform resource indexing you need to tell HLSL + the compiler that the index is non-uniform using the "NonUniformResourceIndex" intrinsic. Failing to do this will likely result in the index from the first thread of the wave deciding which texture to sample from.

Is that the same intrinsic used when you want to use dynamic indexing with instancing?  Or am I thinking of something else?  Also do you need to use a intrinsic when using dynamic indexing with draw indirect? 

Share this post


Link to post
Share on other sites

I think you are confusing with sv_instanceId that is a system semantic.

The bindless intrinsic is used like that

Texture2D<float4> diffuse_[] : register(t1);

uint someIndex = foo(); // coming from something like an interpolator or whatever
Texture2D diffuse = diffuse_[NonUniformResourceIndex(someIndex)];

//then using diffuse

Without the intrinsic, that is more like a tag, as pixels are gather into waves, some hardware ( AMD obviously ) will have bogus result. because a texture descriptor is loaded in scalar registers and if someIndex is divergent, it means that you are crossing flux.

 

The NonUniformResourceIndex will inform the compiler and driver of that and the driver will generate a loop with clever masking to process group of threads per value of someIndex. In simple cases, the overhead may not be significant, but if you multiply the divergent indices, then it will for sure bloater your shader :)

Share this post


Link to post
Share on other sites

I think you are confusing with sv_instanceId that is a system semantic.
 No IIRC one of the MS DX12 videos on youtube mentions explictly that if you want to do dynamic indexing with instancing you need to do something 'special'.  I don't remember what exactly it is but I remember them saying if the index to the texture change within a single 'drawcall' then you have to do something for it to work correctly.

Share this post


Link to post
Share on other sites

 

I think you are confusing with sv_instanceId that is a system semantic.
 No IIRC one of the MS DX12 videos on youtube mentions explictly that if you want to do dynamic indexing with instancing you need to do something 'special'.  I don't remember what exactly it is but I remember them saying if the index to the texture change within a single 'drawcall' then you have to do something for it to work correctly.

 

Ok so yes, this is the NonUniformResourceIndex :)

Share this post


Link to post
Share on other sites

If you'd like to see a more complete example of dynamic indexing, you can check out the deferred texturing demo that I made a while ago. It uses dynamic indexing in both the forward and deferred rendering path to sample material textures, as well as to sample decal textures. There's also an experimental branch where I use bindless techniques throughout the entire rendering framework. Basically all SRV's are persistently allocated from a global descriptor heap, and every shader accesses them using 32-bit indices. However I should warn you that there may be a few bugs on this branch that I haven't fixed yet, and there's also a few issues with dynamic buffers that I have to clean up.

Edited by MJP

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Advertisement