_void_

Members
  • Content count

    112
  • Joined

  • Last visited

Community Reputation

864 Good

About _void_

  • Rank
    Member

Personal Information

  • Interests
    Education
    Programming
  1. @ajmiles @SoldierOfLight Thank you guys! I guess, I managed to confuse everyone :-)
  2. Yeah, I can convert the texture to DXGI_FORMAT_R8G8B8A8_UNORM_SRGB format. I thought maybe there is a way to work this around using shader resource view, which I do not know about :-)
  3. Hello guys, I have a texture of format DXGI_FORMAT_B8G8R8A8_UNORM_SRGB. Is there a way to create shader resource view for the texture so that I could read it as RGBA from the shader instead of reading it specifically as BGRA? I would like all the textures to be read as RGBA. Tx
  4. @ajmiles Thank you for the info! Can you please rephrase the statetement I do not really grasp what you mean here
  5. Hello guys, I am wondering why D3D12 resource size has type UINT64 while resource view size is limited to UINT32. typedef struct D3D12_RESOURCE_DESC { … UINT64 Width; … } D3D12_RESOURCE_DESC; Vertex buffer view can be described in UINT32 types. typedef struct D3D12_VERTEX_BUFFER_VIEW { D3D12_GPU_VIRTUAL_ADDRESS BufferLocation; UINT SizeInBytes; UINT StrideInBytes; } D3D12_VERTEX_BUFFER_VIEW; For the buffer we can specify offset for the first element as UINT64 but the buffer view should still be defined in UINT32 terms. typedef struct D3D12_BUFFER_SRV { UINT64 FirstElement; UINT NumElements; UINT StructureByteStride; D3D12_BUFFER_SRV_FLAGS Flags; } D3D12_BUFFER_SRV; Does it really mean that we can create, for instance, structured buffer of floats having MAX_UNIT64 elements (MAX_UNIT64 * sizeof(float) in byte size) but are not be able to create shader resource view which will enclose it completely since we are limited by UINT range? Is there a specific reason for this? HLSL is restricted to UINT32 values. Calling function GetDimensions() on the resource of UINT64 size will not be able to produce valid values. I guess, it could be one of the reasons. Thanks!
  6. @ajmilesNow I understand :-) I am using MAX_INT as meshType for those pixels in the screen where geometry is missing. I was hoping that writes out of the array boundaries at index MAX_INT will be ignored. But in this case compiler just optimized away the code. I have added explicit check against MAX_INT and everything works now. Thanks a million! if (meshType != MAX_INT) { InterlockedMin(g_ScreenMinPoints[meshType].x, globalThreadId.x); InterlockedMin(g_ScreenMinPoints[meshType].y, globalThreadId.y); InterlockedMax(g_ScreenMaxPoints[meshType].x, globalThreadId.x); InterlockedMax(g_ScreenMaxPoints[meshType].y, globalThreadId.y); }
  7. @ajmiles Updated shader struct AppData { float4x4 viewMatrix; float4x4 viewInvMatrix; float4x4 projMatrix; float4x4 projInvMatrix; float4x4 viewProjMatrix; float4x4 viewProjInvMatrix; float4x4 prevViewProjMatrix; float4x4 prevViewProjInvMatrix; float4x4 notUsed1; float4 cameraWorldSpacePos; float4 cameraWorldFrustumPlanes[6]; float cameraNearPlane; float cameraFarPlane; float2 notUsed2; uint2 screenSize; float2 rcpScreenSize; uint2 screenHalfSize; float2 rcpScreenHalfSize; uint2 screenQuarterSize; float2 rcpScreenQuarterSize; float4 sunWorldSpaceDir; float4 sunLightColor; float4 notUsed3[15]; }; #define NUM_MESH_TYPES 1 #define NUM_THREADS_X 16 #define NUM_THREADS_Y 16 groupshared uint2 g_ScreenMinPoints[NUM_MESH_TYPES]; groupshared uint2 g_ScreenMaxPoints[NUM_MESH_TYPES]; #define NUM_THREADS_PER_GROUP (NUM_THREADS_X * NUM_THREADS_Y) cbuffer AppDataBuffer : register(b0) { AppData g_AppData; } RWStructuredBuffer<uint2> g_ShadingRectangleMinPointBuffer : register(u0); RWStructuredBuffer<uint2> g_ShadingRectangleMaxPointBuffer : register(u1); Texture2D<uint> g_MaterialIDTexture : register(t0); Buffer<uint> g_MeshTypePerMaterialIDBuffer : register(t1); [numthreads(NUM_THREADS_X, NUM_THREADS_Y, 1)] void main(uint3 globalThreadId : SV_DispatchThreadID, uint localThreadIndex : SV_GroupIndex) { for (uint index = localThreadIndex; index < NUM_MESH_TYPES; index += NUM_THREADS_PER_GROUP) { g_ScreenMinPoints[index] = uint2(0xffffffff, 0xffffffff); g_ScreenMaxPoints[index] = uint2(0, 0); } GroupMemoryBarrierWithGroupSync(); if ((globalThreadId.x < g_AppData.screenSize.x) && (globalThreadId.y < g_AppData.screenSize.y)) { uint materialID = g_MaterialIDTexture[globalThreadId.xy]; uint meshType = g_MeshTypePerMaterialIDBuffer[materialID]; InterlockedMin(g_ScreenMinPoints[meshType].x, globalThreadId.x); InterlockedMin(g_ScreenMinPoints[meshType].y, globalThreadId.y); InterlockedMax(g_ScreenMaxPoints[meshType].x, globalThreadId.x); InterlockedMax(g_ScreenMaxPoints[meshType].y, globalThreadId.y); } GroupMemoryBarrierWithGroupSync(); for (uint index = localThreadIndex; index < NUM_MESH_TYPES; index += NUM_THREADS_PER_GROUP) { InterlockedMin(g_ShadingRectangleMinPointBuffer[index].x, g_ScreenMinPoints[index].x); InterlockedMin(g_ShadingRectangleMinPointBuffer[index].y, g_ScreenMinPoints[index].y); InterlockedMax(g_ShadingRectangleMaxPointBuffer[index].x, g_ScreenMaxPoints[index].x); InterlockedMax(g_ShadingRectangleMaxPointBuffer[index].y, g_ScreenMaxPoints[index].y); } }
  8. Hello guys, I am working on implementing deferred texturing tecnique. I have screen-space material ID texture from render G-Buffer pass for which I would like to calculate screen space rectangles encompasing material IDs used for the same mesh type. By mesh type I refer to one pipeline state object permutation used for G-Buffer shading pass. Those screen space rectangles are later used to shade G-Buffer based on mesh type as described in article Deferred+ from GPU Zen. My compute shader pass for calculating encompasing rectangles does not produce expected results. I did some debugging with PIX and I can see that PIX for some reason does not show g_MaterialIDTexture and g_MeshTypePerMaterialIDBuffer in the list of binded resources. When I step with debugger through the shader code, reading g_MaterialIDTexture and MeshTypePerMaterialIDBuffer is skipped. You can see the shader below. groupshared uint2 g_ScreenMinPoints[NUM_MESH_TYPES]; groupshared uint2 g_ScreenMaxPoints[NUM_MESH_TYPES]; #define NUM_THREADS_PER_GROUP (NUM_THREADS_X * NUM_THREADS_Y) cbuffer AppDataBuffer : register(b0) { AppData g_AppData; } RWStructuredBuffer<uint2> g_ShadingRectangleMinPointBuffer : register(u0); RWStructuredBuffer<uint2> g_ShadingRectangleMaxPointBuffer : register(u1); Texture2D<uint> g_MaterialIDTexture : register(t0); Buffer<uint> g_MeshTypePerMaterialIDBuffer : register(t1); [numthreads(NUM_THREADS_X, NUM_THREADS_Y, 1)] void Main(uint3 globalThreadId : SV_DispatchThreadID, uint localThreadIndex : SV_GroupIndex) { for (uint index = localThreadIndex; index < NUM_MESH_TYPES; index += NUM_THREADS_PER_GROUP) { g_ScreenMinPoints[index] = uint2(0xffffffff, 0xffffffff); g_ScreenMaxPoints[index] = uint2(0, 0); } GroupMemoryBarrierWithGroupSync(); if ((globalThreadId.x < g_AppData.screenSize.x) && (globalThreadId.y < g_AppData.screenSize.y)) { uint materialID = g_MaterialIDTexture[globalThreadId.xy]; uint meshType = g_MeshTypePerMaterialIDBuffer[materialID]; InterlockedMin(g_ScreenMinPoints[meshType].x, globalThreadId.x); InterlockedMin(g_ScreenMinPoints[meshType].y, globalThreadId.y); InterlockedMax(g_ScreenMaxPoints[meshType].x, globalThreadId.x); InterlockedMax(g_ScreenMaxPoints[meshType].y, globalThreadId.y); } GroupMemoryBarrierWithGroupSync(); for (uint index = localThreadIndex; index < NUM_MESH_TYPES; index += NUM_THREADS_PER_GROUP) { InterlockedMin(g_ShadingRectangleMinPointBuffer[index].x, g_ScreenMinPoints[index].x); InterlockedMin(g_ShadingRectangleMinPointBuffer[index].y, g_ScreenMinPoints[index].y); InterlockedMax(g_ShadingRectangleMaxPointBuffer[index].x, g_ScreenMaxPoints[index].x); InterlockedMax(g_ShadingRectangleMaxPointBuffer[index].y, g_ScreenMaxPoints[index].y); } } I checked DXBC output and it does not include them either. // // Generated by Microsoft (R) HLSL Shader Compiler 10.1 // // // Buffer Definitions: // // cbuffer AppDataBuffer // { // // struct AppData // { // // float4x4 viewMatrix; // Offset: 0 // float4x4 viewInvMatrix; // Offset: 64 // float4x4 projMatrix; // Offset: 128 // float4x4 projInvMatrix; // Offset: 192 // float4x4 viewProjMatrix; // Offset: 256 // float4x4 viewProjInvMatrix; // Offset: 320 // float4x4 prevViewProjMatrix; // Offset: 384 // float4x4 prevViewProjInvMatrix;// Offset: 448 // float4x4 notUsed1; // Offset: 512 // float4 cameraWorldSpacePos; // Offset: 576 // float4 cameraWorldFrustumPlanes[6];// Offset: 592 // float cameraNearPlane; // Offset: 688 // float cameraFarPlane; // Offset: 692 // float2 notUsed2; // Offset: 696 // uint2 screenSize; // Offset: 704 // float2 rcpScreenSize; // Offset: 712 // uint2 screenHalfSize; // Offset: 720 // float2 rcpScreenHalfSize; // Offset: 728 // uint2 screenQuarterSize; // Offset: 736 // float2 rcpScreenQuarterSize; // Offset: 744 // float4 sunWorldSpaceDir; // Offset: 752 // float4 sunLightColor; // Offset: 768 // float4 notUsed3[15]; // Offset: 784 // // } g_AppData; // Offset: 0 Size: 1024 // // } // // Resource bind info for g_ShadingRectangleMinPointBuffer // { // // uint2 $Element; // Offset: 0 Size: 8 // // } // // Resource bind info for g_ShadingRectangleMaxPointBuffer // { // // uint2 $Element; // Offset: 0 Size: 8 // // } // // // Resource Bindings: // // Name Type Format Dim HLSL Bind Count // ------------------------------ ---------- ------- ----------- -------------- ------ // g_ShadingRectangleMinPointBuffer UAV struct r/w u0 1 // g_ShadingRectangleMaxPointBuffer UAV struct r/w u1 1 // AppDataBuffer cbuffer NA NA cb0 1 // // // // Input signature: // // Name Index Mask Register SysValue Format Used // -------------------- ----- ------ -------- -------- ------- ------ // no Input // // Output signature: // // Name Index Mask Register SysValue Format Used // -------------------- ----- ------ -------- -------- ------- ------ // no Output 0x00000000: cs_5_0 0x00000008: dcl_globalFlags refactoringAllowed | skipOptimization 0x0000000C: dcl_constantbuffer CB0[45], immediateIndexed 0x0000001C: dcl_uav_structured u0, 8 0x0000002C: dcl_uav_structured u1, 8 0x0000003C: dcl_input vThreadIDInGroupFlattened 0x00000044: dcl_input vThreadID.xy 0x0000004C: dcl_temps 2 0x00000054: dcl_tgsm_structured g0, 8, 1 0x00000068: dcl_tgsm_structured g1, 8, 1 0x0000007C: dcl_thread_group 16, 16, 1 // // Initial variable locations: // vThreadID.x <- globalThreadId.x; vThreadID.y <- globalThreadId.y; vThreadID.z <- globalThreadId.z; // vThreadIDInGroupFlattened.x <- localThreadIndex // #line 22 "D:\GitHub\RenderSDK\Samples\Bin\DynamicGI\Shaders\CalcShadingRectanglesCS.hlsl" 0 0x0000008C: mov r0.x, vThreadIDInGroupFlattened.x 1 0x0000009C: mov r0.y, r0.x 2 0x000000B0: loop 3 0x000000B4: mov r0.z, l(1) 4 0x000000C8: ult r0.z, r0.y, r0.z 5 0x000000E4: breakc_z r0.z #line 24 6 0x000000F0: store_structured g0.x, l(0), l(0), l(-1) 7 0x00000114: store_structured g0.x, l(0), l(4), l(-1) #line 25 8 0x00000138: mov r0.zw, l(0,0,0,0) 9 0x00000158: store_structured g1.x, l(0), l(0), r0.z 10 0x0000017C: store_structured g1.x, l(0), l(4), r0.w #line 26 11 0x000001A0: mov r0.z, l(256) 12 0x000001B4: iadd r0.y, r0.z, r0.y 13 0x000001D0: endloop #line 27 14 0x000001D4: sync_g_t #line 29 15 0x000001D8: ult r0.x, vThreadID.x, cb0[44].x 16 0x000001F4: ult r0.y, vThreadID.y, cb0[44].y 17 0x00000210: and r0.x, r0.y, r0.x 18 0x0000022C: if_nz r0.x #line 34 19 0x00000238: atomic_umin g0, l(0, 0, 0, 0), vThreadID.x #line 35 20 0x0000025C: atomic_umin g0, l(0, 4, 0, 0), vThreadID.y #line 37 21 0x00000280: atomic_umax g1, l(0, 0, 0, 0), vThreadID.x #line 38 22 0x000002A4: atomic_umax g1, l(0, 4, 0, 0), vThreadID.y #line 39 23 0x000002C8: endif #line 40 24 0x000002CC: sync_g_t #line 42 25 0x000002D0: mov r0.x, vThreadIDInGroupFlattened.x // r0.x <- index 26 0x000002E0: mov r1.x, r0.x // r1.x <- index 27 0x000002F4: loop 28 0x000002F8: mov r0.y, l(1) 29 0x0000030C: ult r0.y, r1.x, r0.y 30 0x00000328: breakc_z r0.y #line 44 31 0x00000334: ld_structured r0.y, l(0), l(0), g0.xxxx 32 0x00000358: mov r1.y, l(0) 33 0x0000036C: atomic_umin u0, r1.xyxx, r0.y #line 45 34 0x00000388: ld_structured r0.y, l(0), l(4), g0.xxxx 35 0x000003AC: mov r1.z, l(4) 36 0x000003C0: atomic_umin u0, r1.xzxx, r0.y #line 47 37 0x000003DC: ld_structured r0.y, l(0), l(0), g1.xxxx 38 0x00000400: atomic_umax u1, r1.xyxx, r0.y #line 48 39 0x0000041C: ld_structured r0.y, l(0), l(4), g1.xxxx 40 0x00000440: atomic_umax u1, r1.xzxx, r0.y #line 49 41 0x0000045C: mov r0.y, l(256) 42 0x00000470: iadd r1.x, r0.y, r1.x 43 0x0000048C: endloop #line 50 44 0x00000490: ret // Approximately 45 instruction slots used Looks like compiler optimizes them away but I do not understand why. Any ideas? :-) Thanks,
  9. DX12 Resource synchronization

    @Infinisearch@SoldierOfLight Great! Thank you for clarifying the things!
  10. Hi guys! I have to run custom pixel shader to clear RTV (draw call 1). After that I run another pixel shader on that RTV to render objects (draw call 2). Do I need to insert resource barrier between two draw calls provided that RTV is already in render target state? The same thing with UAV. I run custom compute shader to clear UAV and then run another compute shader to do actual calculations. Do I need resource barrier between these two draw calls? Thanks!
  11. Hi guys, I would like to double-check with you if I am correct. I have a buffer of unsigned integers of format DXGI_FORMAT_R16_UINT, which I would like to read from the shader. Since there is no dedicated HLSL 16 bit unsigned int type, I guess we should go with HLSL uint (32 bit) type. Buffer<uint> g_InputBuffer : register(t0); // DXGI_FORMAT_R16_UINT In this case, will each element of the input buffer be automatically converted into a corresponding uint (32 bit) element? And vice versa, if I want to output to buffer of type DXGI_FORMAT_R16_UINT, I guess HLSL uint will be automatically converted to 16 bit unsigned? RWBuffer<uint> g_OutputBuffer : register(u0); // DXGI_FORMAT_R16_UINT uint value = ... g_OutputBuffer[outIndex] = value; Thanks!
  12. @galop1n Yep, you are right. I was using Sample instead of SampleLevel. The issue has been solved. Do you know if MinMax filtering is supported by default in D3D12? How do you check if it is supported otherwise? Thanks!
  13. Hello guys, I would like to use MinMax filtering (D3D12_FILTER_MAXIMUM_MIN_MAG_MIP_POINT) in compute shader. I am trying to compile compute shader (cs_5_0) and encounter an error: "error X4532: cannot map expression to cs_5_0 instruction set". I tried to compile the shader in cs_6_0 mode and got "unrecognized compiler target cs_6_0". I do not really understand the error as cs_6_0 is supposed to be supported. According to MSDN, D3D12_FILTER_MAXIMUM_MIN_MAG_MIP_POINT should "Fetch the same set of texels as D3D12_FILTER_MIN_MAG_MIP_POINT and instead of filtering them return the maximum of the texels. Texels that are weighted 0 during filtering aren't counted towards the maximum. You can query support for this filter type from the MinMaxFiltering member in the D3D11_FEATURE_DATA_D3D11_OPTIONS1 structure". Not sure if this is valid documentation as it is talking about Direct3D 11. D3D12_FEATURE_DATA_D3D12_OPTIONS does not seem to provide this kind of check. Direct3D device is created with feature level D3D_FEATURE_LEVEL_12_0 and I am using VS 2015. Thanks!