Jump to content
  • Advertisement

Search the Community

Showing results for tags 'DX12'.

The search index is currently processing. Current results may not be complete.


More search options

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Categories

  • Audio
    • Music and Sound FX
  • Business
    • Business and Law
    • Career Development
    • Production and Management
  • Game Design
    • Game Design and Theory
    • Writing for Games
    • UX for Games
  • Industry
    • Interviews
    • Event Coverage
  • Programming
    • Artificial Intelligence
    • General and Gameplay Programming
    • Graphics and GPU Programming
    • Engines and Middleware
    • Math and Physics
    • Networking and Multiplayer
  • Visual Arts
  • Archive

Categories

  • Audio
  • Visual Arts
  • Programming
  • Writing

Categories

  • Game Developers Conference
    • GDC 2017
    • GDC 2018
  • Power-Up Digital Games Conference
    • PDGC I: Words of Wisdom
    • PDGC II: The Devs Strike Back
    • PDGC III: Syntax Error

Forums

  • Audio
    • Music and Sound FX
  • Business
    • Games Career Development
    • Production and Management
    • Games Business and Law
  • Game Design
    • Game Design and Theory
    • Writing for Games
  • Programming
    • Artificial Intelligence
    • Engines and Middleware
    • General and Gameplay Programming
    • Graphics and GPU Programming
    • Math and Physics
    • Networking and Multiplayer
  • Visual Arts
    • 2D and 3D Art
    • Critique and Feedback
  • Community
    • GameDev Challenges
    • GDNet+ Member Forum
    • GDNet Lounge
    • GDNet Comments, Suggestions, and Ideas
    • Coding Horrors
    • Your Announcements
    • Hobby Project Classifieds
    • Indie Showcase
    • Article Writing
  • Affiliates
    • NeHe Productions
    • AngelCode
  • Topical
    • Virtual and Augmented Reality
    • News
  • Workshops
    • C# Workshop
    • CPP Workshop
    • Freehand Drawing Workshop
    • Hands-On Interactive Game Development
    • SICP Workshop
    • XNA 4.0 Workshop
  • Archive
    • Topical
    • Affiliates
    • Contests
    • Technical
  • GameDev Challenges's Topics
  • For Beginners's Forum

Calendars

  • Community Calendar
  • Games Industry Events
  • Game Jams
  • GameDev Challenges's Schedule

Blogs

There are no results to display.

There are no results to display.

Product Groups

  • GDNet+
  • Advertisements
  • GameDev Gear

Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Joined

  • Start

    End


Group


About Me


Website


Role


Twitter


Github


Twitch


Steam

Found 281 results

  1. Hello! I would like to introduce Diligent Engine, a project that I've been recently working on. Diligent Engine is a light-weight cross-platform abstraction layer between the application and the platform-specific graphics API. Its main goal is to take advantages of the next-generation APIs such as Direct3D12 and Vulkan, but at the same time provide support for older platforms via Direct3D11, OpenGL and OpenGLES. Diligent Engine exposes common front-end for all supported platforms and provides interoperability with underlying native API. Shader source code converter allows shaders authored in HLSL to be translated to GLSL and used on all platforms. Diligent Engine supports integration with Unity and is designed to be used as a graphics subsystem in a standalone game engine, Unity native plugin or any other 3D application. It is distributed under Apache 2.0 license and is free to use. Full source code is available for download on GitHub. Features: True cross-platform Exact same client code for all supported platforms and rendering backends No #if defined(_WIN32) ... #elif defined(LINUX) ... #elif defined(ANDROID) ... No #if defined(D3D11) ... #elif defined(D3D12) ... #elif defined(OPENGL) ... Exact same HLSL shaders run on all platforms and all backends Modular design Components are clearly separated logically and physically and can be used as needed Only take what you need for your project (do not want to keep samples and tutorials in your codebase? Simply remove Samples submodule. Only need core functionality? Use only Core submodule) No 15000 lines-of-code files Clear object-based interface No global states Key graphics features: Automatic shader resource binding designed to leverage the next-generation rendering APIs Multithreaded command buffer generation 50,000 draw calls at 300 fps with D3D12 backend Descriptor, memory and resource state management Modern c++ features to make code fast and reliable The following platforms and low-level APIs are currently supported: Windows Desktop: Direct3D11, Direct3D12, OpenGL Universal Windows: Direct3D11, Direct3D12 Linux: OpenGL Android: OpenGLES MacOS: OpenGL iOS: OpenGLES API Basics Initialization The engine can perform initialization of the API or attach to already existing D3D11/D3D12 device or OpenGL/GLES context. For instance, the following code shows how the engine can be initialized in D3D12 mode: #include "RenderDeviceFactoryD3D12.h" using namespace Diligent; // ... GetEngineFactoryD3D12Type GetEngineFactoryD3D12 = nullptr; // Load the dll and import GetEngineFactoryD3D12() function LoadGraphicsEngineD3D12(GetEngineFactoryD3D12); auto *pFactoryD3D11 = GetEngineFactoryD3D12(); EngineD3D12Attribs EngD3D12Attribs; EngD3D12Attribs.CPUDescriptorHeapAllocationSize[0] = 1024; EngD3D12Attribs.CPUDescriptorHeapAllocationSize[1] = 32; EngD3D12Attribs.CPUDescriptorHeapAllocationSize[2] = 16; EngD3D12Attribs.CPUDescriptorHeapAllocationSize[3] = 16; EngD3D12Attribs.NumCommandsToFlushCmdList = 64; RefCntAutoPtr<IRenderDevice> pRenderDevice; RefCntAutoPtr<IDeviceContext> pImmediateContext; SwapChainDesc SwapChainDesc; RefCntAutoPtr<ISwapChain> pSwapChain; pFactoryD3D11->CreateDeviceAndContextsD3D12( EngD3D12Attribs, &pRenderDevice, &pImmediateContext, 0 ); pFactoryD3D11->CreateSwapChainD3D12( pRenderDevice, pImmediateContext, SwapChainDesc, hWnd, &pSwapChain ); Creating Resources Device resources are created by the render device. The two main resource types are buffers, which represent linear memory, and textures, which use memory layouts optimized for fast filtering. To create a buffer, you need to populate BufferDesc structure and call IRenderDevice::CreateBuffer(). The following code creates a uniform (constant) buffer: BufferDesc BuffDesc; BufferDesc.Name = "Uniform buffer"; BuffDesc.BindFlags = BIND_UNIFORM_BUFFER; BuffDesc.Usage = USAGE_DYNAMIC; BuffDesc.uiSizeInBytes = sizeof(ShaderConstants); BuffDesc.CPUAccessFlags = CPU_ACCESS_WRITE; m_pDevice->CreateBuffer( BuffDesc, BufferData(), &m_pConstantBuffer ); Similar, to create a texture, populate TextureDesc structure and call IRenderDevice::CreateTexture() as in the following example: TextureDesc TexDesc; TexDesc.Name = "My texture 2D"; TexDesc.Type = TEXTURE_TYPE_2D; TexDesc.Width = 1024; TexDesc.Height = 1024; TexDesc.Format = TEX_FORMAT_RGBA8_UNORM; TexDesc.Usage = USAGE_DEFAULT; TexDesc.BindFlags = BIND_SHADER_RESOURCE | BIND_RENDER_TARGET | BIND_UNORDERED_ACCESS; TexDesc.Name = "Sample 2D Texture"; m_pRenderDevice->CreateTexture( TexDesc, TextureData(), &m_pTestTex ); Initializing Pipeline State Diligent Engine follows Direct3D12 style to configure the graphics/compute pipeline. One big Pipelines State Object (PSO) encompasses all required states (all shader stages, input layout description, depth stencil, rasterizer and blend state descriptions etc.) Creating Shaders To create a shader, populate ShaderCreationAttribs structure. An important member is ShaderCreationAttribs::SourceLanguage. The following are valid values for this member: SHADER_SOURCE_LANGUAGE_DEFAULT - The shader source format matches the underlying graphics API: HLSL for D3D11 or D3D12 mode, and GLSL for OpenGL and OpenGLES modes. SHADER_SOURCE_LANGUAGE_HLSL - The shader source is in HLSL. For OpenGL and OpenGLES modes, the source code will be converted to GLSL. See shader converter for details. SHADER_SOURCE_LANGUAGE_GLSL - The shader source is in GLSL. There is currently no GLSL to HLSL converter. To allow grouping of resources based on the frequency of expected change, Diligent Engine introduces classification of shader variables: Static variables (SHADER_VARIABLE_TYPE_STATIC) are variables that are expected to be set only once. They may not be changed once a resource is bound to the variable. Such variables are intended to hold global constants such as camera attributes or global light attributes constant buffers. Mutable variables (SHADER_VARIABLE_TYPE_MUTABLE) define resources that are expected to change on a per-material frequency. Examples may include diffuse textures, normal maps etc. Dynamic variables (SHADER_VARIABLE_TYPE_DYNAMIC) are expected to change frequently and randomly. This post describes the resource binding model in Diligent Engine. The following is an example of shader initialization: ShaderCreationAttribs Attrs; Attrs.Desc.Name = "MyPixelShader"; Attrs.FilePath = "MyShaderFile.fx"; Attrs.SearchDirectories = "shaders;shaders\\inc;"; Attrs.EntryPoint = "MyPixelShader"; Attrs.Desc.ShaderType = SHADER_TYPE_PIXEL; Attrs.SourceLanguage = SHADER_SOURCE_LANGUAGE_HLSL; BasicShaderSourceStreamFactory BasicSSSFactory(Attrs.SearchDirectories); Attrs.pShaderSourceStreamFactory = &BasicSSSFactory; ShaderVariableDesc ShaderVars[] = { {"g_StaticTexture", SHADER_VARIABLE_TYPE_STATIC}, {"g_MutableTexture", SHADER_VARIABLE_TYPE_MUTABLE}, {"g_DynamicTexture", SHADER_VARIABLE_TYPE_DYNAMIC} }; Attrs.Desc.VariableDesc = ShaderVars; Attrs.Desc.NumVariables = _countof(ShaderVars); Attrs.Desc.DefaultVariableType = SHADER_VARIABLE_TYPE_STATIC; StaticSamplerDesc StaticSampler; StaticSampler.Desc.MinFilter = FILTER_TYPE_LINEAR; StaticSampler.Desc.MagFilter = FILTER_TYPE_LINEAR; StaticSampler.Desc.MipFilter = FILTER_TYPE_LINEAR; StaticSampler.TextureName = "g_MutableTexture"; Attrs.Desc.NumStaticSamplers = 1; Attrs.Desc.StaticSamplers = &StaticSampler; ShaderMacroHelper Macros; Macros.AddShaderMacro("USE_SHADOWS", 1); Macros.AddShaderMacro("NUM_SHADOW_SAMPLES", 4); Macros.Finalize(); Attrs.Macros = Macros; RefCntAutoPtr<IShader> pShader; m_pDevice->CreateShader( Attrs, &pShader ); Creating the Pipeline State Object To create a pipeline state object, define instance of PipelineStateDesc structure. The structure defines the pipeline specifics such as if the pipeline is a compute pipeline, number and format of render targets as well as depth-stencil format: // This is a graphics pipeline PSODesc.IsComputePipeline = false; PSODesc.GraphicsPipeline.NumRenderTargets = 1; PSODesc.GraphicsPipeline.RTVFormats[0] = TEX_FORMAT_RGBA8_UNORM_SRGB; PSODesc.GraphicsPipeline.DSVFormat = TEX_FORMAT_D32_FLOAT; The structure also defines depth-stencil, rasterizer, blend state, input layout and other parameters. For instance, rasterizer state can be defined as in the code snippet below: // Init rasterizer state RasterizerStateDesc &RasterizerDesc = PSODesc.GraphicsPipeline.RasterizerDesc; RasterizerDesc.FillMode = FILL_MODE_SOLID; RasterizerDesc.CullMode = CULL_MODE_NONE; RasterizerDesc.FrontCounterClockwise = True; RasterizerDesc.ScissorEnable = True; //RSDesc.MultisampleEnable = false; // do not allow msaa (fonts would be degraded) RasterizerDesc.AntialiasedLineEnable = False; When all fields are populated, call IRenderDevice::CreatePipelineState() to create the PSO: m_pDev->CreatePipelineState(PSODesc, &m_pPSO); Binding Shader Resources Shader resource binding in Diligent Engine is based on grouping variables in 3 different groups (static, mutable and dynamic). Static variables are variables that are expected to be set only once. They may not be changed once a resource is bound to the variable. Such variables are intended to hold global constants such as camera attributes or global light attributes constant buffers. They are bound directly to the shader object: PixelShader->GetShaderVariable( "g_tex2DShadowMap" )->Set( pShadowMapSRV ); Mutable and dynamic variables are bound via a new object called Shader Resource Binding (SRB), which is created by the pipeline state: m_pPSO->CreateShaderResourceBinding(&m_pSRB); Dynamic and mutable resources are then bound through SRB object: m_pSRB->GetVariable(SHADER_TYPE_VERTEX, "tex2DDiffuse")->Set(pDiffuseTexSRV); m_pSRB->GetVariable(SHADER_TYPE_VERTEX, "cbRandomAttribs")->Set(pRandomAttrsCB); The difference between mutable and dynamic resources is that mutable ones can only be set once for every instance of a shader resource binding. Dynamic resources can be set multiple times. It is important to properly set the variable type as this may affect performance. Static variables are generally most efficient, followed by mutable. Dynamic variables are most expensive from performance point of view. This post explains shader resource binding in more details. Setting the Pipeline State and Invoking Draw Command Before any draw command can be invoked, all required vertex and index buffers as well as the pipeline state should be bound to the device context: // Clear render target const float zero[4] = {0, 0, 0, 0}; m_pContext->ClearRenderTarget(nullptr, zero); // Set vertex and index buffers IBuffer *buffer[] = {m_pVertexBuffer}; Uint32 offsets[] = {0}; Uint32 strides[] = {sizeof(MyVertex)}; m_pContext->SetVertexBuffers(0, 1, buffer, strides, offsets, SET_VERTEX_BUFFERS_FLAG_RESET); m_pContext->SetIndexBuffer(m_pIndexBuffer, 0); m_pContext->SetPipelineState(m_pPSO); Also, all shader resources must be committed to the device context: m_pContext->CommitShaderResources(m_pSRB, COMMIT_SHADER_RESOURCES_FLAG_TRANSITION_RESOURCES); When all required states and resources are bound, IDeviceContext::Draw() can be used to execute draw command or IDeviceContext::DispatchCompute() can be used to execute compute command. Note that for a draw command, graphics pipeline must be bound, and for dispatch command, compute pipeline must be bound. Draw() takes DrawAttribs structure as an argument. The structure members define all attributes required to perform the command (primitive topology, number of vertices or indices, if draw call is indexed or not, if draw call is instanced or not, if draw call is indirect or not, etc.). For example: DrawAttribs attrs; attrs.IsIndexed = true; attrs.IndexType = VT_UINT16; attrs.NumIndices = 36; attrs.Topology = PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; pContext->Draw(attrs); Tutorials and Samples The GitHub repository contains a number of tutorials and sample applications that demonstrate the API usage. Tutorial 01 - Hello Triangle This tutorial shows how to render a simple triangle using Diligent Engine API. Tutorial 02 - Cube This tutorial demonstrates how to render an actual 3D object, a cube. It shows how to load shaders from files, create and use vertex, index and uniform buffers. Tutorial 03 - Texturing This tutorial demonstrates how to apply a texture to a 3D object. It shows how to load a texture from file, create shader resource binding object and how to sample a texture in the shader. Tutorial 04 - Instancing This tutorial demonstrates how to use instancing to render multiple copies of one object using unique transformation matrix for every copy. Tutorial 05 - Texture Array This tutorial demonstrates how to combine instancing with texture arrays to use unique texture for every instance. Tutorial 06 - Multithreading This tutorial shows how to generate command lists in parallel from multiple threads. Tutorial 07 - Geometry Shader This tutorial shows how to use geometry shader to render smooth wireframe. Tutorial 08 - Tessellation This tutorial shows how to use hardware tessellation to implement simple adaptive terrain rendering algorithm. Tutorial_09 - Quads This tutorial shows how to render multiple 2D quads, frequently swithcing textures and blend modes. AntTweakBar sample demonstrates how to use AntTweakBar library to create simple user interface. Atmospheric scattering sample is a more advanced example. It demonstrates how Diligent Engine can be used to implement various rendering tasks: loading textures from files, using complex shaders, rendering to textures, using compute shaders and unordered access views, etc. The repository includes Asteroids performance benchmark based on this demo developed by Intel. It renders 50,000 unique textured asteroids and lets compare performance of D3D11 and D3D12 implementations. Every asteroid is a combination of one of 1000 unique meshes and one of 10 unique textures. Integration with Unity Diligent Engine supports integration with Unity through Unity low-level native plugin interface. The engine relies on Native API Interoperability to attach to the graphics API initialized by Unity. After Diligent Engine device and context are created, they can be used us usual to create resources and issue rendering commands. GhostCubePlugin shows an example how Diligent Engine can be used to render a ghost cube only visible as a reflection in a mirror.
  2. Hi everyone : ) I'm trying to implement SSAO with D3D12 (using the implementation found on learnopengl.com https://learnopengl.com/Advanced-Lighting/SSAO), but I seem to have a performance problem... Here is a part of the code of the SSAO pixel shader : Texture2D PositionMap : register(t0); Texture2D NormalMap : register(t1); Texture2D NoiseMap : register(t2); SamplerState s1 : register(s0); // I hard coded the variables just for the test const static int kernel_size = 64; const static float2 noise_scale = float2(632.0 / 4.0, 449.0 / 4.0); const static float radius = 0.5; const static float bias = 0.025; cbuffer ssao_cbuf : register(b0) { float4x4 gProjectionMatrix; float3 SSAO_SampleKernel[64]; } float main(VS_OUTPUT input) : SV_TARGET { [....] float occlusion = 0.0; for (int i = 0; i < kernel_size; i++) { float3 ksample = mul(TBN, SSAO_SampleKernel[i]); ksample = pos + ksample * radius; float4 offset = float4(ksample, 1.0); offset = mul(gProjectionMatrix, offset); offset.xyz /= offset.w; offset.xyz = offset.xyz * 0.5 + 0.5; float sampleDepth = PositionMap.Sample(s1, offset.xy).z; float rangeCheck = smoothstep(0.0, 1.0, radius / abs(pos.z - sampleDepth)); occlusion += (sampleDepth >= ksample.z + bias ? 1.0 : 0.0) * rangeCheck; } [....] } The problem is this for loop. When I run it, it takes around 140 ms to draw the frame (a simple torus knot...) on a GTX 770. Without this loop, it's 5ms. Running it without the PositionMap sampling and the matrix multiplication takes around 25ms. I understand that matrix multiplication and sampling are "expensive", but I don't think it's enough to justify the sluggish drawing time. I suppose the shader code from the tutorial is working, so unless I've made something terribly stupid that I don't see I suppose my problem comes from something I did wrong with D3D12 that I'm not aware of (I just started learning D3D2). Both PositionMap and NormalMap are render targets from the gbuffer, for each one I created two DescriptorHeap : one as D3D12_DESCRIPTOR_HEAP_TYPE_RTV and one as D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV, and called both CreateRenderTargetView and CreateShaderResourceView. The NoiseMap only has one descriptor heap of type D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV. Before calling DrawIndexedInstanced for the SSAO pass, I copy the relevant to a descriptor heap that I then bind, like so : CD3DX12_CPU_DESCRIPTOR_HANDLE ssao_heap_hdl(_pSSAOPassDesciptorHeap->GetCPUDescriptorHandleForHeapStart()); device->CopyDescriptorsSimple(1, ssao_heap_hdl, _gBuffer.PositionMap().GetDescriptorHeap()->GetCPUDescriptorHandleForHeapStart(), D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV); ssao_heap_hdl.Offset(CBV_descriptor_inc_size); device->CopyDescriptorsSimple(1, ssao_heap_hdl, _gBuffer.NormalMap().GetDescriptorHeap()->GetCPUDescriptorHandleForHeapStart(), D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV); ssao_heap_hdl.Offset(CBV_descriptor_inc_size); device->CopyDescriptorsSimple(1, ssao_heap_hdl, _ssaoPass.GetNoiseTexture().GetDescriptorHeap()->GetCPUDescriptorHandleForHeapStart(), D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV); ID3D12DescriptorHeap* descriptor_heaps[] = { _pSSAOPassDesciptorHeap }; pCommandList->SetDescriptorHeaps(1, descriptor_heaps); pCommandList->SetGraphicsRootDescriptorTable(0, _pSSAOPassDesciptorHeap->GetGPUDescriptorHandleForHeapStart()); pCommandList->SetGraphicsRootConstantBufferView(1, _cBuffSamplesKernel[0].GetVirtualAddress()); Debug/Release build give me the same results, so do shader compilation flags with/without optimisation. So does anyone see something weird in my code that would cause the slowness ? By the way, when I run the pixel shader in the graphics debugger, this line : offset.xyz /= offset.w; does not seem to produce the expected results, the two lines in the following table are the values in the debugger before and after the execution of that line of code Name Value Type offset offset x = -1.631761000, y = 1.522913000, z = 2.634875000, w = 2.634875000 x = -0.619293700, y = 0.577983000, z = 2.634875000, w = 2.634875000 float4 float4 so X and Y are okay, not Z. Please tell me if you need more info/code. Thank you for your help !
  3. I want to stress my application by rendering lots of polygons. I use DX12 (SharpDX) and I upload all vertices to GPU memory via an upload buffer. In normal cases everything works fine, but when I try to create a large upload buffer (CreateCommittedResource) I get a non-informative exception from SharpDX, and the device is being removed. GetDeviceRemovedReason (DeviceRemovedReason in SharpDX) returns 0x887A0020: DXGI_ERROR_DRIVER_INTERNAL_ERROR. I'm guessing it's all because it can't find a consecutive chunk of memory big enough to create the buffer (I have 6GB on the card and I'm currently trying to allocate a buffer smaller than 1GB). So how do I deal with this? I guess I could create several smaller buffers instead, but I can't just put the CreateCommittedResource call in a try-catch section. The exception is serious enough to remove my device, so any further attempts will fail after the first try. Can I somehow know beforehand what size is OK to allocate?
  4. I have a problem synchronizing data between shared resources. On the input I receive a D3D11 2D texture, which itself is enabled for sharing and has D3D11_RESOURCE_MISC_SHARED_NTHANDLE in its description. Having a D3D12 device created on the same adapter I open a resource through sharing a handle. const CComQIPtr<IDXGIResource1> pDxgiResource1 = pTexture; // <<--- Input texture on 11 device HANDLE hTexture; pDxgiResource1->CreateSharedHandle(NULL, GENERIC_ALL, NULL, &hTexture); CComPtr<ID3D12Resource> pResource; // <<--- Counterparty resource on 12 device pDevice->OpenSharedHandle(hTexture, __uuidof(ID3D12Resource), (VOID**) &pResource); I tend to keep the mapping between the 11 texture and 12 resource further as they are re-filled with data, but in context of the current problem it does not matter if I reuse the mapping or I do OpenSharedHandle on every iteration. Further on I have a command list on 12 device where I use 12 resource (pResource) as a copy source. It is an argument in further CopyResource or CopyTextureRegion calls. I don't have any resource barriers in the command list (including that my attempts to use any don't change the behavior). My problem is that I can't have the data synchronized. Sometimes and especially initially the resource has the correct data, however further iterations have issues such as resource having stale/old data. I tried to flush immediate context on 11 device to make sure that preceding commands are completed. I tried to insert resource barriers at the beginning of command list to possibly make sure that source resource has time to receive the correct data. Same time I have other code paths which don't do OpenSharedHandle mapping and instead do additional texture copying and mapping between original 11 device and 11on12 device, and the code including the rest of the logic works well there. This makes me think that I fail to synchronize the data on the step I mentioned above, even though I am lost how do I synchronize exactly outside of command list. I originally thought that 12 resource has a IDXGIKeyedMutex implementation which is the case with sharing-enabled 11 textures, but I don't have the IDXGIKeyedMutex and I don't see what is the D3D12 equivalent, if any. Could you please advise where to look at to fix the sync?
  5. Recently I read that the APIs are faking some behaviors, giving to the user false impressions. I assume Shader Model 6 issues the wave instructions to the hardware for real, not faking them. Is Shader Model 6, mature enough? Can I expect the same level of optimization form Model 6 as from Model 5? Should I expect more bugs from 6 than 5? Would the extensions of the manufacturer provide better overall code than the Model 6, because, let say, they know their own hardware better? What would you prefer to use for your project- Shader Model 6 or GCN Shader Extensions for DirectX? Which of them is easier to set up and use in Visual Studio(practically)?
  6. I am trying to get the DirectX Control Panel to let me do something like changing the break severity but everything is greyed out. Is there any way I can make the DirectX Control Panel work? Here is a screenshot of the control panel.
  7. I seem to remember seeing a version of directx 11 sdk that was implemented in directx12 on the microsoft website but I can't seem to find it anymore. Does any one else remember ever seeing this project or was it some kind off fever dream I had? It would be a nice tool for slowly porting my massive amount of directx 11 code to 12 overtime.
  8. In the shader code, I need to determine to which AppendStructuredBuffers the data should append. And the AppendStructuredBuffers are more than 30. Is declaring 30+ AppendStructuredBuffers going to overkill the shader? Buffers descriptors should consume SGPRs. Some other way to distribute the output over multiple AppendStructuredBuffers? Is emulating the push/pop functionality with one single byte address buffer worth it? Wouldn't it be much slower than using AppendStructuredBuffer?
  9. I am rendering a large number of objects for a simulation. Each object has instance data and the size of the instance data * number of objects is greater than 4GB. CreateCommittedResource is giving me: E_OUTOFMEMORY Ran out of memory. My PC has 128GB (only 8% ish used prior to testing this), I am running the DirectX app as x64. <Creating a CPU sided resource so GPU ram doesn't matter here, but using Titan X cards if that's a question> Simplified code test that recreates the issue (inserted the code into Microsofts D3D12HelloWorld): unsigned long long int siz = pow(2, 32) + 1024; D3D12_FEATURE_DATA_D3D12_OPTIONS options; //MaxGPUVirtualAddressBitsPerResource = 40 m_device->CheckFeatureSupport(D3D12_FEATURE_D3D12_OPTIONS, &options, sizeof(options)); HRESULT oops = m_device->CreateCommittedResource( &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD), D3D12_HEAP_FLAG_NONE, &CD3DX12_RESOURCE_DESC::Buffer(siz), D3D12_RESOURCE_STATE_GENERIC_READ, nullptr, IID_PPV_ARGS(&m_vertexBuffer)); if (oops != S_OK) { printf("Uh Oh"); } I tried enabling "above 4G" in the bios, which didn't do anything. I also tested using malloc to allocate a > 4G array, that worked in the app without issue. Are there more options or build setup that needs to be done? (Using Visual Studio 2015) *Other approaches to solving this are welcome too. I thought about splitting up the set of items to render into a couple of sets with a size < 4G each but would rather have one set of objects. Thank you.
  10. Hey guys! I am not sure how to specify array slice for GatherRed function on Texture2DArray in HLSL. According to MSDN, "location" is one float value. Is it a 3-component float with 3rd component for array slice? Thanks!
  11. I have a winforms project that uses SharpDX (DirectX 12). The SharpDX library provides a RenderForm (based on a System.Windows.Forms.Form). Now I need to convert the project to WPF instead. What is the best way to do this? I have seen someone pointing to a library, SharpDX.WPF at Codeplex, but according to their info it only provides support up to DX11. (Sorry if this has been asked before. The search function seems to be down at the moment)
  12. Would it be a problem to create in HLSL ~50 uninitialized arrays of ~300000 cells each and then use them for my algorithm(what I currently do in C++(and I had stack overflows problems because of large arrays)). It is something internal to the shader. Shader will create the arrays in the beginning, will use them and not need them anymore. Not taking data for the arrays from the outside world, not giving back data from the arrays to the outside world either. Nothing shared. My question is not very specific, it is about memory consumption considerations when writing shaders in general, because my algorithm still has to be polished. I will let the writing of HLSL for when I have the algorithm totally finished and working(because I expect writing HLSL to be just as unpleasant as GLSL). Still it is useful for me to know beforehand what problems to consider.
  13. Hi. I wanted to experiment D3D12 development and decided to run some tutorials: Microsoft DirectX-Graphics-Samples, Braynzar Soft, 3dgep...Whatever sample I run, I've got the same crash. All the initialization process is going well, no error, return codes ok, but as soon as the Present method is invoked on the swap chain, I'm encountering a crash with the following call stack: https://drive.google.com/open?id=10pdbqYEeRTZA5E6Jm7U5Dobpn-KE9uOg The crash is an access violation to a null pointer ( with an offset of 0x80 ) I'm working on a notebook, a toshiba Qosmio x870 with two gpu's: an integrated Intel HD 4000 and a dedicated NVIDIA GTX 670M ( Fermi based ). The HD 4000 is DX11 only and as far as I understand the GTX 670M is DX12 with a feature level 11_0. I checked that the good adapter was chosen by the sample, and when the D3D12 device is asked in the sample with a 11_0 FL, it is created with no problem. Same for all the required interfaces ( swap chain, command queue...). I tried a lot of things to solve the problem or get some info, like forcing the notebook to always use the NVIDIA gpu, disabling the debug layer, asking for a different feature level ( by the way 11_0 is the only one that allows me to create the device, any other FL will fail at device creation )... I have the latest NVIDIA drivers ( 391.35 ), the latest Windows 10 sdk ( 10.0.17134.0 ) and I'm working under Visual Studio 2017 Community. Thanks to anybody who can help me find the problem...
  14. Hi guys! In a lot of samples found in the internet, people when initialize D3D12_SHADER_RESOURCE_VIEW_DESC with resource array size 1 would normallay set its dimension as Texture2D. If the array size is greater than 1, then they would use dimension as Texture2DArray, for an example. If I declare in the shader SRV as Texture2DArray but create SRV as Texture2D (array has only 1 texture) following the same principle as above, would this be OK? I guess, this should work as long as I am using array index 0 to access my texture? Thanks!
  15. Hey! What is the recommended upper count for commands to record in the command list bundle? According to MSDN it is supposed to be a small number but do not elaborate on the actual number. I am thinking if I should pre-record commands in the command buffer and use ExecuteIndirect or maybe bundles instead. The number of commands to record in my case could vary greatly. Thanks!
  16. Hi, I finally managed to get the DX11 emulating Vulkan device working but everything is flipped vertically now because Vulkan has a different clipping space. What are the best practices out there to keep these implementation consistent? I tried using a vertically flipped viewport, and while it works on Nvidia 1050, the Vulkan debug layer is throwing error messages that this is not supported in the spec so it might not work on others. There is also the possibility to flip the clip scpace position Y coordinate before writing out with vertex shader, but that requires changing and recompiling every shader. I could also bake it into the camera projection matrices, though I want to avoid that because then I need to track down for the whole engine where I upload matrices... Any chance of an easy extension or something? If not, I will probably go with changing the vertex shaders.
  17. While working on a project using D3D12 I was getting an exception being thrown while trying to get a D3D12_CPU_DESCRIPTOR_HANDLE. The project is using plain C so it uses the COBJMACROS. The following application replicates the problem happening in the project. #define COBJMACROS #pragma warning(push, 3) #include <Windows.h> #include <d3d12.h> #include <dxgi1_4.h> #pragma warning(pop) IDXGIFactory4 *factory; ID3D12Device *device; ID3D12DescriptorHeap *rtv_heap; int WINAPI wWinMain(HINSTANCE hinst, HINSTANCE pinst, PWSTR cline, int cshow) { (hinst), (pinst), (cline), (cshow); HRESULT hr = CreateDXGIFactory1(&IID_IDXGIFactory4, (void **)&factory); hr = D3D12CreateDevice(0, D3D_FEATURE_LEVEL_11_0, &IID_ID3D12Device, &device); D3D12_DESCRIPTOR_HEAP_DESC desc; desc.NumDescriptors = 1; desc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV; desc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE; desc.NodeMask = 0; hr = ID3D12Device_CreateDescriptorHeap(device, &desc, &IID_ID3D12DescriptorHeap, (void **)&rtv_heap); D3D12_CPU_DESCRIPTOR_HANDLE rtv = ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart(rtv_heap); (rtv); } The call to ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart throws an exception. Stepping into the disassembly for ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart show that the error occurs on the instruction mov qword ptr [rdx],rax which seems odd since rdx doesn't appear to be used. Any help would be greatly appreciated. Thank you.
  18. As far as I understand there is no real random or noise function in HLSL. I have a big water polygon, and I'd like to fake water wave normals in my pixel shader. I know it's not efficient and the standard way is really to use a pre-calculated noise texture, but anyway... Does anyone have any quick and dirty HLSL shader code that fakes water normals, and that doesn't look too repetitious?
  19. Some people say "discard" has not a positive effect on optimization. Other people say it will at least spare the fetches of textures. if (color.A < 0.1f) { //discard; clip(-1); } // tons of reads of textures following here // and loops too Some people say that "discard" will only mask out the output of the pixel shader, while still evaluates all the statements after the "discard" instruction. MSN> discard: Do not output the result of the current pixel. clip: Discards the current pixel.. <MSN As usual it is unclear, but it suggests that "clip" could discard the whole pixel(maybe stopping execution too) I think, that at least, because of termal and energy consuming reasons, GPU should not evaluate the statements after "discard", but some people on internet say that GPU computes the statements anyways. What I am more worried about, are the texture fetches after discard/clip. (what if after discard, I have an expensive branch decision that makes the approved cheap branch neighbor pixels stall for nothing? this is crazy)
  20. I have a problem. My shaders are huge, in the meaning that they have lot of code inside. Many of my pixels should be completely discarded. I could use in the very beginning of the shader a comparison and discard, But as far as I understand, discard statement does not save workload at all, as it has to stale until the long huge neighbor shaders complete. Initially I wanted to use stencil to discard pixels before the execution flow enters the shader. Even before the GPU distributes/allocates resources for this shader, avoiding stale of pixel shaders execution flow, because initially I assumed that Depth/Stencil discards pixels before the pixel shader, but I see now that it happens inside the very last Output Merger state. It seems extremely inefficient to render that way a little mirror in a scene with big viewport. Why they've put the stencil test in the output merger anyway? Handling of Stencil is so limited compared to other resources. Does people use Stencil functionality at all for games, or they prefer discard/clip? Will GPU stale the pixel if I issue a discard in the very beginning of the pixel shader, or GPU will already start using the freed up resources to render another pixel?!?!
  21. I'm wondering when upload buffers are copied into the GPU. Basically I want to pool buffers and want to know when I can reuse and write new data into the buffers.
  22. Can I extend a vertex buffer in DirectX 12 (actually SharpDX with DX12)? I am already set up and happily rendering along when I suddenly realise I need more vertices to render. I could of course scrap the vertex buffer and create a new, larger one, but the problem is that I have already discarded my vertex data on the CPU side, to save memory. So... I could create another vertex buffer instead and make sure to use both, one after the other, when rendering my frames. That would be OK, but what if I need to do this extension 1000 times? I would end up with 1001 vertex buffers, and I guess that would really kill performance. What if I could simply extend my vertex buffer? Is there a way? Let me guess the answer here. I'm guessing there isn't. But maybe there is a way to copy the older vertex buffer into a new larger one? I have already used an upload buffer and copied it to my vertex buffer. Can I simply use the same technique for copying my old vertex buffer into a new one? Does anyone have an example code where this is being done? Well, if I can't extend my vertex buffer in place, I guess this is what I will try instead. Have a great weekend!
  23. AMD forces me to use MipLevels in order to can read from a heap previously used as RTV. Intel's integrated GPU works fine with MipLevels = 1 inside the D3D12_RESOURCE_DESC. For AMD I have to set it to 0(or 2). MSDN says 0 means max levels. With MipLevels = 1, AMD is rendering fine to the RTV, but reading from the RTV it shows the image reordered. Is setting MipLevels to something other than 1 going to cost me too much memory or execution time during rendering to RTVs, because I really don't need mipmaps at all(not for the 99% of my app)? (I use the same 2D D3D12_RESOURCE_DESC for both the SRV and RTV sharing the same heap. Using 1 for MipLevels in that D3D12_RESOURCE_DESC gives me results like in the photos attached below. Using 0 or 2 makes AMD read fine from the RTV. I wish I could sort this somehow, but in the last two days I've tried almost anything to sort this problem, and this is the only way it works on my machine.)
  24. Finally the ray tracing geekyness starts: https://blogs.msdn.microsoft.com/directx/2018/03/19/announcing-microsoft-directx-raytracing/ lets collect some interesting articles, I start with: https://www.remedygames.com/experiments-with-directx-raytracing-in-remedys-northlight-engine/
  25. Hi guys, I am implementing parallel prefix sum on DirectCompute and using GPU Gems 3 article as a reference for CUDA implementation. In the article, the authors add logic to handle shared memory bank conflicts. Mark Harris, one of the authors, claimed later at Stack Overflow that you do not need to handle explicitly bank conflicts in CUDA any more. How about DirectCompute? Do you need to manage this yourself? Is there a difference between D3D10/D3D11/D3D12 versions? Thanks!
  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!