Advertisement Jump to content
  • Advertisement

Search the Community

Showing results for tags 'Vulkan'.



More search options

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Categories

  • Audio
    • Music and Sound FX
  • Business
    • Business and Law
    • Career Development
    • Production and Management
  • Game Design
    • Game Design and Theory
    • Writing for Games
    • UX for Games
  • Industry
    • Interviews
    • Event Coverage
  • Programming
    • Artificial Intelligence
    • General and Gameplay Programming
    • Graphics and GPU Programming
    • Engines and Middleware
    • Math and Physics
    • Networking and Multiplayer
  • Visual Arts
  • Archive

Categories

  • Audio
  • Visual Arts
  • Programming
  • Writing

Categories

  • Game Dev Loadout
  • Game Dev Unchained

Categories

  • Game Developers Conference
    • GDC 2017
    • GDC 2018
  • Power-Up Digital Games Conference
    • PDGC I: Words of Wisdom
    • PDGC II: The Devs Strike Back
    • PDGC III: Syntax Error

Forums

  • Audio
    • Music and Sound FX
  • Business
    • Games Career Development
    • Production and Management
    • Games Business and Law
  • Game Design
    • Game Design and Theory
    • Writing for Games
  • Programming
    • Artificial Intelligence
    • Engines and Middleware
    • General and Gameplay Programming
    • Graphics and GPU Programming
    • Math and Physics
    • Networking and Multiplayer
  • Visual Arts
    • 2D and 3D Art
    • Critique and Feedback
  • Community
    • GameDev Challenges
    • GDNet+ Member Forum
    • GDNet Lounge
    • GDNet Comments, Suggestions, and Ideas
    • Coding Horrors
    • Your Announcements
    • Hobby Project Classifieds
    • Indie Showcase
    • Article Writing
  • Affiliates
    • NeHe Productions
    • AngelCode
  • Topical
    • Virtual and Augmented Reality
    • News
  • Workshops
    • C# Workshop
    • CPP Workshop
    • Freehand Drawing Workshop
    • Hands-On Interactive Game Development
    • SICP Workshop
    • XNA 4.0 Workshop
  • Archive
    • Topical
    • Affiliates
    • Contests
    • Technical
  • GameDev Challenges's Topics
  • For Beginners's Forum
  • Unreal Engine Users's Unreal Engine Group Forum

Calendars

  • Community Calendar
  • Games Industry Events
  • Game Jams
  • GameDev Challenges's Schedule

Blogs

There are no results to display.

There are no results to display.

Product Groups

  • Advertisements
  • GameDev Gear

Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Joined

  • Start

    End


Group


About Me


Website


Role


Twitter


Github


Twitch


Steam

Found 149 results

  1. BearishSun

    bs::framework v1.1 released

    First major update for bs::framework has launched. The update includes a brand new particle system, decals, various renderer enhancements and over 150 other new additions and changes! For more information check out the detailed release notes. General information bs::framework is a C++ game development framework that aims to provide all the low-level systems and features you need for the development of games, tools or engines. It was built from the ground up to replace older similar libraries. It provides a modern API through C++14, provide extensive documentation and a cleaner, more extensible design. While also focusing on high performance with its heavily multi-threaded core, and use of modern technologies such as Vulkan and physically based rendering. www.bsframework.io
  2. BearishSun

    bs::framework v1.1 released

    First major update for bs::framework has launched. The update includes a brand new particle system, decals, various renderer enhancements and over 150 other new additions and changes! For more information check out the detailed release notes. General information bs::framework is a C++ game development framework that aims to provide all the low-level systems and features you need for the development of games, tools or engines. It was built from the ground up to replace older similar libraries. It provides a modern API through C++14, provide extensive documentation and a cleaner, more extensible design. While also focusing on high performance with its heavily multi-threaded core, and use of modern technologies such as Vulkan and physically based rendering. www.bsframework.io View full story
  3. Home: https://www.khronos.org/vulkan/ SDK: http://lunarg.com/vulkan-sdk/ AMD drivers: http://gpuopen.com/gaming-product/vulkan/ (Note that Vulkan support is now part of AMD’s official drivers, so simply getting the latest drivers for your card should give you Vulkan support.) NVIDIA drivers: https://developer.nvidia.com/vulkan-driver (Note that Vulkan support is now part of NVIDIA’s official drivers, so simply getting the latest drivers for your card should give you Vulkan support.) Intel drivers: http://blogs.intel.com/evangelists/2016/02/16/intel-open-source-graphics-drivers-now-support-vulkan/ Quick reference: https://www.khronos.org/registry/vulkan/specs/1.0/refguide/Vulkan-1.0-web.pdf References: https://www.khronos.org/registry/vulkan/specs/1.0/apispec.html https://matthewwellings.com/blog/the-new-vulkan-coordinate-system/ GLSL-to-SPIR-V: https://github.com/KhronosGroup/glslang Sample code: https://github.com/LunarG/VulkanSamples https://github.com/SaschaWillems/Vulkan https://github.com/nvpro-samples https://github.com/nvpro-samples/gl_vk_chopper https://github.com/nvpro-samples/gl_vk_threaded_cadscene https://github.com/nvpro-samples/gl_vk_bk3dthreaded https://github.com/nvpro-samples/gl_vk_supersampled https://github.com/McNopper/Vulkan https://github.com/GPUOpen-LibrariesAndSDKs/HelloVulkan C++: https://github.com/nvpro-pipeline/vkcpp https://developer.nvidia.com/open-source-vulkan-c-api Getting started: https://vulkan-tutorial.com/ https://renderdoc.org/vulkan-in-30-minutes.html https://www.khronos.org/news/events/vulkan-webinar https://developer.nvidia.com/engaging-voyage-vulkan https://developer.nvidia.com/vulkan-shader-resource-binding https://developer.nvidia.com/vulkan-memory-management https://developer.nvidia.com/opengl-vulkan https://github.com/vinjn/awesome-vulkan Videos: https://www.youtube.com/playlist?list=PLYO7XTAX41FPg08uM_bgPE9HLgDAyzDaZ Utilities: https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator (AMD Memory allocator.) https://github.com/GPUOpen-LibrariesAndSDKs/Anvil (AMD Miniature Vulkan engine/framework.) L. Spiro
  4. Introduction Explicit resource state management and synchronization is one of the main advantages and main challenges that modern graphics APIs such as Direct3D12 and Vulkan offer application developers. It makes rendering command recording very efficient, but getting state management right is a challenging problem. This article explains why explicit state management is important and introduces a solution implemented in Diligent Engine, a modern cross-platform low-level graphics library. Diligent Engine has Direct3D11, Direct3D12, OpenGL/GLES and Vulkan backends and supports Windows Desktop, Universal Windows, Linux, Android, Mac and iOS platforms. Its full source code is available on GitHub and is free to use. This article gives an introduction to Diligent Engine. Synchronization in Next-Gen APIs Modern graphics applications can best be described as client-server systems where CPU is a client that records rendering commands and puts them into queue(s), and GPU is a server that asynchronously pulls commands from the queue(s) and processes them. As a result, commands are not executed immediately when CPU issues them, but rather sometime later (typically one to two frames) when GPU gets to the corresponding point in the queue. Besides that, GPU architecture is very different from CPU because of the kind of problems that GPUs are designed to handle. While CPUs are great at running algorithms with lots of flow control constructs (branches, loops, etc.) such as handling events in an application input loop, GPUs are more efficient at crunching numbers by executing the same computation thousands and even millions of times. Of course, there is a little bit of oversimplification in that statement as modern CPUs also have wide SIMD (single instruction multiple data) units that allow them to perform computations efficiently as well. Still, GPUs are at least order of magnitude faster in these kinds of problems. The main challenge that both CPUs and GPUs need to solve is memory latency. CPUs are out-of-order machines with beefy cores and large caches that use fancy prefetching and branch-prediction circuitry to make sure that data is available when a core actually needs it. GPUs, in contrast, are in-order beasts with small caches, thousands of tiny cores and very deep pipelines. They don't use any branch prediction or prefetching, but instead maintain tens of thousands of threads in flight and are capable of switching between threads instantaneously. When one group of threads waits for a memory request, GPU can simply switch to another group provided it has enough work. When programming CPU (when talking about CPU I will mean x86 CPU; things may be a little bit more involved for ARM ones), the hardware does a lot of things that we usually take for granted. For instance, after one core has written something at a memory address, we know that another core can immediately read the same memory. The cache line containing the data will need to do a little bit of travelling through the CPU, but eventually, another core will get the correct piece of information with no extra effort from the application. GPUs, in contrast, give very few explicit guarantees. In many cases, you cannot expect that a write is visible to subsequent reads unless special care is taken by the application. Besides that, the data may need to be converted from one form to another before it can be consumed by the next step. Few examples where explicit synchronization may be required: After data has been written to a texture or a buffer through an unordered access view (UAV in Direct3D) or an image (in Vulkan/OpenGL terminology), the GPU may need to wait until all writes are complete and flush the caches to memory before the same texture or buffer can be read by another shader. After shadow map rendering command is executed, the GPU may need to wait until rasterization and all writes are complete, flush the caches and change the texture layout to a format optimized for sampling before that shadow map can be used in a lighting shader. If CPU needs to read data previously written by the GPU, it may need to invalidate that memory region to make sure that caches get updated bytes. These are just a few examples of synchronization dependencies that a GPU needs to resolve. Traditionally, all these problems were handled by the API/driver and were hidden from the developer. Old-school implicit APIs such as Direct3D11 and OpenGL/GLES work that way. This approach, while being convenient from a developer's point of view, has major limitations that result in suboptimal performance. First, a driver or API does not know what the developer's intent is and have to always assume the worst-case scenario to guarantee correctness. For instance, if one shader writes to one region of a UAV, but the next shader reads from another region, the driver must always insert a barrier to guarantee that all writes are complete and visible because it just can't know that the regions do not overlap and the barrier is not really necessary. The biggest problem though is that this approach makes parallel command recording almost useless. Consider a scenario where one thread records commands to render a shadow map, while the second thread simultaneously records commands to use this shadow map in a forward rendering pass. The first thread needs the shadow map to be in depth-stencil writable state, while the second thread needs it to be in shader readable state. The problem is that the second thread does not know what the original state of the shadow map is. So what happens is when an application submits the second command buffer for execution, the API needs to find out what the actual state of the shadow map texture is and patch the command buffer with the right state transition. It needs to do this not only for our shadow map texture but for any other resource that the command list may use. This is a significant serialization bottleneck and there was no way in old APIs to solve it. Solution to the aforementioned problems is given by the next-generation APIs (Direct3D12 and Vulkan) that make all resource transitions explicit. It is up to the application now to track the states of all resources and assure that all required barriers/transitions are executed. In the example above, the application will know that when the shadow map is used in a forward pass, it will be in the depth-stencil writable state, so the barrier can be inserted right away without the need to wait for the first command buffer to be recorded or submitted. The downside here is that the application is now responsible for tracking all resource states which could be a significant burden. Let's now take a closer look at how synchronization is implemented in Vulkan and Direct3D12. Synchronization in Vulkan Vulkan enables very fine-grain control over synchronization operations and provides tools to individually tweak the following aspects: Execution dependencies, i.e. which set of operations must be completed before another set of operations can begin. Memory dependencies, i.e. which memory writes must be made available to subsequent reads. Layout transitions, i.e. what texture memory layout transformations must be performed, if any. Executions dependencies are expressed as dependencies between pipeline stages that naturally map to the traditional GPU pipeline. The type of memory access is defined by VkAccessFlagBits enum. Certain access types are only valid for specific pipeline stages. All valid combinations are listed in Section 6.1.3 of Vulkan Spec, which are also given in the following table: | Access flag (VK_ACCESS_) | Pipeline Stages | | |(VK_PIPELINE_STAGE_) | Access Type Description |------------------------------------|-----------------------------|--------------------------------------------------------------------- | INDIRECT_COMMAND_READ_BIT | DRAW_INDIRECT_BIT | Read access to indirect draw/dispatch command data attributes stored in a buffer | INDEX_READ_BIT | VERTEX_INPUT_BIT | Read access to an index buffer | VERTEX_ATTRIBUTE_READ_BIT | STAGE_VERTEX_INPUT_BIT | Read access to a vertex buffer | UNIFORM_READ_BIT | ANY_SHADER_BIT | Read access to a uniform (constant) buffer | SHADER_READ_BIT | ANY_SHADER_BIT | Read access to a storage buffer (buffer UAV), uniform texel buffer (buffer SRV), sampled image (texture SRV), storage image (texture UAV) | SHADER_WRITE_BIT | ANY_SHADER_BIT | Write access to a storage buffer (buffer UAV), or storage image (texture UAV) | INPUT_ATTACHMENT_READ_BIT | FRAGMENT_SHADER_BIT | Read access to an input attachment (render target) during fragment shading | COLOR_ATTACHMENT_READ_BIT | COLOR_ATTACHMENT_OUTPUT_BIT | Read access to a color attachment (render target) such as via blending or logic operations | COLOR_ATTACHMENT_WRITE_BIT | COLOR_ATTACHMENT_OUTPUT_BIT | Write access to a color attachment (render target) during render pass or via certain operations such as blending | DEPTH_STENCIL_ATTACHMENT_READ_BIT | EARLY_FRAGMENT_TESTS_BIT or | | | LATE_FRAGMENT_TESTS_BIT | Read access to depth/stencil buffer via depth/stencil operations | DEPTH_STENCIL_ATTACHMENT_WRITE_BIT | EARLY_FRAGMENT_TESTS_BIT or | | | LATE_FRAGMENT_TESTS_BIT | Write access to depth/stencil buffer via depth/stencil operations | TRANSFER_READ_BIT | TRANSFER_BIT | Read access to an image (texture) or buffer in a copy operation | TRANSFER_WRITE_BIT | TRANSFER_BIT | Write access to an image (texture) or buffer in a clear or copy operation | HOST_READ_BIT | HOST_BIT | Read access by a host | HOST_WRITE_BIT | HOST_BIT | Write access by a host Table 1. Valid combinations of access flags and pipeline stages. ANY_SHADER_BIT means TESSELLATION_CONTROL_SHADER_BIT, TESSELLATION_EVALUATION_SHADER_BIT, GEOMETRY_SHADER_BIT, FRAGMENT_SHADER_BIT, or COMPUTE_SHADER_BIT As you can see most access flags correspond 1:1 to a pipeline stage. For example, quite naturally vertex indices can only be read at the vertex input stage, while final color can only be written at color attachment (render target in Direct3D12 terminology) output stage. For certain access types, you can precisely specify what stage will use that access type. Most importantly, for shader reads (such as texture sampling), writes (UAV/image stores) and uniform buffer access it is possible to precisely tell the system what shader stages will be using that access type. For depth-stencil read/write access it is possible to distinguish if the access happens at the early or late fragment test stage. Quite honestly I can't really come up with any examples where this flexibility may be useful and result in measurable performance improvement. Note that it is against the spec to specify access flag for a stage that does not support that type of access (such as depth-stencil write access for vertex shader stage). An application may use these tools to very precisely specify dependencies between stages. For example, it may request that writes to a uniform buffer from vertex shader stage are made available to reads from the fragment shader in a subsequent draw call. An advantage here is that since dependency starts at the fragment shader stage, the driver will not need to synchronize the execution of the vertex shader stage, potentially saving some GPU cycles. For image (texture) resources, a synchronization barrier also defines layout transitions, i.e. potential data reorganization that the GPU may need to perform to support the requested access type. Section 11.4 of the Vulkan spec describes available layouts and how they must be used. Since every layout can only be used at certain pipeline stages (for example, color-attachment-optimal layout can only be used by color attachment read/write stage), and every pipeline stage allows only few access types, we can list all allowed access flags for every layout, as presented in the table below: |Image layout (VK_IMAGE_LAYOUT) | Access (VK_ACCESS_) | Description |----------------------------------|------------------------------------|---------------------------------------------------- | UNDEFINED | n/a | This layout can only be used as initial layout when creating an image or as the old layout in image transition. When transitioning out of this layout, the contents of the image is not preserved. | GENERAL | Any,All types of device access. | | COLOR_ATTACHMENT_OPTIMAL | COLOR_ATTACHMENT_READ_BIT | | | COLOR_ATTACHMENT_WRITE_BIT | Must only be used as color attachment. | DEPTH_STENCIL_ATTACHMENT_OPTIMAL | DEPTH_STENCIL_ATTACHMENT_READ_BIT | | | DEPTH_STENCIL_ATTACHMENT_WRITE_BIT | Must only be used as depth-stencil attachment. | DEPTH_STENCIL_READ_ONLY_OPTIMAL | DEPTH_STENCIL_ATTACHMENT_READ_BIT | | | SHADER_READ_BIT | Must only be used as read-only depth-stencil attachment or as read-only image in a shader. | SHADER_READ_ONLY_OPTIMAL | SHADER_READ_BIT | Must only be used as a read-only image in a shader (sampled image or input attachment). | TRANSFER_SRC_OPTIMAL | TRANSFER_READ_BIT | Must only be used as source for transfer (copy) commands. | TRANSFER_DST_OPTIMAL | TRANSFER_WRITE_BIT | Must only be used as destination for transfer (copy and clear) commands. | PREINITIALIZED | n/a | This layout can only be used as initial layout when creating an image or as the old layout in image transition. When transitioning out of this layout, the contents of the image is preserved, as opposed to UNDEFINED layout. Table 2. Image layouts and allowed access flags. As with access flags and pipeline stages, there is very little freedom in combining image layouts and access flags. As a result, image layouts, access flags and pipeline stages in many cases form uniquely defined triplets. Note that Vulkan also exposes another form of synchronization called render passes and subpasses. The main purpose of render passes is to provide implicit synchronization guarantees such that an application does not need to insert a barrier after every single rendering command (such as draw or clear). Render passes also allow expressing the same dependencies in a form that may be leveraged by the driver (especially on GPUs that use tiled deferred rendering architectures) for more efficient rendering. Full discussion of render passes is out of scope of this post. Synchronization in Direct3D12 Synchronization tools in Direct3D12 are not as expressive as in Vulkan, but are also not as intricate. With the exception of UAV barriers described below, Direct3D12 does not define the distinction between the execution barrier and memory barrier and operates with resource states (see Table 3). | Resource state | | (D3D12_RESOURCE_STATE_) | Description |----------------------------|------------------------------------------------------- | VERTEX_AND_CONSTANT_BUFFER | The resource is used as vertex or constant buffer. | INDEX_BUFFER | The resource is used as index buffer. | RENDER_TARGET | The resource is used as render target. | UNORDERED_ACCESS | The resource is used for unordered access via an unordered access view (UAV). | DEPTH_WRITE | The resource is used in a writable depth-stencil view or a clear command. | DEPTH_READ | The resource is used in a read-only depth-stencil view. | NON_PIXEL_SHADER_RESOURCE | The resource is accessed via shader resource view in any shader stage other than pixel shader. | PIXEL_SHADER_RESOURCE | The resource is accessed via shader resource view in pixel shader. | INDIRECT_ARGUMENT | The resource is used as the source of indirect arguments for an indirect draw or dispatch command. | COPY_DEST | The resource is as copy destination in a copy command. | COPY_SOURCE | The resource is as copy source in a copy command. Table 3. Most commonly used resource states in Direct3D12. Direct3D12 defines three resource barrier types: State transition barrier defines transition from one resource state listed in Table 3 to another. This type of barrier maps to Vulkan barrier when old an new access flags and/or image layouts are not the same. UAV barrier is an execution plus memory barrier in Vulkan terminology. It does not change the state (layout), but instead indicates that all UAV accesses (read or writes) to a particular resource must complete before any future UAV accesses (read or write) can begin. Aliasing barrier indicates a usage transition between two resources that are backed by the same memory and is out of scope of this article. Resource state management in Diligent Engine The purpose of Diligent Engine is to provide efficient cross-platform low-level graphics API that is convenient to use, but at the same time is flexible enough to not limit the applications in expressing their intent. Before version 2.4, the ability of the application to control resource state transitions was very limited. Version 2.4 made resource state transitions explicit and introduced two ways to manage the states. The first one is fully automatic, where the engine internally keeps track of the state and performs necessary transitions. The second one is manual and completely driven by the application. Automatic State Management Every command that may potentially perform state transitions uses one of the following state transitions modes: RESOURCE_STATE_TRANSITION_MODE_NONE - Perform no state transitions and no state validation. RESOURCE_STATE_TRANSITION_MODE_TRANSITION - Transition resources to the states required by the command. RESOURCE_STATE_TRANSITION_MODE_VERIFY - Do not transition, but verify that states are correct. The code snippet below gives an example of a sequence of typical rendering commands in Diligent Engine 2.4: // Clear the back buffer const float ClearColor[] = { 0.350f, 0.350f, 0.350f, 1.0f }; m_pImmediateContext->ClearRenderTarget(nullptr, ClearColor, RESOURCE_STATE_TRANSITION_MODE_TRANSITION); m_pImmediateContext->ClearDepthStencil(nullptr, CLEAR_DEPTH_FLAG, 1.f, 0, RESOURCE_STATE_TRANSITION_MODE_TRANSITION); // Bind vertex buffer Uint32 offset = 0; IBuffer *pBuffs[] = {m_CubeVertexBuffer}; m_pImmediateContext->SetVertexBuffers(0, 1, pBuffs, &offset, RESOURCE_STATE_TRANSITION_MODE_TRANSITION, SET_VERTEX_BUFFERS_FLAG_RESET); m_pImmediateContext->SetIndexBuffer(m_CubeIndexBuffer, 0, RESOURCE_STATE_TRANSITION_MODE_TRANSITION); // Set pipeline state m_pImmediateContext->SetPipelineState(m_pPSO); // Commit shader resources m_pImmediateContext->CommitShaderResources(m_pSRB, RESOURCE_STATE_TRANSITION_MODE_TRANSITION); DrawAttribs DrawAttrs; DrawAttrs.IsIndexed = true; DrawAttrs.IndexType = VT_UINT32; // Index type DrawAttrs.NumIndices = 36; // Verify the state of vertex and index buffers DrawAttrs.Flags = DRAW_FLAG_VERIFY_STATES; m_pImmediateContext->Draw(DrawAttrs); Automatic state management is useful in many scenarios, especially when porting old applications to Diligent API. It has the following limitations though: The state is tracked for the whole resource only. Individual mip levels and/or texture array slices cannot be transitioned. The state is a global resources property. Every device context that uses a resource sees the same state. Automatic state transitions are not thread safe. Any operation that uses RESOURCE_STATE_TRANSITION_MODE_TRANSITION requires that no other thread accesses the states of the same resources simultaneously. Explicit State Management As we discussed above, there is no way to efficiently solve resource management problem in a fully automated manner, so Diligent Engine is not trying to outsmart the industry and makes state transitions part of the API. It introduces a set of states that mostly map to Direct3D12 resource states as we believe this method is expressive enough and is way more clear compared to Vulkan's approach. If an application needs a very fine-grain control, it can use native API interoperability to directly insert Vulkan barriers into a command buffer. The list of states defined by Diligent Engine as well as their mapping to Direct3D12 and Vulkan is given in Table 4 below. | Diligent State | Direct3D12 state | Vulkan Image Layout | Vulkan Access Type | (RESOURCE_STATE_) | (D3D12_RESOURCE_STATE_) | (VK_IMAGE_LAYOUT_) | (VK_ACCESS_) |-------------------|----------------------------|----------------------------------|---------------------------------- | UNKNOWN | n/a | n/a | n/a | UNDEFINED | COMMON | UNDEFINED | 0 | VERTEX_BUFFER | VERTEX_AND_CONSTANT_BUFFER | n/a | VERTEX_ATTRIBUTE_READ_BIT | CONSTANT_BUFFER | VERTEX_AND_CONSTANT_BUFFER | n/a | UNIFORM_READ_BIT | INDEX_BUFFER | INDEX_BUFFER | n/a | INDEX_READ_BIT | RENDER_TARGET | RENDER_TARGET | COLOR_ATTACHMENT_OPTIMAL | COLOR_ATTACHMENT_READ_BIT | COLOR_ATTACHMENT_WRITE_BIT | UNORDERED_ACCESS | UNORDERED_ACCESS | GENERAL | SHADER_WRITE_BIT | SHADER_READ_BIT | DEPTH_READ | DEPTH_READ | DEPTH_STENCIL_READ_ONLY_OPTIMAL | DEPTH_STENCIL_ATTACHMENT_READ_BIT | DEPTH_WRITE | DEPTH_WRITE | DEPTH_STENCIL_ATTACHMENT_OPTIMAL | DEPTH_STENCIL_ATTACHMENT_READ_BIT | DEPTH_STENCIL_ATTACHMENT_WRITE_BIT | SHADER_RESOURCE | NON_PIXEL_SHADER_RESOURCE | SHADER_READ_ONLY_OPTIMAL | SHADER_READ_BIT | | PIXEL_SHADER_RESOURCE | | | INDIRECT_ARGUMENT | INDIRECT_ARGUMENT | n/a | INDIRECT_COMMAND_READ_BIT | COPY_DEST | COPY_DEST | TRANSFER_DST_OPTIMAL | TRANSFER_WRITE_BIT | COPY_SOURCE | COPY_SOURCE | TRANSFER_SRC_OPTIMAL | TRANSFER_READ_BIT | PRESENT | PRESENT | PRESENT_SRC_KHR | MEMORY_READ_BIT Table 4. Mapping between Diligent resource state, Direct3D12 state, Vulkan image layouts and access flags. Diligent resource states map almost exactly 1:1 to Direct3D12 resource states. The only real difference is that in Diligent, SHADER_RESOURCE state maps to the union of NON_PIXEL_SHADER_RESOURCE and PIXEL_SHADER_RESOURCE states, which does not seem to be a real issue. Compared to Vulkan, resource states in Diligent are a little bit more general, specifically: RENDER_TARGET state always defines writable render target (sets both COLOR_ATTACHMENT_READ_BIT, COLOR_ATTACHMENT_WRITE_BIT access type flags). UNORDERED_ACCESS state always defines writable storage image/storage buffer (sets both SHADER_WRITE_BIT, SHADER_READ_BIT access type flags). Transitions to and out of CONSTANT_BUFFER, UNORDERED_ACCESS, and SHADER_RESOURCE states always set all applicable pipeline stage flags as given by Table 1. None of the limitations above seem to be causing any measurable performance degradation. Again, if an application really needs to specify more precise barrier, it can rely on native API interoperability. Note that Diligent defines both UNKNOWN and UNDEFINED states, which have very different meanings. UNKNOWN means that the state is not known to the engine and that application manually manages the state of this resource. UNDEFINED means that the state is known to the engine and is undefined from the point of view of the underlying API. This state has well-defined counterparts in Direct3D12 and Vulkan. Explicit resource state transitions in Diligent Engine are performed with the help of IDeviceContext::TransitionResourceStates() method that takes an array of StateTransitionDesc structures: void IDeviceContext::TransitionResourceStates(Uint32 BarrierCount, StateTransitionDesc* pResourceBarriers) Every element in the array defines resource to transition (a texture or a buffer), old state, new state as well as the range of mip levels and array slices, for a texture resource: struct StateTransitionDesc { ITexture* pTexture = nullptr; IBuffer* pBuffer = nullptr; Uint32 FirstMipLevel = 0; Uint32 MipLevelsCount = 0; Uint32 FirstArraySlice= 0; Uint32 ArraySliceCount= 0; RESOURCE_STATE OldState = RESOURCE_STATE_UNKNOWN; RESOURCE_STATE NewState = RESOURCE_STATE_UNKNOWN; bool UpdateResourceState = false; }; If the state of the resource is known to the engine, the OldState member can be set to UNKNOWN, in which case the engine will use the state from the resource. If the state is not known to the engine, OldState must not be UNKNOWN. NewState can never be UNKNOWN. An important member is UpdateResourceState flag. If set to true, the engine will set the state of the resource to value given by NewState. Otherwise, the state will remain unchanged. Switching between explicit and automatic state management Diligent Engine provides tools to allow switching between and mixing automatic and manual state management. Both ITexture and IBuffer interfaces expose SetState() and GetState() methods that allow an application to get and set the resource state. When the state of a resource is set to UNKNOWN, this resource will be ignored by all methods that use RESOURCE_STATE_TRANSITION_MODE_TRANSITION mode. State transitions will still be performed for all resources whose state is known. An application can thus mix automatic and manual state management by setting the state of resources that are manually managed to UNKNOWN. If an application wants to hand over state management back to the system, it can use SetState() method to set the resource state. Alternatively, it can set UpdateResourceState flag to true, which will have the same effect. Multithreaded Safety As we discussed above, the main advantage of manual resource state management is the ability to record rendering commands in parallel. As resource states are tracked globally in Diligent Engine, the following precautions must be taken: Recording state transitions of the same resource in multiple threads simultaneously with IDeviceContext::TransitionResourceStates() is safe as long as UpdateResourceState flag is set to false. Any thread that uses RESOURCE_STATE_TRANSITION_MODE_TRANSITION mode with any method must be the only thread accessing resources that may be transitioned. This also applies to IDeviceContext::TransitionShaderResources() method. If a thread uses RESOURCE_STATE_TRANSITION_MODE_VERIFY mode with any method (which is recommended whenever possible), no other thread should alter the states of the same resources. Discussion Diligent Engine adopts D3D11-style API with immediate and deferred contexts to record rendering commands. Since it is well known that deferred contexts did not work well in Direct3D11, a natural question one may ask is why they work in Diligent. And the answer is because of the explicit state transition control. While in Direct3D11, resource state management was always automatic, Diligent gives the application direct control of how resource states must be handled by every operation. At the same time, device contexts incorporate dynamic memory, descriptor management and other tasks that need to be handled by a thread that records rendering commands. Conclusion Explicit resource state management system introduced in Diligent Engine v2.4 combines flexibility, efficiency and convenience to use. An application may rely on automatic resource state management in typical rendering scenarios and switch to manual mode when the engine does not have enough knowledge to manage the states optimally or when it is not possible such as in the case of multithreaded rendering command recording. At the moment Diligent Engine only supports one command queue exposed as single immediate context. One of the next steps is to expose multiple command queues through multiple immediate contexts as well as primitives to synchronize execution between queues to allow async compute and other advanced rendering techniques.
  5. Diligent Engine is a modern cross-platform low-level graphics framework. The latest release enables Vulkan on MacOS (via MoltenVK). The full list of supported platforms and APIs is as follows: Win32 (Windows desktop): Direct3D11, Direct3D12, OpenGL4.2+, Vulkan Universal Windows: Direct3D11, Direct3D12 Linux: OpenGL4.2+, Vulkan Android: OpenGLES3.0+ MacOS: OpenGL4.1, Vulkan iOS: OpenGLES3.0 MinGW build support, split barriers and other improvements are also in the new release. Check it out on GitHub.
  6. Hello everyone. I am currently looking at remaking our rendering back-end from the ground up. The goal is to multi-thread rendering and to move to 2nd generation APIs. So i have been looking at those APIs (mostly Vulkan and somewhat DX12 so far which are quite similar) and i think i have a decent understanding of how they work. Just to give the big picture, the front-end of the rendering system is responsible for implementing the rendering logic of different parts of the scene like terrain (LoD, view-frustum culling) etc and produces rendering commands for the back-end in the form of state objects and resources to bind. These objects are created using builder objects and can be built of multiple threads since they don't contain any GPU objects. When submitted to the back-end, the rendering thread just sets the states and performs the draw calls. This design is very friendly to 2nd generation at a first glance. I did take a look at DOOM3-BFG Vulkan renderer and what they do is to use an array of "frame" objects (one for each image of the swapchain). Each has a command buffer and when drawing commands are submitted, they wait on the fence of the current frame (which most of the time will be finished) and record the command buffers on the presentation thread and they just ping-pong the two command buffers from frame to frame. It's easy but it doesn't leverage the API capabilities. My idea is to use a similar frame mechanism for the back-end while building command buffers on other threads. The builder objects could be used by the front-end (taking care to make the render pass concept first class citizens in the builder API). The builder could just be making Vulkan objects directly and just produce opaque objects that the rendering thread would just have to submit. At this point of my reflection the main issue i'm facing is the management of the command buffers life cycle since they are allocated from command buffer pools. The rendering thread could reset the command buffer (by resetting the pool) and the handing it back to a queue when the front-end could get them but it would require that the pool be submitted with the command buffer and it would also require more synchronisation.
  7. After reading vulkan tutorials, there are still many unanswered questions. How can I update blending code in already created pipeline? should I delete old and create pipeline? Updating blend modes are very important for make graphical effects. Pipelines has dynamic states but blending in not one of 'them'.
  8. Hi. I've just graduaded B.Sc in Computer Science. Beeing a founder of game dev studio is my dream. I am huge Linux enthusiast, hence I decided to learn technology which allows to develop games (actually Graphics Engine) under linux enviroment. I'm looking for people who are determined to develop Vulkan programming skills. Together we will create a project on github and develop some simple project. Write a post below if you're interested.
  9. It has been 2 day figuring out what's wrong but stil no success. Here's the code. VkPipelineLayoutCreateInfo pipelineLayoutCreateInfo = {}; pipelineLayoutCreateInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO; VkPipelineLayout pipelineLayout; vkRes = vkCreatePipelineLayout(device.GetVkDevice(), &pipelineLayoutCreateInfo, nullptr, &pipelineLayout); if (vkRes != VK_SUCCESS) { std::cout << "Failed to create pipelineLayout \n"; return -1; } VkPipelineColorBlendAttachmentState colorBlendAttachmentState = {}; colorBlendAttachmentState.colorWriteMask = VK_COLOR_COMPONENT_R_BIT | VK_COLOR_COMPONENT_G_BIT | VK_COLOR_COMPONENT_B_BIT | VK_COLOR_COMPONENT_A_BIT; VkPipelineColorBlendStateCreateInfo colorBlendStates = {}; colorBlendStates.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO; colorBlendStates.attachmentCount = 1; colorBlendStates.pAttachments = &colorBlendAttachmentState; VkVertexInputBindingDescription vertexInputBindingDesc = {}; vertexInputBindingDesc.binding = 0; vertexInputBindingDesc.inputRate = VK_VERTEX_INPUT_RATE_VERTEX; vertexInputBindingDesc.stride = sizeof(float) * 6; VkVertexInputAttributeDescription vertexInputAttribDescs[2] = {}; vertexInputAttribDescs[0].binding = 0; vertexInputAttribDescs[0].format = VK_FORMAT_R32G32B32_SFLOAT; vertexInputAttribDescs[0].location = 0; vertexInputAttribDescs[0].offset = 0; vertexInputAttribDescs[1].binding = 0; vertexInputAttribDescs[1].format = VK_FORMAT_R32G32B32_SFLOAT; vertexInputAttribDescs[1].location = 1; vertexInputAttribDescs[1].offset = sizeof(float) * 3; VkPipelineVertexInputStateCreateInfo vertexInputState = {}; vertexInputState.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO; vertexInputState.vertexBindingDescriptionCount = 1; vertexInputState.pVertexBindingDescriptions = &vertexInputBindingDesc; vertexInputState.vertexAttributeDescriptionCount = 2; vertexInputState.pVertexAttributeDescriptions = vertexInputAttribDescs; VkPipelineViewportStateCreateInfo viewportState = {}; viewportState.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO; viewportState.viewportCount = 1; viewportState.scissorCount = 1; VkPipelineInputAssemblyStateCreateInfo inputAssemblyState = {}; inputAssemblyState.sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO; inputAssemblyState.topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; VkPipelineRasterizationStateCreateInfo rasterizerState = {}; rasterizerState.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO; rasterizerState.cullMode = VK_CULL_MODE_NONE; rasterizerState.polygonMode = VK_POLYGON_MODE_FILL; rasterizerState.lineWidth = 1.0f; rasterizerState.frontFace = VK_FRONT_FACE_CLOCKWISE; VkPipelineMultisampleStateCreateInfo multisampleState = {}; multisampleState.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO; multisampleState.rasterizationSamples = VK_SAMPLE_COUNT_8_BIT; VkDynamicState dynamicState[2] = { VK_DYNAMIC_STATE_VIEWPORT,VK_DYNAMIC_STATE_SCISSOR }; VkPipelineDynamicStateCreateInfo dynamicPipelineState = {}; dynamicPipelineState.sType = VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO; dynamicPipelineState.dynamicStateCount = 2; dynamicPipelineState.pDynamicStates = dynamicState; VkShaderModule shaders[2]; std::ifstream inputFile("Shaders/color.vert.spv"); inputFile.seekg(0, std::ios_base::end); UINT64 fileSize = inputFile.tellg(); inputFile.seekg(0, std::ios_base::beg); std::vector<char> byteCode(fileSize); inputFile.read(byteCode.data(), fileSize); inputFile.close(); VkShaderModuleCreateInfo shaderModuleCreateInfo = {}; shaderModuleCreateInfo.codeSize = fileSize; shaderModuleCreateInfo.pCode = (UINT*)byteCode.data(); vkRes = vkCreateShaderModule(device.GetVkDevice(), &shaderModuleCreateInfo, nullptr, &shaders[0]); if (vkRes != VK_SUCCESS) { std::cout << "Failed to create shader\n"; return -1; } inputFile.open("Shaders/color.frag.spv"); inputFile.seekg(0, std::ios_base::end); fileSize = inputFile.tellg(); inputFile.seekg(0, std::ios_base::beg); byteCode.resize(fileSize); inputFile.read(byteCode.data(), fileSize); inputFile.close(); shaderModuleCreateInfo.codeSize = fileSize; shaderModuleCreateInfo.pCode = (UINT*)byteCode.data(); vkRes = vkCreateShaderModule(device.GetVkDevice(), &shaderModuleCreateInfo, nullptr, &shaders[1]); if (vkRes != VK_SUCCESS) { std::cout << "Failed to create shader\n"; return -1; } VkPipelineShaderStageCreateInfo shaderStages[2]; shaderStages[0] = {}; shaderStages[0].sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO; shaderStages[0].module = shaders[0]; shaderStages[0].pName = "main"; shaderStages[0].stage = VK_SHADER_STAGE_VERTEX_BIT; shaderStages[1] = {}; shaderStages[1].sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO; shaderStages[1].module = shaders[1]; shaderStages[1].pName = "main"; shaderStages[1].stage = VK_SHADER_STAGE_FRAGMENT_BIT; VkGraphicsPipelineCreateInfo graphicsPipelineCreateInfo[1] = {}; graphicsPipelineCreateInfo[0].sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO; graphicsPipelineCreateInfo[0].layout = pipelineLayout; graphicsPipelineCreateInfo[0].pColorBlendState = &colorBlendStates; graphicsPipelineCreateInfo[0].stageCount = 2; graphicsPipelineCreateInfo[0].pStages = shaderStages; graphicsPipelineCreateInfo[0].pDynamicState = &dynamicPipelineState; graphicsPipelineCreateInfo[0].pInputAssemblyState = &inputAssemblyState; graphicsPipelineCreateInfo[0].pRasterizationState = &rasterizerState; graphicsPipelineCreateInfo[0].pVertexInputState = &vertexInputState; graphicsPipelineCreateInfo[0].pMultisampleState = &multisampleState; graphicsPipelineCreateInfo[0].renderPass = renderPass; graphicsPipelineCreateInfo[0].pViewportState = &viewportState; VkPipeline pipeline; vkRes = vkCreateGraphicsPipelines(device.GetVkDevice(), VK_NULL_HANDLE, 1, graphicsPipelineCreateInfo, nullptr, &pipeline); if (vkRes != VK_SUCCESS) { std::cout << "Failed to create pipeline\n"; return -1; }
  10. Hello there! So I've followed along with the Vulkan Tutorial here https://vulkan-tutorial.com/ and I've finished it outside of the multisampling section. And I definitely feel like I've learned a lot. However, it sort of contains everything in one monolithic class. While it has a section for rendering "models" it doesn't actually do that, it's for rendering "meshes." If there's a good guide that sort of starts at the end of this tutorial that explains how to break this up into more manageable pieces, that'd be great, that's what I'm looking for. So if you don't feel like reading further that's the main gist of it. But I'll sort of go over what I'm thinking about here. So for me a mesh is something that comprises of a single vertex buffer and optionally an index buffer. There is one material per mesh. A model is comprised of multiple meshes. Each material would likely contain descriptor set information that helps forward information about the samplers and other data that's needed by the shader, to the shader. Each shader would be a static instance that's setup once and is updated through UBOs (uniform buffers) and push constants. (though I haven't learned about those yet) Meshes would contain a command buffer (primary or secondary?) and a command pool and the drawing commands would be setup for that mesh. Then I suppose I'd want to submit each command buffer in a single vkQueueSubmit() call. Or maybe I'd have a single command buffer and pool for all meshes? Where I'm a bit hung up on is the UBO stuff. All the drawing commands are setup in advance in the command buffers. The MVP matrix for example of course could change every frame per mesh. (per model maybe?) How would I go about updating UBOs per mesh object? Is that something that I could map to memory with Vulkan and then update it with some sort of command in the command buffer? The last thing I notice that I'm worried about are the clear values. VkRenderPassBeginInfo renderPassInfo = {}; // Other render pass code here // Clear values std::array< VkClearValue, 2 > clearValues; clearValues[ 0 ].color = { 0.0f, 0.0f, 0.0f, 1.0f }; clearValues[ 1 ].depthStencil = { 1.0f, 0 }; renderPassInfo.clearValueCount = static_cast< uint32_t >( clearValues.size() ); renderPassInfo.pClearValues = clearValues.data(); The Vulkan Tutorial does it something like that. I'm sort of thinking that there's one VkRenderPass object per shader. In OpenGL you just sort of set glClearColor once and you're done with it usually. Why would I do this for each render pass? Does that make sense? Am I missing something here? Anyway, any help is greatly appreciated!
  11. komires

    Matali Physics 4.3 Released

    We are pleased to announce the release of Matali Physics 4.3. The latest version introduces significant changes in support for DirectX 12 and Vulkan. Introduced changes equate the use of DirectX 12 and Vulkan with DirectX 11 and OpenGL respectively, significantly reducing the costs associated with application of low-level graphics APIs. From version 4.3, we recommend using DirectX 12 and Vulkan in projects that are developed in the Matali Physics environment. What is Matali Physics? Matali Physics is an advanced, multi-platform, high-performance 3d physics engine intended for games, virtual reality and physics-based simulations. Matali Physics and add-ons form physics environment which provides complex physical simulation and physics-based modeling of objects both real and imagined. Main benefits of using Matali Physics: Stable, high-performance solution supplied together with the rich set of add-ons for all major mobile and desktop platforms (both 32 and 64 bit) Advanced samples ready to use in your own games New features on request Dedicated technical support Regular updates and fixes You can find out more information on www.mataliphysics.com View full story
  12. komires

    Matali Physics 4.3 Released

    We are pleased to announce the release of Matali Physics 4.3. The latest version introduces significant changes in support for DirectX 12 and Vulkan. Introduced changes equate the use of DirectX 12 and Vulkan with DirectX 11 and OpenGL respectively, significantly reducing the costs associated with application of low-level graphics APIs. From version 4.3, we recommend using DirectX 12 and Vulkan in projects that are developed in the Matali Physics environment. What is Matali Physics? Matali Physics is an advanced, multi-platform, high-performance 3d physics engine intended for games, virtual reality and physics-based simulations. Matali Physics and add-ons form physics environment which provides complex physical simulation and physics-based modeling of objects both real and imagined. Main benefits of using Matali Physics: Stable, high-performance solution supplied together with the rich set of add-ons for all major mobile and desktop platforms (both 32 and 64 bit) Advanced samples ready to use in your own games New features on request Dedicated technical support Regular updates and fixes You can find out more information on www.mataliphysics.com
  13. I'm creating a 2D game engine using Vulkan. I've been looking at how to draw different textures (each GameObject can contain its own texture and can be different from others). In OpenGL you call glBindTexture and in vulkan I have seen that there are people who say that you can create a descriptor for each texture and call vkCmdBindDescriptorSets for each. But I have read that doing this has a high cost. The way I'm doing it is to use only 1 descriptor for the Sampler2D and use a VkDescriptorImageInfo vector where I add each VkDescriptorImageInfo for each texture and assign the vector in pImageInfo. VkWriteDescriptorSet samplerDescriptorSet; samplerDescriptorSet.pNext = NULL; samplerDescriptorSet.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; samplerDescriptorSet.dstSet = descriptorSets[i]; samplerDescriptorSet.dstBinding = 1; samplerDescriptorSet.dstArrayElement = 0; samplerDescriptorSet.descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER; samplerDescriptorSet.descriptorCount = static_cast<uint32_t>(samplerDescriptors.size()); samplerDescriptorSet.pImageInfo = samplerDescriptors.data(); //samplerDescriptors is the vector Using this, I can skip creating and binding a descriptor for each texture but now I need an array of Samplers in fragment shader. I can't use sampler2DArray because each texture have different sizes so I decided to use an array of Samplers2D (Sampler2D textures[n]). The problem with this is that I don't want to set a max number of textures to use. I found a way to do it dynamically using: #extension GL_EXT_nonuniform_qualifier : enable layout(binding = 1) uniform sampler2D texSampler[]; I never used this before and don't know if is efficient or not. Anyways there is still a problem with this. Now I need to set the number of descriptor count when I create the descriptor layout and again, I don't want to set a max number you can use: VkDescriptorSetLayoutBinding samplerLayoutBinding = {}; samplerLayoutBinding.binding = 1; samplerLayoutBinding.descriptorCount = 999999; <<<< HERE samplerLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER; samplerLayoutBinding.pImmutableSamplers = nullptr; samplerLayoutBinding.stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT; Having said that. How can I solve this? Or which is the correct way to do this efficiently? If you need more information, just ask. Thanks in advance!
  14. I was looking at some of Sascha Willems' examples, specifically `multithreading.cpp`, and was surprised to see that he is creating a secondary command buffer per object. I'm curious to know if this is a fairly standard approach? Also, it made we wonder how expensive it is to bind the same pipeline (phong), once per secondary command buffer, but once these buffers have been executed (concatenated into a primary cbuffer) you effectively have a pipeline bind per object. Being that pipelines are immutable, is it fairly cheap after the first phong pipeline is bound and therefore any subsequent phong binds don't really impact things much? https://github.com/SaschaWillems/Vulkan/blob/master/examples/multithreading/multithreading.cpp Thanks
  15. I am wondering if it would be viable to transfer the SH coefficients calculation to a compute shader instead of doing it on CPU which for our engine requires a readback of the cube map texture. I am not entirely sure how to go about this since it will be hard to parallelize as each thread will be writing to all the coefficients. A lame implementation would be to have one thread running the entire shader but I think that's going into TDR territory. Currently, I am generating an irradiance map but I am planning on switching to storing it inside a spherical harmonics because of the smaller footprint. Does anyone have any ideas on how we can move this to the GPU or its just not a viable option?
  16. Hi, I have C++ Vulkan based project using Qt framework. QVulkanInstance and QVulkanWindow does lot of things for me like validation etc. but I can't figure out due Vulkan low level API how to troubleshoot Vulkan errors. I am trying to render terrain using tessellation shaders. I am learning from SaschaWillems tutorial for tessellation rendering. I think I am setting some value for rendering pass wrong in MapTile.cpp but unable to find which cause I dont know how to troubleshoot it. Whats the problem? App freezes on second end draw call Why? QVulkanWindow: Device lost Validation layers debug qt.vulkan: Vulkan init (vulkan-1.dll) qt.vulkan: Supported Vulkan instance layers: QVector(QVulkanLayer("VK_LAYER_NV_optimus" 1 1.1.84 "NVIDIA Optimus layer"), QVulkanLayer("VK_LAYER_RENDERDOC_Capture" 0 1.0.0 "Debugging capture layer for RenderDoc"), QVulkanLayer("VK_LAYER_VALVE_steam_overlay" 1 1.1.73 "Steam Overlay Layer"), QVulkanLayer("VK_LAYER_LUNARG_standard_validation" 1 1.0.82 "LunarG Standard Validation Layer")) qt.vulkan: Supported Vulkan instance extensions: QVector(QVulkanExtension("VK_KHR_device_group_creation" 1), QVulkanExtension("VK_KHR_external_fence_capabilities" 1), QVulkanExtension("VK_KHR_external_memory_capabilities" 1), QVulkanExtension("VK_KHR_external_semaphore_capabilities" 1), QVulkanExtension("VK_KHR_get_physical_device_properties2" 1), QVulkanExtension("VK_KHR_get_surface_capabilities2" 1), QVulkanExtension("VK_KHR_surface" 25), QVulkanExtension("VK_KHR_win32_surface" 6), QVulkanExtension("VK_EXT_debug_report" 9), QVulkanExtension("VK_EXT_swapchain_colorspace" 3), QVulkanExtension("VK_NV_external_memory_capabilities" 1), QVulkanExtension("VK_EXT_debug_utils" 1)) qt.vulkan: Enabling Vulkan instance layers: ("VK_LAYER_LUNARG_standard_validation") qt.vulkan: Enabling Vulkan instance extensions: ("VK_EXT_debug_report", "VK_KHR_surface", "VK_KHR_win32_surface") qt.vulkan: QVulkanWindow init qt.vulkan: 1 physical devices qt.vulkan: Physical device [0]: name 'GeForce GT 650M' version 416.64.0 qt.vulkan: Using physical device [0] qt.vulkan: queue family 0: flags=0xf count=16 supportsPresent=1 qt.vulkan: queue family 1: flags=0x4 count=1 supportsPresent=0 qt.vulkan: Using queue families: graphics = 0 present = 0 qt.vulkan: Supported device extensions: QVector(QVulkanExtension("VK_KHR_8bit_storage" 1), QVulkanExtension("VK_KHR_16bit_storage" 1), QVulkanExtension("VK_KHR_bind_memory2" 1), QVulkanExtension("VK_KHR_create_renderpass2" 1), QVulkanExtension("VK_KHR_dedicated_allocation" 3), QVulkanExtension("VK_KHR_descriptor_update_template" 1), QVulkanExtension("VK_KHR_device_group" 3), QVulkanExtension("VK_KHR_draw_indirect_count" 1), QVulkanExtension("VK_KHR_driver_properties" 1), QVulkanExtension("VK_KHR_external_fence" 1), QVulkanExtension("VK_KHR_external_fence_win32" 1), QVulkanExtension("VK_KHR_external_memory" 1), QVulkanExtension("VK_KHR_external_memory_win32" 1), QVulkanExtension("VK_KHR_external_semaphore" 1), QVulkanExtension("VK_KHR_external_semaphore_win32" 1), QVulkanExtension("VK_KHR_get_memory_requirements2" 1), QVulkanExtension("VK_KHR_image_format_list" 1), QVulkanExtension("VK_KHR_maintenance1" 2), QVulkanExtension("VK_KHR_maintenance2" 1), QVulkanExtension("VK_KHR_maintenance3" 1), QVulkanExtension("VK_KHR_multiview" 1), QVulkanExtension("VK_KHR_push_descriptor" 2), QVulkanExtension("VK_KHR_relaxed_block_layout" 1), QVulkanExtension("VK_KHR_sampler_mirror_clamp_to_edge" 1), QVulkanExtension("VK_KHR_sampler_ycbcr_conversion" 1), QVulkanExtension("VK_KHR_shader_draw_parameters" 1), QVulkanExtension("VK_KHR_storage_buffer_storage_class" 1), QVulkanExtension("VK_KHR_swapchain" 70), QVulkanExtension("VK_KHR_variable_pointers" 1), QVulkanExtension("VK_KHR_win32_keyed_mutex" 1), QVulkanExtension("VK_EXT_conditional_rendering" 1), QVulkanExtension("VK_EXT_depth_range_unrestricted" 1), QVulkanExtension("VK_EXT_descriptor_indexing" 2), QVulkanExtension("VK_EXT_discard_rectangles" 1), QVulkanExtension("VK_EXT_hdr_metadata" 1), QVulkanExtension("VK_EXT_inline_uniform_block" 1), QVulkanExtension("VK_EXT_shader_subgroup_ballot" 1), QVulkanExtension("VK_EXT_shader_subgroup_vote" 1), QVulkanExtension("VK_EXT_vertex_attribute_divisor" 3), QVulkanExtension("VK_NV_dedicated_allocation" 1), QVulkanExtension("VK_NV_device_diagnostic_checkpoints" 2), QVulkanExtension("VK_NV_external_memory" 1), QVulkanExtension("VK_NV_external_memory_win32" 1), QVulkanExtension("VK_NV_shader_subgroup_partitioned" 1), QVulkanExtension("VK_NV_win32_keyed_mutex" 1), QVulkanExtension("VK_NVX_device_generated_commands" 3), QVulkanExtension("VK_NVX_multiview_per_view_attributes" 1)) qt.vulkan: Enabling device extensions: QVector(VK_KHR_swapchain) qt.vulkan: memtype 0: flags=0x0 qt.vulkan: memtype 1: flags=0x0 qt.vulkan: memtype 2: flags=0x0 qt.vulkan: memtype 3: flags=0x0 qt.vulkan: memtype 4: flags=0x0 qt.vulkan: memtype 5: flags=0x0 qt.vulkan: memtype 6: flags=0x0 qt.vulkan: memtype 7: flags=0x1 qt.vulkan: memtype 8: flags=0x1 qt.vulkan: memtype 9: flags=0x6 qt.vulkan: memtype 10: flags=0xe qt.vulkan: Picked memtype 10 for host visible memory qt.vulkan: Picked memtype 7 for device local memory qt.vulkan: Color format: 44 Depth-stencil format: 129 qt.vulkan: Creating new swap chain of 2 buffers, size 600x370 qt.vulkan: Actual swap chain buffer count: 2 (supportsReadback=1) qt.vulkan: Allocating 1027072 bytes for transient image (memtype 8) qt.vulkan: Creating new swap chain of 2 buffers, size 600x368 qt.vulkan: Releasing swapchain qt.vulkan: Actual swap chain buffer count: 2 (supportsReadback=1) qt.vulkan: Allocating 1027072 bytes for transient image (memtype 8) QVulkanWindow: Device lost qt.vulkan: Releasing all resources due to device lost qt.vulkan: Releasing swapchain I am not so sure if this debug helps somehow :(( I dont want you to debug it for me. I just want to learn how I should debug it and find where problem is located. Could you give me guide please? Source code Source code rendering just few vertices (working) Difference between links are: Moved from Qt math libraries to glm Moved from QImage to gli for Texture class Added tessellation shaders Disabled window sampling Rendering terrain using heightmap and texturearray (Added normals and UV) Thanks
  17. Folks, I tried to google Vulcan and sdl2 but they did not help so well. I only found stand-alone or GLFW tutorials. Does anyone have good tutorial about Vulkan with SDL2? Thanks
  18. Would anyone be able to point me to an object picking example with Vulkan using the render each object using a different color to a texture in a separate rendering pass and read those texture pixels to get the color value approach. I'm very new to Vulkan, and I'm starting to port my OpenGl apps over, and I use this technique a lot.
  19. Hello, I am university student. This year I am going to write bachelor thesis about Vulkan app using terrain to render real places based on e.g. google maps data. I played World of Warcraft for 10 years and I did some research about their terrain rendering. They render map as grid of tiles. Each tile had 4 available textures to paint (now 8 by expansion WoD) However I found issue about this implementation, that's gaps between tiles. Is there any technique which solves this problem? I read on stackoverflow that guys find only solution in using smooth tool and fixing it manually. Main question: is this terrain rendering technique obsolete? Is there any new technique that replaces this method to render large map as small tiles? Should I try to implement rendering terrain as grid of tiles or should I use some modern technique (can you tell me which is modern for you?). If I should implement terrain as one large map to prevent gaps between tiles how textures are applied for such large map? Thanks for any advice.
  20. I'm writing a rendering system for our in-house game engine. My idea at first was to include only a Vulkan backend,but then Apple refused to port vulkan to MacOs and Microsoft relased their DXR raytracing for DirectX12. There is stil Radeon Rays for vulkan, but DXR is directly integrated with the graphic API. So we were thinking of a multiple backend rendering system with Vulkan for windows and Linux, Directx12 for Windows and Metal2 for MacOs. But this system would lead to an incredible amount of code to write than a single API, so my questions were: Should we stick to a Vulkan and maybe use a translation layer like MolteVk to port it to macOs ? Is it worth tl write the multiple APIs renderer ? Should we write a different renderer for each platform and then ship separate executables ? (Sorry for possibly bad English 😁)
  21. I have been coding since the 90's, and have released a few minor succesfully 3D engines over the years, Vivid3D/Trinity3D/Aurora - Probably not heard of them, but I am quite skilled in the concept. So upon the announcement of RTX cards and their features, I was highly motivated to create a modern 3D engine, in C++(Visual studio 2017) using Vulkan. Atm I have a GTX 1060 - which is v.fast and is more than enough to build the base engine. In a few months I'll be getting a RTX 2070 to implement raytracing into the engine etc. The engine has only been in dev for a week or so, but it already has a basic structure, using classes. It will be a easy to use 3D engine, with support for model imports using AssImp. So my point is, I am looking for any other Vulkan coders, who might be interested in helping develop the engine? The code is on GitHub - open but I can make it private if we decide to commercialize the engine to make money/support futher development. I want to make a Deus Ex like mini-game to test and promote the engine, so you can help with that too, even if you are a 3D artist and interested, because I am just a coder atm. So yeah, if anyone is interested pls drop me a email or reply here, and I'll add you to the gitHub project(I'll need your github username) Note - a C# wrapper is planned also, if you are a c# coder and would like to help with that, demos/wrapper etc, that would be very cool. My email is antonyrwells@outlook.com Thank you.
  22. Folks continue to tell me that Vulkan is a viable general-purpose replacement for OpenGL, and with the impending demise of OpenGL on Mac/iOS, I figured it's time to take the plunge... On the surface this looks like going back to the pain of old-school AMD/NVidia/Intel OpenGL driver hell. What I'm trying to get a grasp of is where the major portability pitfalls are up front, and what my hardware test matrix is going to be like... ~~ The validation layers seem useful. Do they work sufficiently well that a program which validates is guaranteed to at least run on another vendor's drivers (setting aside performance differences)? I assume I'm going to need to abstract across the various queue configurations? i.e. single queue on Intel, graphics+transfer on NVidia, graphics+compute+transfer on AMD? That seems fairly straightforward to wrap with a framegraph and bake the framegraph down to the available queues. Memory allocation seems like it's going to be a pain. Obviously theres some big memory scaling knobs like render target resolution, enabling/disabling post-process effects, asset resolution, etc. But at some point someone has to play tetris into the available memory on a particular GPU, and I don't really want to shove that off into manual configuration. Any pointers for techniques to deal with this in a sane manner? Any other major pitfalls I'm liable to run into when trying to write Vulkan code to run across multiple vendors/hardware targets? ~~ As for test matrix, I assume I'm going to need to test one each of recent AMD/NVidia/Intel, plus MoltenVK for Apple. Are the differences between subsequent architectures large enough that I need to test multiple different generations of cards from the same vendor? How bad is the driver situation in the android chipset space?
  23. Hi I'm searching for a good implementation of Vulkan in C#, Don't need to be the fastest as it's for an Editor, but i need the most complete and most active one. Saw a few ones but with no update in the past 5 months. Don't want to implement a "dead" library as my usage is for the next 3/4 years. Any idea ? thanks
  24. Hi everyone, I think my question boils down to "How do i feed shaders?" I was wondering what are the good strategies to store mesh transformation data [World matrices] to then be used in the shader for transforming vertices (performance being the priority ). And i'm talking about a game scenario where there are quite a lot of both moving entities, and static ones, that aren't repeated enough to be worth instanced drawing. So far i've only tried these naive methods : DX11 : - Store transforms of ALL entities in a constant buffer ( and give the entity an index to the buffer for later modification ) - Or store ONE transform in a constant buffer, and change it to the entity's transform before each drawcall. Vulkan : - Use Push Constants to send entity's transform to the shader before each drawcall, and maybe use a separate Device_local uniform buffer for static entities? Same question applies to lights. Any suggestions?
  25. I'm writing a small 3D Vulkan game engine using C++. I'm working in a team, and the other members really don't know almost anything about C++. About three years ago i found this new programming language called D wich seems very interesting, as it's very similar to C++. My idea was to implement core systems like rendering, math, serialization and so on using C++ and then wrapping all with a D framework, easier to use and less complicated. Is it worth it or I should stick only to C++ ? Does it have less performance compared to a pure c++ application ?
  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!