• Advertisement

Vulkan New API Rendering Architecture

Recommended Posts

So I'm currently updating my rendering architecture to gain more performance from the newer APIs , DX12/Vulkan, while still supporting D3D11 and I wanted to get some advice on the architecture to use. As far as I know there are two main architectures ; the first is using a single main thread. This thread performs gameplay logic using a task system, and once that is complete, performs visibility and drawcall logic using a task system and submits commands back on the main thread. As far as I know, the benefits of this approach is reducing input latency, but a consequence is that you have to wait for rendering tasks to complete before you perform game logic again. The second is a main thread and a render thread. After gameplay logic is computed, the main thread syncs data with the rendering thread. The rendering thread will compute visibility , draw call , and command buffer generation using the task system and submits command lists on the rendering thread. A benefit of this approach is that it does not block the computation of gameplay logic, but creates a frame of latency.

Share this post

Link to post
Share on other sites
Using a gameplay / render thread doesn't add any latency as the critical path for a frame remains completely unchanged - it's still update + render.
Say update and render cost 4ms each. Each frame takes 8ms to compute, regardless of whether you're using on thread or two.
With the two-threads plus pipelining solution, you start a new frame once every 4ms, so your framerate doubles, but latency is still 8ms per frame :o

Two threads is a bad architecture though because it doesn't scale. Let's say that a job system gives a 4x speed boost on a 4-core CPU.
The single-threaded version now takes 2ms per frame and has 4x the original framerate.

However, you can have your cake and eat it too. Add the job system to the two-threads pipelined version and it's now starting a new frame once every 1ms for 8x the original framerate... However because you're actively using two threads instead of one, let's say the job system now only gives a 2x speex boost instead of 4x: that means a frame takes 4ms total but a new one is started once every 2ms, so it actually does end up with the same framerate as the single-thread plus jobs version, but double the latency :o

In the real world, a lot of code remains serial and doesn't end up fully utilising the job system though, so personally I do still use the K-threads pipelined plus a job system model. I also place my "per system" threads into the job system's thread pool. e.g. on a 4-core CPU, I have one jobs+gameplay thread, one jobs+render thread, and two jobs-only threads.

The main difference with the new APIs is that the generation of command buffers can benefit from threading/jobs.
You can generate command buffer in jobs/threads on D3D11, but you don't gain any performance by doing so, so there's little point. In D3D12, I've found that you need to be recording a few thousand draw calls at a time to see any benefit from a job system... So in my engine, when I'm about to record a commamd buffer, I check if the backend reports that it supports fast threaded command buffers (i.e. is D3D12/Vulkan) and also if the draw-item count is over 1000 or not, and then either record the commands immediately on the render thread, or spawn several jobs.

As for rendering architecture itself, a more interesting question to me is whether to make a state-machine renderer like the underlying APIs, or a stateless renderer like BGGX or mine :D

Share this post

Link to post
Share on other sites

Thank you again for another response! I think I may have described my pattern wrong. For the main thread and render thread system.. both are using a job system to distribute tasks. So my approach is similar to yours (one game thread, one render thread , and the rest will be job threads). The "only main thread" method I was also talking about is based off of umbra-ignite-2015-jrmy-virga-dishonored- 

where they queue the game logic tasks and render tasks from the main thread. 

I guess it makes sense that it would not increase latency, I have just read about how companies like id-software , arkane , and a few other studios explain that that was the reason why they chose the main thread / task system model.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Advertisement
  • Advertisement
  • Popular Tags

  • Advertisement
  • Popular Now

  • Similar Content

    • By turanszkij
      Hi, right now building my engine in visual studio involves a shader compiling step to build hlsl 5.0 shaders. I have a separate project which only includes shader sources and the compiler is the visual studio integrated fxc compiler. I like this method because on any PC that has visual studio installed, I can just download the solution from GitHub and everything just builds without additional dependencies and using the latest version of the compiler. I also like it because the shaders are included in the solution explorer and easy to browse, and double-click to open (opening files can be really a pain in the ass in visual studio run in admin mode). Also it's nice that VS displays the build output/errors in the output window.
      But now I have the HLSL 6 compiler and want to build hlsl 6 shaders as well (and as I understand I can also compile vulkan compatible shaders with it later). Any idea how to do this nicely? I want only a single project containing shader sources, like it is now, but build them for different targets. I guess adding different building projects would be the way to go that reference the shader source project? But how would they differentiate from shader type of the sources (eg. pixel shader, compute shader,etc.)? Now the shader building project contains for each shader the shader type, how can other building projects reference that?
      Anyone with some experience in this?
    • By mark_braga
      I am working on a compute shader in Vulkan which does some image processing and has 1024 * 5=5120 loop iterations (5 outer and 1024 inner)
      If I do this, I get a device lost error after the succeeding call to queueSubmit after the image processing queueSubmit
      // Image processing dispatch submit(); waitForFence(); // All calls to submit after this will give the device lost error If I lower the number of loops from 1024 to 256 => 5 * 256 = 1280 loop iterations, it works fine. The shader does some pretty heavy arithmetic operations but the number of resources bound is 3 (one SRV, one UAV, and one sampler). The thread group size is x=16 ,y=16,z=1
      So my question - Is there a hardware limit to the number of loop executions/number of instructions per shader?
    • By AxeGuywithanAxe
      I wanted to see how others are currently handling descriptor heap updates and management.
      I've read a few articles and there tends to be three major strategies :
      1 ) You split up descriptor heaps per shader stage ( i.e one for vertex shader , pixel , hull, etc)
      2) You have one descriptor heap for an entire pipeline
      3) You split up descriptor heaps for update each update frequency (i.e EResourceSet_PerInstance , EResourceSet_PerPass , EResourceSet_PerMaterial, etc)
      The benefits of the first two approaches is that it makes it easier to port current code, and descriptor / resource descriptor management and updating tends to be easier to manage, but it seems to be not as efficient.
      The benefits of the third approach seems to be that it's the most efficient because you only manage and update objects when they change.
    • By khawk
      CRYENGINE has released their latest version with support for Vulkan, Substance integration, and more. Learn more from their announcement and check out the highlights below.
      Substance Integration
      CRYENGINE uses Substance internally in their workflow and have released a direct integration.
      Vulkan API
      A beta version of the Vulkan renderer to accompany the DX12 implementation. Vulkan is a cross-platform 3D graphics and compute API that enables developers to have high-performance real-time 3D graphics applications with balanced CPU/GPU usage. 

      Entity Components
      CRYENGINE has addressed a longstanding issue with game code managing entities within the level. The Entity Component System adds a modular and intuitive method to construct games.
      And More
      View the full release details at the CRYENGINE announcement here.

      View full story
    • By khawk
      The AMD GPU Open website has posted a brief tutorial providing an overview of objects in the Vulkan API. From the article:
      Read more at http://gpuopen.com/understanding-vulkan-objects/.

      View full story
  • Advertisement