• Advertisement

Vulkan Vulkan render pass interdependencies?

Recommended Posts

Hi,

In Vulkan you have render passes where you specify which attachments to render to and which to read from, and subpasses within the render pass which can depend on each other. If one subpass needs to finish before another can begin you specify that with a subpass dependency.

In my engine I don't currently use subpasses as the concept of the "render pass" translates roughly to setting a render target and clearing it followed by a number of draw calls in DirectX, while there isn't really any good way to model subpasses in DX. Because of this, in Vulkan, my frame mostly consists of a number of render passes each with one subpass.

My question is, do I have to specify dependencies between the render passes or is that needed only if you have multiple subpasses?

In the Vulkan Programming Guide, chapter 13 it says: "In the example renderpass we set up in Chapter 7, we used a single subpass with no dependencies and a single set of outputs.”, which suggests that you only need dependencies between subpasses, not between render passes. However, the (excellent) tutorials at vulkan-tutorial.com have you creating a subpass dependency to "external subpasses" in the chapter on "Rendering and presentation", under "Subpass dependencies": https://vulkan-tutorial.com/Drawing_a_triangle/Drawing/Rendering_and_presentation even if they are using only one render pass with a single subpass.

So, in short; If I have render pass A, with a single subpass, rendering to an attachment and render pass B, also with a single subpass, rendering to that same attachment, do I have to specify subpass dependencies between the two subpasses of the render passes, in order to make render pass A finish before B can begin, or are they handled implicitly by the fact that they belong to different render passes?

Thanks!

Edited by GuyWithBeard

Share this post


Link to post
Share on other sites
Advertisement

The short answer is that you are likely to have problems and will need to do something to introduce dependencies between render passes.  The longer answer is that this is very driver specific and you might get away with it, at least for a while.  The problem here is that unlike Dx12, Vulkan does do a little driver level work to help you out which you usually want to duplicate in Dx12 yourself anyway.  Basically though, if you issue 10 render passes, the Vulkan driver does not have to heed the order you sent them to the queue unless you have supplied some form of dependency information or explicit synchronization.  Vulkan is allowed to, and probably will, issue the renderpasses out of order, in parallel or completely ass backwards depending on the drivers involved.  Obviously this means that the draw commands associated with each begin/next subpass executes out of order.

When I implemented my wrapper, I ended up duplicating the entire concept of render/sub passes in Dx12, it was the most appropriate solution I could find which would allow me to solve the various issues on the three primary rendering API's I wanted to support.  The primary reason for putting the work into this was exactly the problems you are asking questions about.  At least in my case, when I really dug into and understood the render passes I realized I had basically just been doing exactly the same things except in my rendering code.  By pushing it down to an abstraction it cleaned things up quite a lot and made for a much cleaner API.  Additionally, at a later time, it will make it considerably easier to deal with optimizing for better parallelism and GPU utilization since all the transitions and issuance is in one place that I can get clever with, without having to re-organize large swaths of code.

So, yup, I'd suggest you consider implementing the subpass portion because it has a lot of benefits and solves the problem you are asking about.

Share this post


Link to post
Share on other sites

Interesting. I wonder if you could force render passes into a specific execution order (ie. the order they are recorded into the command buffer) by using only VK_SUBPASS_EXTERNAL as srcSubpass and dstSubpass.

EDIT: And by that I mean that the vulkan-tutorial.com article seems to suggest that you can refer to a subpass of a previous/subsequent render pass using VK_SUBPASS_EXTERNAL, although I am a bit unsure if that is what it really means.

Of course, it wouldn't be utilizing the GPU to its fullest but it would ensure correct behavior...

Edited by GuyWithBeard

Share this post


Link to post
Share on other sites

BTW, did you manage to hide away all resource barriers and transitions into your render pass system or do you still have to issue barrier/transition commands on your command buffers at the rendering code level?

Edited by GuyWithBeard

Share this post


Link to post
Share on other sites
1 hour ago, GuyWithBeard said:

Interesting. I wonder if you could force render passes into a specific execution order (ie. the order they are recorded into the command buffer) by using only VK_SUBPASS_EXTERNAL as srcSubpass and dstSubpass.

Yes you can.  This was actually the first approach I tried but it is very bad for the GPU in terms that it severely under-utilizes parallelism.  As a quick and dirty 'get it running' solution it is fine though.  If you read that tutorial again you might catch that they talk about there being implicit 'subpasses' in each render pass, it's kinda 'pre' 'your pass' 'post'.  By setting dependencies on those hidden subpasses that is how you can control renderpass to renderpass dependency, or at least ordering.

1 hour ago, GuyWithBeard said:

BTW, did you manage to hide away all resource barriers and transitions into your render pass system or do you still have to issue barrier/transition commands on your command buffers at the rendering code level?

Nope, I still expose quite a bit of that for things not directly within a renderpass since it is common to all three API's, or null op where Metal does some of it for you.  The most specific case is the initial upload of a texture where you still have to do the transition to CPU visible, copy the data, transition to GPU visible copy to GPU only memory.  While I have considered hiding all these items the goal is to keep the adapter as close to the underlying API's as possible and maintain the free threaded externally synchronized model of DX12/Vulkan and to a lesser degree Metal.  Hiding transitions and such would mean building more thread handling and synchronization into that low level than I desire.  This would be detrimental since I'm generating command buffers via 32 threads (Thread ripper is fun) which would really suck if the lowest level adapter was introducing synchronization at the CPU level in order to automate hiding a couple barrier & transition calls.

Long story short, the middle layer 'rendering engine' will probably hide all those details.  I just haven't really bothered much and hand code a lot of it since I'm more focused on game systems than on the rendering right now.

Share this post


Link to post
Share on other sites
1 hour ago, Hiwas said:

Yes you can. 

Cool, thanks. I am gonna go with that for now because I am in the middle of implementing large feature and now is not the right time to rewrite my render pass system.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


  • Advertisement
  • Advertisement
  • Popular Tags

  • Advertisement
  • Popular Now

  • Similar Content

    • By mark_braga
      I am working on a compute shader in Vulkan which does some image processing and has 1024 * 5=5120 loop iterations (5 outer and 1024 inner)
      If I do this, I get a device lost error after the succeeding call to queueSubmit after the image processing queueSubmit
      // Image processing dispatch submit(); waitForFence(); // All calls to submit after this will give the device lost error If I lower the number of loops from 1024 to 256 => 5 * 256 = 1280 loop iterations, it works fine. The shader does some pretty heavy arithmetic operations but the number of resources bound is 3 (one SRV, one UAV, and one sampler). The thread group size is x=16 ,y=16,z=1
      So my question - Is there a hardware limit to the number of loop executions/number of instructions per shader?
    • By AxeGuywithanAxe
      I wanted to see how others are currently handling descriptor heap updates and management.
      I've read a few articles and there tends to be three major strategies :
      1 ) You split up descriptor heaps per shader stage ( i.e one for vertex shader , pixel , hull, etc)
      2) You have one descriptor heap for an entire pipeline
      3) You split up descriptor heaps for update each update frequency (i.e EResourceSet_PerInstance , EResourceSet_PerPass , EResourceSet_PerMaterial, etc)
      The benefits of the first two approaches is that it makes it easier to port current code, and descriptor / resource descriptor management and updating tends to be easier to manage, but it seems to be not as efficient.
      The benefits of the third approach seems to be that it's the most efficient because you only manage and update objects when they change.
    • By khawk
      CRYENGINE has released their latest version with support for Vulkan, Substance integration, and more. Learn more from their announcement and check out the highlights below.
      Substance Integration
      CRYENGINE uses Substance internally in their workflow and have released a direct integration.
       
      Vulkan API
      A beta version of the Vulkan renderer to accompany the DX12 implementation. Vulkan is a cross-platform 3D graphics and compute API that enables developers to have high-performance real-time 3D graphics applications with balanced CPU/GPU usage. 

       
      Entity Components
      CRYENGINE has addressed a longstanding issue with game code managing entities within the level. The Entity Component System adds a modular and intuitive method to construct games.
      And More
      View the full release details at the CRYENGINE announcement here.

      View full story
    • By khawk
      The AMD GPU Open website has posted a brief tutorial providing an overview of objects in the Vulkan API. From the article:
      Read more at http://gpuopen.com/understanding-vulkan-objects/.


      View full story
    • By HateWork
      Hello guys,
      My math is failing and can't get my orthographic projection matrix to work in Vulkan 1.0 (my implementation works great in D3D11 and D3D12). Specifically, there's nothing being drawn on the screen when using an ortho matrix but my perspective projection matrix work fantastic!
      I use glm with defines GLM_FORCE_LEFT_HANDED and GLM_FORCE_DEPTH_ZERO_TO_ONE (to handle 0 to 1 depth).
      This is how i define my matrices:
      m_projection_matrix = glm::perspective(glm::radians(fov), aspect_ratio, 0.1f, 100.0f); m_ortho_matrix = glm::ortho(0.0f, (float)width, (float)height, 0.0f, 0.1f, 100.0f); // I also tried 0.0f and 1.0f for depth near and far, the same I set and work for D3D but in Vulkan it doesn't work either. Then I premultiply both matrices with a "fix matrix" to invert the Y axis:
      glm::mat4 matrix_fix = {1.0f, 0.0f, 0.0f, 0.0f, 0.0f, -1.0f, 0.0f, 0.0f, 0.0f, 0.0f, 1.0f, 0.0f, 0.0f, 0.0f, 0.0f, 1.0f}; m_projection_matrix = m_projection_matrix * matrix_fix; m_ortho_matrix = m_ortho_matrix * matrix_fix; This fix matrix works good in tandem with GLM_FORCE_DEPTH_ZERO_TO_ONE.
      Model/World matrix is the identity matrix:
      glm::mat4 m_world_matrix(1.0f); Then finally this is how i set my view matrix:
      // Yes, I use Euler angles (don't bring the gimbal lock topic here, lol). They work great with my cameras in D3D too! m_view_matrix = glm::yawPitchRoll(glm::radians(m_rotation.y), glm::radians(m_rotation.x), glm::radians(m_rotation.z)); m_view_matrix = glm::translate(m_view_matrix, -m_position); That's all guys, in my shaders I correctly multiply all 3 matrices with the position vector and as I said, the perspective matrix works really good but my ortho matrix displays no geometry.
      EDIT: My vertex data is also on the right track, I use the same geometry in D3D and it works great: 256.0f units means 256 points/dots/pixels wide.
      What could I possibly be doing wrong or missing?
      Big thanks guys any help would be greatly appreciated. Keep on coding, cheers.
       
  • Advertisement