• Advertisement

Vulkan Confused with Vulkan subpass dependency

Recommended Posts

I am looking at the SaschaWillems subpass example for getting some insight into subpass depdendencies but its hard to understand whats going on without any comments. Also there is not a lot of documentation on subpass dependencies overall.

Looking at the code, I can see that user specifies the src subpass, dst subpass and src state, dst state. But there is no mention of which resource the dependency is on. Is a subpass dependency like a pipeline barrier. If yes, how does it issue the barrier? Is the pipeline barrier issued on all attachments in the subpass with the input src and dst access flags? Any explanation will really clear a lot of doubts on subpass dependencies.

Thank you

Edited by mark_braga

Share this post


Link to post
Share on other sites
Advertisement

I recently wrote an abstraction for this mechanism so my graphics API would not be D3D12 specific.  Given that, I can only really describe this from the point of view of writing the code but since things seem to be working, I believe the details I figured out are pretty close to accurate.

First off, you need to look at the three related info structures again since they most certainly do tell you exactly which images are being referenced, it is just a bit indirect. Basically there is an array of all images used in the overall pass found in the render pass info structure, sub passes reference these images via 0 based indexing.

As to the behavior, at the start and end of each subpass the API issues an image transition barrier if needed to put the attachment in the requested format.  So, for instance, if you were doing a post processing blur, you might end up with the following chain of events:

NextSubPass
Transition attachment 0 to writable
.. Draw your scene
NextSubPass
Transition attachment 0 to readable
Transition attachment 1 to writable
.. Draw post processing quad to run vertical blur with input attachment 0 and output attachment 1
NextSubPass
Transition attachment 0 to writable
Transition attachment 1 to readable
.. Draw post processing quad to run horizontal blur with input attachment 1 and output attachment 0

So the attachments involved are ping ponging from readable to writable as required for the post processing to occur.

Hopefully this makes sense and helps you out.  I had to look at those structures quite a few times till I figured out the details.  The structures themselves are pretty simple, it's just the relationships that are hard to see until you try and fail a couple times to get the correct behavior.

Share this post


Link to post
Share on other sites

Thanks for the explanation.

1 hour ago, Hiwas said:

NextSubPass
Transition attachment 0 to writable
.. Draw your scene
NextSubPass
Transition attachment 0 to readable
Transition attachment 1 to writable
.. Draw post processing quad to run vertical blur with input attachment 0 and output attachment 1

Here are you talking about the attachment in the subpass or the renderpass? (Is attachment0 relative to the pColorAttachments in the subpass or pAttachments in the renderpass)

Share this post


Link to post
Share on other sites

In the subpass descriptions you have arrays of VkAttachmentReference which is a uint and layout.  The uint is the 0 based index into the VkRenderPassCreateInfo structure's pAttachment array where you listed all of the attachments for the render pass.  So, effectively, what I'm saying with those is:

// assume you have pRenderPass and pSubPass pointers to the respective Vk structures.
theImageWeWantToMessWith = pRenderPass->pAttachments[ pSubPass->pInputAttachments.attachment ]

That is effectively what is going on behind the scenes to figure out which image to call memory barriers on.
So, when I said attachment 0 and 1, I was talking about the index into the VkRenderPassCreateInfo structure's pAttachments array.  Note that render pass info does not separate inputs/outputs etc, it just takes one big list, only subpasses care about usage.

Hope that clarifies things.

Share this post


Link to post
Share on other sites

So how is the image barrier issued. Is the logic something like this:

for (uint32_t i = 0; i < dependencyCount; ++i)
{
	if (pDependencies[i].srcSubpass == currentSubpass)
    {
    	for (uint32_t att = 0; att < pRenderPass->attachmentCount; ++att)
        {
        	if (pRenderPass->pAttachments[att]->srcAccessFlag == pDepdendencies[i].srcAccessFlag)
            {
            	// transition the attachment to pDependencies.dstAccess?
            }
        }
    }
}

 

Share this post


Link to post
Share on other sites

In a general way, that is fairly close to a very simplistic solution.  Unfortunately at this level it is really all about how clever the drivers get when they solve the path through the dag generated by the subpasses.  They could do the very simplistic solution of just issuing a vkCmdPipelineBarrier with top and bottom of pipe flags set between subpasses with dependencies or they could look at the subpass attachments in detail and figure out a more refined approach.  Since this is all just a state transition chain, building a simple DAG allows for a much more optimized approach to issuing a mix of pipeline and memory barriers.

I can't find the article I remember that describes some of this but this one may be of interest: https://gpuopen.com/vulkan-barriers-explained/ as it is related.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


  • Advertisement
  • Advertisement
  • Popular Tags

  • Advertisement
  • Popular Now

  • Similar Content

    • By mark_braga
      I am working on a compute shader in Vulkan which does some image processing and has 1024 * 5=5120 loop iterations (5 outer and 1024 inner)
      If I do this, I get a device lost error after the succeeding call to queueSubmit after the image processing queueSubmit
      // Image processing dispatch submit(); waitForFence(); // All calls to submit after this will give the device lost error If I lower the number of loops from 1024 to 256 => 5 * 256 = 1280 loop iterations, it works fine. The shader does some pretty heavy arithmetic operations but the number of resources bound is 3 (one SRV, one UAV, and one sampler). The thread group size is x=16 ,y=16,z=1
      So my question - Is there a hardware limit to the number of loop executions/number of instructions per shader?
    • By AxeGuywithanAxe
      I wanted to see how others are currently handling descriptor heap updates and management.
      I've read a few articles and there tends to be three major strategies :
      1 ) You split up descriptor heaps per shader stage ( i.e one for vertex shader , pixel , hull, etc)
      2) You have one descriptor heap for an entire pipeline
      3) You split up descriptor heaps for update each update frequency (i.e EResourceSet_PerInstance , EResourceSet_PerPass , EResourceSet_PerMaterial, etc)
      The benefits of the first two approaches is that it makes it easier to port current code, and descriptor / resource descriptor management and updating tends to be easier to manage, but it seems to be not as efficient.
      The benefits of the third approach seems to be that it's the most efficient because you only manage and update objects when they change.
    • By khawk
      CRYENGINE has released their latest version with support for Vulkan, Substance integration, and more. Learn more from their announcement and check out the highlights below.
      Substance Integration
      CRYENGINE uses Substance internally in their workflow and have released a direct integration.
       
      Vulkan API
      A beta version of the Vulkan renderer to accompany the DX12 implementation. Vulkan is a cross-platform 3D graphics and compute API that enables developers to have high-performance real-time 3D graphics applications with balanced CPU/GPU usage. 

       
      Entity Components
      CRYENGINE has addressed a longstanding issue with game code managing entities within the level. The Entity Component System adds a modular and intuitive method to construct games.
      And More
      View the full release details at the CRYENGINE announcement here.

      View full story
    • By khawk
      The AMD GPU Open website has posted a brief tutorial providing an overview of objects in the Vulkan API. From the article:
      Read more at http://gpuopen.com/understanding-vulkan-objects/.


      View full story
    • By HateWork
      Hello guys,
      My math is failing and can't get my orthographic projection matrix to work in Vulkan 1.0 (my implementation works great in D3D11 and D3D12). Specifically, there's nothing being drawn on the screen when using an ortho matrix but my perspective projection matrix work fantastic!
      I use glm with defines GLM_FORCE_LEFT_HANDED and GLM_FORCE_DEPTH_ZERO_TO_ONE (to handle 0 to 1 depth).
      This is how i define my matrices:
      m_projection_matrix = glm::perspective(glm::radians(fov), aspect_ratio, 0.1f, 100.0f); m_ortho_matrix = glm::ortho(0.0f, (float)width, (float)height, 0.0f, 0.1f, 100.0f); // I also tried 0.0f and 1.0f for depth near and far, the same I set and work for D3D but in Vulkan it doesn't work either. Then I premultiply both matrices with a "fix matrix" to invert the Y axis:
      glm::mat4 matrix_fix = {1.0f, 0.0f, 0.0f, 0.0f, 0.0f, -1.0f, 0.0f, 0.0f, 0.0f, 0.0f, 1.0f, 0.0f, 0.0f, 0.0f, 0.0f, 1.0f}; m_projection_matrix = m_projection_matrix * matrix_fix; m_ortho_matrix = m_ortho_matrix * matrix_fix; This fix matrix works good in tandem with GLM_FORCE_DEPTH_ZERO_TO_ONE.
      Model/World matrix is the identity matrix:
      glm::mat4 m_world_matrix(1.0f); Then finally this is how i set my view matrix:
      // Yes, I use Euler angles (don't bring the gimbal lock topic here, lol). They work great with my cameras in D3D too! m_view_matrix = glm::yawPitchRoll(glm::radians(m_rotation.y), glm::radians(m_rotation.x), glm::radians(m_rotation.z)); m_view_matrix = glm::translate(m_view_matrix, -m_position); That's all guys, in my shaders I correctly multiply all 3 matrices with the position vector and as I said, the perspective matrix works really good but my ortho matrix displays no geometry.
      EDIT: My vertex data is also on the right track, I use the same geometry in D3D and it works great: 256.0f units means 256 points/dots/pixels wide.
      What could I possibly be doing wrong or missing?
      Big thanks guys any help would be greatly appreciated. Keep on coding, cheers.
       
  • Advertisement