Need barrier after RTV cleanrendertarget and before actual rendering?

Started by
7 comments, last by galop1n 7 years, 4 months ago

Hey Guys,

a quick question:

do we need to insert some kind of barrier between RTV clean and actual rendering to the same RTV?

It seems we don't need this barrier, but after struggling with a lot resource state tracking issues in DX12, I kinda of overreacting whenever we have two GPU tasks working on same memory address...

If we don't need this barrier, how could we assure that when I actually write to pixel, the clear operation on the same pixel is done by then?

Thanks in advance

Advertisement

No, you don't need a barrier in the case of a clear followed by a draw. Draws implicitly have ordering guarantees with regards to render target read/write operations: if you issue Draw A before Draw B, then the writes (and blending operations) of Draw B are guaranteed to happen after the writes of Draw A are completed.

Note that these guarantees only apply to render target operations, and not to the shaders themselves or any UAV read/writes. The pixel shaders can and will execute in any order, and typically the hardware will have some sort of mechanism for ensuring that the RT writes get put in the correct order even though the shaders themselves did not execute in draw order. This is why Rasterizer Ordered Views were added for D3D12, since they let ensure that writes to UAV's happen in draw order.

There is no need for a barrier between a clear and render. In fact, if your resource is not in a RT state, the clear is likely to yield at you with the debug layer. You wiil need of course, later, a barrier from RTV to SRV, this will instruct the driver to perform fast clear elimination. As a side note, be sure to put the proper clear color at resource creation so the fast clear can be enable.

No, you don't need a barrier in the case of a clear followed by a draw. Draws implicitly have ordering guarantees with regards to render target read/write operations: if you issue Draw A before Draw B, then the writes (and blending operations) of Draw B are guaranteed to happen after the writes of Draw A are completed.

Note that these guarantees only apply to render target operations, and not to the shaders themselves or any UAV read/writes. The pixel shaders can and will execute in any order, and typically the hardware will have some sort of mechanism for ensuring that the RT writes get put in the correct order even though the shaders themselves did not execute in draw order. This is why Rasterizer Ordered Views were added for D3D12, since they let ensure that writes to UAV's happen in draw order.

Thanks. Just curious how GPU achieved that ordering in RT write. They give each RT write a Draw ID and block undesired RT write?(so possible block relative ps thread?)?


later, a barrier from RTV to SRV, this will instruct the driver to perform fast clear elimination.

Thanks for the reply, but I just get confused by this sentence. why there is a fast clear elimination when we transit resource from RTV to SRV? what this clear elimination doing?

Thanks

Thanks. Just curious how GPU achieved that ordering in RT write. They give each RT write a Draw ID and block undesired RT write?(so possible block relative ps thread?)?


For the case of traditional "immediate mode" GPU's (the kind you find in the discrete video cards used by laptops and desktops), the magic happens in the ROPs. The ROPs are the bit of hardware that handles memory access to the render targets, and they're capable of sorting their inputs by draw order to ensure that the writes happen in the correct order. See this more info: https://fgiesen.wordpress.com/2011/07/12/a-trip-through-the-graphics-pipeline-2011-part-9/
Fast clear mecanihum is a gpu optimisation. The gpu split your surface in little tiles and keep a little block of memory for their status. when you clear, only the status of the tile is cleared, not your surface. when you render, touched tiles will clear themselves (if not fully covered). Then once you are done, the gpu will have to clear the remaining tiles. Hopefuly, not many as you have covered most of the surface, and so save on bandwidth.

This is why you provide a clear color at the resource creation. Usually, the fast clear will only work with it.

That kind of system exist for color compression and depth buffer optimisation too. That is why resource barrier are important so the driver knows when to perform actions.

Thanks MJP, that link you posted is super useful :)

And thanks galop1n for elaborating on the RT clear, HW vendors are doing crazy thing in a block box... wish to know more...

AMD has lots of hardware documentation available if you really want to get into some of the low-level details of their GPU's: http://developer.amd.com/resources/developer-guides-manuals/ (scroll down to "Instruction Set Architecture (ISA) Documents" and "Open GPU Documentation"). Intel also has a ton of docs available: https://01.org/linuxgraphics/documentation

AMD has lots of hardware documentation available if you really want to get into some of the low-level details of their GPU's: http://developer.amd.com/resources/developer-guides-manuals/ (scroll down to "Instruction Set Architecture (ISA) Documents" and "Open GPU Documentation"). Intel also has a ton of docs available: https://01.org/linuxgraphics/documentation

And nVidia keep everything secret :(

This topic is closed to new replies.

Advertisement