Any tutorial on Multipass Rendering?

Graphics and GPU Programming Programming rendering pass multipass multi

Started by Key_C0de December 27, 2020 01:45 AM

10 comments, last by swiftcoder 3 years, 2 months ago

Key_C0de

Author

December 27, 2020 01:45 AM

Are there any good tutorials on multi-pass rendering?

I would consider it a bonus if they had some code (C/C++) as well.

None

markypooch

1,354

December 27, 2020 12:28 PM

I'm assuming you are working with a lower level api?

There are many. Though “good” can be subjective. I assume when you say, “multi-pass” you are referring to the high-level general process of performing a pass of some geometry to a fbo/rt/dsv, and the transitioning of that resource to something else (perhaps something that can be sampled in a shader)?

OpenGL Dev has some examples in how you might accomplish this, and the process isn't terribly different from OpenGL > D3D11.

http://ogldev.atspace.co.uk/www/tutorial23/tutorial23.html - Demonstrates performing a pass of the scene from the perspective of a directional light into a depth buffer in order to later transition that into a shader resource for shadow mapping in a subsequent pass

http://ogldev.atspace.co.uk/www/tutorial35/tutorial35.html - Demonstrates performing a pass of the scene geometry, storing normals/albedo/specular into textures for decoupling of light calculations from said geometry in a subsequent pass.

https://developer.nvidia.com/gpugems/gpugems3/part-iv-image-effects/chapter-23-high-speed-screen-particles - Demonstrates a particle optimization by downsampling your depth buffer to a lower resolution, performing a “pass” by rendering your particles to a low-resolution rt testing against that downsampled depthbuffer, and blending it back in.

Let me know if this helps, or not.

Key_C0de

Author

December 27, 2020 12:40 PM

Yes I'm on DirectX.

I do know about multipass rendering. But everything I could gather and implement I did it by reading discussions, forum threads etc. I have yet to read an “official” tutorial, or some guide, or some thesis on it, so I'm wondering if one exists that may help solidifying my knowledge a bit.

Thanks.

None

markypooch

1,354

December 27, 2020 12:54 PM

Ah, well, for simple cases, the process generally looks like: (for d3d11)

Create a ID3D11Texture2D* with a D3D11_TEXTURE2D_DESC with bind flags something like D3D11_BIND_RENDER_TARGET | D3D11_BIND_SHADER_RESOURCE_VIEW , for a depth stencil, you'd probably want D3D11_BIND_DEPTH_STENCIL | D3D11_BIND_SHADER_RESOURCE_VIEW
Create a RenderTargetView/DepthStencilView from that texture resource by filling out a D3D11_RENDER_TARGET_VIEW_DESC/D3D11_DEPTH_STENCIL_VIEW_DESC and calling the CreateRenderTargetView/CreateDepthStencilView methods on the device context.
Set it via the OMSetRenderTarget method on the device context
Do your relevant drawing
Transistion back to your primary RenderTargetView that is associated with the textures on your swap chain with the same OMSetRenderTarget method, and when the time comes you can use that texture you created in the first step as a shader input.

Rastertek, and Braynzarsoft have tutorials with more of a d3d focus.

At least for d3d11, the process isn't too different from the hoops you need to jump through to perform the bare minimum setup to get something on screen, minus a few extra steps. i.e., explicitly having to create the backing texture2d resource with the appropriate flags instead of just grabbing it from the swap chain.

http://www.rastertek.com/dx11tut22.html

https://www.braynzarsoft.net/viewquestion/q33611-directx-11

But let me know if that's of some help, I assume you are using DirectX 11

[EDIT] - I should probably expand on this, as I believe you might also be asking how this might look like in an engine, or rendering framework, and managing it in a sensible way. The way I've done it before is by taking the approach of having essentially a vector of Pass objects, each pass describes it's input/output resources, what will be drawn in that pass, the render targets it will use, their dependencies ect.

I like this approach as it offers a good deal of flexibility, another approach I've tried is an opaque world renderer that hides most notions of passes, and kind of manages all the passes involving lighting/shadowing/screen space stuff behind the scenes that need to be configured with calls like AddModel(DrawItem)/WriteToShadowMap(DrawItem)/AddPointLight(…..). Whereas a draw item would represent all the state (shaders, blend states, rasterizer states, buffers) needed for a particular draw call. Only somewhat exposing a pass system for the post process stuff that would get processed by the world renderer last. but I'm sure other people could weigh in on this, as I'm far more of a hobbyist rather than a professional.

this is a pretty good read

https://www.slideshare.net/DICEStudio/framegraph-extensible-rendering-architecture-in-frostbite

Key_C0de

Author

December 27, 2020 04:03 PM

@markypooch Thanks for elaborating. I have read most of the resources you linked. Good stuff.

"each pass describes it's input/output resources" I'm doing something similar. You mean that a Pass might depend on another Pass, for example Pass1 outputs some resource (say a Render target texture) that will be used as input by Pass2 (it will read the texture and process it further and output it to the final back buffer or some Pass3)? Input and Output are generic classes and each Pass has 2 vectors one for each of them. Is that what you're saying?

None

markypooch

1,354

December 27, 2020 04:23 PM

@Key_C0de more or less, yea. Like that frostbite slide I linked, there are numerous ways to approach this problem, and it ranges from the simple to fairly complex. And I guess it just depends on what you need out of it.

Key_C0de

Author

December 27, 2020 04:32 PM

@markypooch That's cool. I've also been dealing on this, I've read about this elsewhere and this is what I set my mind with. It is good that I find another who adheres to this design.

The problem I have with it though is that some passes need to be executed before others. I suppose the Input/Output takes care of these dependencies, but I need to go further into development of it to clear some of my doubts.. I want to have a way for this to be multithreaded (separate threads recording draw calls independently) and then putting everything inside a render queue. Finally once everything's built the main thread dispatching everything on the GPU.

None

swiftcoder

18,997

December 27, 2020 06:13 PM

Key_C0de said:
The problem I have with it though is that some passes need to be executed before others. I suppose the Input/Output takes care of these dependencies, but I need to go further into development of it to clear some of my doubts.

If you think about the inputs/outputs, they form a directed graph. And that means you can run a topological sort to determine the correct rendering order.

Once it's ordered by dependency, parallelising the result is also straightforward. You just loop through the list in order, and dispatch any passes where all the dependencies are ready, until all passes are complete.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Key_C0de

Author

December 31, 2020 11:05 PM

@swiftcoder Thanks and Happy new year 2021.

So you are saying I should organize the Input/Output (under a common class say GraphNode) and let them be the Renderer's graph nodes. So the renderer contains a Graph data structure and when I have to render I run a topological sort, to sort what's needed to run first (based on dependency relationships).

The thing is that on a Pass's execute() the draw calls are being dispatched (Not by any action of the Input/Output classes, the latter exist just to set dependencies between Passes). Therefore I have to find out how to then sort these Passes (which are owned by the Renderer) based on the topological sort of the Input/Output graph nodes. Once the Passes are sorted then I can fire up execute on them one at a time which dispatches the draw calls to the GPU.

I hope you're following. Maybe I'm just thinking out loud here because that's the way I have organized my codebase and I want an evaluation.

Thus: “How to sort the Passes”?

Maybe I should run Topological sort on the Passes themselves based on some criteria involving Inputs/Outputs. It's not clear to me yet how to do that automatically. The plot thickens..

EDIT

Should I even bother sorting the Passes though (in my small game engine)?

Just execute them at creation/setup order I guess..

None

noizex

1,456

February 11, 2021 09:15 PM

EDIT
Should I even bother sorting the Passes though (in my small game engine)?
Just execute them at creation/setup order I guess..

That's what I did and so far it works well… SSAO, Depth Prepass, Tiled Forward+ Lighting, CSM, FXAA, debug layers etc. and I haven't yet had a need for some sort (hah!) of sorting. I also realized that with inputs/outputs that have multiple passes (like the same resources is somehow used in 1 pass, then reused or added to in another pass) it will be hard to determine at which point given resource contains what some other pass expects. Probably doable but I did it simpler, and it's just linear execution of passes. Not making Frostbite after all, and from what I saw their render graph is big mess so I prefer this simplicity ;P

I really suggest starting small and once you actually run into the need of sorting and rearranging your passes you can do it easily. I find it more useful to be able to rearrange them on runtime via my config (to test various paths, effects), and also the consistency of data driven renderer setup - everything is from that config, even compute buffers and their structure. It's really something that opens tons of possibilities to expand your rendering and makes it actually fun.

I can elaborate a bit on how this is structred in code, but it's really just common sense - passes (different fixed types: scene render, cascade shadow map, blur, blit, compute), output/input vectors. There will be some magic involved to set the resources properly (sometimes deferred setup needed), to make all resources react to window resizes properly and change their sizes etc. But it's definitely worth it in the long run.

Here is the definition of my sample render graph, all serially executed. Still WIP, heavily based on Stingray renderer config with some Frostbite. Needs lots of features still - static and dynamic branching (disabling certain passes and having it gracefully disable them in other passes without crashing, if there is some dependency), flexible pass parameters that can be set via config (currently they're hardcoded), some kind of modules which contain passes that can be reused multiple times etc.

passes: [
   # depth pre-pass and population of structure buffer
   {
       name: depth_prepass_structure,
       type: scene,
       outputs: [ structure_buffer, depth_stencil_buffer ],
       clear: [ depth, color ],
       render_state: opaque,
       queues: main_scene,
       context: depth_and_structure
   },

   {
       name: shadow_map,
       type: cascaded_shadow_map,
       outputs: [ shadow_map ],
       clear: [ depth ],
       queues: main_scene,
       render_state: opaque_shadowmap,
       context: cascaded_shadow_map
   },

   {
       name: ssao_buffer,
       type: screen_space,
       clear: [
           color,
           depth
       ],      
       shader: ssao,
       render_state: fullscreen_noblend,
       inputs: [
           structure_buffer,
           ssao_rotations
       ],
       outputs: [ occlussion_buffer ]
   },

   {
       name: ssao_buffer_blur,
       type: screen_space,
       clear: [ color, depth ],
       shader: ssao_blur,
       render_state: fullscreen_noblend,
       inputs: [ structure_buffer, occlussion_buffer ],
       outputs: [ occlussion_buffer_blurred ]
   },

   {
       name: forward_light_culling,
       type: compute,
       shader: light_culling,
       workgroupsx: "(screen_width + (screen_width % tile_size)) / tile_size",
       workgroupsy: "(screen_height + (screen_height % tile_size)) / tile_size",
       write_buffers: [ visible_light_indices_buffer ],
       inputs: [ depth_stencil_buffer ]
   },

   {
       name: forward_light_culling_debug,
       type: screen_space,
       clear: [ color ],
       render_state: fullscreen_noblend,
       shader: light_culling_debug,
       outputs: [ debug_color ] 
    },

   {
       name: opaque_geometry,
       type: scene,
       clear: [ color, depth ],
       render_state: opaque,
       inputs: [ shadow_map, occlussion_buffer_blurred ],
       outputs: default_framebuffer,
       queues: [ scene_opaque, scene_alpha_tested ]
   },
 
   {
       name: hdr_light_accum,
       type: scene,
       clear: [ color ],
       render_state: opaque_nodepthwrite,
       inputs: [ occlussion_buffer_blurred, shadow_map ],
       outputs: [ hdr, hdr_brightness, depth_stencil_buffer, main_color ],        
       queues: main_scene
   },

   {
       name: bloom,
       type: blur,
       repeats: 10,
       render_state: fullscreen_noblend,
       inputs: [ hdr_brightness, hdr_blur ],
       outputs: [ hdr_brightness, hdr_blur ],
       shader: blur_gaussian
   },      

   {
       name: hdr_tone_mapping,
       type: screen_space,
       clear: [ color ],
       render_state: fullscreen_noblend,
       inputs: [ hdr, hdr_brightness ],
       outputs: [ base ],
       shader: tone_mapping        
   },    

   {
       name: skybox,
       type: screen_space,
       render_state: fullscreen_depthtest,
       inputs: [ skybox ],        
       outputs: [ base, depth_stencil_buffer ],
       shader: skybox    
   },      
         
   {
       name: fxaa_filter,
       type: screen_space,
       clear: [ color ],
       inputs: [ base ],
       outputs: [ base_2 ],        
       render_state: fullscreen_noblend,
       shader: filter_fxaa
   },

   {
       name: debug_draw,
       type: scene,
       outputs: [ base_2, depth_stencil_buffer ],
       render_state: debug_depth_blend,
       queues: debug
   },           

   {
       name: debug_draw_no_depth,
       type: scene,
       outputs: [ base_2, depth_stencil_buffer ],
       render_state: debug_no_depth_blend,
       queues: debug_nodepth
   },  
   
   {
       name: blit_main_color,
       type: blit,
       clear: [ color, depth ],
       inputs: [ base_2 ],
       outputs: default_framebuffer,
       render_state: opaque
   },
 
   {
       name: imgui,
       type: overlay,
       outputs: default_framebuffer,
       render_state: overlay,
       queues: imgui
   },

   {
       name: game_ui,
       type: overlay,
       outputs: default_framebuffer,
       render_state: overlay,
       queues: [ game_ui ]
   }
]


Where are we and when are we and who are we?
How many people in how many places at how many times?

Any tutorial on Multipass Rendering?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Any tutorial on Multipass Rendering?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines