MRT and Anti-aliasing

Graphics and GPU Programming Programming

Started by Medo Mex September 05, 2013 10:19 PM

7 comments, last by Tispe 10 years, 7 months ago

Medo Mex

891

Author

September 05, 2013 10:19 PM

I have two render targets, 0 = color and 1 = depth.

Now I'm trying to get anti-aliasing to work, since it's not working anymore.

I read that I could use IDirect3DDevice9::StretchRect to get anti-aliasing to work, how do I use the function correctly?

I'm getting a black screen If I enabled anti-aliasing.

Tispe

1,468

September 06, 2013 05:44 AM

This is just guessing but if you render to a surface with a 4x larger area (double length and height), then downsample it using StretchRect it gives the effect of AA.

Hodgman

52,717

September 06, 2013 06:30 AM

Unfortunately, D3D9 doesn't allow MSAA targets to be used with MRT:

Multiple Render Targets (Direct3D 9)
No antialiasing is supported.

If you want to support this, you can emulate MRT with multiple passes -- first render the scene to your first target, then render the scene again to your second target.

. 22 Racing Series .

Medo Mex

891

Author

September 06, 2013 07:13 PM

@Hodgman: Hmm.. don't you think that will lower the performance?

What modern games do?

@Tispe: Which is better for performance? the technique that you mentioned or rendering the scene twice?

DwarvesH

510

September 06, 2013 09:04 PM

Wasn't the problem with DirectX 9 that while you could create all your targets with some MSAA technique, because of the way you did lighting and accumulated all the targets with the lighting the results were not correct? And DirectX 10.1 gave you access to all the samples in a multisampled target and also gave you access to the depth buffer in a similar manner?

Unfortunately, D3D9 doesn't allow MSAA targets to be used with MRT:

Multiple Render Targets (Direct3D 9)

No antialiasing is supported.

If you want to support this, you can emulate MRT with multiple passes -- first render the scene to your first target, then render the scene again to your second target.

Would this look correct? Rendering the albedo, normal and specular targets all separately, with MSAA on and combining them in the final shader would give artifact-less and correct rendering? I'm asking just about the correctness. I know this would be slow, because you need to render the whole thing 3 times into separate multi-sampled targets and resolve each to a texture before combination.

@Hodgman: Hmm.. don't you think that will lower the performance?

What modern games do?

@Tispe: Which is better for performance? the technique that you mentioned or rendering the scene twice?

I'm not using deferred rendering so I have full access to MSAA (but not CSAA, damn you XNA), but I also optionally support post-processing AA. I have FXAA support, which is incredibly fast but gives in my opinion very blurry results. And I have support for SMAA which gives extremely good results on non-moving pictures, even better than MSAA x8 sometimes, but has higher temporal anti-aliasing. And it is a lot slower than FXAA, but faster than MSAA 8x. Good thing that you cost per frame is constant.

Both are very easy to implement. If you know how to render a full-screen quad and do something simple like bloom, you will easily take a reference sample of FXAA or SMAA and integrate them into your engine within minutes.

There is also a mater of preference. I can't stand the blurriness of FXAA but others don't notice it. For a test, if you have NVidia try out 32x CSAA with an ultra setting SMAA. You may really like the look, with both techniques rarely building up temporal aliasing constructively. Just don't expect high framerates :).

My blegh: http://dwarvesh.blogspot.com

Hodgman

52,717

September 08, 2013 03:05 AM

@Hodgman: Hmm.. don't you think that will lower the performance?

Yes, quite possibly. But your current version doesn't even work, so I'd value bad performance higher than no performance

However, in some cases, this may actually increase performance. Drawing the scene twice, where the first time you only draw depth, and the second time you use your real pixel shaders, is known as a "z pre pass", "depth pre pass", "zpp", etc.

Doom 3 chose to do this on purpose, because it lets the GPU take full advantage of the depth buffer, to avoid overdraw.

Say you've got a camera looking through 3 walls:
Cam -> |A| |B| |C|
If you draw C, then B, then A, then you're running 3 different pixel shaders, even though only the last one (A) counts -- it overwrites the previous results. That's "overdraw".

By drawing the whole scene's depth buffer first, then in the second pass, B & C will be skipped, because they fail the depth test.
In scenes that have a lot of overdraw, then a ZPP may actually improve performance.

What modern games do?

Use DX11 where you can use MSAA textures and MRT at the same time

Another solution though would be to use a post-process anti-aliasing solution, like FXAA instead of MSAA, as mentioned above.

Wasn't the problem with DirectX 9 that while you could create all your targets with some MSAA technique, because of the way you did lighting and accumulated all the targets with the lighting the results were not correct? And DirectX 10.1 gave you access to all the samples in a multisampled target and also gave you access to the depth buffer in a similar manner?

Yes, D3D9 gives no way to access the sub-samples (besides averaging them all together in a standard resolve step), so even if you do manage to render out an MSAA G-buffer, you can't make use of it.

Would this look correct? Rendering the albedo, normal and specular targets all separately, with MSAA on and....

Nope. Medo3337 isn't rendering a G-buffer. He's just using forward rendering, but also outputting depth to a colour texture.

The averaging of depth will also be wrong, but probably still close enough to correct for most purposes.

. 22 Racing Series .

DwarvesH

510

September 10, 2013 10:18 AM

However, in some cases, this may actually increase performance. Drawing the scene twice, where the first time you only draw depth, and the second time you use your real pixel shaders, is known as a "z pre pass", "depth pre pass", "zpp", etc.

Doom 3 chose to do this on purpose, because it lets the GPU take full advantage of the depth buffer, to avoid overdraw.

Say you've got a camera looking through 3 walls:
Cam -> |A| |B| |C|
If you draw C, then B, then A, then you're running 3 different pixel shaders, even though only the last one (A) counts -- it overwrites the previous results. That's "overdraw".

By drawing the whole scene's depth buffer first, then in the second pass, B & C will be skipped, because they fail the depth test.
In scenes that have a lot of overdraw, then a ZPP may actually improve performance.

I've heard of Z-pass before. I actually read about it. To quote directly:

Double-Speed Z-Only and Stencil Rendering

All GeForce Series GPUs (FX and later) render at double speed when rendering

only depth or stencil values. To enable this special rendering mode, you must

follow the following rules:

Color writes are disabled

Texkill has not been applied to any fragments (clip, discard)

Depth replace (oDepth, texm3x2depth, texdepth) has not been applied to any fragments

Alpha test is disabled

No color key is used in any of the active textures

See section 6.4.1 for information on NULL render targets with double speed Z.

3.6.2. Z-cull Optimization

Z-cull optimization improves performance by avoiding the rendering of

occluded surfaces. If the occluded surfaces have expensive shaders applied to

them, z-cull can save a large amount of computation time. See section 4.8 for a

discussion on Z-cull and how to best use it.

3.6.3. Lay Down Depth First (“Z-only rendering”)

The best way to take advantage of the two aforementioned performance

features is to “lay down depth first.” By this, we mean that you should use

double-speed depth rendering to draw your scene (without shading) as a first

pass. This then establishes the closest surfaces to the viewer. Now you can

render the scene again, but with full shading. Z-cull will automatically cull out

fragments that aren't visible, meaning that you save on shading computations.

Laying down depth first requires its own render pass, but can be a performance

win if many occluded surfaces have expensive shading applied to them. Doublespeed rendering is less efficient as triangles get small. And, small triangles can

reduce z-cull efficiency.

Bu I'm not sure exactly what I should do to benefit 100% from this technique and the "double speed" Z rendering and how to satisfy all those points.

Then there is this section that makes things even more confusing:

CULL and EarlyZ: Coarse and

Fine-grained Z and Stencil

Culling

NVIDIA GeForce 6 series and later GPUs can perform a coarse level Z and

Stencil culling. Thanks to this optimization large blocks of pixels will not be

scheduled for pixel shading if they are determined to be definitely occluded.

In addition, GeForce 8 series and later GPUs can also perform fine-grained Z

and Stencil culling, which allow the GPU to skip the shading of occluded pixels.

These hardware optimizations are automatically enabled when possible, so they

are mostly transparent to developers. However, it is good to know when they

cannot be enabled or when they can underperform to ensure that you are taking

advantage of them.

Coarse Z/Stencil culling (also known as ZCULL) will not be able to cull any

pixels in the following cases:

1. If you don’t use Clears (instead of fullscreen quads that write depth) to

clear the depth-stencil buffer.

2. If the pixel shader writes depth.

3. If you change the direction of the depth test while writing depth.

ZCULL will not cull any pixels until the next depth buffer Clear.

4. If stencil writes are enabled while doing stencil testing (no stencil

culling)

5. On GeForce 8 series, if the DepthStencilView has

Texture2D[MS]Array dimension

Also note that ZCULL will perform less efficiently in the following

circumstances

1. If the depth buffer was written using a different depth test direction

than that used for testing

2. If the depth of the scene contains a lot of high frequency information

(i.e.: the depth varies a lot within a few pixels)

3. If you allocate too many large depth buffers.

4. If using DXGI_FORMAT_D32_FLOAT format

Similarly, fine-grained Z/Stencil culling (also known as EarlyZ) is disabled in

the following cases:

1. If the pixel shader outputs depth

2. If the pixel shader uses the .z component of an input attribute with the

SV_Position semantic (only on GeForce 8 series in D3D10)

3. If Depth or Stencil writes are enabled, or Occlusion Queries are

enabled, and one of the following is true:

• Alpha-test is enabled

• Pixel Shader kills pixels (clip(), texkil, discard)

• Alpha To Coverage is enabled

• SampleMask is not 0xFFFFFFFF (SampleMask is set in

D3D10 using OMSetBlendState and in D3D9 setting the

D3DRS_MULTISAMPLEMASK renderstate)

My blegh: http://dwarvesh.blogspot.com

Medo Mex

891

Author

September 12, 2013 11:38 PM

I think I could use the way Tispe suggested for now by making the backbuffer size larger than the scene and then resize it using device->StretchRect() to get anti-aliasing effect

Now how do I use device->StretchRect() correctly to do that?

Tispe

1,468

September 13, 2013 10:47 AM


device->StretchRect(pSourceTextureSurface, NULL, pBackBufferSurface, NULL, D3DTEXF_LINEAR);

You can render to a texture with double the resolution then get the texture surface and pass it as pSourceTextureSurface. Remember to release() the surfaces to decrease the reference count.

Edit: Perhpas just using a plain surface with device->CreateOffscreenPlainSurface(). D3DPOOL_DEFAULT is the appropriate pool for use with the IDirect3DDevice9::StretchRect. This surface must then be released when resetting the device.

MRT and Anti-aliasing

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

MRT and Anti-aliasing

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines