Sign in to follow this  
Cypher19

Deferred shading and hardware antialiasing, living happily together *Updated*

Recommended Posts

I'm sure most of you reading this know that deferred shading and antialiasing basically don't go together. Well, I've managed to crack this nut. Previously, the only kind of "antialiasing" people could do with DS was just a simple blur along the edges. In my opinion, that method sucks, because it's not using any knowledge of the sub-pixel data, like hardware AA do. Well, I've devised a method that allows for a blurring using hardware AA data, demonstrated below: No AA: Image hosted by Photobucket.com Regular hardware AA: (6 sample stochastic on an ATi card) Image hosted by Photobucket.com Deferred AA: Image hosted by Photobucket.com (click the images for uncropped versions) Anyways, as you can see, the two are virtually indistinguishable from each other (I almost cannot tell the difference in motion), and performance is the same way. The algorithm is basically a weighted 2x1 blur, but because of extra calculations and an extra (cheap) pass (two are needed before the process takes place: an AA'd image, and a non-AA'd image using the same data, the later you'd already have when doing deferred shading of course) it's a tad bit more costly. On my Radeon X800Pro, performance dropped from ~198fps to ~183fps1, without any lighting calculations. I was only testing with diffuse texture data, as the above images show, but because it's a post-process blur it does with with lighting. Now for the meat of it: The algorithm works by first taking an AA'd image, and then a non-AA'd image of the same data, preferably something that is likely going to have aliasing in it, such as normal data. The AA'd image and non-AA'd CAN be different formats (e.g. my demo uses X8R8G8B8 for the AA'd normals, and A16B16G16R16F for the non-AA'd normals which I use for lighting) so long as the data can be compared. Other effective edge-finding data can be used as well, such as position. I did not experiment with that because edges like concave walls won't get AA'd, and the fact that position needs the high precision data that FP16 buffers provide, so it can be AA'd. After getting the data, the antialiasing postprocess takes place. First, two samples are taken. One from the AA'd image, and one from the NonAA'd image with a ~1 texel offset. The difference is found between them, and then that "length" is taken, and just for safety's sakes, the final value is then saturated (I found that the values could get go outside the 0..1 boundary and cause some aliasing). This gives us the weight, a, to use so that we can interpolate between two values from the final scene (each offset from the pixel in question):
Col = a*FinalNonAA[0] + (a-1)*FinalNonAA[1]
And that's it! See 3 for full HLSL pixel shader code. One *slight* disadvantage behind this though is that polygons that are smaller than a pixel don't get AA'd properly, and they just end up looking as if they weren't AA'd at all. Anyways, here are some results from the demo, with lighting: With AA 1 Without AA 1 With AA 2 Without AA 2 With AA 3 Without AA 3 And, lastly, if you have a card that supports PS2.0 (and 3 simultaneous render targets) you can check out the demo here: http://www.eng.uwaterloo.ca/~dcrooks/DeferredAA.rar Controls are E/S/D/F to move around (FPS-style with strafing), B to turn off AA (this only affects image quality. The AA blur isn't actually turned off, it's just that the weighting is set to some constant value), N to stop the lights, Up/down to brighten or darken the lights, and Left/Right to increase or decrease the number of lights. *NOTE* Make sure that your driver settings (e.g. ATi's Catalyst Control Center) are set to either No AA or application managed. The AA seems to turn right off if you try to force it. By default, the app will try to get your max AA level (either 8x, 6x, or 4x). 1: After writing this out, I just remembered that the implementation that gives those framerates don't do the 2 final scene texture samples, since in this case the final scene was just the diffuse data sampled earlier. Perf shouldn't drop too much further though. 2: I'll edit this topic later to include the logic behind this idea. 3:
float4 DeferredAAPS(float2 Tex:TEXCOORD0, uniform float Width, uniform float Height):COLOR0
{
	float4 Col;
	float3 NonAA;
	float2 Offset = float2(1/Width, 1/Height)*0.75;
	NonAA = abs(normalize(tex2D(NonAASampler, Tex-Offset).rgb));
	float3 AAResult = tex2D(AASampler, Tex).rgb;
	
	float Weight;

	Weight =  saturate(length(AAResult-NonAA))*WeightFactor; // The weightfactor is set in the main program so I can turn it on and off.
	
	Col.rgb =	tex2D(ScreenSampler, Tex+Offset).rgb * Weight+
				tex2D(ScreenSampler, Tex-Offset).rgb * (1-Weight);
				
	Col.a = 1;

	return Col;

}
[Edited by - Cypher19 on October 6, 2005 8:50:58 AM]

Share this post


Link to post
Share on other sites
Hi,

Interesting (and a bit weird;)) idea for solving AA in deferred shading, but it's difficult to tell much from the screenies you provided. You could also try doing different non-AA texture look-ups - taking just top-left and bottom-right obviously ignores a lot of cases.

Share this post


Link to post
Share on other sites
Quote:
Original post by MickeyMouse
Hi,

Interesting (and a bit weird;)) idea for solving AA in deferred shading, but it's difficult to tell much from the screenies you provided. You could also try doing different non-AA texture look-ups - taking just top-left and bottom-right obviously ignores a lot of cases.


Well, I was concerned about that as well, but once I saw the results with just two samples, I felt that the quality was more than good enough, so I didn't bother with four (which is what I was initially going to do, but was scared as hell that I wouldn't be able to find the four weights from each sample that I needed)

Share this post


Link to post
Share on other sites
One thing that I think you're missing is the fact that you need HDR buffers to store the per pixel data in (atleast the per pixel normals).

You can't render AA'd to any HDR format at the moment, this makes it impossible to perform the first step in your solution:
Quote:
The algorithm works by first taking an AA'd image


Maybe you could use LDR buffers for the normal, if you don't need specular reflections, but IMO specular is crucial to get nice lighting.
Specular doesn't work with quantized normals, since when you takes your calculated specular intensity and raises that to the power of your glossiness, the quantization error gives you severe banding artifacts.

I've just changed from a deferred renderer to a "create-zillions-of-pixel-shaders" renderer, to allow multi sampling when not using HDR (I've got some projects that need HDR, others that doesn't).

If you find a good (effective) way to get multi sampling when writing to HDR targets I'd be interested.

I've always been a fan of deferred rendering since it's so simple to change lighting model and add new object shaders, but my clients keep asking me why my stuff looks so jagged compared to other renderers.

Just my 2c.

Share this post


Link to post
Share on other sites
Quote:
Original post by eq
One thing that I think you're missing is the fact that you need HDR buffers to store the per pixel data in (atleast the per pixel normals).

You can't render AA'd to any HDR format at the moment, this makes it impossible to perform the first step in your solution:
Quote:
The algorithm works by first taking an AA'd image


Maybe you could use LDR buffers for the normal, if you don't need specular reflections, but IMO specular is crucial to get nice lighting.
Specular doesn't work with quantized normals, since when you takes your calculated specular intensity and raises that to the power of your glossiness, the quantization error gives you severe banding artifacts.

I've just changed from a deferred renderer to a "create-zillions-of-pixel-shaders" renderer, to allow multi sampling when not using HDR (I've got some projects that need HDR, others that doesn't).

If you find a good (effective) way to get multi sampling when writing to HDR targets I'd be interested.

I've always been a fan of deferred rendering since it's so simple to change lighting model and add new object shaders, but my clients keep asking me why my stuff looks so jagged compared to other renderers.

Just my 2c.


*scratches head* um, yeah, but I'm using diffuse texture data right now, not normals. I was merely suggesting using other things like normal and position for better edge finding. And the AA'd normal data could be A8R8G8B8, and just use the nonAA stuff as A16R16G16B16F. The data should be similar, just more precise in the latter format.

Also, one thing to point out is that since it's just the weightings that are calculated, you could easily adapt this to an HDR solution.

Share this post


Link to post
Share on other sites
Hmm I don't get it.
In my renderer, the geometry writes the following:
Buffer0 = Albedo.rgb, SelfIllumination
Buffer1 = Normal.xyz, SpecularIntensity
Buffer2 = World.xyz, Glossiness

Then I do a full screen pass, that converts Normal + World to:
Buffer3 = Reflection.xyz, ZBufferDepth

I then use these buffers to accumulate lights into another buffer.
Buffer4 = Light.rgb

And finally combine them into one HDR buffer using:
Buffer5 = Albedo * Light * (1 - SelfIllumination) + Albedo * SlefIllumination

Then I have a HDR to LDR step, using a tone-mapper, adaptive luminosity and bloom filtering.

Assume that I store the result of Buffer5 in an LDR format instead, skipping the whole HDR to LDR step.
I fail to see how your method would help me get the same results in Buffer5 as if I'd did it all with pixelshaders and multi sampling enabled!?

I really like to know if this is possible with only one extra AA'd geometry pass!?

Edit: I can see that this could work somewhat ok with only one buffer, but if you need to re-render the geometry into several buffers again, it's seems like it's going to get too slow..

Share this post


Link to post
Share on other sites
Quote:
Original post by Code-R
forgive my ignorance, but even in LDR, how would you get the AA'd image in the first place?


IDirect3DDevice9::StretchRect lets people copy multisampled data to a non-MS'd RT.

Quote:
Hmm I don't get it.
In my renderer, the geometry writes the following:
Buffer0 = Albedo.rgb, SelfIllumination
Buffer1 = Normal.xyz, SpecularIntensity
Buffer2 = World.xyz, Glossiness

Then I do a full screen pass, that converts Normal + World to:
Buffer3 = Reflection.xyz, ZBufferDepth

I then use these buffers to accumulate lights into another buffer.
Buffer4 = Light.rgb

And finally combine them into one HDR buffer using:
Buffer5 = Albedo * Light * (1 - SelfIllumination) + Albedo * SlefIllumination

Then I have a HDR to LDR step, using a tone-mapper, adaptive luminosity and bloom filtering.

Assume that I store the result of Buffer5 in an LDR format instead, skipping the whole HDR to LDR step.
I fail to see how your method would help me get the same results in Buffer5 as if I'd did it all with pixelshaders and multi sampling enabled!?

I really like to know if this is possible with only one extra AA'd geometry pass!?

Edit: I can see that this could work somewhat ok with only one buffer, but if you need to re-render the geometry into several buffers again, it's seems like it's going to get too slow..


Before you create Buffer0-2, render the albedo (and if you want, self-illumination. That's probably unnecessary though) to the backbuffer where it'll be antialiased, StretchRect it to buffer(-1), then as a part of (or after) the HDR to LDR step apply Deferred AA like a blur filter (because that's what it is), by using Buffer(-1)as the AA'd data, Buffer0 as the non-AA'd data, and your near-final result as the final scene.

Share this post


Link to post
Share on other sites
So you say that you could:

// Calc linear blend
LinearBlendFactor = (AlbedoAA - Aldedo[1]) / (Albedo[0] - Albedo[1])

// Calc final albedo (why not use AlbedoAA directly?)
FinalAlbedo = lerp(LinearBlendFactor, Albedo[0], Albedo[1])

// Calc final normal
FinalNormal = lerp(LinearBlendFactor, Normal[0], Normal[1])

I don't think that you can use the same linear blend factor for the different buffers?!
The difference in color doesn't say anything about the difference in normal or any other component, or?

Ex.

Multi sampled pixel's (3 fragments):
Channel 0: 9 0 9 3 7 2 => 6 4 as averaged single samples.
Channel 1: 0 9 0 2 8 2 => 3 4 as averaged single samples

Single sampled buffer (using the centre fragment):
Channel 0: 0 7
Channel 1: 9 8

Blend factor using buffer 0 = (6 - 0) / (7 - 0) = 6/7
Blend factor using buffer 1 = (3 - 9) / (8 - 9) = 6
Blend factors aren't even close since there's no correlation between the buffers.

After blend using factor 0:
Channel 0: 6 (always same as the averaged multi sampled)
Channel 1: 9 should have been 3

After blend using factor 1:
Channel 0: 42 should have been 6
Channel 1: 3 (always same as the averaged multi sampled)





It's quite obvious that this works "perfectly" for one buffer (but then again this is quite useless, why not use the AA'd buffer directly).
Since there's NO correlation between the data in the different buffers, one AA'd buffer isn't enough, you'd need an AA'd sample for every buffer/channel (and then again this is useless since you could use the AA'd buffer directly).

Share this post


Link to post
Share on other sites
Quote:
It's quite obvious that this works "perfectly" for one buffer (but then again this is quite useless, why not use the AA'd buffer directly).
Since there's NO correlation between the data in the different buffers, one AA'd buffer isn't enough, you'd need an AA'd sample for every buffer/channel (and then again this is useless since you could use the AA'd buffer directly).


The core idea was to, at best, estimate what the total AA contribution is to a pixel. Once I got lighting back in (which was an hour or so ago) I noticed that the current method was insufficient. I'm currently retooling it so that edge detection and weighting work properly.

Share this post


Link to post
Share on other sites
Let me know if you find something that does work, no one would be more happy then I.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this