the best ssao ive seen

ArKano22 · 2011-11-16T22:11:26

Sorry if the topic title is a bit pretentious but that´s what i think when i look at it :D. I´ve been struggling quite some time to get a good looking ssao and this is the end of my quest! I´ve implemented the famous bunnell GI disk method (http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter14.html) but in screen space (so in the end it´s only form factors) and the results speak for themselves: With contrast increased, so you can see artifacts & granularity: None of the screens have any blurring applied, just ssao output. The banding you can see at some places is due to precision problems in my gbuffer, bear with me. Otherwise, it looks incredibly good, consistent when you move the camera, very little haloing (but haloing anyway) and no gray flat surfaces, captures well fine details. Plus you get one bounce local GI, like in the last two screens. Speed wise, without GI it is as fast as every other ssao implementation, maybe a bit slower. With GI, more or less 30-35% speed decrease. I´m going to post full glsl code soon, together with some speed comparisons. To implement it you´ll need to have a gbuffer set up, or at least normal, diffuse+directlight and depth buffers.

Graphics and GPU Programming Programming

Started by ArKano22 December 13, 2009 04:54 PM

237 comments, last by Paksas 12 years, 6 months ago

ArKano22

651

Author

March 14, 2010 05:36 AM

Quote:Original post by B_old
Are you saying that offseting the dotproduct by a fixed amount should have the same effect as offseting the occluded point depending on its distance from the eye? Or should the offset also be relative? Currently I have no access to my development-stuff during weekends, but I will give this a try tomorrow.

If you subtract an amount from the dot result, it has the same effect as what you´re doing, yes. Take a look at the new version i posted, the bias is implemented that way. Tinker around with it and you´ll see how it works.

Quote:Original post by B_old
So 360 / samples is correct?

No, for it to be correct it would be needed to calculate the angle of a fixed-length circular arc, and based on that length, change the rotation angle. That would make the samples evenly spaced.

Quote:Original post by ArKano22
Actually I have another question about this, because don't really understand whats going on. Most SSAO-shaders I've seen shoot rays in a sphere around the occluded point, reversing the sign for rays that point away from the occluded normal. So in the end you have hemi-sphere of rays. But you seem to be shooting rays all around. Why does it work?.

All the info you need to calculate occlusion is in the buffers. Sampling around a sphere works when you think more in 3D, but ssao is just an advanced edge enhancing effect, a 2D filter. You need to know relative position and normals of pixels. If a pixel is near other and they´re facing each other, they are occluding each other, else they´re not. My method works by just sampling around and deciding how much each sample occludes the current pixel. Since i´m using position and normals instead of only depth, i have more information available so i don´t need to sample in 3D and project or stuff like that.

When implementing the blur in the new version i realized it´s not different from a bilateral filter in the sense that you are taking samples around and using their normal/position difference to calculate something: in the case of bilateral blur, it is blur weight, in my case, it is occlusion.

Because of this and the existence of pseudo-separable bilateral filters, i still hope that my ssao can be made separable as well. No luck yet :D.

EDIT: I´ve made a pair of diagrams to try to explain it better. I´m not very good at drawing but maybe they will clarify things a bit:

Traditional occlusion traces rays in a hemisphere, projects to screen and compares depths:

My ssao (point-disk now...i don´t know how to call it anymore) uses the
distance between the occluder and the current pixel (d) and the angle between the normal (N) of the current pixel and the
vector from the pixel to the occluder (V) to calculate an occlusion value:

In this image you can see that, if you move the occludee in the direction of the normal, the angle between V and N increases. That is the same as substracting the "bias" value from the dot product, which is what i do. In the old version i used the occluder´s normal too, but it is not really necessary.

[Edited by - ArKano22 on March 14, 2010 9:36:29 AM]

Shael

286

March 14, 2010 08:28 PM

After having so many issues I went over everything again and reworked the shader to its simplest form which doesn't use normals from the geometry, just samples from a unit sphere and reflects using randomised normals from a texture. This is the result:

new result

Couple questions tho,

What are your thoughts on how it looks? Correct/Wrong?

What the best way to blend it with the scene? I tried multiplying it with the material diffuse and lighting but the effect is very very subtle. Do I need to somehow darken the ssao result?

Cheers!

2EZ Studios

ArKano22

651

Author

March 15, 2010 05:24 AM

Quote:Original post by Shael
After having so many issues I went over everything again and reworked the shader to its simplest form which doesn't use normals from the geometry, just samples from a unit sphere and reflects using randomised normals from a texture. This is the result:

Couple questions tho,

What are your thoughts on how it looks? Correct/Wrong?

What the best way to blend it with the scene? I tried multiplying it with the material diffuse and lighting but the effect is very very subtle. Do I need to somehow darken the ssao result?

Cheers!

Hi!
I think it looks ok but too much haloing. Your occlusion function needs to be scaled down, so that small depth differences affect more, and big differences dont affect at all. This will increase the contrast of the image, and possibly create some self occlusion artifacts but the overall quality will be better.

Try to subtract the ao from the ambient light. If you cant (you have ambient and diffuse together in a buffer or lighting is calculated elsewhere) try multiplying the resulting occlusion by some factor.

Shael

286

March 15, 2010 09:39 PM

Thanks for the reply. I tried to scale down the distance and sample radius but then I barely get any shading, the room turns white. For that screenshot above I had distance scale set 200 and sample radius 0.9. If i lower distance and raise radius I goes almost white.

Must appreciated if you could take a quick look at my code and see if you notice anything I've overlooked.

sampler2D DepthSampler = sampler_state{	Texture = &lt;DepthBuffer&gt;;	MinFilter = LINEAR;	MipFilter = NONE;	MagFilter = LINEAR;		ADDRESSU = CLAMP;    ADDRESSV = CLAMP;};sampler2D NormalSampler = sampler_state{	Texture = &lt;NormalBuffer&gt;;	MinFilter = LINEAR;	MipFilter = NONE;	MagFilter = LINEAR;		ADDRESSU = CLAMP;    ADDRESSV = CLAMP;};sampler2D RandomSampler = sampler_state{	Texture = &lt;RandomTexture&gt;;	MinFilter = LINEAR;	MipFilter = NONE;	MagFilter = LINEAR;		ADDRESSU = WRAP;    ADDRESSV = WRAP;};//-----------------------------------------------------------------------------// Name: SSAO_VS// Type: Vertex shader                                   // Desc: Adjust half pixel and pass texcoord as well as frustum corner to PS//       in order to re-construct view-space position//-----------------------------------------------------------------------------struct OutputVS{    float4 Position		: POSITION0;	float2 UV 	        : TEXCOORD0;	float3 FrustumRay   : TEXCOORD1;};OutputVS SSAO_VS(float4 pos : POSITION0, float4 tex : TEXCOORD0){	OutputVS Out = (OutputVS) 0;		Out.Position = pos;	Out.UV = tex;	Out.FrustumRay = FSQ_GetFrustumRay(tex);		return Out;}float4 samples[16] ={	float4(0.355512,   -0.709318, 	-0.102371,  0.0 ),	float4(0.534186,    0.71511, 	-0.115167,  0.0 ),	float4(-0.87866,    0.157139, 	-0.115167,  0.0 ),	float4(0.140679,   -0.475516, 	-0.0639818, 0.0 ),	float4(-0.0796121,  0.158842, 	-0.677075,  0.0 ),	float4(-0.0759516, -0.101676, 	-0.483625,  0.0 ),	float4(0.12493,    -0.0223423,	-0.483625,  0.0 ),	float4(-0.0720074,  0.243395, 	-0.967251,  0.0 ),	float4(-0.207641,   0.414286, 	 0.187755,  0.0 ),	float4(-0.277332,  -0.371262, 	 0.187755,  0.0 ),	float4(0.63864,    -0.114214, 	 0.262857,  0.0 ),	float4(-0.184051,   0.622119, 	 0.262857,  0.0 ),	float4(0.110007,   -0.219486, 	 0.435574,  0.0 ),	float4(0.235085,    0.314707,    0.696918,  0.0 ),	float4(-0.290012,   0.0518654,   0.522688,  0.0 ),	float4(0.0975089,  -0.329594,    0.609803,  0.0 )};				float g_fSampleRad = 0.9;float g_fScale = 200.0;//float g_OccludeDist = 7.5;//-----------------------------------------------------------------------------// Name: SSAO_PS// Type: Pixel shader                                   // Desc: Calculate screen space ambient occlusion of the scene//-----------------------------------------------------------------------------float4 SSAO_PS(OutputVS IN) : COLOR0{	float fColour	= 0.0f;                	float2 nUV		= IN.UV;	float  Depth	= tex2D(DepthSampler, nUV).r;	float3 Norm 	= (2.0f * tex2D(RandomSampler, nUV * 100).rgb - 1.0f);	float3 pixelPosEyeSpace	= IN.FrustumRay * Depth;	//	return float4(Depth, 1.0f, 1.0f, 1.0f);//	return SE.xyzz;//	return VSNs.xyzz;		float result = 0.0f;	for(int i = 0; i &lt; 16; i++)	{		// Determine the eye-space and clip-space locations of our current sample point		float4 samplePointEyeSpace = float4(pixelPosEyeSpace + (reflect(samples.rgb,Norm) * g_fSampleRad), 1.0f);        float4 samplePointClipSpace = mul(samplePointEyeSpace, mProj);		// Determine the texture coordinate of our current sample point		float2 sampleTexCoord = 0.5f * (samplePointClipSpace.xy/samplePointClipSpace.w) + float2(0.5f, 0.5f);		//Flip around the y-coordinate and offset by half a pixel		sampleTexCoord.y = 1.0f - sampleTexCoord.y;		float2 offset = 0.5f / float2(fViewportWidth, fViewportHeight);		sampleTexCoord -= offset;					// Read the depth of our sample point from the depth buffer		float sampleDepth = tex2D(DepthSampler, sampleTexCoord).r;		// Compute our occulusion factor        float occlusionFactor = g_fScale * max(Depth - sampleDepth, 0.0f);        result += 1.0f / (1.0f + occlusionFactor * occlusionFactor);	}		result = saturate(result/16);		return float4(result, result, result, result);}

Thanks!

2EZ Studios

B_old

689

March 16, 2010 02:20 AM

Hi all,
didn't want to post before tomorrow, because I only had little time to test the new shader yesterday, but maybe some additions to Shael's post.

When trying the new shader with similar (same) constants as the "old" one, there was virtually no visible occlusion for me, too. Didn't have time yet to tinker with the values much.
I noticed that the new shader runs slower?! Without the backface-addition and only 16 samples it seems to run slower than the "old" shader with 20 samples. Couldn't explain it, have to look into it more.
Something that seems to work really well for me is to use the new doAmbientOcclusion() in the "old" shader. Still looks great and runs faster. I just had to lower g_scale considerably but apart from that it seems fine.

[Edited by - B_old on March 16, 2010 5:20:59 AM]

ArKano22

651

Author

March 16, 2010 04:42 AM

Quote:Original post by B_old
Hi all,
didn't want to post before tomorrow, because I only had little time to test the new shader yesterday, but maybe some additions to Shael's post.

When trying the new shader with similar (same) constants as the "old" one, there was virtually no visible occlusion for me to. Didn't have time yet to tinker with the values much.
I noticed that the new shader runs slower?! Without the backface-addition and only 16 samples it seems to run slower than the "old" shader with 20 samples. Couldn't explain it, have to look into it more.
Something that seems to work really well for me is to use the new doAmbientOcclusion() in the "old" shader. Still looks great and runs faster. I just had to lower g_scale considerably but apart from that it seems fine.

Hi :)
That´s weird. For me, the 16 sample version of the new shader runs much faster than the "old" version O_o. It is true that the scale of the new parameters is different than the old version so same values will not give the same result. However i don´t know why it would run slower, having less operations to do... If you happen to find something please tell!

EDIT: Maybe it has something to do with the sampling pattern, the old one used a much more regular one, and that might benefit from the cache...
EDIT2: I changed a 8.0 with a 80.0 in the version i uploaded. In both "for" loops, change the 80 with an 8 and you should gain at least 20 fps. I will reupload a corrected version.

[Edited by - ArKano22 on March 16, 2010 9:42:57 AM]

AgentSnoop

110

March 16, 2010 10:33 AM

Quote:Original post by ArKano22
Hi :)
That´s weird. For me, the 16 sample version of the new shader runs much faster than the "old" version O_o. It is true that the scale of the new parameters is different than the old version so same values will not give the same result. However i don´t know why it would run slower, having less operations to do... If you happen to find something please tell!

EDIT: Maybe it has something to do with the sampling pattern, the old one used a much more regular one, and that might benefit from the cache...
EDIT2: I changed a 8.0 with a 80.0 in the version i uploaded. In both "for" loops, change the 80 with an 8 and you should gain at least 20 fps. I will reupload a corrected version.

Changing the 80.0 to 8.0 did speed things up about 20 fps or so :D. Also, for me, it seems dependent on the sampling radius. .01 is faster than .1, which is faster than 1, but then goes back to roughly the same speed as .01 at 19, but looks bad. (Good thing I enjoy the results of .01 the best).

Although, I still have a problem when I am perpendicular (it's easier for me to look directly down) at a triangle and it seems to mess up. I know it's partly depth related as when you change your depth relative to the triangle, it flickers. Any thoughts?

[Edited by - AgentSnoop on March 16, 2010 1:33:24 PM]

Shael

286

March 17, 2010 06:57 AM

Sorry to be a pain in the ass. I've finally got the first RM project code working outside of RM. But the only way to do it was to use exactly the same render target setup. I cannot for the life of me get it to work using R32F for the linear depth, and A8R8G8B8 for normals and then reconstruct the position using a frustum ray.

I've checked my code over and over and I can't see what is wrong. The position and normals when output to screen look the same as when I output them in RM. Its driving me insane.

Could you setup either a small dx application or another RM project that uses this form render target setup with position reconstruction to see if you can get it working?

Thanks a lot!

2EZ Studios

ArKano22

651

Author

March 17, 2010 11:03 AM

Changed again the sampling method, this time i think it is as fast as it gets:
RM Project
(the last link i posted is now the same as this one, so no one downloads the bugged version)

I found the coherency between samples has a huge impact in performance, probably due to the gpu caching pixels near each sample. If you use a too random sampling pattern, you occasionate lots of cache fails and the fps drop. Now on the Hebe scene i get 190-200 fps with front and back faces and +300 fps with only front faces.

It also depends on the distance to the object, objects near the camera use bigger occlusion radiuses which are also slow to compute, due to samples being too far apart between them. Someone (too lazy to search the reply, sorry) noticed this and asked why bigger radiuses affected the speed, that´s the answer.

Shael, i´ll try to get a version with position reconstruction working. However I cant promise anything, i do not have any DX framework around and i´m not really a DX guy, so i´d probably do it in rendermonkey.

AgentSnoop

110

March 17, 2010 11:07 AM

Quote:Original post by Shael
Sorry to be a pain in the ass. I've finally got the first RM project code working outside of RM. But the only way to do it was to use exactly the same render target setup. I cannot for the life of me get it to work using R32F for the linear depth, and A8R8G8B8 for normals and then reconstruct the position using a frustum ray.

I've checked my code over and over and I can't see what is wrong. The position and normals when output to screen look the same as when I output them in RM. Its driving me insane.

Could you setup either a small dx application or another RM project that uses this form render target setup with position reconstruction to see if you can get it working?

Thanks a lot!

I'm using that kind of set up currently. If you post your code, I can try plugging it in to see what I get, and see if there are any major differences between the two.

the best ssao ive seen

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

the best ssao ive seen

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines