the best ssao ive seen

ArKano22 · 2011-11-16T22:11:26

Sorry if the topic title is a bit pretentious but that´s what i think when i look at it :D. I´ve been struggling quite some time to get a good looking ssao and this is the end of my quest! I´ve implemented the famous bunnell GI disk method (http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter14.html) but in screen space (so in the end it´s only form factors) and the results speak for themselves: With contrast increased, so you can see artifacts & granularity: None of the screens have any blurring applied, just ssao output. The banding you can see at some places is due to precision problems in my gbuffer, bear with me. Otherwise, it looks incredibly good, consistent when you move the camera, very little haloing (but haloing anyway) and no gray flat surfaces, captures well fine details. Plus you get one bounce local GI, like in the last two screens. Speed wise, without GI it is as fast as every other ssao implementation, maybe a bit slower. With GI, more or less 30-35% speed decrease. I´m going to post full glsl code soon, together with some speed comparisons. To implement it you´ll need to have a gbuffer set up, or at least normal, diffuse+directlight and depth buffers.

ArKano22

651

Author

March 23, 2010 01:46 PM

Quote:Original post by Prune
Quote:Original post by ArKano22
int iterations = lerp(5.0,1.0,p.z/g_far_clip);

vec[] in your code is of size 4. Wouldn't a case such as an ortho projection with near plane of 0 or negative be able to cause a shader runtime error as vec[4] is accessed when nearby geometry crosses depth 0?

Yep, that´s bug, and a very nasty one. Change the 5.0 with a 4.0 and that´s it.

Shael

286

March 24, 2010 04:41 AM

Quote:Original post by ArKano22
Quote:Original post by Prune
Quote:Original post by ArKano22
int iterations = lerp(5.0,1.0,p.z/g_far_clip);

vec[] in your code is of size 4. Wouldn't a case such as an ortho projection with near plane of 0 or negative be able to cause a shader runtime error as vec[4] is accessed when nearby geometry crosses depth 0?

Quote:
Yep, that´s bug, and a very nasty one. Change the 5.0 with a 4.0 and that´s it.

That isn't right though. Changing it to 4.0 means you will never reach the maximum depth because of the for-loop condition: "j < iterations". The value should still be 5.0 so that j is able to reach a value of 4. That or change the condition :)

2EZ Studios

ArKano22

651

Author

March 24, 2010 06:50 AM

Quote:Original post by Shael
Quote:Original post by ArKano22
Quote:Original post by Prune
Quote:Original post by ArKano22
int iterations = lerp(5.0,1.0,p.z/g_far_clip);

vec[] in your code is of size 4. Wouldn't a case such as an ortho projection with near plane of 0 or negative be able to cause a shader runtime error as vec[4] is accessed when nearby geometry crosses depth 0?

Quote:
Yep, that´s bug, and a very nasty one. Change the 5.0 with a 4.0 and that´s it.

That isn't right though. Changing it to 4.0 means you will never reach the maximum depth because of the for-loop condition: "j < iterations". The value should still be 5.0 so that j is able to reach a value of 4. That or change the condition :)

Your right :). That´s why i used 5.0 but then i forgot why xDDD. Thanks for correcting me.

Prune

224

March 24, 2010 06:36 PM

Quote:Original post by Shael
That isn't right though. Changing it to 4.0 means you will never reach the maximum depth because of the for-loop condition: "j < iterations". The value should still be 5.0 so that j is able to reach a value of 4. That or change the condition :)

In GLSL I get:
0(64) : error C5025: lvalue in array access too complex
If j reaches a value of 4, it accesses vec[4] which is out of bounds, since vec has only elements vec[0], vec[1], vec[2], and vec[3].
Also note that float vs int type for iterations makes a difference due to the rounding mode.

If you try
int iterations = 5.0;
you will get a black screen.
Works with
int iterations = 4.9;
since the default rounding is truncation.
So something like
int iterations = lerp(4.999, 1.0, p.z / g_far_clip);

[Edited by - Prune on March 24, 2010 6:36:46 PM]

"But who prays for Satan? Who, in eighteen centuries, has had the common humanity to pray for the one sinner that needed it most?" --Mark Twain

~~~~~~~~~~~~~~~Looking for a high-performance, easy to use, and lightweight math library? http://www.cmldev.net/ (note: I'm not associated with that project; just a user)

Shael

286

March 24, 2010 07:26 PM

Hmm maybe you're right. Though I tried using 4.0 and the LOD doesn't seem to work correctly. Near and Far objects have really bad quality. Where as with 5.0 it seems to work.

EDIT: I'm using HLSL and I don't get any error like you get with GLSL.

2EZ Studios

ArKano22

651

Author

March 24, 2010 07:51 PM

For me it works perfectly with 5.0, with 4.0 near objects have less quality than they should.

Prune

224

March 24, 2010 09:21 PM

Guys, that's why I wrote 4.999 ;)

There's no quality decrease in my testing vs 5.

To see the problem with 5, do this:

int iterations = 5; (result: black screen--and I'm using ArKano22's HSLS version, not my GLSL translation)

The problem will not be seen with a perspective projection since that the lerp parameter is always greater than zero, but this is not true in general, such as when you have an ortho projection. 4.999 works in both cases and there's no visible difference in the first case.

"But who prays for Satan? Who, in eighteen centuries, has had the common humanity to pray for the one sinner that needed it most?" --Mark Twain

~~~~~~~~~~~~~~~Looking for a high-performance, easy to use, and lightweight math library? http://www.cmldev.net/ (note: I'm not associated with that project; just a user)

DudeMiester

156

March 25, 2010 01:53 AM

So I was working on my own SSAO method, and I read this thread. Then I realized I could improve mine in a number of ways. In particular, I based my sampling pattern on ArKano's and used the notion of occluding discs. However, I use occluding spheres whose radius is limited in such a way as to prevent the self-occlusion of planes. I require this, because I'm doing a pool table, and I need interacting spheres and flat surfaces to look correct. Incidentally, it also diminishes the banding when under-sampling due to extreme close-ups.

I also work directly with the non-linear depth buffer. I have an efficient way of reconstructing the world space position from the non-linear depth, presuming the near plane is symmetrical.

I have yet to code an edge aware blur, so please excuse the noise. The essential code follows the images.

All together now:

Reconstructing world space position:

//Darryl Barnhart//www.dbarnhart.cavec3 viewDelta(vec2 screenSpaceCoords) {	screenSpaceCoords = 2*screenSpaceCoords-vec2(1);	return vec3(screenSpaceCoords/vec2(projection[0][0], projection[1][1]), -1);}float worldDepth(float nonlinearDepth) {	return projection[3][2]/(1.0-2.0*nonlinearDepth-projection[2][2]);}vec3 worldPosition(vec2 coords, float nonlinearDepth) {	return viewDelta(coords)*-worldDepth(nonlinearDepth);}

Note that -z is the forward direction.

The SSAO:

//Darryl Barnhart//www.dbarnhart.caconst float sampleRadius = 0.011;float sample(vec2 coords, float offsetFactor, vec3 basePosition, vec3 baseNormal) {	float nonlinearDepth = texture2D(sceneDepth, coords).r;	if(nonlinearDepth==1) return 0;	vec3 position = worldPosition(coords, nonlinearDepth);	vec3 displacement = position - basePosition;	float distanceSqr = dot(displacement, displacement);	float distance = sqrt(distanceSqr);	vec3 direction = displacement/distance;	float radiusFactor = dot(direction, baseNormal);	if(radiusFactor<0.001) return 0; //Ignore samples behind the surface	float radius = radiusFactor*sampleRadius*offsetFactor;	float occlusion = 1-distance*inversesqrt(distanceSqr + radius*radius);	return occlusion;}//Take for samples rotated by 90 degrees eachfloat fourSamples(vec2 baseCoords, vec2 baseOffset, float offsetFactor, vec3 basePosition, vec3 baseNormal) {	baseOffset*=offsetFactor;	float result=0;	result+=sample(baseCoords+vec2( baseOffset.x,  baseOffset.y), offsetFactor, basePosition, baseNormal);	result+=sample(baseCoords+vec2( baseOffset.y, -baseOffset.x), offsetFactor, basePosition, baseNormal);	result+=sample(baseCoords+vec2(-baseOffset.x, -baseOffset.y), offsetFactor, basePosition, baseNormal);	result+=sample(baseCoords+vec2(-baseOffset.y,  baseOffset.x), offsetFactor, basePosition, baseNormal);	return result;}void main() {	vec2 coords = gl_FragCoord.xy*screenSizeInv;	float nonlinearDepth = texture2D(sceneDepth, coords).r;	vec3 position = worldPosition(coords, nonlinearDepth);	vec3 normal = normalize(texture2D(sceneNormals, coords).xyz);	vec2 randomDirection = normalize(2*texture2D(randomTexture, coords/screenSizeInv.x/64.0).xy-vec2(1));	vec2 axisOffset = randomDirection*sampleRadius/-position.z;	vec2 angleOffset = vec2(axisOffset.x-axisOffset.y, axisOffset.x+axisOffset.y)*(sqrt(2.0)/2);	float occlusion = 0;	occlusion+=fourSamples(coords, angleOffset, 0.25, position, normal);	occlusion+=fourSamples(coords, axisOffset, 0.5, position, normal);	occlusion+=fourSamples(coords, angleOffset, 0.75, position, normal);	occlusion+=fourSamples(coords, axisOffset, 1.0, position, normal);	gl_FragColor.r = clamp(1-occlusion/(occlusion+1), 0, 1);}

I still have one issue that you can see in the screenshots. When looking directly into very narrow crevices, like those between the ball and the table, the occlusion lightens at the narrowest part. I think that's what's referred to as "halo", and best I can tell, nothing can be done about it except more sampling.

Anyways, I have another idea for SSAO that I want to try. A lot of the slowness comes from reading from far away parts of the depth texture, not cache friendly. Also, proper ambient occlusion should take into account the entire scene, so an alleyway off a street becomes darker. I figured you could build a mipmap pyramid of the depth and normals, compute SSAO starting somewhere near the top, then progressively refine it by using the lower resolution and more global AO levels to modulate the higher resolution levels. You only need to sample the neighbouring pixels on each pass, so the cache will like it, and both global and local occlusion are captured. I doubt it would require any blur afterwards either. Still, I'm not sure how fast it would be in the end.

Any thoughts?

Darryl Barnhart
www.dbarnhart.ca

[s] [/s]
I can see the fnords.

ArKano22

651

Author

March 25, 2010 06:03 AM

@DudeMiester

Wow, there´s some good ssao going on there.

To reduce haloing, you can use a technique i call depth extrapolation, that works specially well for planar surfaces so your pool scene is the best case possible:
http://www.gamedev.net/community/forums/topic.asp?topic_id=550699&whichpage=1?
It is basically guessing a second layer of depth based on known samples, without the need to resort to depth peeling.

About using the mipmap pyramid, it could work well but i guess it would suffer from a lot of "popping", specially for low-frequency components of the AO. However it is worth giving it a try. Btw, making the ssao pseudo-separable to boost the sampling phase (which i was trying for a long time) ended up looking like s***.

I also tried to make the ssao adaptive, using the derivative of a mipmapped normal buffer to concentrate sampling where there are normal changes. But it also looked like s*** :D.

[Edited by - ArKano22 on March 25, 2010 1:03:44 PM]

DudeMiester

156

March 25, 2010 09:44 PM

I just noticed there's an error in my code. Here's a new and simplified version:

float sample(vec2 coords, float offsetFactor, vec3 basePosition, vec3 baseNormal) {	float nonlinearDepth = texture2D(sceneDepth, coords).r;	if(nonlinearDepth==1) return 0; //Not necessary, but can improve performance	vec3 position = worldPosition(coords, nonlinearDepth);	vec3 displacement = position - basePosition;	//multiply the maximum by a value less than 1 to smooth the AO more.	float radius = clamp(dot(displacement, baseNormal), 0, sampleRadius*offsetFactor);	float distanceSqr = dot(displacement, displacement);	float occlusion = 1-sqrt(distanceSqr/(distanceSqr + radius*radius));	return occlusion;}

With this correction, the haloing underneath the spheres is much reduced. Enough for my purposes, anyways.

Still, I was thinking about your halo reduction method, but because I'm using spheres I imagine it won't work so well. However, I'm thinking if you store at each pixel a curvature factor, you can reasonably approximate the closest point to the occluded pixel. You could use a geometry shader with adjacency to compute the approximate curvature at each vertex, which is then interpolated. I'm thinking 1/radius of a sphere approximating the local curvature, which can be set to 0 for planes and made negative for indents.

Darryl Barnhart
www.dbarnhart.ca

[Edited by - DudeMiester on March 25, 2010 10:44:31 PM]

[s] [/s]
I can see the fnords.

the best ssao ive seen

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

the best ssao ive seen

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines