Intel sponsors gamedev.net search:   
[Control Panel] [Register] [Bookmarks] [Who's Online] [Active Topics] [Stats] [FAQ] [Search]
GameDev.Net Discussion Forums Image of the Day  SSAO
Send Topic To a Friend | View Forum FAQ | Track this topic | View Forum

 Last Image Next Image 
 SSAO
 Page:   1 2 3 4 »»
Post Reply 



Raw output of the SSAO trickery.

I coded my own interpretation of Crytek's celebrated paragraph titled "Screen-Space Ambient Occlusion" (I lost the link to the pdf).

The paragraph explains a technique to approximate ambient occlusion in realtime, in a very attractive way that doesn't depend on scene complexity in principle. However the 18 line text is quite ambiguous (intentionally?), so you have to guess a bit few things if you want to implement it. The image above is the result of my guesses so far:

1. compute the z value to near clipping plane for every fragment. This z value is in eye space, so linear (no 1/z "distortions"), thus can be done per vertex in a regular float16 texture (not depth texture). Note I use opengl and thus this z values are negative. I also output the eye space normal, not necessary in principle.
2. in a following pass, a full screen quad reads this z values and recovers the 3d xyz in eye space for that point, doing one mul and one div: "vec3 eyepos = eyeray * fragmentZ / ray.z;" where fragmentZ comes from sampling the eyelinear z value. For that, an eye space ray is created in the quad's vertices (on the vertex shader) and it's interpolated down to the fragments.
3. generate few points (6 in my case) in a sphere around that eyepos, so in eye space. These points are "inside" a sphere, not just on it's surface in my case, something not clear in Crytek's article. I use their trick of random reflection to dither the otherwise ugly banding effects, that's clear on the paper.
4. project these points from eye to clip with the standard gl_ProjectionMatrix, and do the w division and 0.5 scale and offset to convert from ndc to texture space.
5. sample the eyelinearz texture again in each point, and calculate the z difference between it and the reference fragmentZ value.
6. Now the tricky part comes. We have to accumulate a "blocking" factor depending on this zdifference (zd). I used "bloq += step(0.0,zd)/(1.0+(zd*zd);". The step avoid that pixels in the back occlude pixel in front. Then you get some extra undesired halo effect. I remove it by atennuating the blocking factor width an aproximation of the distance between the occluder and ocludee, ie, the bigger the zd, the less it blocks. 1/zd^2 is a physically correct falloff. This attenuation thing is also not mentioned on the paper.
7. This still gives an ugly noisy image. This is not mentioned at all on the paper, but my big guess is that they apply a blur here. So I did. Since we have the zbuffer (and eye space normals!) I can avoid bluring across object boundaries by doing z comparisons in each sampling point on the blur kernel and also a dot product between the normals. I implemented a one pass 32 point 2D bluring, a two pass gaussian blur should work much better I suppose.

Still not perfect, but it's getting somewhere. There is lot of parameters to tune af usual: radius of the sphere, number of sampling points, distribution of the points, blocker attenuation function, blur kernel, contrast adjustments...

[Edited by - Inigo Quilez on September 6, 2007 6:52:45 PM]


I'm curious how the algorithm behaves in a particular case, but I can't think of a quick way of summing it up, so let me explain: Imagine you're staring at a couch that nearly but not quite butts up against the wall behind it. You're staring at the wall flat-on, and thus the front of the couch. The region around the back edge of the couch, where it meets the wall, should have some level of shadowing to indicate the closeness of the surfaces. Now clearly, your implementation handles that.

However, what does it do if you were to bring the camera up against the wall and look straight down at the gap between wall and couch - and close enough that the gap is of reasonable size. Does your algorithm still shadow the wall at all, in that case?


very interesting. are you going to provide a video or even a binary?


A similar technique can be found in this (http://www.cs.utexas.edu/~perumaal/) paper. I mixed Crytek's and the one of that paper a bit too: http://www.rgba.org/iq/trastero/ssao3.jpg, and I still need a blur to remove noise. So I'm doing something wrong probably.


Good job. I've got a few suggestions though:

1) Try adding the diffuse and specular components and take some comparison shots with and without the ambient occlusion term so the difference would be more obvious.

2) Any info on the performance characteristics of your implementation?

Let us informed :)


Thanks very much for the lengthy description of your approach.
I've been trying an implementation as well, but didn't succeed in getting rid of many artifacts.
You've given me food for thought :)


Quote:
Original post by Inigo Quilez

It's really interesting picture,but it seems me,that this method
works exactly only if ALL nearest faces are visible at the
some time (or I don't quite understand this method).
May be,it's just bicycle inventing(I never read/thought about SSAO yet),
but you can try to do following instead of it.
Actually 3 steeps needed:
1)Render all meshes to texture,RGB part filled like for normal maps,
alpha part filled with range values,culling mode-CW.
2)Like 1),but culling mode-CCW (back faces info).
3)Render final SSAO texture,using steep 1,2 textures,some kind of NxN filter (with gaps)
and calculating earch component of sum like a function of dot product and range between opposite pixels in filter sell.

[Edited by - Krokhin on September 7, 2007 4:12:08 AM]


hi. I took three screenshots:

no ssao:
noisy ssao:
blured ssao:

There is still work to do on the blocking function. My working collegue Flavien Brebion is trying also himself and got some nice blocking functions; we are still searching for the perfect ballance between halos (white ones at this moment, what is not that bad), number of samples and noise (actually he found that with 32 samples on the sphere you almost don't need bluring), while I was using 6 jittered samples and then you obviously need some.

The method doesn't work of course when you have objects outside the viewport area or they project to zero pixels since they don't get registered on the zbuffer. It's also a quite local ambient occlusion of course, but hey, it's better than nothing!


Woow. looks great. Is there anyway to have the hlsl code? I really need it for my project. I want to integrate it with Quest3D.
Thanks.

My online portfolio website : http://www.ali-rahimi.net


Must say it looks really great!

I have one comment though. It seems like the edges of the boxes are brighter than the faces, but I don't think they should be? Maybe you should clamp the value somehow?

Like, if a hemisphere is "free" then let it have full brightness.


Hi Inigo,

Thanks for the inspired post!

I am getting very nice results with the suggested correct 1/(r*r) falloff, so your concept of the blocking function is just fine. However instead of using the step() function I do this instead:

zd = max(zd, 0);
occlusion += 1.0 / (1 + zd*zd);

If noise is an issue even with large numbers of samples, don't forget that loading the sphere point table with a GOOD distribution of vectors makes a huge difference. Mostly it is important to subdivide the solid angles evenly. Carefully randomizing the lengths of the vectors (approx in the 0.1-1.0 range) will make the ambient shading look much more natural as well.

Regards,
Darren


(Edit to say this local occlusion effect really shines with detailed models.)



@dgrantkp :: right now samples are coming from a simple generate-uniform-points-in-cube-and-reject-if-not-in-ball method. If you could provide a set of good points it would be super great.

@the_laze :: yes, I think one can get rid of them with clamping or something. I would say Crytek has the same effect thou, at least if you look to the picture they added to the article.

@rahimi :: hey man, I'm a fan of your work. For the shader, well, I think it's more or less clear how to get it if you follow the description. I you cannot make it work I can try to help (by mail), but I'm still figuring out myself how to make it properly.

@Ashkan :: performance pretty much depends on how big your sampling sphere/disk is (because of cache things). In my case I get between 30 and 120 fps depending on how bad/good the setting (I'm all the time changing them!). I wonder if I should not sort my sampling point from left to right and up down, so they project into more coherent way to screen and texture cache works better. Anyway my implementation is not the best at all, and I'm sure proper implementation will definitively not impact the framerate of any application. Otherwise they would not use it in a game...


I'm just using a const array of 14 vectors. The first 6 are +/- x, y, z and the last 8 are the unit cube corners normalized. Then to soften the shading pre-multiply these vectors with some moderately random lengths.

About the halo that appears: It is the effect of samples on the same planar surface in the scene adding to occlusion because the blocking function does not factor in the relative position of the occluder. (For instance, think about what you get when zd=0, the trivial example.) One quick and dirty way to counteract this is to bias the zfragment, but it comes at the cost of new artifacts.

I leave the math involved in solving the relative position problem as an exercise for now. ;)


[Edited by - dgrantkp on September 10, 2007 3:08:23 AM]


I spent a few hours this weekend trying to implement the effect. So far I only distribute samples based on 2d pattern. No reprojection.
I still have much to do and fine tune. I got alot of artifacts, some of which I removed by using step(0.0001, zd) instead of step(0.0, zd).
Right now I render the scene to a RGBA32F buffer and store the depth in the alpha channel, unfortunatly that means I read 16 bytes
each time I sample the depth buffer in the last fullscreen pass. I would like to use RGBA8 and LUMINANCE32F buffer at same time to
render scene color and depth. But opengl fbo doesn't allow that. Does direct3d ? will try to copy the result to a LUMINANCE32F buffer
in my next coding session. What do you use?
I think it would be better with a not so steep blocking function. Anyone tried with anything smoother? Anyone tried taking the normal
into account (One should probably distribute the samples differently based on the normal)?

I would be very interrested to take this further. Anyone that want's to discuss this subject can contact me on:
msn and mail: david.k.olsson at gmail.com
icq: 18045544



Quote:
Original post by Zelcious
Right now I render the scene to a RGBA32F buffer and store the depth in the alpha channel, unfortunatly that means I read 16 bytes each time I sample the depth buffer in the last fullscreen pass. I would like to use RGBA8 and LUMINANCE32F buffer at same time to render scene color and depth. But opengl fbo doesn't allow that.


One solution is to render to two RGBA8 buffers. In the second one, you can encode the depth from a float to RGBA8, and when reading it back in the post-process step, decode it from an RGBA8 to a float.

This is the code in GLSL to do that:

/// Packing a [0-1] float value into a 4D vector where each component will be a 8-bits integer
vec4 packFloatToVec4i(const float value)
{
	const vec4 bitSh = vec4(256.0 * 256.0 * 256.0, 256.0 * 256.0, 256.0, 1.0);
	const vec4 bitMsk = vec4(0.0, 1.0 / 256.0, 1.0 / 256.0, 1.0 / 256.0);
	vec4 res = fract(value * bitSh);
	res -= res.xxyz * bitMsk;
	return res;
}

/// Unpacking a [0-1] float value from a 4D vector where each component was a 8-bits integer
float unpackFloatFromVec4i(const vec4 value)
{
	const vec4 bitSh = vec4(1.0 / (256.0 * 256.0 * 256.0), 1.0 / (256.0 * 256.0), 1.0 / 256.0, 1.0);
	return(dot(value, bitSh));
}


Alternatively, you can use two RGBA16f buffers, and output the linear depth instead (depth=(dist-znear)/(zfar-znear)). The results look allright even though it's only a 16-bits float. You can output the eye-space normals into the 3 remaining components if you want to tweak the ambient-occlusion formula by taking normals into account.

Y.



these are very interesting results.
here's the original paper, btw:-
Finding next gen: CryEngine 2


@Zelcious :: I also tried the 2D version. In that case you better "push" the blocker pixels a bit in the opposite direction of the surface normal (in eye space); that's why I decided to output eye space normals in the first place. That removes lot of acne.

These normals can be encoded in two floats by the way, since z is always pointing towards the viewer. You safe one channel in a typical RGBAF16 texture, but I don't know what you could store there. May be a object space calculated AO (=global) to be blended later with the screen space (=local) one??


Quick post:

Copying the alpha channel of the RGBA32F to a R32F texture before doing the 8 sample SSAO increased my fps from 25 to 33 (1920x1200).
The code needs much more optimization. It's the thrashing of the cache that is the big problem.


Quote:
Original post by Zelcious
I would like to use RGBA8 and LUMINANCE32F buffer at same time to
render scene color and depth.


I use R32F for depth in D3D (Radeon X1650), but luminance formats are not valid render targets.



Very well done! Wich is the performance hit of this technique?


ah, I made a small 4k demo using this SSAO experiments. I removed the dithering to save some bytes ;)

http://www.rgba.org/iq/demoscene/productions/productions.htm

(first one)




Really nice demo you have there :D

The last few weeks I've tried to implement this fx myself in spare times, with some good results, atleast until you don't mind the weird bands on the lower half of the screen...

Here's a screenshot:
Image Hosted by ImageShack.us

For the sample points I use 10 constant vectors of magnitude [0.1, 1.0] reflected on a random normal fetched from a texture.
Someone have some ideas on this glitch and how to get rid of it?
I've run out of ideas.


To the post above, try using a higher precision depth buffer. My banding artifacts went away when I stored depth as 32F instead of 16F.

I sent Inigo an email about this, but maybe somebody else has an idea. I'm unsure exactly how this algorithm works when we're basing everything on z difference and not taking into account the surface normals. Example: if I am looking at a flat surface (and no nearby occluders) straight on, the z difference at each sample on the inside of the surface will be 0, or close to 0. When I'm at an angle to the surface the z difference at each sample will change (we could get anything from -2 to 2, say). So we have two problems: flat surfaces self-occlude, which is not what we want, and surfaces will change color as we change our viewing angle because of the change in z difference at each sample (the more straight on we are to a surface, the darker it will get due to smaller z differences).

Inigo's screenshots look like he doesn't have the problem of the color changing as the viewing angle changes, at least (though I do see some noise on the flat faces). What am I missing here?

Sam McGrath



Quote:

I sent Inigo an email about this, but maybe somebody else has an idea. I'm unsure exactly how this algorithm works when we're basing everything on z difference and not taking into account the surface normals. Example: if I am looking at a flat surface (and no nearby occluders) straight on, the z difference at each sample on the inside of the surface will be 0, or close to 0. When I'm at an angle to the surface the z difference at each sample will change (we could get anything from -2 to 2, say). So we have two problems: flat surfaces self-occlude, which is not what we want, and surfaces will change color as we change our viewing angle because of the change in z difference at each sample (the more straight on we are to a surface, the darker it will get due to smaller z differences).

Inigo's screenshots look like he doesn't have the problem of the color changing as the viewing angle changes, at least (though I do see some noise on the flat faces). What am I missing here?

Sam McGrath


I've managed to remove the self-occlusion & view dependent-occlusion problems using per-pixel normals to create sample points only on an hemisphere then using the sample depth in place of pixel depth in the occlusion test.

Quote:

To the post above, try using a higher precision depth buffer. My banding artifacts went away when I stored depth as 32F instead of 16F.


I've tried using a 32F RT but with no results, thanks for the advice though.

Anyway i found a workaround for this problem: as i extend the rendering window the bands fade away, using a 1024x768 resolution removes them completely.

Screenshot (SSAO raw output):
Free Image Hosting at www.ImageShack.us

The architecture is the well-known sponza atrium, the shape on the upper part of the screen is the back of Blender's Suzanne monkey model.


Hi, I added few more screenshots here: http://www.gamedev.net/community/forums/topic.asp?topic_id=469773 (bottom of the thread).

There is two ways to do the ssao as I see it. One is to create the ocluder points from the zbuffer and compute the occlusion they project on the point being shaded like in "Hardware Accelerated Ambient Occlusion on GPUs" by Perumaal Shanmugam Okan Arikan, or by generating 3d points regardless of the zbuffer and do some heuristic blocking computations, ala Crytek.

I first implemented the first, and indeed I had some ugly acne that I could remove by considering the eye space normals and moving a bit the blockers towards the eye by a small epsilon.

I quickly changed to the second method that is less physically accurate, but faster.

I made the following provisional page http://www.rgba.org/iq/computer/articles/ssao/ssao.htm, that I plan to improve when I have the chance to test the technique on more models and optimize it so it runs on ps2.0



Page:   1 2 3 4 »»
All times are ET (US)

Post Reply
 Last Image Next Image 
Forum Rules:
You may not post new threads
You may not post replies
You may not edit your posts
You may not use HTML in your posts
Jump To:
Administrative Options: