Hello,
I've somewhat successfully implemented the SSAO-approach featured in this article: http://www.gamedev.net/page/resources/_/technical/graphics-programming-and-theory/a-simple-and-practical-approach-to-ssao-r2753. Here is the shader:
float4 cSize : register(c0); // xy = screen, zw = 1 / random texture
float4 cParams : register(c1); // scale, bias, intensity, sample rad
// currently passed in: { 0.1f, 0.045f, 1.0f, 0.5f };
float4x4 cView : register(c2);
sampler cNormalSampler : register(s0);
sampler cPositionSampler : register(s1);
sampler cNoiseSampler : register(s2);
struct PS_INPUT
{
float4 vPos : POSITION0;
float2 vTex0 : TEXCOORD0;
};
float3 getPosition(in float2 uv)
{
return mul( tex2D(cPositionSampler, uv).xyz, cView) ;
}
float3 getNormal(in float2 uv)
{
return normalize( mul(tex2D(cNormalSampler, uv).xyz * 2.0f - 1.0f, cView) );
}
float2 getRandom(in float2 uv)
{
const float2 random_size = 1/64.0f;
return normalize(tex2D(cNoiseSampler, cSize.xy * uv * cSize.zw).xy * 2.0f - 1.0f);
}
float doAmbientOcclusion(in float2 tcoord,in float2 uv, in float3 p, in float3 cnorm)
{
float3 diff = getPosition(tcoord + uv) - p;
const float3 v = normalize(diff);
const float d = length(diff)*cParams.x;
return max(0.0, dot(cnorm, v)- cParams.y)* (1.0 / (1.0 + d) ) * cParams.z;
}
float4 mainPS(PS_INPUT i) : COLOR0
{
const float2 vec[4] = {float2(1,0),float2(-1,0),
float2(0,1),float2(0,-1)};
float3 p = getPosition(i.vTex0);
float3 n = getNormal(i.vTex0);
float2 rand = getRandom(i.vTex0);
float ao = 0.0f;
float rad = cParams.w/ p.z;
//**SSAO Calculation**//
int iterations = 4;
for (int j = 0; j < iterations; ++j)
{
float2 coord1 = reflect(vec[j], rand)*rad;
float2 coord2 = float2(coord1.x*0.707 - coord1.y*0.707,
coord1.x*0.707 + coord1.y*0.707);
ao += doAmbientOcclusion(i.vTex0, coord1*0.25, p, n);
ao += doAmbientOcclusion(i.vTex0, coord2*0.5, p, n);
ao += doAmbientOcclusion(i.vTex0, coord1*0.75, p, n);
ao += doAmbientOcclusion(i.vTex0, coord2, p, n);
}
ao/=(float)iterations*4.0;
//**END**//
return 1.0-ao;
}
However, I'm having multiple issues:
- First of all, the performance hit is straightout horrible. In an emtpy scene, the game drops from 600 to 200 FPS. Thats about 4 ms, simply for SSAO'ing an emtpy scene. But ok, 4 ms for that effect seems somewhat reasonable. However, as soon as I load some bigger scene, the FPS drops down to not more then 20-30 FPS. Without SSAO, that scene would render at about 200 FPS. Thats almost 45 ms just for the SSAO. I can add 20-30 deferred lights, before I even come close to that value. See that attachment for what the scene looks like. If I scroll out, so that still the whole screen is covered, I get about 100 FPS. Why is that? What makes such a huge difference in performance between a near object, and a far one, in a deferred renderer, where the SSAO shader has the same size of a texture anyway? oO And, more importantly, how can I solve this? I'm running on a Geforce 560 Gtx Ti, so it should at least run at 60 fps oO
- Note how in the screenshot, there is a suptle black line in the middle of the scene. If I zoom out, this line becomes white, and it rotates as I rotate the scene. Where does this come from? Is there any way to resolve that?
- In the article, it says that in order to get position and normal to view space from world space, I should multiply by the view matrix, however, if I pass in the view matrix as cView, the ambient occlusion doesn't rotate correctly as I rotate the scene, it seems to be fixed to a certain side of the model. Hard to describe, imagine having a model, if you look from the one side it is somehow SSAO'd ( way too dark, though), and if you look from the other side, there is no occlusion at all. Passing in the ViewProjection-matrix fixes this, but I suspect there is some issue in my code. Is the code right, or do you see something suspicious.
Hopefully someone has an idea, especially about the performance part...