IBL Problem with consistency using GGX / Anisotropy

Started by
21 comments, last by Digitalfragment 9 years, 11 months ago

Hey guys I'm currently in the process of building a new material / shading pipeline and have come across this specific problem.

I've switched from a blinn-phong to a GGX based shading model that also supports anisotropy. Before now I've been using the modified AMD CubeMapGen from sebastien lagarde's blog (http://seblagarde.wordpress.com/2012/06/10/amd-cubemapgen-for-physically-based-rendering/) to prefilter radiance environment maps and store each filter size in the mipmap levels. The problem is when switching to GGX or even an anisotropic highlight the specular just doesn't fit anymore (at all).

So the question is how do you get the environment map to be consistent with your shading model ?

@MJP I know you guys also use an anisotropic and cloth shading model, how do you handle indirect reflections / environment maps ?

Advertisement

Modified AMD CubeMapGen generates cubemaps using Phong shading model. For more complicated models check out Black Ops And UE4 presentations from Siggraph 2013:

http://blog.selfshadow.com/publications/s2013-shading-course/lazarov/s2013_pbs_black_ops_2_notes.pdf

http://blog.selfshadow.com/publications/s2013-shading-course/karis/s2013_pbs_epic_notes_v2.pdf

Do you know some book or paper that teaches you how this works ?
I know about radiometry but The only thing I know about this process of filtering an environment map is the general idea of applying a BRDF convolution on the image.

Also what about more complex brdfs for clothing (ashikmin) or anisotropic ggx?
Do you just use isotropic reflections as an approximation?

The best you can do with a pre-convolved cubemap is integrate the environment with an isotropic distribution to get the reflection when V == N, which is the head-on angle. It will give incorrect results as you get to crazing angles, so you won't get that long, vertical "streaky" look that's characteristic of microfacet specular models. If you apply fresnel to the cubemap result you can also get reflections with rather high intensity, and so you have to pursue approximations like the ones proposed on those course notes in order to keep the fresnel from blowing out. It's possible to approximate the streaky reflections with multiple samples from the cubemap if you're willing to take the hit, and you can also use multiple samples along the tangent direction in order to approximate anisotropic reflections

For our cloth BRDF we have a duplicate set of our cubemaps that are convolved with the inverted Gaussian distribution used in that BRDF. It's just like the GGX cubemaps where it gets you the correct result when V == N, but at grazing angles.

In layman's terms you need to treat cube map as a lookup table and precompute there some data by using a source cubemap. Let's start from simple Lambert diffuse to see how does it work.

For Lambert diffuse we want to precompute lighting per normal direction and store it in a cubemap. To do that we need to solve a simple integral:

[attachment=20867:int.gif]

Which in our case means: for every texel of the destination cubemap, calculate some value E. E is calculated by iterating over all source cubemap texels and summing their irradiance E. Where E = SourceCubemap(l) * ( n dot l ) * SourceCubemapTexelSolidAngle. This operation is called cosine filter in AMD CubeMapGen.

For simple Phong model (no fresnel etc.) we can precompute reflection in a similar way and store it in a destination cubemap. Unfortunately even after just adding Fresnel or when using Blinn-Phong there are too many input variables and exact results don't fit into a single cubemap. It's time to do some kind of approximations, start using more storage or take more than one sample.

Okay I've been looking over the source code of the cubemapgen and the only line that resembles the distribution function resides in a function called "ProcessFilterExtents" and looks like the following:


// Here we decide if we use a Phong/Blinn or a Phong/Blinn BRDF.
// Phong/Blinn BRDF is just the Phong/Blinn model multiply by the cosine of the lambert law
// so just adding one to specularpower do the trick.					   
weight *= pow(tapDotProd, (a_SpecularPower + (float32)IsPhongBRDF));

So if I understand correctly the loop that encloses this line "integrates" over the hemisphere. tapDotProd is something like R * V ? How do they handle the blinn case with N * H cause I'm not seeing it anywhere. tapDotProd seems to be calculated as the dot product between the current cubemap pixel position and its center tap position. Is the only thing defining the distribution function the specular power here? Can you explain this approximation of the direction vector to me ?

edit: I think I get it, V is a known variable to us here as it is just every direction we loop over the cubemap and R would be the lookup vector into this cubemap image, right ?

edit2: After looking at brian karis's code samples it occurs to me that he is doing this on the GPU in a pixel/compute shader and that I might just do that too :)

To prefilter the envmap he's got the following code:


float3 PrefilterEnvMap(float Roughness, float3 R)
{
   float3 N = R;
   float3 V = R;
  
   float3 PrefilteredColor = 0;
   const uint NumSamples = 1024;
   for (uint i = 0; i < NumSamples; i++)
   {
      float2 Xi = Hammersley(i, NumSamples);
      float3 H = ImportanceSampleGGX(Xi, Roughness, N);
      float3 L = 2 * dot(V, H) * H - V;
      float NoL = saturate(dot(N, L));
  
      if (NoL > 0)
      {
         PrefilteredColor += EnvMap.SampleLevel(EnvMapSampler, L, 0).rgb * NoL;
         TotalWeight += NoL;

       }
     }

     return PrefilteredColor / TotalWeight;
}


I've got a few questions about this...

1. What does the function hammersley do ?

2. He's sampling the environment map here...is that a TextureCube ? Or is this function being run for each cube face as a Texture2D ?

3. The input to this is the reflection vector R. How would it be calculated in this context ? I imagine similar to the direction vector in the AMD cubemapgen ?

1. What does the function hammersley do ?
2. He's sampling the environment map here...is that a TextureCube ? Or is this function being run for each cube face as a Texture2D ?
3. The input to this is the reflection vector R. How would it be calculated in this context ? I imagine similar to the direction vector in the AMD cubemapgen ?

1. hammersley generates psuedo random, fairly well spaced 2-D coordinates, that the GGX importance sample function then gathers into a region thats going to contribute the most for the given roughness.

2. Its a cubemap, for a rough surface an entire hemisphere is required. The nice thing about using a cubemap as an input is that its easy to render one in realtime.

3. The function is run for every pixel of a cubemap rendertarget, its convolving the environment for all directions.

Heres my attempt at implementing this entire process as presented by Epic at Siggraph last year (if anyone can point out errors, that would be awesome).

It uses SV_VERTEXID to generate fullscreen quads, and a GS with SV_RENDERTARGETARRAYINDEX to output to all 6 faces of a cubemap rendertarget simultaneously.


 
struct vs_out
{
float4 pos : SV_POSITION;
float2 uv : TEXCOORD0;
};
 
void vs_main(out vs_out o, uint id : SV_VERTEXID)
{
o.uv = float2((id << 1) & 2, id & 2);
o.pos = float4(o.uv * float2(2,-2) + float2(-1,1), 0, 1);
//o.uv = (o.pos.xy * float2(0.5,-0.5) + 0.5) * 4;
//o.uv.y = 1 - o.uv.y;
}
 
struct ps_in
{
float4 pos : SV_POSITION;
float3 nrm : TEXCOORD0;
uint face : SV_RENDERTARGETARRAYINDEX;
};
 
float3 UvAndIndexToBoxCoord(float2 uv, uint face)
{
float3 n = float3(0,0,0);
float3 t = float3(0,0,0);
 
if (face == 0) // posx (red)
{
n = float3(1,0,0);
t = float3(0,1,0);
}
else if (face == 1) // negx (cyan)
{
n = float3(-1,0,0);
t = float3(0,1,0);
}
else if (face == 2) // posy (green)
{
n = float3(0,-1,0);
t = float3(0,0,-1);
}
else if (face == 3) // negy (magenta)
{
n = float3(0,1,0);
t = float3(0,0,1);
}
else if (face == 4) // posz (blue)
{
n = float3(0,0,-1);
t = float3(0,1,0);
}
else // if (i.face == 5) // negz (yellow)
{
n = float3(0,0,1);
t = float3(0,1,0);
}
 
float3 x = cross(n, t);
 
uv = uv * 2 - 1;
 
n = n + t*uv.y + x*uv.x;
n.y *= -1;
n.z *= -1;
return n;
}
 
[maxvertexcount(18)]
void gs_main(triangle vs_out input[3], inout TriangleStream<ps_in> output)
{
for( int f = 0; f < 6; ++f )
{
for( int v = 0; v < 3; ++v )
{
ps_in o;
o.pos = input[v].pos;
o.nrm = UvAndIndexToBoxCoord(input[v].uv, f);
o.face = f;
output.Append(o);
}
output.RestartStrip();
}
}
 
SamplerState g_samCube
{
    Filter = MIN_MAG_MIP_LINEAR;
    AddressU = Clamp;
    AddressV = Clamp;
};
TextureCube g_txEnvMap : register(t0);
 
cbuffer mip : register(b0)
{
float g_CubeSize;
float g_CubeLod;
float g_CubeLodCount;
};
 
 
// http://holger.dammertz.org/stuff/notes_HammersleyOnHemisphere.html
float radicalInverse_VdC(uint bits) {
     bits = (bits << 16u) | (bits >> 16u);
     bits = ((bits & 0x55555555u) << 1u) | ((bits & 0xAAAAAAAAu) >> 1u);
     bits = ((bits & 0x33333333u) << 2u) | ((bits & 0xCCCCCCCCu) >> 2u);
     bits = ((bits & 0x0F0F0F0Fu) << 4u) | ((bits & 0xF0F0F0F0u) >> 4u);
     bits = ((bits & 0x00FF00FFu) << 8u) | ((bits & 0xFF00FF00u) >> 8u);
     return float(bits) * 2.3283064365386963e-10; // / 0x100000000
 }
 // http://holger.dammertz.org/stuff/notes_HammersleyOnHemisphere.html
 float2 Hammersley(uint i, uint N) {
     return float2(float(i)/float(N), radicalInverse_VdC(i));
 }
 
 static const float PI = 3.1415926535897932384626433832795;
 
// Image-Based Lighting
// http://www.unrealengine.com/files/downloads/2013SiggraphPresentationsNotes.pdf
float3 ImportanceSampleGGX( float2 Xi, float Roughness, float3 N )
{
float a = Roughness * Roughness;
float Phi = 2 * PI * Xi.x;
float CosTheta = sqrt( (1 - Xi.y) / ( 1 + (a*a - 1) * Xi.y ) );
float SinTheta = sqrt( 1 - CosTheta * CosTheta );
float3 H;
H.x = SinTheta * cos( Phi );
H.y = SinTheta * sin( Phi );
H.z = CosTheta;
float3 UpVector = abs(N.z) < 0.999 ? float3(0,0,1) : float3(1,0,0);
float3 TangentX = normalize( cross( UpVector, N ) );
float3 TangentY = cross( N, TangentX );
// Tangent to world space
return TangentX * H.x + TangentY * H.y + N * H.z;
}
 
// M matrix, for encoding
const static float3x3 M = float3x3(
    0.2209, 0.3390, 0.4184,
    0.1138, 0.6780, 0.7319,
    0.0102, 0.1130, 0.2969);
 
// Inverse M matrix, for decoding
const static float3x3 InverseM = float3x3(
    6.0013,    -2.700,    -1.7995,
    -1.332,    3.1029,    -5.7720,
    .3007,    -1.088,    5.6268);    
 
float4 LogLuvEncode(in float3 vRGB)
{
    float4 vResult;
    float3 Xp_Y_XYZp = mul(vRGB, M);
    Xp_Y_XYZp = max(Xp_Y_XYZp, float3(1e-6, 1e-6, 1e-6));
    vResult.xy = Xp_Y_XYZp.xy / Xp_Y_XYZp.z;
    float Le = 2 * log2(Xp_Y_XYZp.y) + 127;
    vResult.w = frac(Le);
    vResult.z = (Le - (floor(vResult.w*255.0f))/255.0f)/255.0f;
    return vResult;
}
 
float3 LogLuvDecode(in float4 vLogLuv)
{
    float Le = vLogLuv.z * 255 + vLogLuv.w;
    float3 Xp_Y_XYZp;
    Xp_Y_XYZp.y = exp2((Le - 127) / 2);
    Xp_Y_XYZp.z = Xp_Y_XYZp.y / vLogLuv.y;
    Xp_Y_XYZp.x = vLogLuv.x * Xp_Y_XYZp.z;
    float3 vRGB = mul(Xp_Y_XYZp, InverseM);
    return max(vRGB, 0);
}
 
// Ignacio Castano via http://the-witness.net/news/2012/02/seamless-cube-map-filtering/
float3 fix_cube_lookup_for_lod(float3 v, float cube_size, float lod)
{
float M = max(max(abs(v.x), abs(v.y)), abs(v.z));
float scale = 1 - exp2(lod) / cube_size;
if (abs(v.x) != M) v.x *= scale;
if (abs(v.y) != M) v.y *= scale;
if (abs(v.z) != M) v.z *= scale;
return v;
}
 
// Pre-Filtered Environment Map
// http://www.unrealengine.com/files/downloads/2013SiggraphPresentationsNotes.pdf
float4 ps_main(in ps_in i) : SV_TARGET0
{
float3 N = fix_cube_lookup_for_lod(normalize(i.nrm), g_CubeSize, g_CubeLod);
float Roughness = (float)g_CubeLod / (float)(g_CubeLodCount-1);
 
float4 totalRadiance = float4(0,0,0,0);
 
const uint C = 1024;
[loop]
for (uint j = 0; j < C; ++j)
{
float2 Xi = Hammersley(j,C);
float3 H = ImportanceSampleGGX( Xi, Roughness, N );
float3 L = 2 * dot( N, H ) * H - N; 
float nDotL = saturate(dot(L, N));
[branch]
if (nDotL > 0)
{
float4 pointRadiance = (g_txEnvMap.SampleLevel( g_samCube, L, 0 ));
totalRadiance.rgb += pointRadiance.rgb * nDotL;
totalRadiance.w += nDotL;
}
}
 
return LogLuvEncode(totalRadiance.rgb / totalRadiance.w);
}
 

First of all that is one awesome code sample (also the links are nice!) you got there, can't thank you enough! This clears up lots of things for me smile.png

Still got some questions on that:

1. Why encode to logLUV and not just write to an FP16 target ?

2. Haven't used the geometry shader duplication method yet for rendering a cubemap. Could you explain the VS and GS a little ? Or do you have a link to an article that explains it ?

3. Could I also generate a diffuse irradiance environment map using the above sample but removing the importance sampling bit and just accumulate nDotL ?

Update: I just finished implementing a test that applies it to a single mip level based on your example but I'm a little confused on how the Roughness is calculated.

For example: Is your first lod level 0 or 1 ? If you only had an lod count of 1 wouldn't that give you a zero divide ? How many LODs do you use ? I assume it ranges from 0 (perfect mirror) to 1 (highest roughness) ? Should the first miplevel be the original cube map (aka no convolution = perfect mirror) ?

Shouldn't this line be: float Roughness = (float)g_CubeLod / (float)(g_CubeLodCount); ?

*Bump*

I feel like I'm getting a little closer but there's still something wrong so I made a quick video showing how the filtered mip levels look like:

Update: Fixed the lookup vector. CubeSize has to be the size of the original cubemap (not each miplevel size, which was what I had before). Still got the problem with the randomly distributed dots all over it.

Does anyone have an idea ?

Thanks for the comment on the code sample, i'm glad it was helpful.
1. The logLUV endcoding was due to Maya 2012 not wanting to load fp16 dds files, and it was faster to implement the encode/decode than to work out Maya. (If anyone out there knows how to get Maya to load floating point dds files correctly without destroying mips too, let me know!)
- The quality difference between the logLUV in an RGBA8 target and the unencoded float16 wasn't bad either, and the bandwidth improvement of using 32bbp vs 64bpp is quite dramatic when still needing to target the DX9 generation of consoles.
2. The VS/GS trick is fun. I'll break it down into small chunks:
In directx 11, you can render without supplying vertex data. Instead SV_VERTEXID is automatically supplied and equal to the index of the vertex being generated. So in C++ i simply bind the shaders and call Draw(3,0) to draw a fullscreen triangle. See this link for more info on system-generated semantics:
- The reason to use a triangle and not a quad is to avoid the redundant processing of the pixel quads that run along the edge shared by the 2 triangles in the quad.
The GS pass is then taking this triangle and generating 6 triangles, and assigning them to output to individual textures in the texture array using SV_RENDERTARGETARRAYINDEX. This allows the C++ code to generate a single D3D11RenderTargetView for the cubemap per mip map, instead of creating an RTV for each individual face of each mip level. The GS cubemap trick was in one of the DirectX SDK samples, i can't remember which one though.
The code in the GS to work out the corner direction of the cubemap box was completely trial and error smile.png
An extra note on the shader, the seamles-cube-filtering concept isn't necessary when the generated file is going to be used on DX11, only DX9 as on the older generation cubemap texture filtering did not blend between cubemap faces (for example when sampling the 1x1x1 mip, you would only ever get a flat colour)
3. Yes definately, i hadn't gotten around to doing a diffuse irradiance map and instead hacked it to use the last mip of the probe with the vertex normal - which for our case seems good enough for now and saves generating double the number of probes at runtime.
The question about the LOD, the first is 0 which yields a roughness of 0 (perfect mirror), and (count-1) is so that the last generated mip map yields a roughness of 1 (lambert diffuse). Yes that would cause a divide by zero - but thats fine because a single mipmap level can't be both a roughness of 0 and a roughness of 1! Also, dont generate mipmaps all the way down to a 1x1, as you do need a little more precision at a roughness of 1.
Regarding the high frequency dots visible in your video, i can't say i ever got that problem - but i was using outdoor environment maps that tend to be a lot smoother and consistent in lighting values compared to that Grace Cathedral map. What sampler state are you using to sample your source cubemap with?

This topic is closed to new replies.

Advertisement