Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 19 Jun 2011
Offline Last Active Nov 05 2013 09:48 PM

#5106881 Metallic in UE 4

Posted by CryZe on 04 November 2013 - 04:49 AM

Non-metallic materials are achromatic most of a time and thus don't cause the reflections to be colored in any way. In their implementation the more the material is metallic, the more its reflections are colored by the diffuse color. Also the more metallic the material is, the stronger it reflects light in a specular way.

#5057197 Rendering & Saving Environment Maps

Posted by CryZe on 27 April 2013 - 04:36 AM

The way it works with the geometry shader, is, that you project each vertex inside the geometry shader onto the different sides and than output the resulting vertices with a SV_RenderTargetArrayIndex semantic which represents which side the projected vertex you are outputting is on.

#5051834 Oren-Nayar with Blinn-Phong Specular

Posted by CryZe on 10 April 2013 - 09:04 AM


If you want to check your assumptions, but without computing the integrals by hand, you can always resort to numeric integration using monte carlo sampling. I'm pretty sure, boost comes with a random number generator the produces evenly distributed random directions.

No need for random directions - just use spherical coordinates (and if your BRDF is isotropic, you only need the elevation). Every time you use dot(L, N) in your shader, use cos(theta), and so on. And then you can just integrate it numerically in Mathematica or something.


I don't have a source for this, but I *think* using this approach actually introduces a bias towards the poles. Assuming an unstratified sampling pattern.

If you keep in mind following you can use it without any problem:

#5036140 Your preferred or desired BRDF?

Posted by CryZe on 24 February 2013 - 12:14 PM

Edit: So I suppose it would make sense to remove the final NdotL, since this shader represents only the BRDF and not the final pixel color. Presumably BRDF Explorer is automatically multiplying the result by NdotL.


Yep, the BRDF needs both the PI and the division by NDotL, while the implementation in a standard shader doesn't need those.

#5030667 Cook-Torrance / BRDF General

Posted by CryZe on 10 February 2013 - 06:30 AM

I've approximated the diffuse transmittance integral and created a BRDF which is pretty lightweight but also pretty physically accurate. Use this instead of Lambert if you want to have proper energy conservation. It's based on GGX roughness though, so you might need to convert your roughness to GGX roughness:




It's actually just a single MAD instruction per light if you implement it, the rest can be done on a per pixel basis.

#5028381 Cook-Torrance / BRDF General

Posted by CryZe on 03 February 2013 - 11:22 AM

float geo_a = (2.0 * NdotH * NdotV) / VdotH;
float geo_b = (2.0 * NdotH * NdotL) / VdotH;
float G = min(1.0, max(0, min(geo_a, geo_b)));

You can improve this part of the code this way:
float g_min = min(NdotV, NdotL);
float G = saturate(2 * NdotH * g_min / VdotH);
Also, don't ever use max(0, dot(a, b)). Instead use saturate(dot(a, b)) which compiles into a single instruction.

Isn't real-time graphics programming all about approximations?

It is, but is using the complement of Fspecular actually a good one? I don't think so (unless you're using Fdiffuse).

I think someone should approximate a diffuse BRDF using the equation I posted in my post above. That would be a way better approximation.

To get back to the original topic:


The last one is the correct Cook-Torrance microfacet model. Sometimes you find (ns + 2) / (2 * pi) or (ns + 2) / (8 * pi) as the normalization factor for Blinn-Phong. The second one is already pre-multiplied with the 1/4 while the first one is the distribution function for the microfacet model.

And this one is the correct Beckmann NDF:


I wouldn't recommend the Beckmann NDF though. It's pretty damn slow in comparison to other NDFs because of 2 reciprocals and the exponential function. (Y u no use GGX xD)

This is the BRDF I'm using:


I'm using GGX as the distribution function, Schlick's approximation of fresnel as fresnel term and Walter's geometric term for the GGX distribution function.


I color-coded everything for implementation details. The grey parts are just parts of the BRDF and don't need to be implemented. The green parts can be calculated once for every pixel. And the red parts are the only parts, that actually need to be calculated for every light.

#5028211 Cook-Torrance / BRDF General

Posted by CryZe on 02 February 2013 - 08:09 PM

Tiago, that's an approximation, even with the real Fresnel equations, but is not really correct.

Fresnel is dependent on the microfacets oriented towards the halfway vector, but diffuse actually is all the light that is not being reflected, not being absorbed and scattered back out, independent of the microfacets oriented towards the halfway vector. So simply the complement of a single fresnel value won't do it. You would need to solve the integral over all the microfacet orientations with a modified microfacet model:


Your approximation might actually be worse than not having a factor for diffuse at all. If anything I'd use the macro surface normal instead of the halfway vector (just for diffuse though, for specular you should use the microfacet normal):


#5027777 Shading metals

Posted by CryZe on 01 February 2013 - 02:07 AM

Oh wait, you want to do metals with the fresnel equations? If that's the case, your formula from the other thread won't work. Metals usually have complex reflective indices (complex numbers). Fresnel's equations work with complex numbers though, your implementation just doesn't. You need complex multiplication, complex addition and the absolute value (which you didn't implement) needs to work with complex numbers as well. Also since metals have chromatic reflections you would have to calculate your fresnel term for all 3 color channels. I'd use Schlick's approximation, reduce most of it to constant time, and reduce other parts of the formula to scalar calculations, while only the necessary parts get calculated for all color channels.


Here's approximately how that code should look like. You should probably calculate the constant part per vertex or per draw call on the CPU, if possible:


float2 f0CmplxRed = cmplxDiv(cmplxSub(n1Red, n2Red), cmplxAdd(n1Red, n2Red));
float2 f0CmplxGreen = cmplxDiv(cmplxSub(n1Green, n2Green), cmplxAdd(n1Green, n2Green));
float2 f0CmplxBlue = cmplxDiv(cmplxSub(n1Blue, n2Blue), cmplxAdd(n1Blue, n2Blue));

float3 f0Sqrt = 0;
f0Sqrt.r = cmplxAbs(f0CmplxRed);
f0Sqrt.g = cmplxAbs(f0CmplxGreen);
f0Sqrt.b = cmplxAbs(f0CmplxBlue);

float3 f0 = f0Sqrt * f0Sqrt;
float3 cf0 = 1 - f0;

foreach (light)
	float factor = pow(1 - dot(L, H), 5);
	float3 fresnel = f0 + cf0 * factor;



#5027503 Fresnel equation

Posted by CryZe on 31 January 2013 - 03:35 AM

I think the most important thing in optimizing a BRDF is, that you reduce linear time to constant time. Some parts of the BRDF don't need to be calculated per light. Just take a look at your version of Schlick's fresnel:

foreach (light)
	float f0Sqrt = (n1 - n2) / (n1 + n2);
	float f0 = f0Sqrt * f0Sqrt;
	float fresnel = f0 + (1 - f0) * pow(1 - dot(L, H), 5);


If you implement it this way, you almost reduce the linear code by half of its instructions:

float f0Sqrt = (n1 - n2) / (n1 + n2);
float f0 = f0Sqrt * f0Sqrt;
float cf0 = 1 - f0;

foreach (light)
	float fresnel = f0 + cf0 * pow(1 - dot(L, H), 5);


I've reduced my BRDF this way. And this is also the reason I'm using this reduced version of Schlick's fresnel (yes I'm using the refractive indices as well) in my BRDF, because the full fresnel equation just can't be reduced this way and is way too expensive in comparison to this one. That's also the reason why I prefer the GGX NDF over any other NDF. It's pretty damn physically accurate and can be reduced into just a few instructions. Actually it's probably even faster than Blinn-Phong.

float roughnessSqr = roughness * roughness;
float numerator = roughnessSqr / PI;
float roughnessSqrSub1 = roughnessSqr - 1;

foreach (light)
	float NDotH = dot(N, H);
	float NDotHSqr = NDotH * NDotH;
	float denominatorSqrt = NDotHSqr * roughnessSqrSub1 + 1;
	float denominator = denominatorSqrt * denominatorSqrt;
	float ggx = numerator / denominator;

#5027170 Fresnel equation

Posted by CryZe on 30 January 2013 - 07:47 AM

You implemented the formula for s-polarized light. But you want to implement the formula for non-polarized light. Which is simply R = (Rs + Rp) / 2.

#5027137 Realtime non-raytraced curved mirrors

Posted by CryZe on 30 January 2013 - 04:03 AM

You could try Billboard reflections, such as those from the Unreal Engine Samarithan Demo. You simply approximate the environment using textured planes and perform texture fetches at the ray-plane-intersections. Maybe not only texture fetches, but actual "pixel shading" of this geometry too. Do this for all the planes and output the color of the nearest sample. This should work pretty well if done correctly. If you combine it with screen space reflections and cube mapping, you can get almost perfect reflections. Billboard reflections as first fallback and cube mapping as second fallback. You simply fade in the billboard reflections, where no sufficient screen space information is available, and also fade in cube mapping, where no ray-plane-intersections are found. You might also want to use a signed distance field of the geometry to check whether the billboards are occluded. They did that in the Samarithan Demo to prevent bright billboards from leaking through buildings in the reflections.

Update: I have an even more crazy idea. You could also apply not only bump mapping to these billboards, but also parallax occlusion mapping. This way, you would have reflections that are not completely flat. But this would probably immensely reduce your performance.

#5025085 In-engine environment map probes questions

Posted by CryZe on 24 January 2013 - 06:46 AM

No need for that. Even though every material reflects light, there's no need to waste computation time calculating reflections for rough materials because the reflection won't be noticeable.

So rough gold would be completely black? I don't know, roughness doesn't change anything about the reflectivity of the material, just the "blurriness" of the reflection. So on metals or at grazing angles you'll lose an incredible amount of lighting if you simply turn it off for rough materials. If anything I'd approximate rough reflection using an approximation via a diffuse term.

Its common to store a glossiness value in the range [0,1] in textures and then convert it to specular power using a function like this:
float specPower = pow(2.0f, 13*glossiness);

Oh, my bad. I thought he was talking about the specular power used in the BRDF. :D

So would it be correct to multiply this by the ambient occlusion ?

Ambient occlusion is low-quality shadowing information. Use it on anything non-directional, where you don't have better visibility information. So don't apply it to reflections or any point / spot / directional lights. Just to irradiance or other kinds of ambient terms, like diffuse sky illumination. Basically every source of light which comes from more than just a single direction (this includes area lights) could be multiplied with ambient occlusion. The more it spans the hemisphere over the surface, the more it's suited for ambient occlusion. Environment maps might span the whole hemisphere over the surface, but when used for reflections, only a single direction is used, so you shouldn't apply ambient occlusion to it. If the environment map is used for diffuse image based lighting, samples over the whole hemisphere are used and thus ambient occlusion should be applied. For rough reflections you might want to blend in ambient occlusion to some degree though.

And how about shadows ? Do I multiply or add them to the environment map color ? Or not at all ?

I'm not quite sure which shadows you're talking about. But basically you are "tracing" a reflection ray and the environment map helps you find the intersection. So there's no need for any kind of shadow map.

But even with a fresnel reflectance color of 0.028f (for skin) I'm seeing reflections even when not at glancing angle on the skin. Do you think the color of the env map is too bright ?

Either it's actually too strong or the reflection is not rough enough. Hard to tell without seeing the result.

#5025037 In-engine environment map probes questions

Posted by CryZe on 24 January 2013 - 02:50 AM

Basically a material that is highly glossy (let's say = 1.0f) would have strong reflection intensity and a material that is very rough (= 0.2f) would then have very low reflection intensity. Does that make sense, or would it be "physically incorrect" somehow ?

A rough material reflects just as much light as a smooth one. The only difference is, that on the rough material the reflections are more scattered into all directions and not as bundled into a single direction. That gives the illusion of a highly smooth material being more reflective, but that's not true, it's just more or less "binary" as to how it reflects light (completely into one direction, nothing into another direction).

The only parameters needed to describe reflection are parameters to describe the materials microfacets orientations and positions (usually a single parameter called "Roughness" or "Glossiness") and the index of refraction of the material. The IOR is completely responsible for as to how much light is reflected from a microfacet which can be calculated using fresnel's equations. The easier and more common way to store this IOR is to calculate how much light would be reflected at microfacet normal incidence (aka specular intensity) and use this later on with Schlick's approximation of fresnel's equations which approximates them without the need for actual IOR values.

Each microfacet is a perfect mirror so light coming from one direction will be reflected to exactly one other direction. So how each microfacet is oriented doesn't affect how much light is reflected, only where it's being reflected to.

So overall it doesn't matter how rough your material is, the only thing that is actually responsible for how much is reflected, is the index of refraction.

You could actually derive a diffuse model from this microfacet model as well:
Simply multiply lambert with the integral over all the microfacet orientations of the percentage of microfacets being oriented into this direction multiplied with the percentage of microfacets actually visible from the light source multiplied with the complement of how much light is reflected from these microfacets (because the non-reflected rest of the light is getting scattered into the material and thus might get scattered out again into random directions as diffuse lighting if it's not absorbed).

Also I'm not quite sure what those numbers are (0.2 being rough and 1.0 being glossy). In Phong and Blinn-Phong you have exponents ranging from 1 to infinity where higher numbers represent a smoother surface and in NDFs like Beckmann or GGX you have factors ranging from 0 to 1 where higher numbers represent a rougher surface. You are either using a pretty weird NDF / BRDF or you misunderstand something.

Also would I add the reflection to everything ? Or exclude it on materials like skin ?

Everything has an index of refraction and thus everything must be reflective unless it has the same index of refraction as air (or whatever material the camera is in). Skin for example is pretty much as reflective as water is (the index of refraction doesn't differ that much), but is pretty rough in comparison to water.

metallic reflection ? How is it different ? Do I just output specular and ignore diffuse completely ? You're saying the color of the spheres in that screenshot comes from the specular color ?

Metals have index of refractions in the complex plane. If you calculate how much light is reflected using fresnel's equations, you get pretty high percentages. These are usually around 95% instead of at about 4% for dielectric materials. Also most metals unlike dielectric materials have highly varying index of refractions in the visible light spectrum. So green light might get reflected less than red light for example. Copper and gold are good examples of materials with varying index of refractions in the visible light spectrum. Also metals are highly absorbtive. The already low amount of light that is not reflected and gets scattered into the material gets absorbed so fast, that pretty much no light gets scattered out again as diffuse lighting. Keep in mind, that there is no difference between subsurface scattering and diffuse lighting.

#5012788 simple compute shader question

Posted by CryZe on 20 December 2012 - 07:36 AM

You can use groupshared variables to make them syncronized

No he can't. The keyword groupshared allows multiple threads inside a thread group to share data, not between the different thread groups. It's this way, because the thread groups might get executed by different streaming multiprocessors, while a single thread group is executed on a single streaming multiprocessor, where all the threads can use specialized on-chip memory to efficiently share data.

It actually depends on your implementation. A normal shader for blurring doesn't really require synchronization between the individual threads, since every thread just needs to gather its data. A better implementation might be, that you want to use groupshared memory as some kind of cache for the row of the texture, so that every thread only needs to perform a single texture fetch. Since groupshared memory is only available inside a thread group, you can only use up to 1024 threads. The only solution would be to convert your algorithm into a more iterative algorithm. Just work on 2 pixels per thread and use a groupshared array with 2048 elements. This works and is not even slower, since a thread group is not what the actual hardware executes in parallel. The driver splits your thread group into units of 32 or 64 threads called Warps or Wavefronts that get executed iteratively (clarification: all the threads of a warp get executed in parallel, but the different warps get executed iteratively). So 2 thread groups of 1024 would be executed as 32 wavefronts or 64 warps in an iterative manner. My solution of just a single thread group and 1024 threads per thread group gets converted into just 16 wavefronts or 32 warps, but they all do the twice the amount of work. So in the end your algorithm is just as parallel, as it would be if you would use 2 thread groups. As long as a thread group consists of at least 8 warps (recommendation of NVIDIA) you can always remove some of your parallelisation without a decrease in performance. As far as I understand NVIDIA's Kepler architecture, they now begin to execute multiple warps in parallel (6, if I'm correct), so this solution might not be the best for the future. But it's as good as it can get with DirectX 11 unfortunately.

If you would actually want to work on more than 2048 pixels, you would need to use register memory to cache your pixels, since you would need more than the maximum of 32 KB group shared memory. Let's say you would want to work with 4096 pixels. You could store 4 pixels per thread inside its registers and always expose 2 of them in the group shared memory. You just need to synchronize the threads and always expose the pixels you want to access from other threads. Groupshared memory is just a way to share data. Register memory is way larger than just 32 KB.

#5008240 SSAO and skybox artifact

Posted by CryZe on 07 December 2012 - 03:18 PM

ALL code is executed, including ALL branches, all function calls, etc. (...) This is how all graphics cards work, AMD, NVIDIA, etc.

I'm not quite sure where you base your information on, but almost all graphics cards from the last 3 or 4 years work this way. Here's a quote from NVidia:

Any flow control instruction (if, switch, do, for, while) can significantly affect
the instruction throughput by causing threads of the same warp to diverge; that is, to
follow different execution paths. If this happens, the different execution paths must be
serialized, since all of the threads of a warp share a program counter; this increases the
total number of instructions executed for this warp. When all the different execution
paths have completed, the threads converge back to the same execution path.
To obtain best performance in cases where the control flow depends on the thread ID,
the controlling condition should be written so as to minimize the number of divergent
This is possible because the distribution of the warps across the block is deterministic as
mentioned in SIMT Architecture of the CUDA C Programming Guide. A trivial example is
when the controlling condition depends only on (threadIdx / WSIZE) where WSIZE is
the warp size.
In this case, no warp diverges because the controlling condition is perfectly aligned with
the warps.

Only when serialization is needed, which is when threads inside a warp diverge into different branches, the different execution paths get serialized.