• Create Account

# Digitalfragment

Member Since 29 Aug 2002
Offline Last Active Today, 04:42 PM

### #5231184Plane and AABB intersection tests

Posted by on 27 May 2015 - 12:12 AM

Its distance from the origin (0,0,0) to the plane in the direction of the planes normal.

So, assuming you are facing down +Z (0,0,1) the normal for the near plane would be +Z (0,0,1) while the normal for the far plane would be -Z (0,0,1) in order to make them both point inward. As you move forward or backward along the Z plane, those normals don't change, but the D parameter does.

Completely ignore the fact that the coordinates are 3d and think of values along a ruler. If you have a section of a ruler, then from the left side to the right side you are incrementing values (1), from the right to the left you are decrementing values (-1). The distance along the ruler * either 1 or -1 depending on the side of the is the D value.

### #5230931Plane and AABB intersection tests

Posted by on 25 May 2015 - 05:00 PM

As a quick guess, the winding order of the points you have used for the frustum planes are inconsistent. Check that either all plane normals point inward on the volume, or outward (depending on how you like to test inside/outside)

### #5182768frac(x) results in strange artifacts when x is close to zero

Posted by on 24 September 2014 - 06:57 PM

Fwiw, you should never avoid using mips without good reason. Mips are there largely to avoid thrashing the cache, and therefore improve performance. You can avoid the problem Erik described by supplying your own derivatives into the tex2d call that are calculated prior to the frac() call. This lets the hardware use the mips that it would have had the frac() call not been there. Something along the lines of this:

```float2 uv = worldPos.xz / 4; // <- swizzle elements however you need, instead of doing your math seperately on x and z!
float alpha = tex2Dgrad(_Grid, frac(uv), ddx(uv), ddy(uv)).a;
```

Also, its perfectly acceptible to use texture coordinates that are outside of 0 to 1. You can assign sampler addressing modes to let the hardware automatically repeat the texture, or clamp it, in either dimension. That avoids the need for the tex2Dgrad & frac completely.

EDIT: sorry, i missed the part where Erik already pointed out the usages of the wrap mode. I'll leave my comment about tex2Dgrad though, as its handy for similar situations that cant be fixed via repeat (such as when tiling within 0 to 1)

### #5176327Isn't "delete" so pesky to type? Wish there was a better way?

Posted by on 26 August 2014 - 06:43 PM

Oh you people.

For anyone who is legitimately confused, have fun researching C++'s comma operator.  And yes, the comma operator can be overloaded; that might blow your mind.  In fact, I bet with a templated overload of the comma operator, plus an appropriately designed class with an overloaded delete operator to represent a collection of deletable pointers, one could make ApochPiQ's code actual behave exactly as presented.  Frightening.  Too bad it would lose the 3x performance gain, heh!

This is true. And for a few seconds after reading your post i seriously considered it. Then realized that i can implement it easier and safer through variadic templates!

```template <typename T> void d(T *t) { delete t; }
template <typename T, typename... U> void d(T*t, U&... u) { delete t; d(u...); }

d(foo, bar, baz);```

Not sure if i feel dirty or not...

### #5149973IBL Problem with consistency using GGX / Anisotropy

Posted by on 27 April 2014 - 05:37 PM

Depends on which way around your looking at it. In my case, i'm outputting a cubemap thats sampled by Normal, so the direction to the pixel that im outputting is N (which is the "UVandIndexToBoxCoord" wrapped direction, and so L is the direction of the pixel in the *source* cubemap being sampled. In my specular sample, L was calculated using importance sampling around N, but for a complete diffuse convolve you would need to sample the entire hemisphere around N.

For this reason, id still suggest using hammersley to generate a set of points, but dont use the GGX distribution, instead use a hemispherical distribution (or just use GGX but with a roughness of 1). Those are your L vectors that you sample the source with.

### #5149963IBL Problem with consistency using GGX / Anisotropy

Posted by on 27 April 2014 - 04:57 PM

The thing that I can't seem to figure out is how this direction of the pixel (L) is calculated ?

Thats what the purpose of UVandIndexToBoxCoord in my shaders was, it takes the uv coordinate and cubemap face index and returned the direction for it.

edit: missed this question:

P.S. Did you implement the second part of the equation that models the environment brdf (GGX)? Since this also doesn't seem to work out as it should (see my second last post). The resulting texture looks like this: https://www.dropbox.com/s/waya105re6ls4vl/shot_140427_163858.png (as you can see the green channel is just 0)

I did have a similar issue, when trying to replicate Epic's model, I can't remember exactly what it was but here is my current sample for producing that texture:
```
struct vs_out
{
float4 pos : SV_POSITION;
float2 uv : TEXCOORD0;
};

void vs_main(out vs_out o, uint id : SV_VERTEXID)
{
o.uv = float2((id << 1) & 2, id & 2);
o.pos = float4(o.uv * float2(2,-2) + float2(-1,1), 0, 1);
//o.uv = (o.pos.xy * float2(0.5,-0.5) + 0.5) * 4;
//o.uv.y = 1 - o.uv.y;
}

struct ps_in
{
float4 pos : SV_POSITION;
float2 uv : TEXCOORD0;
};

// http://holger.dammertz.org/stuff/notes_HammersleyOnHemisphere.html
bits = (bits << 16u) | (bits >> 16u);
bits = ((bits & 0x55555555u) << 1u) | ((bits & 0xAAAAAAAAu) >> 1u);
bits = ((bits & 0x33333333u) << 2u) | ((bits & 0xCCCCCCCCu) >> 2u);
bits = ((bits & 0x0F0F0F0Fu) << 4u) | ((bits & 0xF0F0F0F0u) >> 4u);
bits = ((bits & 0x00FF00FFu) << 8u) | ((bits & 0xFF00FF00u) >> 8u);
return float(bits) * 2.3283064365386963e-10; // / 0x100000000
}
// http://holger.dammertz.org/stuff/notes_HammersleyOnHemisphere.html
float2 Hammersley(uint i, uint N) {
}

static const float PI = 3.1415926535897932384626433832795;

// Image-Based Lighting
float3 ImportanceSampleGGX( float2 Xi, float Roughness, float3 N )
{
float a = Roughness * Roughness;
float Phi = 2 * PI * Xi.x;
float CosTheta = sqrt( (1 - Xi.y) / ( 1 + (a*a - 1) * Xi.y ) );
float SinTheta = sqrt( 1 - CosTheta * CosTheta );
float3 H;
H.x = SinTheta * cos( Phi );
H.y = SinTheta * sin( Phi );
H.z = CosTheta;
float3 UpVector = abs(N.z) < 0.999 ? float3(0,0,1) : float3(1,0,0);
float3 TangentX = normalize( cross( UpVector, N ) );
float3 TangentY = cross( N, TangentX );
// Tangent to world space
return TangentX * H.x + TangentY * H.y + N * H.z;
}

// http://graphicrants.blogspot.com.au/2013/08/specular-brdf-reference.html
float GGX(float nDotV, float a)
{
float aa = a*a;
float oneMinusAa = 1 - aa;
float nDotV2 = 2 * nDotV;
float root = aa + oneMinusAa * nDotV * nDotV;
return nDotV2 / (nDotV + sqrt(root));
}

// http://graphicrants.blogspot.com.au/2013/08/specular-brdf-reference.html
float G_Smith(float a, float nDotV, float nDotL)
{
return GGX(nDotL,a) * GGX(nDotV,a);
}

// Environment BRDF
float2 IntegrateBRDF( float Roughness, float NoV )
{
float3 V;
V.x = sqrt( 1.0f - NoV * NoV ); // sin
V.y = 0;
V.z = NoV;
// cos
float A = 0;
float B = 0;
const uint NumSamples = 1024;
[loop]
for( uint i = 0; i < NumSamples; i++ )
{
float2 Xi = Hammersley( i, NumSamples );
float3 H = ImportanceSampleGGX( Xi, Roughness, float3(0,0,1) );
float3 L = 2 * dot( V, H ) * H - V;
float NoL = saturate( L.z );
float NoH = saturate( H.z );
float VoH = saturate( dot( V, H ) );
[branch]
if( NoL > 0 )
{
float G = G_Smith( Roughness, NoV, NoL );
float G_Vis = G * VoH / (NoH * NoV);
float Fc = pow( 1 - VoH, 5 );
A += (1 - Fc) * G_Vis;
B += Fc * G_Vis;
}
}
return float2( A, B ) / NumSamples;
}

// Environment BRDF
float4 ps_main(in ps_in i) : SV_TARGET0
{
float2 uv = i.uv;
float nDotV = uv.x;
float Roughness = uv.y;

float2 integral = IntegrateBRDF(Roughness, nDotV);

return float4(integral, 0, 1);
}

```

### #5149249Screenshot of your biggest success/ tech demo

Posted by on 24 April 2014 - 06:20 PM

I did the material/lighting shaders and post processing on this:
http://i.imgur.com/HPZmJ.jpg

Likewise on the previous screenshot; i worked on the custom character creation tech, cutscene system, environment shaders, and degradation effects.

### #5149243IBL Problem with consistency using GGX / Anisotropy

Posted by on 24 April 2014 - 05:24 PM

Thanks for the comment on the code sample, i'm glad it was helpful.

1. The logLUV endcoding was due to Maya 2012 not wanting to load fp16 dds files, and it was faster to implement the encode/decode than to work out Maya. (If anyone out there knows how to get Maya to load floating point dds files correctly without destroying mips too, let me know!)

- The quality difference between the logLUV in an RGBA8 target and the unencoded float16 wasn't bad either, and the bandwidth improvement of using 32bbp vs 64bpp is quite dramatic when still needing to target the DX9 generation of consoles.

2. The VS/GS trick is fun. I'll break it down into small chunks:

In directx 11, you can render without supplying vertex data. Instead SV_VERTEXID is automatically supplied and equal to the index of the vertex being generated. So in C++ i simply bind the shaders and call Draw(3,0) to draw a fullscreen triangle. See this link for more info on system-generated semantics:

- The reason to use a triangle and not a quad is to avoid the redundant processing of the pixel quads that run along the edge shared by the 2 triangles in the quad.

The GS pass is then taking this triangle and generating 6 triangles, and assigning them to output to individual textures in the texture array using SV_RENDERTARGETARRAYINDEX. This allows the C++ code to generate a single D3D11RenderTargetView for the cubemap per mip map, instead of creating an RTV for each individual face of each mip level. The GS cubemap trick was in one of the DirectX SDK samples, i can't remember which one though.
The code in the GS to work out the corner direction of the cubemap box was completely trial and error

An extra note on the shader, the seamles-cube-filtering concept isn't necessary when the generated file is going to be used on DX11, only DX9 as on the older generation cubemap texture filtering did not blend between cubemap faces (for example when sampling the 1x1x1 mip, you would only ever get a flat colour)

3. Yes definately, i hadn't gotten around to doing a diffuse irradiance map and instead hacked it to use the last mip of the probe with the vertex normal - which for our case seems good enough for now and saves generating double the number of probes at runtime.

The question about the LOD, the first is 0 which yields a roughness of 0 (perfect mirror), and (count-1) is so that the last generated mip map yields a roughness of 1 (lambert diffuse). Yes that would cause a divide by zero - but thats fine because a single mipmap level can't be both a roughness of 0 and a roughness of 1! Also, dont generate mipmaps all the way down to a 1x1, as you do need a little more precision at a roughness of 1.

Regarding the high frequency dots visible in your video, i can't say i ever got that problem - but i was using outdoor environment maps that tend to be a lot smoother and consistent in lighting values compared to that Grace Cathedral map. What sampler state are you using to sample your source cubemap with?

### #5147569IBL Problem with consistency using GGX / Anisotropy

Posted by on 17 April 2014 - 12:56 AM

1. What does the function hammersley do ?
2. He's sampling the environment map here...is that a TextureCube ? Or is this function being run for each cube face as a Texture2D ?
3. The input to this is the reflection vector R. How would it be calculated in this context ? I imagine similar to the direction vector in the AMD cubemapgen ?

1. hammersley generates psuedo random, fairly well spaced 2-D coordinates, that the GGX importance sample function then gathers into a region thats going to contribute the most for the given roughness.

2. Its a cubemap, for a rough surface an entire hemisphere is required. The nice thing about using a cubemap as an input is that its easy to render one in realtime.

3. The function is run for every pixel of a cubemap rendertarget, its convolving the environment for all directions.

Heres my attempt at implementing this entire process as presented by Epic at Siggraph last year (if anyone can point out errors, that would be awesome).

It uses SV_VERTEXID to generate fullscreen quads, and a GS with SV_RENDERTARGETARRAYINDEX to output to all 6 faces of a cubemap rendertarget simultaneously.

```
struct vs_out
{
float4 pos : SV_POSITION;
float2 uv : TEXCOORD0;
};

void vs_main(out vs_out o, uint id : SV_VERTEXID)
{
o.uv = float2((id << 1) & 2, id & 2);
o.pos = float4(o.uv * float2(2,-2) + float2(-1,1), 0, 1);
//o.uv = (o.pos.xy * float2(0.5,-0.5) + 0.5) * 4;
//o.uv.y = 1 - o.uv.y;
}

struct ps_in
{
float4 pos : SV_POSITION;
float3 nrm : TEXCOORD0;
uint face : SV_RENDERTARGETARRAYINDEX;
};

float3 UvAndIndexToBoxCoord(float2 uv, uint face)
{
float3 n = float3(0,0,0);
float3 t = float3(0,0,0);

if (face == 0) // posx (red)
{
n = float3(1,0,0);
t = float3(0,1,0);
}
else if (face == 1) // negx (cyan)
{
n = float3(-1,0,0);
t = float3(0,1,0);
}
else if (face == 2) // posy (green)
{
n = float3(0,-1,0);
t = float3(0,0,-1);
}
else if (face == 3) // negy (magenta)
{
n = float3(0,1,0);
t = float3(0,0,1);
}
else if (face == 4) // posz (blue)
{
n = float3(0,0,-1);
t = float3(0,1,0);
}
else // if (i.face == 5) // negz (yellow)
{
n = float3(0,0,1);
t = float3(0,1,0);
}

float3 x = cross(n, t);

uv = uv * 2 - 1;

n = n + t*uv.y + x*uv.x;
n.y *= -1;
n.z *= -1;
return n;
}

[maxvertexcount(18)]
void gs_main(triangle vs_out input[3], inout TriangleStream<ps_in> output)
{
for( int f = 0; f < 6; ++f )
{
for( int v = 0; v < 3; ++v )
{
ps_in o;
o.pos = input[v].pos;
o.nrm = UvAndIndexToBoxCoord(input[v].uv, f);
o.face = f;
output.Append(o);
}
output.RestartStrip();
}
}

SamplerState g_samCube
{
Filter = MIN_MAG_MIP_LINEAR;
};
TextureCube g_txEnvMap : register(t0);

cbuffer mip : register(b0)
{
float g_CubeSize;
float g_CubeLod;
float g_CubeLodCount;
};

// http://holger.dammertz.org/stuff/notes_HammersleyOnHemisphere.html
bits = (bits << 16u) | (bits >> 16u);
bits = ((bits & 0x55555555u) << 1u) | ((bits & 0xAAAAAAAAu) >> 1u);
bits = ((bits & 0x33333333u) << 2u) | ((bits & 0xCCCCCCCCu) >> 2u);
bits = ((bits & 0x0F0F0F0Fu) << 4u) | ((bits & 0xF0F0F0F0u) >> 4u);
bits = ((bits & 0x00FF00FFu) << 8u) | ((bits & 0xFF00FF00u) >> 8u);
return float(bits) * 2.3283064365386963e-10; // / 0x100000000
}
// http://holger.dammertz.org/stuff/notes_HammersleyOnHemisphere.html
float2 Hammersley(uint i, uint N) {
}

static const float PI = 3.1415926535897932384626433832795;

// Image-Based Lighting
float3 ImportanceSampleGGX( float2 Xi, float Roughness, float3 N )
{
float a = Roughness * Roughness;
float Phi = 2 * PI * Xi.x;
float CosTheta = sqrt( (1 - Xi.y) / ( 1 + (a*a - 1) * Xi.y ) );
float SinTheta = sqrt( 1 - CosTheta * CosTheta );
float3 H;
H.x = SinTheta * cos( Phi );
H.y = SinTheta * sin( Phi );
H.z = CosTheta;
float3 UpVector = abs(N.z) < 0.999 ? float3(0,0,1) : float3(1,0,0);
float3 TangentX = normalize( cross( UpVector, N ) );
float3 TangentY = cross( N, TangentX );
// Tangent to world space
return TangentX * H.x + TangentY * H.y + N * H.z;
}

// M matrix, for encoding
const static float3x3 M = float3x3(
0.2209, 0.3390, 0.4184,
0.1138, 0.6780, 0.7319,
0.0102, 0.1130, 0.2969);

// Inverse M matrix, for decoding
const static float3x3 InverseM = float3x3(
6.0013,    -2.700,    -1.7995,
-1.332,    3.1029,    -5.7720,
.3007,    -1.088,    5.6268);

float4 LogLuvEncode(in float3 vRGB)
{
float4 vResult;
float3 Xp_Y_XYZp = mul(vRGB, M);
Xp_Y_XYZp = max(Xp_Y_XYZp, float3(1e-6, 1e-6, 1e-6));
vResult.xy = Xp_Y_XYZp.xy / Xp_Y_XYZp.z;
float Le = 2 * log2(Xp_Y_XYZp.y) + 127;
vResult.w = frac(Le);
vResult.z = (Le - (floor(vResult.w*255.0f))/255.0f)/255.0f;
return vResult;
}

float3 LogLuvDecode(in float4 vLogLuv)
{
float Le = vLogLuv.z * 255 + vLogLuv.w;
float3 Xp_Y_XYZp;
Xp_Y_XYZp.y = exp2((Le - 127) / 2);
Xp_Y_XYZp.z = Xp_Y_XYZp.y / vLogLuv.y;
Xp_Y_XYZp.x = vLogLuv.x * Xp_Y_XYZp.z;
float3 vRGB = mul(Xp_Y_XYZp, InverseM);
return max(vRGB, 0);
}

// Ignacio Castano via http://the-witness.net/news/2012/02/seamless-cube-map-filtering/
float3 fix_cube_lookup_for_lod(float3 v, float cube_size, float lod)
{
float M = max(max(abs(v.x), abs(v.y)), abs(v.z));
float scale = 1 - exp2(lod) / cube_size;
if (abs(v.x) != M) v.x *= scale;
if (abs(v.y) != M) v.y *= scale;
if (abs(v.z) != M) v.z *= scale;
return v;
}

// Pre-Filtered Environment Map
float4 ps_main(in ps_in i) : SV_TARGET0
{
float3 N = fix_cube_lookup_for_lod(normalize(i.nrm), g_CubeSize, g_CubeLod);
float Roughness = (float)g_CubeLod / (float)(g_CubeLodCount-1);

const uint C = 1024;
[loop]
for (uint j = 0; j < C; ++j)
{
float2 Xi = Hammersley(j,C);
float3 H = ImportanceSampleGGX( Xi, Roughness, N );
float3 L = 2 * dot( N, H ) * H - N;
float nDotL = saturate(dot(L, N));
[branch]
if (nDotL > 0)
{
float4 pointRadiance = (g_txEnvMap.SampleLevel( g_samCube, L, 0 ));
}
}

}

```

Posted by on 10 December 2013 - 05:39 PM

Blizzard & AMD published a whitepaper that covers many graphical aspects of Starcraft II:
http://developer.amd.com/wordpress/media/2012/10/S2008-Filion-McNaughton-StarCraftII.pdf

### #5115787Manually creating textures in Direct3D11

Posted by on 09 December 2013 - 07:48 PM

The ID3D11Device::CreateTexture2D function has all of the functionality in there.
http://msdn.microsoft.com/en-us/library/windows/desktop/ff476521(v=vs.85).aspx

In particular, the source data is passed into CreateTexture2D via the D3D11_SUBRESOURCE_DATA structure as described in the MSDN page linked above.

How to handle mipmaps is also described in that page, under Remarks.

Note that png/jpeg are not native formats, and so cannot be used directly. They would have to be decompressed into a standard format, and then if you chose, recompressed into any of the BC formats etc.

### #5115458HLSL keywords in inout

Posted by on 08 December 2013 - 04:17 PM

'in' IIRC, denotes a vertex input going into a main function in CG.

'out' acts just like 'out' in C#, no value is passed into the function, but the function must write a value to it, which then gets passed back to the calling function.

'inout' acts like 'ref' in C# as you guessed, the argument gets passed both ways, allowing the function to read and modify the value.

### #5112577Difference Tiled and Clustered shading

Posted by on 27 November 2013 - 04:16 PM

Essentially yes. Though IIRC, the author of the original clustered shading whitepaper talks about adding in several dimensions, by also breaking the shading into similar angles.
So its (x/y) -> (position x/ position y/ position z/ normal x / normal z)

### #5109069How many influancing weights?

Posted by on 13 November 2013 - 05:30 PM

DX11, PS4 and XBone games are trending towards 8 weights for the face

I wonder how they weight the mesh then. Do they use some procedural try&see tecnique? Considering other technologic facts, wheather the weights are dynamic or constant. I think they will go for per vertex animation in near future.

Check out the presentation "Ryse: Son of Rome - Defining the next gen"
http://www.crytek.com/cryengine/presentations

They do both skinning and vertex animation. Both have pros & cons, so take the best of both.

### #5108357Voxel Cone Tracing Experiment - Part 2 Progress

Posted by on 10 November 2013 - 04:42 PM

Obviously your profiler is broken somehow, as I doubt your experiment manages to hold ever increasing data in the same exact amount of ram.

Actually I'm using the task manager to get the amount of ram that my application is using.

Sounds like you hit your video cards memory limit and the drivers are now using system memory - which is also why your frame rate tanks. Task Manager only shows system memory usage, not the memory internal to the video card.

PARTNERS