Jump to content

  • Log In with Google      Sign In   
  • Create Account


Digitalfragment

Member Since 29 Aug 2002
Offline Last Active Aug 28 2014 07:37 PM

#5176327 Isn't "delete" so pesky to type? Wish there was a better way?

Posted by Digitalfragment on 26 August 2014 - 06:43 PM

Oh you people.  rolleyes.gif

 

For anyone who is legitimately confused, have fun researching C++'s comma operator.  And yes, the comma operator can be overloaded; that might blow your mind.  In fact, I bet with a templated overload of the comma operator, plus an appropriately designed class with an overloaded delete operator to represent a collection of deletable pointers, one could make ApochPiQ's code actual behave exactly as presented.  Frightening.  Too bad it would lose the 3x performance gain, heh!  smile.png

This is true. And for a few seconds after reading your post i seriously considered it. Then realized that i can implement it easier and safer through variadic templates!

 

template <typename T> void d(T *t) { delete t; }
template <typename T, typename... U> void d(T*t, U&... u) { delete t; d(u...); }

d(foo, bar, baz);

 

Not sure if i feel dirty or not...




#5149973 IBL Problem with consistency using GGX / Anisotropy

Posted by Digitalfragment on 27 April 2014 - 05:37 PM

Depends on which way around your looking at it. In my case, i'm outputting a cubemap thats sampled by Normal, so the direction to the pixel that im outputting is N (which is the "UVandIndexToBoxCoord" wrapped direction, and so L is the direction of the pixel in the *source* cubemap being sampled. In my specular sample, L was calculated using importance sampling around N, but for a complete diffuse convolve you would need to sample the entire hemisphere around N.

For this reason, id still suggest using hammersley to generate a set of points, but dont use the GGX distribution, instead use a hemispherical distribution (or just use GGX but with a roughness of 1). Those are your L vectors that you sample the source with.




#5149963 IBL Problem with consistency using GGX / Anisotropy

Posted by Digitalfragment on 27 April 2014 - 04:57 PM


The thing that I can't seem to figure out is how this direction of the pixel (L) is calculated ?

 

Thats what the purpose of UVandIndexToBoxCoord in my shaders was, it takes the uv coordinate and cubemap face index and returned the direction for it.

edit: missed this question:


P.S. Did you implement the second part of the equation that models the environment brdf (GGX)? Since this also doesn't seem to work out as it should (see my second last post). The resulting texture looks like this: https://www.dropbox.com/s/waya105re6ls4vl/shot_140427_163858.png (as you can see the green channel is just 0)

I did have a similar issue, when trying to replicate Epic's model, I can't remember exactly what it was but here is my current sample for producing that texture:
 
struct vs_out
{
float4 pos : SV_POSITION;
float2 uv : TEXCOORD0;
};
 
void vs_main(out vs_out o, uint id : SV_VERTEXID)
{
o.uv = float2((id << 1) & 2, id & 2);
o.pos = float4(o.uv * float2(2,-2) + float2(-1,1), 0, 1);
//o.uv = (o.pos.xy * float2(0.5,-0.5) + 0.5) * 4;
//o.uv.y = 1 - o.uv.y;
}
 
struct ps_in
{
float4 pos : SV_POSITION;
float2 uv : TEXCOORD0;
};
 
// http://holger.dammertz.org/stuff/notes_HammersleyOnHemisphere.html
float radicalInverse_VdC(uint bits) {
     bits = (bits << 16u) | (bits >> 16u);
     bits = ((bits & 0x55555555u) << 1u) | ((bits & 0xAAAAAAAAu) >> 1u);
     bits = ((bits & 0x33333333u) << 2u) | ((bits & 0xCCCCCCCCu) >> 2u);
     bits = ((bits & 0x0F0F0F0Fu) << 4u) | ((bits & 0xF0F0F0F0u) >> 4u);
     bits = ((bits & 0x00FF00FFu) << 8u) | ((bits & 0xFF00FF00u) >> 8u);
     return float(bits) * 2.3283064365386963e-10; // / 0x100000000
 }
 // http://holger.dammertz.org/stuff/notes_HammersleyOnHemisphere.html
 float2 Hammersley(uint i, uint N) {
     return float2(float(i)/float(N), radicalInverse_VdC(i));
 }
 
 static const float PI = 3.1415926535897932384626433832795;
 
// Image-Based Lighting
// http://www.unrealengine.com/files/downloads/2013SiggraphPresentationsNotes.pdf
float3 ImportanceSampleGGX( float2 Xi, float Roughness, float3 N )
{
float a = Roughness * Roughness;
float Phi = 2 * PI * Xi.x;
float CosTheta = sqrt( (1 - Xi.y) / ( 1 + (a*a - 1) * Xi.y ) );
float SinTheta = sqrt( 1 - CosTheta * CosTheta );
float3 H;
H.x = SinTheta * cos( Phi );
H.y = SinTheta * sin( Phi );
H.z = CosTheta;
float3 UpVector = abs(N.z) < 0.999 ? float3(0,0,1) : float3(1,0,0);
float3 TangentX = normalize( cross( UpVector, N ) );
float3 TangentY = cross( N, TangentX );
// Tangent to world space
return TangentX * H.x + TangentY * H.y + N * H.z;
}
 
// http://graphicrants.blogspot.com.au/2013/08/specular-brdf-reference.html
float GGX(float nDotV, float a)
{
float aa = a*a;
float oneMinusAa = 1 - aa;
float nDotV2 = 2 * nDotV;
float root = aa + oneMinusAa * nDotV * nDotV;
return nDotV2 / (nDotV + sqrt(root));
}
 
// http://graphicrants.blogspot.com.au/2013/08/specular-brdf-reference.html
float G_Smith(float a, float nDotV, float nDotL)
{
return GGX(nDotL,a) * GGX(nDotV,a);
}
 
// Environment BRDF
// http://www.unrealengine.com/files/downloads/2013SiggraphPresentationsNotes.pdf
float2 IntegrateBRDF( float Roughness, float NoV )
{
float3 V;
V.x = sqrt( 1.0f - NoV * NoV ); // sin
V.y = 0;
V.z = NoV;
// cos
float A = 0;
float B = 0;
const uint NumSamples = 1024;
[loop]
for( uint i = 0; i < NumSamples; i++ )
{
float2 Xi = Hammersley( i, NumSamples );
float3 H = ImportanceSampleGGX( Xi, Roughness, float3(0,0,1) );
float3 L = 2 * dot( V, H ) * H - V;
float NoL = saturate( L.z );
float NoH = saturate( H.z );
float VoH = saturate( dot( V, H ) );
[branch]
if( NoL > 0 )
{
float G = G_Smith( Roughness, NoV, NoL );
float G_Vis = G * VoH / (NoH * NoV);
float Fc = pow( 1 - VoH, 5 );
A += (1 - Fc) * G_Vis;
B += Fc * G_Vis;
}
}
return float2( A, B ) / NumSamples;
}
 
// Environment BRDF
// http://www.unrealengine.com/files/downloads/2013SiggraphPresentationsNotes.pdf
float4 ps_main(in ps_in i) : SV_TARGET0
{
float2 uv = i.uv;
float nDotV = uv.x;
float Roughness = uv.y;
 
float2 integral = IntegrateBRDF(Roughness, nDotV);
 
return float4(integral, 0, 1);
}
 



#5149249 Screenshot of your biggest success/ tech demo

Posted by Digitalfragment on 24 April 2014 - 06:20 PM

I did the material/lighting shaders and post processing on this:
http://i.imgur.com/HPZmJ.jpg

Likewise on the previous screenshot; i worked on the custom character creation tech, cutscene system, environment shaders, and degradation effects.




#5149243 IBL Problem with consistency using GGX / Anisotropy

Posted by Digitalfragment on 24 April 2014 - 05:24 PM

Thanks for the comment on the code sample, i'm glad it was helpful.
 
1. The logLUV endcoding was due to Maya 2012 not wanting to load fp16 dds files, and it was faster to implement the encode/decode than to work out Maya. (If anyone out there knows how to get Maya to load floating point dds files correctly without destroying mips too, let me know!)
 
- The quality difference between the logLUV in an RGBA8 target and the unencoded float16 wasn't bad either, and the bandwidth improvement of using 32bbp vs 64bpp is quite dramatic when still needing to target the DX9 generation of consoles.
 
2. The VS/GS trick is fun. I'll break it down into small chunks:
 
In directx 11, you can render without supplying vertex data. Instead SV_VERTEXID is automatically supplied and equal to the index of the vertex being generated. So in C++ i simply bind the shaders and call Draw(3,0) to draw a fullscreen triangle. See this link for more info on system-generated semantics:
 
 
- The reason to use a triangle and not a quad is to avoid the redundant processing of the pixel quads that run along the edge shared by the 2 triangles in the quad.
 
The GS pass is then taking this triangle and generating 6 triangles, and assigning them to output to individual textures in the texture array using SV_RENDERTARGETARRAYINDEX. This allows the C++ code to generate a single D3D11RenderTargetView for the cubemap per mip map, instead of creating an RTV for each individual face of each mip level. The GS cubemap trick was in one of the DirectX SDK samples, i can't remember which one though.
The code in the GS to work out the corner direction of the cubemap box was completely trial and error smile.png
 
An extra note on the shader, the seamles-cube-filtering concept isn't necessary when the generated file is going to be used on DX11, only DX9 as on the older generation cubemap texture filtering did not blend between cubemap faces (for example when sampling the 1x1x1 mip, you would only ever get a flat colour)
 
3. Yes definately, i hadn't gotten around to doing a diffuse irradiance map and instead hacked it to use the last mip of the probe with the vertex normal - which for our case seems good enough for now and saves generating double the number of probes at runtime.
 
 
The question about the LOD, the first is 0 which yields a roughness of 0 (perfect mirror), and (count-1) is so that the last generated mip map yields a roughness of 1 (lambert diffuse). Yes that would cause a divide by zero - but thats fine because a single mipmap level can't be both a roughness of 0 and a roughness of 1! Also, dont generate mipmaps all the way down to a 1x1, as you do need a little more precision at a roughness of 1.
 
Regarding the high frequency dots visible in your video, i can't say i ever got that problem - but i was using outdoor environment maps that tend to be a lot smoother and consistent in lighting values compared to that Grace Cathedral map. What sampler state are you using to sample your source cubemap with?



#5147569 IBL Problem with consistency using GGX / Anisotropy

Posted by Digitalfragment on 17 April 2014 - 12:56 AM

1. What does the function hammersley do ?
2. He's sampling the environment map here...is that a TextureCube ? Or is this function being run for each cube face as a Texture2D ?
3. The input to this is the reflection vector R. How would it be calculated in this context ? I imagine similar to the direction vector in the AMD cubemapgen ?

 

 

1. hammersley generates psuedo random, fairly well spaced 2-D coordinates, that the GGX importance sample function then gathers into a region thats going to contribute the most for the given roughness.
 

2. Its a cubemap, for a rough surface an entire hemisphere is required. The nice thing about using a cubemap as an input is that its easy to render one in realtime.

3. The function is run for every pixel of a cubemap rendertarget, its convolving the environment for all directions.

Heres my attempt at implementing this entire process as presented by Epic at Siggraph last year (if anyone can point out errors, that would be awesome).

It uses SV_VERTEXID to generate fullscreen quads, and a GS with SV_RENDERTARGETARRAYINDEX to output to all 6 faces of a cubemap rendertarget simultaneously.
 

 
struct vs_out
{
float4 pos : SV_POSITION;
float2 uv : TEXCOORD0;
};
 
void vs_main(out vs_out o, uint id : SV_VERTEXID)
{
o.uv = float2((id << 1) & 2, id & 2);
o.pos = float4(o.uv * float2(2,-2) + float2(-1,1), 0, 1);
//o.uv = (o.pos.xy * float2(0.5,-0.5) + 0.5) * 4;
//o.uv.y = 1 - o.uv.y;
}
 
struct ps_in
{
float4 pos : SV_POSITION;
float3 nrm : TEXCOORD0;
uint face : SV_RENDERTARGETARRAYINDEX;
};
 
float3 UvAndIndexToBoxCoord(float2 uv, uint face)
{
float3 n = float3(0,0,0);
float3 t = float3(0,0,0);
 
if (face == 0) // posx (red)
{
n = float3(1,0,0);
t = float3(0,1,0);
}
else if (face == 1) // negx (cyan)
{
n = float3(-1,0,0);
t = float3(0,1,0);
}
else if (face == 2) // posy (green)
{
n = float3(0,-1,0);
t = float3(0,0,-1);
}
else if (face == 3) // negy (magenta)
{
n = float3(0,1,0);
t = float3(0,0,1);
}
else if (face == 4) // posz (blue)
{
n = float3(0,0,-1);
t = float3(0,1,0);
}
else // if (i.face == 5) // negz (yellow)
{
n = float3(0,0,1);
t = float3(0,1,0);
}
 
float3 x = cross(n, t);
 
uv = uv * 2 - 1;
 
n = n + t*uv.y + x*uv.x;
n.y *= -1;
n.z *= -1;
return n;
}
 
[maxvertexcount(18)]
void gs_main(triangle vs_out input[3], inout TriangleStream<ps_in> output)
{
for( int f = 0; f < 6; ++f )
{
for( int v = 0; v < 3; ++v )
{
ps_in o;
o.pos = input[v].pos;
o.nrm = UvAndIndexToBoxCoord(input[v].uv, f);
o.face = f;
output.Append(o);
}
output.RestartStrip();
}
}
 
SamplerState g_samCube
{
    Filter = MIN_MAG_MIP_LINEAR;
    AddressU = Clamp;
    AddressV = Clamp;
};
TextureCube g_txEnvMap : register(t0);
 
cbuffer mip : register(b0)
{
float g_CubeSize;
float g_CubeLod;
float g_CubeLodCount;
};
 
 
// http://holger.dammertz.org/stuff/notes_HammersleyOnHemisphere.html
float radicalInverse_VdC(uint bits) {
     bits = (bits << 16u) | (bits >> 16u);
     bits = ((bits & 0x55555555u) << 1u) | ((bits & 0xAAAAAAAAu) >> 1u);
     bits = ((bits & 0x33333333u) << 2u) | ((bits & 0xCCCCCCCCu) >> 2u);
     bits = ((bits & 0x0F0F0F0Fu) << 4u) | ((bits & 0xF0F0F0F0u) >> 4u);
     bits = ((bits & 0x00FF00FFu) << 8u) | ((bits & 0xFF00FF00u) >> 8u);
     return float(bits) * 2.3283064365386963e-10; // / 0x100000000
 }
 // http://holger.dammertz.org/stuff/notes_HammersleyOnHemisphere.html
 float2 Hammersley(uint i, uint N) {
     return float2(float(i)/float(N), radicalInverse_VdC(i));
 }
 
 static const float PI = 3.1415926535897932384626433832795;
 
// Image-Based Lighting
// http://www.unrealengine.com/files/downloads/2013SiggraphPresentationsNotes.pdf
float3 ImportanceSampleGGX( float2 Xi, float Roughness, float3 N )
{
float a = Roughness * Roughness;
float Phi = 2 * PI * Xi.x;
float CosTheta = sqrt( (1 - Xi.y) / ( 1 + (a*a - 1) * Xi.y ) );
float SinTheta = sqrt( 1 - CosTheta * CosTheta );
float3 H;
H.x = SinTheta * cos( Phi );
H.y = SinTheta * sin( Phi );
H.z = CosTheta;
float3 UpVector = abs(N.z) < 0.999 ? float3(0,0,1) : float3(1,0,0);
float3 TangentX = normalize( cross( UpVector, N ) );
float3 TangentY = cross( N, TangentX );
// Tangent to world space
return TangentX * H.x + TangentY * H.y + N * H.z;
}
 
// M matrix, for encoding
const static float3x3 M = float3x3(
    0.2209, 0.3390, 0.4184,
    0.1138, 0.6780, 0.7319,
    0.0102, 0.1130, 0.2969);
 
// Inverse M matrix, for decoding
const static float3x3 InverseM = float3x3(
    6.0013,    -2.700,    -1.7995,
    -1.332,    3.1029,    -5.7720,
    .3007,    -1.088,    5.6268);    
 
float4 LogLuvEncode(in float3 vRGB)
{
    float4 vResult;
    float3 Xp_Y_XYZp = mul(vRGB, M);
    Xp_Y_XYZp = max(Xp_Y_XYZp, float3(1e-6, 1e-6, 1e-6));
    vResult.xy = Xp_Y_XYZp.xy / Xp_Y_XYZp.z;
    float Le = 2 * log2(Xp_Y_XYZp.y) + 127;
    vResult.w = frac(Le);
    vResult.z = (Le - (floor(vResult.w*255.0f))/255.0f)/255.0f;
    return vResult;
}
 
float3 LogLuvDecode(in float4 vLogLuv)
{
    float Le = vLogLuv.z * 255 + vLogLuv.w;
    float3 Xp_Y_XYZp;
    Xp_Y_XYZp.y = exp2((Le - 127) / 2);
    Xp_Y_XYZp.z = Xp_Y_XYZp.y / vLogLuv.y;
    Xp_Y_XYZp.x = vLogLuv.x * Xp_Y_XYZp.z;
    float3 vRGB = mul(Xp_Y_XYZp, InverseM);
    return max(vRGB, 0);
}
 
// Ignacio Castano via http://the-witness.net/news/2012/02/seamless-cube-map-filtering/
float3 fix_cube_lookup_for_lod(float3 v, float cube_size, float lod)
{
float M = max(max(abs(v.x), abs(v.y)), abs(v.z));
float scale = 1 - exp2(lod) / cube_size;
if (abs(v.x) != M) v.x *= scale;
if (abs(v.y) != M) v.y *= scale;
if (abs(v.z) != M) v.z *= scale;
return v;
}
 
// Pre-Filtered Environment Map
// http://www.unrealengine.com/files/downloads/2013SiggraphPresentationsNotes.pdf
float4 ps_main(in ps_in i) : SV_TARGET0
{
float3 N = fix_cube_lookup_for_lod(normalize(i.nrm), g_CubeSize, g_CubeLod);
float Roughness = (float)g_CubeLod / (float)(g_CubeLodCount-1);
 
float4 totalRadiance = float4(0,0,0,0);
 
const uint C = 1024;
[loop]
for (uint j = 0; j < C; ++j)
{
float2 Xi = Hammersley(j,C);
float3 H = ImportanceSampleGGX( Xi, Roughness, N );
float3 L = 2 * dot( N, H ) * H - N; 
float nDotL = saturate(dot(L, N));
[branch]
if (nDotL > 0)
{
float4 pointRadiance = (g_txEnvMap.SampleLevel( g_samCube, L, 0 ));
totalRadiance.rgb += pointRadiance.rgb * nDotL;
totalRadiance.w += nDotL;
}
}
 
return LogLuvEncode(totalRadiance.rgb / totalRadiance.w);
}
 



#5116045 [Terrain-RTS map] Advanced shadings for better detail!

Posted by Digitalfragment on 10 December 2013 - 05:39 PM

Blizzard & AMD published a whitepaper that covers many graphical aspects of Starcraft II:
http://developer.amd.com/wordpress/media/2012/10/S2008-Filion-McNaughton-StarCraftII.pdf




#5115787 Manually creating textures in Direct3D11

Posted by Digitalfragment on 09 December 2013 - 07:48 PM

The ID3D11Device::CreateTexture2D function has all of the functionality in there.
http://msdn.microsoft.com/en-us/library/windows/desktop/ff476521(v=vs.85).aspx

 

In particular, the source data is passed into CreateTexture2D via the D3D11_SUBRESOURCE_DATA structure as described in the MSDN page linked above.

How to handle mipmaps is also described in that page, under Remarks.

Note that png/jpeg are not native formats, and so cannot be used directly. They would have to be decompressed into a standard format, and then if you chose, recompressed into any of the BC formats etc.




#5115458 HLSL keywords in inout

Posted by Digitalfragment on 08 December 2013 - 04:17 PM

'in' IIRC, denotes a vertex input going into a main function in CG.

'out' acts just like 'out' in C#, no value is passed into the function, but the function must write a value to it, which then gets passed back to the calling function.

'inout' acts like 'ref' in C# as you guessed, the argument gets passed both ways, allowing the function to read and modify the value.




#5112577 Difference Tiled and Clustered shading

Posted by Digitalfragment on 27 November 2013 - 04:16 PM

Essentially yes. Though IIRC, the author of the original clustered shading whitepaper talks about adding in several dimensions, by also breaking the shading into similar angles.
So its (x/y) -> (position x/ position y/ position z/ normal x / normal z)




#5109069 How many influancing weights?

Posted by Digitalfragment on 13 November 2013 - 05:30 PM

 

DX11, PS4 and XBone games are trending towards 8 weights for the face

I wonder how they weight the mesh then. Do they use some procedural try&see tecnique? Considering other technologic facts, wheather the weights are dynamic or constant. I think they will go for per vertex animation in near future.

 

Check out the presentation "Ryse: Son of Rome - Defining the next gen"
http://www.crytek.com/cryengine/presentations

They do both skinning and vertex animation. Both have pros & cons, so take the best of both.




#5108357 Voxel Cone Tracing Experiment - Part 2 Progress

Posted by Digitalfragment on 10 November 2013 - 04:42 PM

 


Obviously your profiler is broken somehow, as I doubt your experiment manages to hold ever increasing data in the same exact amount of ram.

 

Actually I'm using the task manager to get the amount of ram that my application is using.

 

Sounds like you hit your video cards memory limit and the drivers are now using system memory - which is also why your frame rate tanks. Task Manager only shows system memory usage, not the memory internal to the video card.




#5108356 How to compute the bounding volume for an animated (skinned) mesh ?

Posted by Digitalfragment on 10 November 2013 - 04:37 PM

Calculate a bounding box for each joint in the body in the orientation of that joint, based only on the vertices that are skinned to that joint.

Then at runtime, project the joint-oriented bounding boxes into worldspace, taking the min and max of each of those bounding boxes.

 

There's no real reason to have an artist make the bounding volumes.




#5108353 How many influancing weights?

Posted by Digitalfragment on 10 November 2013 - 04:33 PM

DX11, PS4 and XBone games are trending towards 8 weights for the face, but staying with 4 for everywhere else.




#5104209 Spherical Harminics: Need Help :-(

Posted by Digitalfragment on 24 October 2013 - 04:58 PM


 

This what I don't understand from your reply: It's necessary to render the cube maps? I mean I understand that SH will help me "compress" the cube maps (becouse storing them for let's say 32x32x32 grid of probes would requires too much memory)....and then lookup them using SH...but wouldn't be too expensive create those cube maps in real time? I'm using RSM to store virtual point light of the scene...is it possible to store incoming radiance into SH without using Cube Maps?

Most video cards can render the scene into 6 128x128 textures pretty damn quick, given that most video cards can render complex scenes at resolutions of 1280x720 and up without struggling. You would then also perform the integration of the cubemap in a shader as well, instead of pulling the rendered textures down form the GPU and performing it on the CPU. This yields a table of colours in a very small texture (usually a 9 pixel wide texture) which can then be used in your shaders at runtime. You dont need to recreate your SH probes every frame either, so the cost is quite small.

The Stupid SH tricks paper also has functions for projecting directional, point and spot lights directly onto the SH coefficients table, however the cubemap integration is much more ideal as you can trivially incorporate area and sky lights.






PARTNERS