Help with GPU Pro 5 Hi-Z Screen Space Reflections

Bruce Wilkie · 2015-02-05T01:38:43

Hi there! I'm trying to implement chapter 4 of the Lighting and Shading section in GPU Pro 5. Basically, how to optimize my screen space reflections using a mip-mapped Z buffer to quickly converge on the intersection point of my reflection ray. Sadly, the author wasn't allowed to release the code/demo he talks about in the article, so I've had to work out most of the shader myself. I'm close, but one thing I don't get - if you are always starting at a lower mip in the HiZ buffer, won't your starting ray depth often (always?) be _behind_ (greater Z) than what you read from HiZ buffer? because you make the HiZ buffer taking the min(...) of the more detailed mip. If you've implemented that chapter, or even just read and understood it, or using HiZ tracing before, I'd be curious to hear your thoughts. thx!

Graphics and GPU Programming Programming

Started by Bruzer100 July 12, 2014 01:54 PM

77 comments, last by WFP 10 years ago

LukasBanana

263

November 04, 2014 04:45 PM

Ok I think I finally understood how the "intersectCellBoundary" function works.

But could it be, that the crossStep and crossOffset should be computed after the ray direction was 'normalized'?

I mean instead of this:


// get the cell cross direction and a small offset to enter the next cell when doing cell crossing
float2 crossStep = float2(v.x >= 0.0f ? 1.0f : -1.0f, v.y >= 0.0f ? 1.0f : -1.0f);
float2 crossOffset = float2(crossStep.xy * HIZ_CROSS_EPSILON.xy);
crossStep.xy = saturate(crossStep.xy);

// scale vector such that z is 1.0f (maximum depth)
float3 d = v.xyz / v.z;

Maybe this?


// scale vector such that z is 1.0f (maximum depth)
float3 d = v.xyz / v.z;

// get the cell cross direction and a small offset to enter the next cell when doing cell crossing
float2 crossStep = float2(d.x >= 0.0f ? 1.0f : -1.0f, d.y >= 0.0f ? 1.0f : -1.0f);
float2 crossOffset = float2(crossStep.xy * HIZ_CROSS_EPSILON.xy);
crossStep.xy = saturate(crossStep.xy);

Because when v.z is negative, d.x and d.y will be fliped.

I still don't get correct results, but it seems to be more logical for me right now.

UPDATE:

I finally found a problem: Currently the ray-marching through the Hi-Z buffer does not work correctly for NPOT (non-power-of-two) resolutions.

With the ray-marching visualization I noticed that the 'cells' where not located correctly.

As you can see in the screenshot below I now use temporarily a resolution of 512x512, where the cells are correct:

DevShot%2010%20%28Ray%20March%20Visualiz

I tried to modify the "GetCellCount" function to always have even 'counts':


vec2 GetCellCount(vec2 size, float level)
{
    return floor(size / (level > 0.0 ? exp2(level) : 1.0));
//  return       size / (level > 0.0 ? exp2(level) : 1.0) ;
}

But that didn't work.

And I still have wrong results when the ray dir Z is negative. But I stay tuned :-)

My YouTube Channel
My Twitter Profile
My Projects on github.com
My Projects on bitbucket.org

WFP

2,787

November 04, 2014 09:21 PM

Hey Lukas,

Regarding the NPOT restriction, this was a problem we worked through earlier in the thread. I'm traveling right now, so don't have time to read back through or do a big post here, but read back over the thread because I'm almost positive I posted a solution for that somewhere in there. If I remember correctly, I changed the way I was doing offsets while building out the various buffers. Let me know if you get it fixed. I'll re-read the thread when I get time and try to find the exact post if I get some time later and you haven't solved it yet.

Edit: Look at post #24. I've tested that on both an NVidia and laptop ATI cards so it should work regardless of which you use.

Thanks,
WFP

LukasBanana

263

November 05, 2014 05:47 PM

I used positive offsets for the Hi-Z Buffer generation from the very beginning of my implementation.

This is relevant part of my GLSL shader's main function to generate the Hi-Z maps:


/* Fetch texels from current MIP LOD */
ivec2 coord = ivec2(gl_FragCoord.xy) * ivec2(2);

vec4 texels[2];
texels[0].rg = texelFetch(tex, coord, 0).rg;
texels[0].ba = texelFetch(tex, coord + ivec2(1, 0), 0).rg;
texels[1].rg = texelFetch(tex, coord + ivec2(0, 1), 0).rg;
texels[1].ba = texelFetch(tex, coord + ivec2(1, 1), 0).rg;

/* Down-sample texels */
float minZ = min(min(texels[0].r, texels[0].b), min(texels[1].r, texels[1].b));
float maxZ = max(max(texels[0].g, texels[0].a), max(texels[1].g, texels[1].a));

BTW: I don't pass the current MIP level or any other data via constant buffer.

I just bind the current MIP level to render in and clamp the MIP level to sample from:


glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_BASE_LEVEL, mipLevel - 1);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAX_LEVEL, mipLevel - 1);

But that's not the solution for my problem.

I think the relation between the 'cells' and the 'texels', sampled from the Hi-Z buffer, is currently not correct for NPOT resolutions.

I presume the "intersectCellBoundary" function fails in this case when I pass the "cellCount" parameter.

My YouTube Channel
My Twitter Profile
My Projects on github.com
My Projects on bitbucket.org

WilliamHamilton

112

November 14, 2014 08:38 PM

Hello,

That is very cool to find other programmers with the same issues.

Currently I am working on 100% reflective surface, now I have similar result like @WFP:

http://www.gamedev.net/topic/658702-help-with-gpu-pro-5-hi-z-screen-space-reflections/#entry5173175

Strairs, noise... I cannot fix it like you.

For me I have lot of contradiction of the original paper for exemple:

Pseudo code on p163 use min and boundary plane and on implementation p174 he only call function named "getMinDepthPlane".

He test both min and boundary plane without ++ or -- on Mips not on p174 implementation.

On p160 he talk about importance of perspective interpolation intersectDepthPlane should be more complexe than simple return o + d*t; we need perspective interpolation on Depth buffer.

Other open question did you know how we can test if we are on "Failure cases":

http://bartwronski.com/2014/01/25/the-future-of-screenspace-reflections/

For failure case 1 it is pretty easy just test the UV but how can we do the test for case 2 and 3?

It is possible to have someone can share full code for hiZTracing? Or/And how to compute the parameters of HiZ?

Thanks

LukasBanana

263

November 15, 2014 09:34 PM

@WilliamHamilton:

On p160 he talk about importance of perspective interpolation intersectDepthPlane should be more complexe than simple return o + d*t; we need perspective interpolation on Depth buffer.

Sure, this function is really that simple. He explains we can simply linear-interpolate the post-projected Z, because we interpolate it in (post-projected) screen space.

Figure 4.10 on p.160 is helpful here.

It is possible to have someone can share full code for hiZTracing? Or/And how to compute the parameters of HiZ?

I can't do that right now, sorry. It's part of my bachelor thesis. When my work is done (expected next year), it will be possibly open-source.

@all:

I finally resolved the staircase artifacts with NPOT (non-power-of-two) resolutions!

Thanks WFP, your hint about the Hi-Z MIP generation was very helpful :-), but I have a slightly different solution.

Here I made a sketch for a single Hi-Z MIP generation pass (example shows generation of the MIP size 2x2 from the previous size 5x4):

Hi-Z%20MIP%20Generation.jpg

Instead of always sampling with the offsets (0, 0), (1, 0), (0, 1), (1, 1), I use an offset which is either 1 or 2.

Pseudocode:


tex.Load(coord + int2(0, 0));
tex.Load(coord + int2(offset.x, 0));
tex.Load(coord + int2(0, offset.y));
tex.Load(coord + int2(offset.x, offset.y));

As you can see I don't use "SampleLevel" (or "textureLodOffset" in GLSL), instead I use "Load" (or "texelFetch" in GLSL).

On the C++ side I update this offset before every MIP-Level generation pass:


// Check for width and height if it's even or odd.
constantBuffer.offset.x = (upperMIPLevelSize.width  % 2 == 0 ? 1 : 2);
constantBuffer.offset.y = (upperMIPLevelSize.height % 2 == 0 ? 1 : 2);

This works pretty good for me. Here is a comparision screenshot with my previous and current solution:

DevShot%2014%20%28Corrent%20and%20Incorr

This is a 800x600 resolution but I also tested totally flubbed sizes like 812x617

Hope this helps someone :-)

Greetings,

Lukas

My YouTube Channel
My Twitter Profile
My Projects on github.com
My Projects on bitbucket.org

WFP

2,787

November 17, 2014 03:22 AM

Hi WilliamHamilton,

Towards the end of the chapter, the author actually gives a few hints and examples to handle failure-type cases. Here's the code I currently have in to fade rays as they become unusable, which is basically right from the book. It could use a little touching up, but should get you on the right track.


        // fade pointing towards camera
	float3 reflectVS = reflect(toPositionVS, normalVS);
	float fadeOnMirror = dot(reflectVS, toPositionVS);
	
	// fade rays close to screen edge
	float2 boundary = abs(raySS.xy - float2(0.5f, 0.5f)) * 2.0f;

	const float FADE_START = cb_fadeStart;
	const float FADE_END = cb_fadeEnd;
	float fadeOnBorder = 1.0f - saturate((boundary.x - FADE_START) / (FADE_END - FADE_START));
	fadeOnBorder *= 1.0f - saturate((boundary.y - FADE_START) / (FADE_END - FADE_START));

	// fade on distance traveled
	// needs to be updated to use the cb_maximumDistance as part of its calculation
	float traveled = distance(raySS.xy, positionSS.xy);
	float fadeOnTravel = 1.0f - saturate((traveled - FADE_START) / (FADE_END - FADE_START));

	float totalFade = fadeOnMirror * fadeOnBorder * fadeOnTravel;

If you'd like help with other parts of your implementation, please post it here and I'll take a look at it when I get a moment to see if anything jumps out at me.

LukasBananas,

Glad to see you were able to resolve your step artifacts. Nice job!

Thanks,

WFP

WilliamHamilton

112

November 17, 2014 05:58 PM

I am more interested by your implementation of HiTrace.

My implementation (majority of the code is based on the implementation of you guy, mix):

mainPS:


    // Get Data
    float4 normal = float4( NormalTexture.Sample( NormalTextureSamp, uv ).xyz,             1.f );

    // Early discard: Remove the sky and surfaces without normals
    if ( ( dot( normal, float3( 1.f, 1.f, 1.f ) ) ) <= 0.f )
        return 0;

    float4 color  = ColorTextureTexture.Sample( ColorTextureSamp, uv );
    float4 depth  = float4( DepthTexture.Sample( DepthTextureSamp, uv ).xxx,               1.f );
    float4 vis    = float4( MaterialTexture.Sample( MaterialTextureSamp, uv ).xyz,         1.f );
    float4 hiZ    = float4( HiZDepthTexture.Sample( HiZDepthTextureSamp, uv ).xyz,         1.f );

    // World Normal To ViewSpace Normal
    float3 viewNormal = normalize( mul( NormalFromGNormal( normal.xyz ), ( float3x3 )ViewMatrix ) );

    float usedDepth = depth.z; // ( min, max, depth )

    // UV Space To Clip Space
    float4 projPos = float4( UVToClipXY( uv.xy ), usedDepth, 1.0 );

    // Clip Space to View Space
    projPos = mul( projPos, InvProjectionMatrix );

    float3 viewPos = projPos.xyz/projPos.w;

    float3 viewDir = normalize( viewPos );

    float3 viewReflect = normalize( reflect( viewDir, viewNormal ) );

    // Clip Plane result
    float4 screenReflectPos = mul( float4( viewPos + viewReflect, 1.0 ), ProjectionMatrix );

    /*
    if ( abs( screenReflectPos.w ) < 0.001f )
        return 0;
    */

    screenReflectPos.xyz /= screenReflectPos.w;

    // Back to UV Space
    screenReflectPos.xy = ClipXYToUV( screenReflectPos.xy );

    float3 screenPos = float3( uv, usedDepth );
    // Operation on UV Space
    float3 screenReflect = normalize( screenReflectPos.xyz - screenPos );

    if ( screenReflect.z < 0.f )
        return 0.f;

    float4 newUV = hiZTrace( screenPos, screenReflect );

    if ( newUV.x < 0.f || newUV.y < 0.f || newUV.x > 1.f || newUV.y > 1.f )
        return 0.f;

    /*
    float4 foundPos = float4( float2( -1.0, 1.0 ) + newUV.xy*float2( 2.0, -2.0 ), newUV.z, 1.0 );
    foundPos = mul( foundPos, InvProjectionMatrix );
    float3 viewFoundPos = foundPos.xyz/foundPos.w;

    float4 foundPosD = float4( float2( -1.0, 1.0 ) + newUV.xy*float2( 2.0, -2.0 ), depth.x, 1.0 );
    foundPosD = mul( foundPos, InvProjectionMatrix );
    float3 viewFoundPosD = foundPosD.xyz/foundPosD.w;
    */

    float4 reflectedColor = ColorTextureTexture.Sample( ColorTextureTextureSamp, newUV.xy );

    return reflectedColor;

And my HiTrace implementation:


float3 intersectDepthPlane( float3 o, float3 d, float t )
{
    return o + d*t;
}

float2 getCell(float2 ray, float2 cellCount)
{
    return floor( ray*cellCount );
}

float3 intersectCellBoundary(float3 o, float3 d, float2 cellIndex, float2 cellCount, float2 crossStep, float2 crossOffset)
{
    float2 index = cellIndex + crossStep;
    index /= cellCount;
    index += crossOffset;
    float2 delta = index - o.xy;
    delta /= d.xy;
    float t = min( delta.x, delta.y );
    return intersectDepthPlane( o, d, t );
}

float2 getCellCount(float level, float rootLevel)
{
    float2 div = ( level == 0.0f ? 1.0f : exp2( level ) );
    return floor( ViewportSize.xy/div );
}

float3 getMinimumDepthPlane( float2 ray, float level, float rootLevel )
{
    [branch]
    if ( level == 0 )
        return DepthTexture.tex.SampleLevel( DepthTexture.samp, ray.xy, 0.f ).xxx;
    else
        return HiZDepthTexture.tex.SampleLevel( HiZDepthTexture.samp, ray.xy, level ).xyz;
}

bool crossedCellBoundary( float2 cellIdxOne, float2 cellIdxTwo )
{
    return floor( cellIdxOne.x ) != floor( cellIdxTwo.x ) || floor( cellIdxOne.y ) != floor( cellIdxTwo.y );
}

float4 hiZTrace( float3 p, float3 v )
{
    const float rootLevel = HIZ_MAX_LEVEL - 1.0f;

    float level = HIZ_START_LEVEL;

    uint iterations = 0u;

    float3 d = v.xyz / v.z;

    float2 crossStep   = float2( d.x >= 0.f ? 1.0f : -1.0f, d.y >= 0.f ? 1.0f : -1.0f );
    float2 crossOffset = float2( crossStep.xy*HIZ_CROSS_EPSILON.xy );
    crossStep.xy       = saturate( crossStep.xy );

    float4 ray = float4( p.xyz, 0.f );

    float3 o = intersectDepthPlane( p, d, -p.z );

    float2 hiZSize = getCellCount( level, rootLevel );
    float2 rayCell = getCell( ray.xy, hiZSize );

    ray.xyz = intersectCellBoundary( o, d, rayCell.xy, hiZSize.xy, crossStep.xy, crossOffset.xy );

    float cumul = 0.f;

    [loop]
    while ( level >= HIZ_START_LEVEL && iterations < MAX_ITERATIONS )
    {
        crossStep    = float2( d.x >= 0.f ? 1.0f : -1.0f, d.y >= 0.f ? 1.0f : -1.0f );
        crossOffset  = float2( crossStep.xy*HIZ_CROSS_EPSILON.xy );
        crossStep.xy = saturate( crossStep.xy );

        const float2 cellCount  = getCellCount( level, rootLevel );
        const float2 oldCellIdx = getCell( ray.xy, cellCount );

        float3 zMinMax0 = getMinimumDepthPlane( ray.xy, level, rootLevel );
        //float3 zMinMax1 = getMinimumDepthPlane( ray.xy, min( level + 1, HIZ_MAX_LEVEL ), rootLevel );

        //float3 zClosest = min( zMinMax0, zMinMax1 );
        float3 zClosest = zMinMax0;

        float3 tmpRay = intersectDepthPlane( o, d, max( ray.z, zClosest.z ) );
        // get the new cell number as well
        const float2 newCellIdx = getCell( tmpRay.xy, cellCount );

        [branch]
        if ( crossedCellBoundary( oldCellIdx, newCellIdx ) )
        {
            tmpRay = intersectCellBoundary( o, d, oldCellIdx, cellCount.xy, crossStep.xy, crossOffset.xy );
            level = min( HIZ_MAX_LEVEL, level + 1.f );
        }
        else
        {
            level = max( level - 1.f, 0 );
        }

        ray.xyz = tmpRay.xyz;

        ++iterations;
    }

    return float4( ray.xyz, iterations );
}

I try with power of 2 back Buffer but I still have noise or stairs depend of steps...

If you see something wrong on my code I will be happy :)

I am still working on "RayTracing" without blurring...

Thanks

WilliamHamilton

112

November 17, 2014 06:10 PM

If you can't share HiTrace could you share a screen of each step to compare?

viewDir, screenReflected...? On ImageSpace?

WFP

2,787

November 18, 2014 12:56 AM

Hi WilliamHamilton,

There are a few noticeable differences I see in your implementation that I do not have in mine.

First, I let getCellCount return whatever the value is without taking the floor() of it. When I add floor like your implementation, I get a few block/stair artifacts in certain situations. I think it's okay to let it return the real value, including fractional parts.

For getMinimumDepthPlane, I'm not sure why you are returning a float3. At most, I think you would want a float2 (x for min and y for max from the hiz-buffer). In the code you posted, you're using the z-component of that value, so what are you storing there? If you look at the implementations for constructing the HiZ Buffer in this thread, I think you'll find that you really only need two values.

In crossedCellBoundary, I don't use floor() for any of my comparison values and it seems to work fine. It may not be an error, but you can probably remove it - test with and without to see what works.

You can also move your crossStep and crossOffset calculations out of the loop, as what you posted will generate the same values for them every iteration.

What values are you using for HIZ_START_LEVEL, HIZ_STOP_LEVEL, HIZ_MAX_LEVEL, HIZ_CROSS_EPSILON, and MAX_ITERATIONS?

Thanks,

WFP

WilliamHamilton

112

November 18, 2014 05:14 PM

Hello thanks for you answer,

@WFP:

Indeed without the floor on my GetCell that fix the issue of stairs.

GetMinDepthPlane give me min/max and depth at (0,0) offset. I use this value for debug & test. But I test with the min and max now and I got the same result with:

float3 tmpRay = intersectDepthPlane( o, d, max( ray.z, zMinMax.y ) ); // (or zMinMax0.x)

If we don't use a floor so we compare directly floats and that work :) ok !

For now Start and Stop are 1 or 2 CrossEpsilon like you I think: PixelSize.xy*exp2( HIZ_START_LEVEL + 1.f );

And Max iteration sadly it is between 512 and 1024.

I try to do like you

http://www.gamedev.net/topic/658702-help-with-gpu-pro-5-hi-z-screen-space-reflections/page-3#entry5187536

To have the "fix" for the "A Shape" like feets. I use directly the pixels with Load but that break the reflection and I have lot of self collision and when I use:

float3 tmpRay = intersectDepthPlane( o, d, max( ray.z, max( zMinMax.x, zMinMax.y ) ) ); // First Min to Max

Instead, my reflection back (but with more artefact than before) it is possible my Z is not like your. Me I have 0 near and 1 far... (Sorry I can share screens).

The Switch to the Load instead SampleLevel look like a floor no?

Last question :)

For the fading I am not sure about the fading for travel distance. A ray can travel the distance he want, for exemple to reflect the sky, I am not sure is good information to remove the "Ray Hit" of "Projection Shadow".

I was try with iteration count, with ray.z VS real depth... I don't found (yet) a good information for that.

Thanks

William

Help with GPU Pro 5 Hi-Z Screen Space Reflections

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Help with GPU Pro 5 Hi-Z Screen Space Reflections

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines