# Help with GPU Pro 5 Hi-Z Screen Space Reflections

## Recommended Posts

Hi Bruzer100,

I was also preparing to start implementing something similar, and was disappointed to find that the code was unavailable, especially since several places in the chapter specifically tell the reader to consult the source.  I'm sure when I do start my implementation in the next few days, I'll probably run into similar issues like the ones you've come across, so if you wouldn't mind sharing what hurdles you've had to work around that weren't called out, or even want to share your implementation, it would be highly appreciated.  I'll bookmark this thread so that when I do start my implementation I can add anything I find to be valuable, especially if it was left out of the book's chapter.

Thanks,

WFP

##### Share on other sites

Hi guys,

Same boat as you. Are you doing the ray marching in screen or view space?

On page 174, it's not clear to me how the function intersectDepthPlane should work:

float3 o = intersectDepthPlane(p.xy, d.xy, -p.z);

Is this a type-o or am I missing something? Should it be 'p.xyz' since it should be re-projecting the point onto the near plane (o.z = 0)? The method can't assume the point returned is always at z=0 since it's also used to calculate the tmpRay position during the ray march (which needs to keep track of the .z component)

I would expect the method to look like this: (?)

float3 intersectDepthPlane(float3 p, float2 d, float z)

{

return p + float3(d, 1) * z;

}

Not sure how intersectCellBoundary works either with the crossStep and crossOffset... (why need these two helper variables? Why saturate the cross direction?)

Cheers!

Jp

##### Share on other sites

Also:

static const float2 hiZSize = cb_screenSize; // not sure if correct - this is mip level 0 size

I was also wondering about this. To me it would make sense that this should be the size of the mip_level we are starting the ray march from. Since it is used to do the first cell boundary test, but the name of the variable seems to imply otherwise :) Not sure either. Feels great to have other people to chat about this :D

Jp

##### Share on other sites
Hey Jp,

Great to see you're making some good headway on this. I'm looking forward to seeing how your translation to HLSL works out.

I realize I forgot to answer a question of yours yesterday, but I think you figured out the answer anyway - I am doing the ray marching in screen space.

Regarding the min-max tracing, what the author means is that when you're creating your hi-z buffer, you save not only the minimum depth value [min(min(value.x, value.y), min(value.z, value.w))], you also store the maximum value as well [max(max(value.x, value.y), max(value.z, value.w))]. What this gives you is a better estimation of the depth of the object at the pixel you're currently processing. You can use this in the ray-tracing pass to walk behind an object - if the current ray depth (from O + D * t) is outside the range of [min, max] at that pixel, you know it's not intersecting and can continue marching along that ray without further processing at the current position. I do not have this in my implementation yet, as I'm just trying to get the basics working first.

That's a good idea you had concerning the hiZSize, and this evening when I get back home (mine is a hobby project at the moment, so I work on it in my free time), I will try setting it to something like hiZSize = cb_screenSize / exp2(HI_Z_START_LEVEL). One of my issues could very well be that I'm not taking a large enough step away from the starting point to begin with.

Glad to have another person to bounce ideas back and forth with!

-WFP

(Edit: formatting and grammar)

##### Share on other sites

I tried using different cell sizes (mip levels) to offset the inital ray starting point.  Even going 2 down wasn't enough.  In the end, I ended up biasing much like shadow maps, to clean up the initial self intersections.  It means the reflections aren't perfectly lined up, but you can't tell once there is a blur applied.

I think you are having issues not rejecting the "wrong" ray hits.  This wasn't talked about in the article, or maybe I missed it, but the hiZ trace can return you a screen space position that is incorrect for the reflection.  Think of a ray going behind an object floating above the ground.  The Z position of the ray will be far in back of Z position of the object, but the hiZ trace will return you the screen space position of where the ray and object first intersect.  In this case, you need to understand it's a nonsensical intersection, and keep tracing.  This is also where the max part of the min/max buffer comes in handy, since you will now be "behind" the object, which the implementation in the book doesn't cover.

The author says he does handle all that in his implementation that he never shows us.  :/  IMO it's kind of an incomplete article, and I'm disappointed the GPU Pro editors decided to include it knowing it was written against source code that couldn't be released.  Seems a bit disingenuous.

Also, I _had_ to apply temporal super sampling and temporal filtering to my results to get the effect to shipping quality.  The temporal stability of the technique is very poor - you'll see shimmering / aliasing on real-world scene with any depth complexity to it.

##### Share on other sites

Hi,

I've spent just a moment this evening so far working some more on this, and it does seem that changing the hiZSize to what was mentioned above helps out some.  It is now defined as

static const float2 hiZSize = cb_screenSize / exp2(HIZ_START_LEVEL);

It still needs some refinement, but I feel confident that either a small linear search or a bias like Bruzer mentioned will help.

The main issues I'm facing now are these stair-step-like artifacts that show up.  I've attached links to two images showing what I'm talking about.

Any idea what causes artifacts like this?  Bruzer mentioned some stair-like artifacts he was seeing due to using a floor() command where he didn't need one, but I only have the one there to make sure the cell is an integer and removing that doesn't remove the artifacts.

I'm almost wondering if I'm not doing something in the "setup" code incorrectly - that is, the code leading up to the hiZTrace() call where I'm getting the position and ray direction.  Maybe someone could give it a once-over to see if I've missed something there?

Thanks,

WFP

https://www.dropbox.com/s/3rby7uwi2vugnw0/screenshot_4.png

https://www.dropbox.com/s/hs6ygesn1ez7sw4/screenshot_4_annotated.png

##### Share on other sites

@WFP, looking good :D This seems much better! Yes, it does seem like a small linear search at the end (I guess 4 taps for the missing first 2 mip levels) will probably help. What is the resolution your rendering at? Power of 2? The hi-z pass I wrote this pm doesn't currently support buffers that aren't power of 2 (incorrectly gathers mins and maxs). If some cells have incorrect min/max info, you might have some ray misses. I didn't get very far with the ray march today though. Will continue tomorrow. If I'd have to bet, I would say your screen space pos and dir are corrrect. I don't think you'd get any good reflections that way.

@Bruzer, I sort of agree about the article. It really doesn't stand on it's feet without the code. For example, the article describes in great detail how to calculate a reflection vector in screenspace. Diagrams, of a reflected ray, description of what is a reflection, and then, when it gets to the actual reflection algo: "to understand the algorithm, look at the diagram on page blah". There's a whole lot of subtleties not captured neither by the pseudo code, the diagram, or the code snippet.

I also looked at the simple linear ray march mini program and wondered how that could handle the case where a ray goes "under" an object (the code as is would conclude a reflection is good as soon as the ray reaches anything in front of it)...

And yes about the temp filtering, I was anticipating the reflection pass to be unstable :( How many history buffers did you keep?

Will post results as soon as I have anything.

Cheers,

Jp

##### Share on other sites

Hey Jp,

I'm rendering at 1536x864 which I know is a bit of an unconventional resolution, but I use it to help ensure that my scenes and effects can be rendered without imposing limitations on windowed client dimensions.  The artifacts I mentioned above do still show up if I use more traditional dimensions like 1280x720 or 1920x1080.  I'm glad you mentioned using a power of 2 texture, though, because when I tested tonight by setting the output to 1024x512 and 1024x1024, the stair-like artifacts did seem to be alleviated, though some artifacts remain.  I wonder if I need to revisit the way I'm building my hi-z buffer due to not using power of two textures.  I will look into that tomorrow when I get some time and see if that helps out.

Thanks,

WFP

##### Share on other sites

Actually, I got a little extra time this evening to work on it and wanted to update with a screenshot of running at 1024x512.  The power of two texture is clearly helping the stepping artifacts (I tried on several other power of 2 combinations, as well) so now I need to see what I can do to adapt that to non-power of two resolutions.  Any ideas?

The interlaced lines I'm confident can be repaired by updating the epsilon.  I haven't tested it much in my code yet, but I'm guessing just moving it closer to something like below will help a lot.

static const float2 HIZ_CROSS_EPSILON = float2(texelWidth, texelHeight) * exp2(HIZ_START_LEVEL);


There are a few other artifacts that appear under things like spheres and character arms that I think I can solve by using the min and max depth in combination with one another to walk behind objects - these are examples of the nonsensical intersections Bruzer mentioned.

Anyway, I'm calling it a night right now, but will be working more on it as soon as I get a chance.

-WFP

Screenshot at 1024x512:

https://www.dropbox.com/s/3uvq0mrczps6vc0/screenshot_5.png

##### Share on other sites

Only some minor updates to provide at the moment.  I've been able to confirm a few suspicions from my previous posts.

The first is that using power of two textures removes the stair-step artifacts.

The second is that setting the HIZ_CROSS_EPSILON to what I mentioned in the post above did indeed remove the interlaced lines.  I also found through testing though, that I could move the HIZ_START_LEVEL and HIZ_STOP_LEVEL to 0.0f and leave the epsilon to be the texel sizes and it would also remove the interlaced lines.  With either of these setups, the results were dramatically better and the only noticeable artifact in the ray-tracing portion is the nonsensical intersection stuff that can be solved by properly using the min/max buffer.  Here's what I landed on for HIZ_CROSS_EPSILON and it works well on both start/stop levels I've tested on (2.0f and 0.0f).

static const float2 HIZ_CROSS_EPSILON = float2(texelWidth, texelHeight) * exp2(HIZ_START_LEVEL + 1.0f);


I've included another screenshot to show the same scene (a stack of boxes - i.e., wonderful programmer art) with the interlaced lines gone.

If anyone has any ideas to get rid of my power of two texture size constraints that would be most appreciated.  The only thing I can think of right now is copying the necessary resources to a power of two texture before starting the effect, but I feel like that's bound to introduce its own set of problems, especially from copying the depth buffer to a different size.  Any other ideas?

Screenshot at 1024x512:

https://www.dropbox.com/s/7qfs04tvx8vpf09/screenshot_6.png

Edit:  grammar

Edited by WFP

##### Share on other sites

OK so I've found one way to address the power of two issue, but I'm still very open to suggestions if anyone has any.  I've updated my getMinimumDepthPlane method (below) to always sample level 0.0f (largest mip level) at the texture coordinates provided.  This does seem to fix my issues with the stair-like artifacts, but it doesn't sit well with me because if this were the correct solution, why would the author have passed in level and rootLevel in the first place?  Anyway, the ray tracing steps currently work fairly well now (still need to address other artifacts using min/max values) at any resolution and stepping through in the VS2013 graphics debugger shows that it converges on a solution (for most pixels tested) in about 13 iterations (far less than my 64 limit).

float getMinimumDepthPlane(float2 ray, float level, float rootLevel)
{
// not sure why we need rootLevel for this - for textures that are non-power-of-two, 0.0f works better
return hiZBuffer.SampleLevel(sampPointClamp, ray.xy, 0.0f).r;
}


Screenshot at 1536x864:

https://www.dropbox.com/s/eo2wkiz87bswgz5/screenshot_7.png

##### Share on other sites

You might have some issues with that, once your scene has a bit more depth complexity.  You are basically ignoring most of the depth values in a cell, which if you are at a lower mip level, could be quite a lot.

##### Share on other sites
Yep, and that's exactly why it sits so uneasy with me. Just haven't been able to get rid of those stepping artifacts otherwise yet. I'll keep at it and see if anything else will get rid of them.

##### Share on other sites

Just wanted to check in on this thread.  Still haven't gotten much of anywhere removing the stair-like artifacts without forcing the mip level to 0 (which we know is wrong).  I tried using a trilinear sampler instead of the point sampler, but as I expected all that did was make the stair artifacts into slopes, but they still noticeably exists.

@jgrenier Have you had any time to port your code to HLSL and if so have you had any luck with it or experienced artifacts similar to what I'm seeing?

@Bruzer100 Could you tell us about the samplers you used during the different steps of building out your hi-z, convolution, and integration buffers?  I'm using a point sampler for everything but the cone-tracing step (which in my code is currently disabled), and am wondering if perhaps I'm using an incorrect addressing mode or border mode (I currently use clamped borders).

Thanks,

WFP

##### Share on other sites

Quick question. Regarding the visibility buffer. Isn't there a "- minZ" missing on page 173. If this is to be the percentage of empty volume of a cell, it doesn't make sense to me that we do the integration with the fine values directly. i.e. integration should be:

float4 integration = (fineZ.xyzw - minZ) * abs(coarseVolume) * visibility.xyzw;

Or am I missing something?

##### Share on other sites

Even the 4.9 (page 159) figure doesn't really make sense to me either. To me it looks like MIP-1 should have visibilities calculated as [25%, 100%] (since 1/4 of the first two MIP-0 cell is empty). I feel like I'm missing something here :(

##### Share on other sites

Hey Jp,

Regarding your first question - honestly I'm not sure.  I seem to have "better" results when I use the code presented in the book for the visibility pass (although I do include a divide by 0 check on that first division).  That being said, I currently have the cone-tracing part of the technique disabled in mine as mentioned in one of my above comments as I'm still a ways from figuring out the ray marching part.  When I do enable it, the results are not even to the point where I think it would be useful to post an image of them, so there's a lot of work I need to do on it, but I haven't been spending much time or energy on it due to the issues I've been having getting the ray marching to work.

As for your second question, I think the book is correct in the diagram provided, if not a little confusing to look at.  The first four bars represent mip level 0 and all have 100% visibility.  The next two bars, the grey and the white, represent mip level 1, which they're obtaining by just accounting for the two nearest bars - halving the resolution of the first mip level (in the actual implementation, this is four values instead of two like shown in the book).  This gives the 50% and 100% values as shown.  And obviously along these lines the final blackish bar is the combination of the mip level 1 values into mip level 2.  When going down a mip level in the visibility buffer, the value can always be the same or less than the value before it in the visibility buffer mip chain, but never above that value.

If I've missed something or misunderstood your question, let me know and I'll try to update my explanation. :)

-WFP

##### Share on other sites

If we adjust the algorithm described on page 173 to work with the 2d case, here's what I get for the first cell of MIP-1:

fineZ.x = 100
fineZ.y = 50
minZ = 50
maxZ = 100
coarseVolume = 1/(100 - 50) = 1/50
visibility.x = 1
visibility.y = 1
integration = [100, 50] * 1/50 * [1,1] = [100/50, 50/50] = [2,1]
coarseIntegration = dot(0.5, [2,1]) = 1.5

Which is not a percentage of visibility but a ratio between the fineZ values and the (maxZ-minZ) diff.

Thinking about it, if really the output is meant to be "the percentage of empty voxel volume relative to the total volume of the cells", then (I think) we should calculate the integration value as:

float4 integration = (fineZ.xyzw/maxZ) * visibility.xyzw;

That way, the 2d example would give:
fineZ.x = 100
fineZ.y = 50
maxZ = 100
visibility.x = 1
visibility.y = 1
integration = [100, 50] / 100 * [1,1] = [100/100, 50/100] = [1,0.5]
coarseIntegration = dot(0.5, [1,0.5]) = 0.75

Which seems to be correct? i.e. 25% of the cell is occluded. Then I get the following results for the following cases:

Does this make sense?

Edited by jgrenier

##### Share on other sites

Hey Jp,

Thanks for the explanation and seeing it drawn out helped clarify for me a lot what you were getting at.  I definitely think you're onto something, and your output looks inline with what I would expect from the visibility buffer.  Whenever I get to the cone-tracing step you can bet that I'll try out what you've got above and see where that puts things.  Thanks for the update!

-WFP

##### Share on other sites

Hi everyone. I'm now stuck with the cone pass.

From what I understand, the cone tracing consist of sampling each circles along the cone (from the reflection point to the to the reflection incident point, i.e. from the big circles to the small circles) and approximating the integral by weighting each sample appropriately. The article says something like "we intersect each sphere with the hi-z structure and weight them by finding out how much they are in front, in the middle or behind the coarse depth cells".... I've tried and and got rubbish:

I've tried both of the following ways to weight in the cone's spheres. Got equally crappy results (min/max represent the value stored in the hi-z structure at a particular mip-level)

I'm would think that the sphere intersection test with the hi-z structure needs to be done with the linear min and max depth value. One thing that confuses me, is that in the article the weighting function only takes the 2d position of the current sphere center (as well as the mip_level for that 2d pos). How is the width of the _sphere_ calculated then?!? I would have though that you would need to project the current circle in view space to then do a sphere intersection test with the coarse depth cells?

That part really confuses me.

Here's an example of the case that needs to be solved. (note: you can see the color values that will be fetched by each sphere, the bigger the sphere, the blurrier the fetch). This is the case where a reflection approaches an edge. On the top image the ray hits the sphere, and on the bottom image, the ray hits the background and all of the contribution comes from the edge sphere (so the reflection result is the blurred sky). The question is how should the weight of each of those spheres be set so that there is a smooth transition between the left and the right reflection?

By hardcoding the weight of the first sphere to 1, we get a hit of what kind of results we could expect (minus the crapy hard edges)

Any thoughts? What am I not getting? Grrrr.

Jp

##### Share on other sites

So I've finally figured out what was causing the stair artifacts we were seeing when running in anything but power of two texture sizes.  In the article, the author uses offsets of -1 in obtaining his other three points for comparison, but it turns out that, at least for my NVIDIA card (760 GTX), the opposite needed to be true.  Using offsets in the positive direction (see below) alleviated the stair artifacts that were showing up.  There seems to be an implementation difference in how ATI and NVIDIA cards handle this, because the code worked with -1 offsets on the ATI card that it was tested on.  I still need to follow up to make sure changing the sign to positive doesn't break the technique on those cards, but at the very least, at least we have an answer for what was causing it. :)  I've posted the modified HiZ buffer construction pixel shader that I use below, as well as a screenshot running at 1536x864 with no stair artifacts showing up.  Next steps are filtering this buffer to fill in the tiny artifacts/gaps that show up (as well as temporal stability, etc., eventually), and then applying the cone-tracing step, which Jp is doing some great work on. :)

-WFP

HiZ_PS.hlsl:

struct VertexOut
{
float4 posH : SV_POSITION;
float2 tex : TEXCOORD;
};

SamplerState sampPointClamp : register(s0);

Texture2D hiZBuffer : register(t0);

float2 main(VertexOut pIn) : SV_TARGET
{
float2 texcoords = pIn.tex;
float4 minDepth = 0.0f;
float4 maxDepth = 0.0f;

// sample level zero since only one mip level is available with the bound SRV
float2 tx = hiZBuffer.SampleLevel(sampPointClamp, texcoords, 0.0f, int2(0, 0)).rg;
minDepth.r = tx.r;
maxDepth.r = tx.g;

float2 ty = hiZBuffer.SampleLevel(sampPointClamp, texcoords, 0.0f, int2(0, 1)).rg;
minDepth.g = ty.r;
maxDepth.g = ty.g;

float2 tz = hiZBuffer.SampleLevel(sampPointClamp, texcoords, 0.0f, int2(1, 0)).rg;
minDepth.b = tz.r;
maxDepth.b = tz.g;

float2 tw = hiZBuffer.SampleLevel(sampPointClamp, texcoords, 0.0f, int2(1, 1)).rg;
minDepth.a = tw.r;
maxDepth.a = tw.g;

return float2(
min(min(minDepth.r, minDepth.g), min(minDepth.b, minDepth.a)),
max(max(maxDepth.r, maxDepth.g), max(maxDepth.b, maxDepth.a)));
}


##### Share on other sites

Thinking about it, if really the output is meant to be "the percentage of empty voxel volume relative to the total volume of the cells", then (I think) we should calculate the integration value as:

Reading page 172/173, I think visibility is supposed to be "the percentage of empty space within the minimum and maximum of a depth cell" modulated with the visibility of the previous mip.

So I also think that there is an error on the pre-integration pass, but the correct code would be:
float4 integration = (fineZ.xyzw - minZ) * abs (coarseVolume) * visibility.xyzw;

This makes MIP 1 on page 159 diagram correct but I still have no idea how the 37.5% visibility on MIP 2 was calculated.

Can one of you try the line of code above in your implementation and see how it looks? I haven't had time to implement the article myself.

Btw, has anyone tried to contact the article author about the source code? I wasn't able to find it anywhere.

Edited by TiagoCosta

## Create an account

Register a new account

• ### Forum Statistics

• Total Topics
628334
• Total Posts
2982147

• 9
• 24
• 9
• 9
• 13