Jump to content

  • Log In with Google      Sign In   
  • Create Account

Help with GPU Pro 5 Hi-Z Screen Space Reflections

  • You cannot reply to this topic
49 replies to this topic

#1 Bruzer100   Members   -  Reputation: 164


Posted 12 July 2014 - 07:54 AM

Hi there!  I'm trying to implement chapter 4 of the Lighting and Shading section in GPU Pro 5.  Basically, how to optimize my screen space reflections using a mip-mapped Z buffer to quickly converge on the intersection point of my reflection ray.


Sadly, the author wasn't allowed to release the code/demo he talks about in the article, so I've had to work out most of the shader myself.  I'm close, but one thing I don't get - if you are always starting at a lower mip in the HiZ buffer, won't your starting ray depth often (always?) be _behind_ (greater Z) than what you read from HiZ buffer?  because you make the HiZ buffer taking the min(...) of the more detailed mip.


If you've implemented that chapter, or even just read and understood it, or using HiZ tracing before, I'd be curious to hear your thoughts.




#2 WFP   Members   -  Reputation: 687


Posted 12 July 2014 - 01:19 PM

Hi Bruzer100,

I was also preparing to start implementing something similar, and was disappointed to find that the code was unavailable, especially since several places in the chapter specifically tell the reader to consult the source.  I'm sure when I do start my implementation in the next few days, I'll probably run into similar issues like the ones you've come across, so if you wouldn't mind sharing what hurdles you've had to work around that weren't called out, or even want to share your implementation, it would be highly appreciated.  I'll bookmark this thread so that when I do start my implementation I can add anything I find to be valuable, especially if it was left out of the book's chapter.



#3 jgrenier   Members   -  Reputation: 246


Posted 11 August 2014 - 07:00 PM

Hi guys,


Same boat as you. Are you doing the ray marching in screen or view space?

On page 174, it's not clear to me how the function intersectDepthPlane should work:

float3 o = intersectDepthPlane(p.xy, d.xy, -p.z);

Is this a type-o or am I missing something? Should it be 'p.xyz' since it should be re-projecting the point onto the near plane (o.z = 0)? The method can't assume the point returned is always at z=0 since it's also used to calculate the tmpRay position during the ray march (which needs to keep track of the .z component)


I would expect the method to look like this: (?)

float3 intersectDepthPlane(float3 p, float2 d, float z)


   return p + float3(d, 1) * z;



Not sure how intersectCellBoundary works either with the crossStep and crossOffset... (why need these two helper variables? Why saturate the cross direction?)




#4 WFP   Members   -  Reputation: 687


Posted 12 August 2014 - 03:33 PM

Hi Jp,


Bruzer and I have spoken a few times since this topic started originally and he has been great in helping me almost figure this thing out.  For his sanity and for the good of the larger audience, it's probably best for us to bring the conversation back to this thread, though, so I'll post below what I've worked out so far with his help.  Also, the chapter in GPU Pro 1 by Michal Drobot on Quadtree Displacement Mapping is a big help in understanding this, and is what the author of this article based his ray-tracing steps on.  I still have some very major issues in my implementation (screenshots below), so I'm hoping that anyone reading over this may be able to help out and call me out on things I've done in a bone-headed way.


You'll notice in my implementation that some of the method arguments are a little different from what's in the book.  For example, I pass the full float3 vectors to intersectDepthPlane and some other methods.


Also, I've done some preliminary testing on doing a small (8 or so iterations) linear ray march before doing the hi-z traversal in order to reduce artifacts of immediate intersections and found that it did help, but due to the current state of my shader, I pulled those back out until the basic stuff was working.


I hope this helps, and again, please call out any blatant errors you see in my current implementation attempt, as they clearly exist.



This is the pixel shader in its current state.  Notice that currently I'm still trying to get the ray-tracing through the hi-z buffer part working, so I'm overwriting the cone-tracing output to be the equivalent to a cone angle of 0 (i.e., a perfectly smooth/mirror surface).

#include "HiZSSRConstantBuffer.hlsli"
#include "../../LightingModel/PBL/LightUtils.hlsli"
#include "../../ConstantBuffers/PerFrame.hlsli"
#include "../../ShaderConstants.hlsli"

struct VertexOut
	float4 posH : SV_POSITION;
	float3 viewRay : VIEWRAY;
	float2 tex : TEXCOORD;

SamplerState sampPointClamp : register(s0); // point sampling, clamped borders
SamplerState sampTrilinearClamp : register(s1); // trilinear sampling, clamped borders

Texture2D hiZBuffer : register(t0); // hi-z buffer - all mip levels
Texture2D visibilityBuffer : register(t1); // visibility buffer - all mip levels
Texture2D colorBuffer : register(t2); // convolved color buffer - all mip levels
Texture2D normalBuffer : register(t3); // normal buffer - from g-buffer
Texture2D specularBuffer : register(t4); // specular buffer - from g-buffer (rgb = ior, a = roughness)

static const float HIZ_START_LEVEL = 2.0f;
static const float HIZ_STOP_LEVEL = 2.0f;
static const float HIZ_MAX_LEVEL = float(cb_mipCount);
static const float2 HIZ_CROSS_EPSILON = float2(texelWidth, texelHeight); // maybe need to be smaller or larger? this is mip level 0 texel size
static const uint MAX_ITERATIONS = 64u;

float linearizeDepth(float depth)
	return projectionB / (depth - projectionA);

// Hi-Z ray tracing methods

static const float2 hiZSize = cb_screenSize; // not sure if correct - this is mip level 0 size

float3 intersectDepthPlane(float3 o, float3 d, float t)
	return o + d * t;

float2 getCell(float2 ray, float2 cellCount)
	// does this need to be floor, or does it need fractional part - i think cells are meant to be whole pixel values (integer values) but not sure
	return floor(ray * cellCount);

float3 intersectCellBoundary(float3 o, float3 d, float2 cellIndex, float2 cellCount, float2 crossStep, float2 crossOffset)
	float2 index = cellIndex + crossStep;
	index /= cellCount;
	index += crossOffset;
	float2 delta = index - o.xy;
	delta /= d.xy;
	float t = min(delta.x, delta.y);
	return intersectDepthPlane(o, d, t);

float getMinimumDepthPlane(float2 ray, float level, float rootLevel)
	// not sure why we need rootLevel for this
	return hiZBuffer.SampleLevel(sampPointClamp, ray.xy, level).r;

float2 getCellCount(float level, float rootLevel)
	// not sure why we need rootLevel for this
	float2 div = level == 0.0f ? 1.0f : exp2(level);
	return cb_screenSize / div;

bool crossedCellBoundary(float2 cellIdxOne, float2 cellIdxTwo)
	return cellIdxOne.x != cellIdxTwo.x || cellIdxOne.y != cellIdxTwo.y;

float3 hiZTrace(float3 p, float3 v)
	const float rootLevel = float(cb_mipCount) - 1.0f; // convert to 0-based indexing
	float level = HIZ_START_LEVEL;

	uint iterations = 0u;

	// get the cell cross direction and a small offset to enter the next cell when doing cell crossing
	float2 crossStep = float2(v.x >= 0.0f ? 1.0f : -1.0f, v.y >= 0.0f ? 1.0f : -1.0f);
	float2 crossOffset = float2(crossStep.xy * HIZ_CROSS_EPSILON.xy);
	crossStep.xy = saturate(crossStep.xy);

	// set current ray to original screen coordinate and depth
	float3 ray = p.xyz;

	// scale vector such that z is 1.0f (maximum depth)
	float3 d = v.xyz / v.z;

	// set starting point to the point where z equals 0.0f (minimum depth)
	float3 o = intersectDepthPlane(p, d, -p.z);

	// cross to next cell to avoid immediate self-intersection
	float2 rayCell = getCell(ray.xy, hiZSize.xy);
	ray = intersectCellBoundary(o, d, rayCell.xy, hiZSize.xy, crossStep.xy, crossOffset.xy);

	while(level >= HIZ_STOP_LEVEL && iterations < MAX_ITERATIONS)
		// get the minimum depth plane in which the current ray resides
		float minZ = getMinimumDepthPlane(ray.xy, level, rootLevel);
		// get the cell number of the current ray
		const float2 cellCount = getCellCount(level, rootLevel);
		const float2 oldCellIdx = getCell(ray.xy, cellCount);

		// intersect only if ray depth is below the minimum depth plane
		float3 tmpRay = intersectDepthPlane(o, d, max(ray.z, minZ));

		// get the new cell number as well
		const float2 newCellIdx = getCell(tmpRay.xy, cellCount);

		// if the new cell number is different from the old cell number, a cell was crossed
		if(crossedCellBoundary(oldCellIdx, newCellIdx))
			// intersect the boundary of that cell instead, and go up a level for taking a larger step next iteration
			tmpRay = intersectCellBoundary(o, d, oldCellIdx, cellCount.xy, crossStep.xy, crossOffset.xy); //// NOTE added .xy to o and d arguments
			level = min(HIZ_MAX_LEVEL, level + 2.0f);

		ray.xyz = tmpRay.xyz;

		// go down a level in the hi-z buffer


	return ray;


// Hi-Z cone tracing methods

float specularPowerToConeAngle(float specularPower)
	// based on phong reflection model
	const float xi = 0.244f;
	float exponent = 1.0f / (specularPower + 1.0f);
	 * may need to try clamping very high exponents to 0.0f, test out on mirror surfaces first to gauge
	 * return specularPower >= 8192 ? 0.0f : cos(pow(xi, exponent));
	return cos(pow(xi, exponent));

float isoscelesTriangleOpposite(float adjacentLength, float coneTheta)
	// simple trig and algebra - soh, cah, toa - tan(theta) = opp/adj, opp = tan(theta) * adj, then multiply * 2.0f for isosceles triangle base
	return 2.0f * tan(coneTheta) * adjacentLength;

float isoscelesTriangleInRadius(float a, float h)
	float a2 = a * a;
	float fh2 = 4.0f * h * h;
	return (a * (sqrt(a2 + fh2) - a)) / (4.0f * max(h, 0.00001f));

float4 coneSampleWeightedColor(float2 samplePos, float mipChannel)
	// placeholder - this is just to get something on screen
	float3 sampleColor = colorBuffer.SampleLevel(sampTrilinearClamp, samplePos, mipChannel).rgb;
	float visibility = visibilityBuffer.SampleLevel(sampTrilinearClamp, samplePos, mipChannel).r;

	return float4(sampleColor * visibility, visibility);

float isoscelesTriangleNextAdjacent(float adjacentLength, float incircleRadius)
	// subtract the diameter of the incircle to get the adjacent side of the next level on the cone
	return adjacentLength - (incircleRadius * 2.0f);


float4 main(VertexOut pIn) : SV_TARGET
	 * Ray(t) = O + D> * t
	 * D> = V>SS / V>SSz
	 * O = PSS + D> * -PSSz
	 * V>SS = P'SS - PSS
	 * PSS = {texcoord.x, texcoord.y, depth} // screen/texture coordinate and depth
	 * PCS = (PVS + reflect(V>VS, N>VS)) * MPROJ
	 * P'SS = (PCS / PCSw) * [0.5f, -0.5f] + [0.5f, 0.5f]
	int3 loadIndices = int3(pIn.posH.xy, 0);
	float depth = hiZBuffer.Load(loadIndices).r;
	// PSS
	float3 positionSS = float3(pIn.tex, depth);
	float linearDepth = linearizeDepth(depth);
	// PVS
	float3 positionVS = pIn.viewRay * linearDepth;

	// V>VS - since calculations are in view-space, we can just normalize the position to point at it
	float3 toPositionVS = normalize(positionVS);
	// N>VS
	float3 normalVS = normalBuffer.Load(loadIndices).rgb;
	if(dot(normalVS, float3(1.0f, 1.0f, 1.0f)) == 0.0f)
		return float4(0.0f, 0.0f, 0.0f, 0.0f);
	float3 reflectVS = reflect(toPositionVS, normalVS);
	float4 positionPrimeSS4 = mul(float4(positionVS + reflectVS, 1.0f), projectionMatrix);
	float3 positionPrimeSS = (positionPrimeSS4.xyz / positionPrimeSS4.w);
	positionPrimeSS.x = positionPrimeSS.x * 0.5f + 0.5f;
	positionPrimeSS.y = positionPrimeSS.y * -0.5f + 0.5f;

	// V>SS - screen space reflection vector
	float3 reflectSS = positionPrimeSS - positionSS;

	// calculate the ray
	float3 raySS = hiZTrace(positionSS, reflectSS);

	// perform cone-tracing steps

	// get specular power from roughness
	float4 specularAll = specularBuffer.Load(loadIndices);
	float specularPower = roughnessToSpecularPower(specularAll.a);

	// convert to cone angle (maximum extent of the specular lobe aperture
	float coneTheta = specularPowerToConeAngle(specularPower);

	// P1 = positionSS, P2 = raySS, adjacent length = ||P2 - P1||
	// need to check if this is correct calculation or not
	float2 deltaP = raySS.xy - positionSS.xy;
	float adjacentLength = length(deltaP);
	// need to check if this is correct calculation or not
	float2 adjacentUnit = normalize(deltaP);

	float4 totalColor = float4(0.0f, 0.0f, 0.0f, 0.0f);

	// cone-tracing using an isosceles triangle to approximate a cone in screen space
	for(int i = 0; i < 7; ++i)
		// intersection length is the adjacent side, get the opposite side using trig
		float oppositeLength = isoscelesTriangleOpposite(adjacentLength, coneTheta);

		// calculate in-radius of the isosceles triangle
		float incircleSize = isoscelesTriangleInRadius(adjacentLength, oppositeLength);

		// get the sample position in screen space
		float2 samplePos = pIn.tex.xy + adjacentUnit * (adjacentLength - incircleSize);

		// convert the in-radius into screen size then check what power N to raise 2 to reach it - that power N becomes mip level to sample from
		float mipChannel = log2(incircleSize * max(cb_screenSize.x, cb_screenSize.y)); // try this with min intead of max

		 * Read color and accumulate it using trilinear filtering and weight it.
		 * Uses pre-convolved image (color buffer), pre-integrated transparency (visibility buffer),
		 * and hi-z buffer (hiZBuffer).
		 * Checks if cone sphere is below, between, or above the hi-z minimum and maximum and weights
		 * it together with transparency (visibility).
		 * Visibility is accumulated in the alpha channel.  Break if visibility is 100% or greater (>= 1.0f).
		totalColor += coneSampleWeightedColor(samplePos, mipChannel);
		if(totalColor.a >= 1.0f)

		adjacentLength = isoscelesTriangleNextAdjacent(adjacentLength, incircleSize);

	// fake implementation while testing - overwrites entire cone tracing loop - equivalent of cone angle being 0.0f
	totalColor.rgb = colorBuffer.SampleLevel(sampPointClamp, raySS.xy, 0.0f).rgb;
	// end fake

	float3 toEye = -toPositionVS;
	// test this with saturate instead of abs, too - see which gives best result
	float3 specular = calculateFresnelTerm(specularAll.rgb, abs(dot(normalVS, toEye))) * RB_1DIVPI;

	return float4(totalColor.rgb * specular, 1.0f);


(EDIT: screenshots didn't show up so linking to Dropbox images instead)






Edited by WFP, 12 August 2014 - 03:40 PM.

#5 jgrenier   Members   -  Reputation: 246


Posted 13 August 2014 - 07:52 AM



Thanks that is fantastic. My favorite comment is "// not sure why we need rootLevel for this" (since I have the exact same comment in my own code) :D


On my side, I've finished writing all of the subroutines missing from the chapter. I've written them in mel first to see if they worked in Maya. Will port them to hlsl this pm to test with the shader. I'm also focusing on the hi-z raymarching as a starting point. I'm hoping the cone tracing passes will be more simple. Question, do you know what the author means by "The final demo uses minimum-maximum tracing which is a bi more complicated"? I'm not sure what he means by "maximum" tracing... When would we ever need the maximum depth value of a cell? Since we can only go so far as the cell's boundary for a march anyways. I'm scratching my head over this one : ) I'm only storing the minimum z value for now.


Here's the mel procedures I've written so far (seemed to be able to get the valid cell intersections in Maya...


Thanks again and will keep in touch!



proc float[] intersectDepthPlane(float $p[], float $d[], float $t)

    float $x = $p[0] + $d[0] * $t;
    float $y = $p[1] + $d[1] * $t;
    return {$x, $y};
proc float[] getCell(float $pos[], float $cellCount[])
    float $cellX = clamp(0, $cellCount[0] - 0.0001, $pos[0] * $cellCount[0]);
    float $cellY = clamp(0, $cellCount[1] - 0.0001, $pos[1] * $cellCount[1]);
    return {floor($cellX), floor($cellY)};
proc float[] intersectCellBoundary(float $pos[], float $dir[], float $cellId[], float $cellCount[], float $crossStep[], float $crossOffset[])
    float $cellWidth = 1.0/$cellCount[0];
    float $cellHeight = 1.0/$cellCount[1];
    float $xPlane = $cellId[0]/($cellCount[0]) + $cellWidth * $crossStep[0];
    float $yPlane = $cellId[1]/($cellCount[1]) + $cellHeight* $crossStep[1];
    float $tx = ($xPlane - $pos[0])/$dir[0];
    float $ty = ($yPlane - $pos[1])/$dir[1];
    float $t  = min($tx, $ty);
    float $intersection[] = intersectDepthPlane($pos, $dir, $t);
    return $intersection;
// Set the count info
float $cellCount[2] = {12,12};
// Get the origin info
float $ox = `getAttr o.translateX`;
float $oy = `getAttr o.translateY`;
float $dx = `getAttr d.translateX`;
float $dy = `getAttr d.translateY`;
// Get the direction info
float $ray[2] = {$dx, $dy};
float $d[2] = {$ray[0] - $ox, $ray[1] - $oy};
float $dl = sqrt($d[0] * $d[0] + $d[1] * $d[1]);
float $o[2] = {$ox, $oy};
float $d[2] = {$d[0]/$dl, $d[1]/$dl};
// Get the cross info
float $crossStep[2] = {1,1};
if($d[0] < 0)
    $crossStep[0] = -1; 
if($d[1] < 0)
    $crossStep[1] = -1;
float $eps = 0.0001;
float $crossOffset[2] = {$crossStep[0] * $eps, $crossStep[1] * $eps}; 
$crossStep[0] = clamp(0, 1, $crossStep[0]);
$crossStep[1] = clamp(0, 1, $crossStep[1]);
float $cellId[2] = getCell($ray, $cellCount);
print($cellId[0] + ", " + $cellId[1] + "\n");
$ray = intersectCellBoundary($o, $d, $cellId, $cellCount, $crossStep, $crossOffset);
// Display
catchQuiet(`delete intersection_ray_curve`);
catchQuiet(`curve -d 3 -p $o[0] $o[1] 0 -p $ray[0] $ray[1] 0 -k 0 -k 0 -k 0 -k 0 -name "intersection_ray_curve"`);
parent -r intersection_ray_curve directX;
select o;

#6 jgrenier   Members   -  Reputation: 246


Posted 13 August 2014 - 07:57 AM


static const float2 hiZSize = cb_screenSize; // not sure if correct - this is mip level 0 size

I was also wondering about this. To me it would make sense that this should be the size of the mip_level we are starting the ray march from. Since it is used to do the first cell boundary test, but the name of the variable seems to imply otherwise :) Not sure either. Feels great to have other people to chat about this :D


#7 WFP   Members   -  Reputation: 687


Posted 13 August 2014 - 08:16 AM

Hey Jp,

Great to see you're making some good headway on this. I'm looking forward to seeing how your translation to HLSL works out.

I realize I forgot to answer a question of yours yesterday, but I think you figured out the answer anyway - I am doing the ray marching in screen space.

Regarding the min-max tracing, what the author means is that when you're creating your hi-z buffer, you save not only the minimum depth value [min(min(value.x, value.y), min(value.z, value.w))], you also store the maximum value as well [max(max(value.x, value.y), max(value.z, value.w))]. What this gives you is a better estimation of the depth of the object at the pixel you're currently processing. You can use this in the ray-tracing pass to walk behind an object - if the current ray depth (from O + D * t) is outside the range of [min, max] at that pixel, you know it's not intersecting and can continue marching along that ray without further processing at the current position. I do not have this in my implementation yet, as I'm just trying to get the basics working first.

That's a good idea you had concerning the hiZSize, and this evening when I get back home (mine is a hobby project at the moment, so I work on it in my free time), I will try setting it to something like hiZSize = cb_screenSize / exp2(HI_Z_START_LEVEL). One of my issues could very well be that I'm not taking a large enough step away from the starting point to begin with.

Glad to have another person to bounce ideas back and forth with!


(Edit: formatting and grammar)

#8 Bruzer100   Members   -  Reputation: 164


Posted 13 August 2014 - 09:09 AM

I tried using different cell sizes (mip levels) to offset the inital ray starting point.  Even going 2 down wasn't enough.  In the end, I ended up biasing much like shadow maps, to clean up the initial self intersections.  It means the reflections aren't perfectly lined up, but you can't tell once there is a blur applied.


I think you are having issues not rejecting the "wrong" ray hits.  This wasn't talked about in the article, or maybe I missed it, but the hiZ trace can return you a screen space position that is incorrect for the reflection.  Think of a ray going behind an object floating above the ground.  The Z position of the ray will be far in back of Z position of the object, but the hiZ trace will return you the screen space position of where the ray and object first intersect.  In this case, you need to understand it's a nonsensical intersection, and keep tracing.  This is also where the max part of the min/max buffer comes in handy, since you will now be "behind" the object, which the implementation in the book doesn't cover.


The author says he does handle all that in his implementation that he never shows us.  :/  IMO it's kind of an incomplete article, and I'm disappointed the GPU Pro editors decided to include it knowing it was written against source code that couldn't be released.  Seems a bit disingenuous.


Also, I _had_ to apply temporal super sampling and temporal filtering to my results to get the effect to shipping quality.  The temporal stability of the technique is very poor - you'll see shimmering / aliasing on real-world scene with any depth complexity to it.

#9 WFP   Members   -  Reputation: 687


Posted 13 August 2014 - 05:41 PM



I've spent just a moment this evening so far working some more on this, and it does seem that changing the hiZSize to what was mentioned above helps out some.  It is now defined as

static const float2 hiZSize = cb_screenSize / exp2(HIZ_START_LEVEL); 

It still needs some refinement, but I feel confident that either a small linear search or a bias like Bruzer mentioned will help.


The main issues I'm facing now are these stair-step-like artifacts that show up.  I've attached links to two images showing what I'm talking about.


Any idea what causes artifacts like this?  Bruzer mentioned some stair-like artifacts he was seeing due to using a floor() command where he didn't need one, but I only have the one there to make sure the cell is an integer and removing that doesn't remove the artifacts.


I'm almost wondering if I'm not doing something in the "setup" code incorrectly - that is, the code leading up to the hiZTrace() call where I'm getting the position and ray direction.  Maybe someone could give it a once-over to see if I've missed something there?






Dropbox image links:



#10 jgrenier   Members   -  Reputation: 246


Posted 13 August 2014 - 07:16 PM

@WFP, looking good :D This seems much better! Yes, it does seem like a small linear search at the end (I guess 4 taps for the missing first 2 mip levels) will probably help. What is the resolution your rendering at? Power of 2? The hi-z pass I wrote this pm doesn't currently support buffers that aren't power of 2 (incorrectly gathers mins and maxs). If some cells have incorrect min/max info, you might have some ray misses. I didn't get very far with the ray march today though. Will continue tomorrow. If I'd have to bet, I would say your screen space pos and dir are corrrect. I don't think you'd get any good reflections that way.


@Bruzer, I sort of agree about the article. It really doesn't stand on it's feet without the code. For example, the article describes in great detail how to calculate a reflection vector in screenspace. Diagrams, of a reflected ray, description of what is a reflection, and then, when it gets to the actual reflection algo: "to understand the algorithm, look at the diagram on page blah". There's a whole lot of subtleties not captured neither by the pseudo code, the diagram, or the code snippet.


I also looked at the simple linear ray march mini program and wondered how that could handle the case where a ray goes "under" an object (the code as is would conclude a reflection is good as soon as the ray reaches anything in front of it)...


And yes about the temp filtering, I was anticipating the reflection pass to be unstable :( How many history buffers did you keep?


Well gentlemans, I'm very pleased I found this thread :)

Will post results as soon as I have anything.




#11 WFP   Members   -  Reputation: 687


Posted 13 August 2014 - 07:35 PM

Hey Jp,


I'm rendering at 1536x864 which I know is a bit of an unconventional resolution, but I use it to help ensure that my scenes and effects can be rendered without imposing limitations on windowed client dimensions.  The artifacts I mentioned above do still show up if I use more traditional dimensions like 1280x720 or 1920x1080.  I'm glad you mentioned using a power of 2 texture, though, because when I tested tonight by setting the output to 1024x512 and 1024x1024, the stair-like artifacts did seem to be alleviated, though some artifacts remain.  I wonder if I need to revisit the way I'm building my hi-z buffer due to not using power of two textures.  I will look into that tomorrow when I get some time and see if that helps out.





#12 WFP   Members   -  Reputation: 687


Posted 13 August 2014 - 08:41 PM

Actually, I got a little extra time this evening to work on it and wanted to update with a screenshot of running at 1024x512.  The power of two texture is clearly helping the stepping artifacts (I tried on several other power of 2 combinations, as well) so now I need to see what I can do to adapt that to non-power of two resolutions.  Any ideas?


The interlaced lines I'm confident can be repaired by updating the epsilon.  I haven't tested it much in my code yet, but I'm guessing just moving it closer to something like below will help a lot.

static const float2 HIZ_CROSS_EPSILON = float2(texelWidth, texelHeight) * exp2(HIZ_START_LEVEL);

There are a few other artifacts that appear under things like spheres and character arms that I think I can solve by using the min and max depth in combination with one another to walk behind objects - these are examples of the nonsensical intersections Bruzer mentioned.


Anyway, I'm calling it a night right now, but will be working more on it as soon as I get a chance.




Screenshot at 1024x512:


#13 WFP   Members   -  Reputation: 687


Posted 16 August 2014 - 12:29 PM

Only some minor updates to provide at the moment.  I've been able to confirm a few suspicions from my previous posts.


The first is that using power of two textures removes the stair-step artifacts.


The second is that setting the HIZ_CROSS_EPSILON to what I mentioned in the post above did indeed remove the interlaced lines.  I also found through testing though, that I could move the HIZ_START_LEVEL and HIZ_STOP_LEVEL to 0.0f and leave the epsilon to be the texel sizes and it would also remove the interlaced lines.  With either of these setups, the results were dramatically better and the only noticeable artifact in the ray-tracing portion is the nonsensical intersection stuff that can be solved by properly using the min/max buffer.  Here's what I landed on for HIZ_CROSS_EPSILON and it works well on both start/stop levels I've tested on (2.0f and 0.0f).

static const float2 HIZ_CROSS_EPSILON = float2(texelWidth, texelHeight) * exp2(HIZ_START_LEVEL + 1.0f);

I've included another screenshot to show the same scene (a stack of boxes - i.e., wonderful programmer art) with the interlaced lines gone.


If anyone has any ideas to get rid of my power of two texture size constraints that would be most appreciated.  The only thing I can think of right now is copying the necessary resources to a power of two texture before starting the effect, but I feel like that's bound to introduce its own set of problems, especially from copying the depth buffer to a different size.  Any other ideas?


Screenshot at 1024x512:



Edit:  grammar

Edited by WFP, 16 August 2014 - 12:30 PM.

#14 WFP   Members   -  Reputation: 687


Posted 16 August 2014 - 01:10 PM

OK so I've found one way to address the power of two issue, but I'm still very open to suggestions if anyone has any.  I've updated my getMinimumDepthPlane method (below) to always sample level 0.0f (largest mip level) at the texture coordinates provided.  This does seem to fix my issues with the stair-like artifacts, but it doesn't sit well with me because if this were the correct solution, why would the author have passed in level and rootLevel in the first place?  Anyway, the ray tracing steps currently work fairly well now (still need to address other artifacts using min/max values) at any resolution and stepping through in the VS2013 graphics debugger shows that it converges on a solution (for most pixels tested) in about 13 iterations (far less than my 64 limit).

float getMinimumDepthPlane(float2 ray, float level, float rootLevel)
	// not sure why we need rootLevel for this - for textures that are non-power-of-two, 0.0f works better
	return hiZBuffer.SampleLevel(sampPointClamp, ray.xy, 0.0f).r;

Screenshot at 1536x864:


#15 Bruzer100   Members   -  Reputation: 164


Posted 16 August 2014 - 02:29 PM

You might have some issues with that, once your scene has a bit more depth complexity.  You are basically ignoring most of the depth values in a cell, which if you are at a lower mip level, could be quite a lot.

#16 WFP   Members   -  Reputation: 687


Posted 16 August 2014 - 03:39 PM

Yep, and that's exactly why it sits so uneasy with me. Just haven't been able to get rid of those stepping artifacts otherwise yet. I'll keep at it and see if anything else will get rid of them.

#17 WFP   Members   -  Reputation: 687


Posted 19 August 2014 - 08:11 AM

Just wanted to check in on this thread.  Still haven't gotten much of anywhere removing the stair-like artifacts without forcing the mip level to 0 (which we know is wrong).  I tried using a trilinear sampler instead of the point sampler, but as I expected all that did was make the stair artifacts into slopes, but they still noticeably exists.


@jgrenier Have you had any time to port your code to HLSL and if so have you had any luck with it or experienced artifacts similar to what I'm seeing?


@Bruzer100 Could you tell us about the samplers you used during the different steps of building out your hi-z, convolution, and integration buffers?  I'm using a point sampler for everything but the cone-tracing step (which in my code is currently disabled), and am wondering if perhaps I'm using an incorrect addressing mode or border mode (I currently use clamped borders).




#18 jgrenier   Members   -  Reputation: 246


Posted 27 August 2014 - 09:39 AM

Quick question. Regarding the visibility buffer. Isn't there a "- minZ" missing on page 173. If this is to be the percentage of empty volume of a cell, it doesn't make sense to me that we do the integration with the fine values directly. i.e. integration should be:

float4 integration = (fineZ.xyzw - minZ) * abs(coarseVolume) * visibility.xyzw;

Or am I missing something?

#19 jgrenier   Members   -  Reputation: 246


Posted 27 August 2014 - 09:44 AM

Even the 4.9 (page 159) figure doesn't really make sense to me either. To me it looks like MIP-1 should have visibilities calculated as [25%, 100%] (since 1/4 of the first two MIP-0 cell is empty). I feel like I'm missing something here :(

#20 WFP   Members   -  Reputation: 687


Posted 27 August 2014 - 06:53 PM

Hey Jp,


Regarding your first question - honestly I'm not sure.  I seem to have "better" results when I use the code presented in the book for the visibility pass (although I do include a divide by 0 check on that first division).  That being said, I currently have the cone-tracing part of the technique disabled in mine as mentioned in one of my above comments as I'm still a ways from figuring out the ray marching part.  When I do enable it, the results are not even to the point where I think it would be useful to post an image of them, so there's a lot of work I need to do on it, but I haven't been spending much time or energy on it due to the issues I've been having getting the ray marching to work.


As for your second question, I think the book is correct in the diagram provided, if not a little confusing to look at.  The first four bars represent mip level 0 and all have 100% visibility.  The next two bars, the grey and the white, represent mip level 1, which they're obtaining by just accounting for the two nearest bars - halving the resolution of the first mip level (in the actual implementation, this is four values instead of two like shown in the book).  This gives the 50% and 100% values as shown.  And obviously along these lines the final blackish bar is the combination of the mip level 1 values into mip level 2.  When going down a mip level in the visibility buffer, the value can always be the same or less than the value before it in the visibility buffer mip chain, but never above that value.


If I've missed something or misunderstood your question, let me know and I'll try to update my explanation. :)