Shimmering / offset issues with shadow mapping

Started by
9 comments, last by Husbj 8 years, 2 months ago

I am in the somewhat early stages of adding shadow mapping for directional light sources to my engine and after having some issues with this I decided to follow the step-by-step implementation offered by Alex Tardif here: http://alextardif.com/ShadowMapping.html

Still so I'm getting about the same shimmering edges, as well as seeming frame-to-frame offsets in the whole depth map when moving the viewing camera.

Here's a short video to show the issues in action:

What strikes me as particularly odd is the fact that there are significantly less artifacts when moving the camera left-to-right, as opposed to forward / backwards. As is more understandable the most severe artifacts occur when changing the orientation of the camera.

Here's my relevant code if anybody can spot any obvious issues or things I've missed that may be considered universally as "obvious".

The code is intentionally as close a match to the one presented in Tardif's article as possible, even if this makes it a bit messier with the changes between XMFLOATX and XMVECTOR etc. I have also left out the second part of his article (or rather not gotten to it yet) which deals with downsampling and blurring, but I cannot see how this would have any relevance besides making the shadows appear smoother.

I am also only using a single cascade split for now, which arbitrarily spans the 1..400 depth range of the rendering camera (which roughly correponds to the size of my testing scene):


// Create a new projection matrix for the rendering camera (which is assumed to use perspective projection here) that 
// only stretches over the current cascade
XMMATRIX matCascadeProjection = XMMatrixPerspectiveFovLH(pRenderingCamera->GetFOV(), pRenderingCamera->GetAspect(), 1.0f, 300.0f);
XMVECTOR frustumCorners[8] = {
	XMVectorSet(-1.0f, 1.0f,  0.0f, 0.0f), 
	XMVectorSet(1.0f,  1.0f,  0.0f, 0.0f), 
	XMVectorSet(1.0f,  -1.0f, 0.0f, 0.0f), 
	XMVectorSet(-1.0f, -1.0f, 0.0f, 0.0f), 
	XMVectorSet(-1.0f, 1.0f,  1.0f, 0.0f), 
	XMVectorSet(1.0f,  1.0f,  1.0f, 0.0f), 
	XMVectorSet(1.0f,  -1.0f, 1.0f, 0.0f), 
	XMVectorSet(-1.0f, -1.0f, 1.0f, 0.0f)
};
// NOTE: The transpose part here seems rather useless; the tutorial mentions it is for being sent to the GPU, but this
//       particular matrix never is. Nevertheless, I'll do it like this to achieve the highest possible correspondence to the
//       article's code snippets. Furthermore, not using a transposed matrix (and obviously not using the TransformTransposed 
//       function) seems to give identical results. Try to remove the transpose part once everything seems to work as intended.
XMMATRIX matCamViewProj		= XMMatrixTranspose(pRenderingCamera->GetViewMatrix() * matCascadeProjection);
XMMATRIX matInvCamViewProj	= XMMatrixInverse(nullptr, matCamViewProj);
// Unproject frustum corners into world space
for(size_t n = 0; n < 8; n++) {
	XMFLOAT3 tmp;
	XMStoreFloat3(&tmp, frustumCorners[n]);
	tmp = util::TransformTransposedFloat3(tmp, matInvCamViewProj);
	frustumCorners[n] = XMLoadFloat3(&tmp);
}

// Find frustum center
XMFLOAT3 frustumCenter(0.0f, 0.0f, 0.0f);
{
	XMVECTOR v = XMLoadFloat3(&frustumCenter);
	for(size_t n = 0; n < 8; n++)
		v += frustumCorners[n];
	v *= (1.0f / 8.0f);
	XMStoreFloat3(&frustumCenter, v);
}

// Retrieve normalized light direction
XMVECTOR lightDirection = XMVector3Normalize(light->GetTransform().GetForwardVector());

// Determine the radius of the to-be orthographic projection as the distance between the farthest frustum corner points divided by two
float radius = XMVectorGetX(XMVector3Length((frustumCorners[0] - frustumCorners[6]))) / 2.0f;	// The length is copied into each element

// Figure out how many texels per world-unit will fit if we project a cube with the given "radius" (side length / 2)
float texelsPerUnit = (float)shadowMapWidth / (radius * 2);	// NOTE: The shadow map *must* be square!

// Build a scaling matrix to scale evenly in all directions to the number of texels per unit
XMMATRIX matScaling = XMMatrixScaling(texelsPerUnit, texelsPerUnit, texelsPerUnit);

// Create look-at vector and matrix by accounting for scaling (and later snapping) to the number of texels per unit
const XMVECTOR UpVector		= XMVectorSet(0.0f, 1.0f, 0.0f, 0.0f);
const XMVECTOR ZeroVector	= XMVectorSet(0.0f, 0.0f, 0.0f, 0.0f);
XMVECTOR vecBaseLookat		= XMVectorSet(-XMVectorGetX(lightDirection), - XMVectorGetY(lightDirection), -XMVectorGetZ(lightDirection), 0.0f);
XMMATRIX matLookat		= XMMatrixMultiply(XMMatrixLookAtLH(ZeroVector, vecBaseLookat, UpVector), matScaling);
XMMATRIX matInvLookat		= XMMatrixInverse(nullptr, matLookat);	// Take note that this will also undo the scaling effect imposed on «matLookat»!

// Now the above can be used to move the frustum center in texel-sized increments (when transformed by matLookat, and the 
// result can then be brought back into world-space by the inverse lookat matrix).
frustumCenter	= util::TransformFloat3(frustumCenter, matLookat);
frustumCenter.x	= (float)floor(frustumCenter.x);	// Clamp to texel increment (by rounding down)
frustumCenter.y = (float)floor(frustumCenter.y);	// Clamp to texel increment (by rounding down)
frustumCenter	= util::TransformFloat3(frustumCenter, matInvLookat);

// Calculate eye position by backtracking in the opposite light direction, ie. towards the light, by the cascade radius * 2
XMVECTOR eye = XMLoadFloat3(&frustumCenter) - (lightDirection * radius * 2.0f);

// Build the final light view matrix
XMMATRIX matLightView = XMMatrixLookAtLH(eye, XMLoadFloat3(&frustumCenter), UpVector);

// Build the light's projection matrix. This is intended to keep a consistent size and should therefore minimize
// shimmering edges due to per-frame matrix recalculations.
// The near- and far value multiplications are arbitrary and meant to catch shadow casters outside of the frustum, 
// whose shadows may extend into it. These should probably be better tweaked later on, but lets see if it at all works first.
const float zMod = 6.0f;
XMMATRIX matLightProj = XMMatrixOrthographicOffCenterLH(-radius, radius, -radius, radius, -radius * zMod, radius * zMod);


// Associate the current matrices with the light source (the shader side will need to know the "light matrix" to properly sample the shadow map(s))
light->SetCascadeViewProjectionMatrix(split, matLightView * matLightProj);

// Bind the corresponding shadow map for rendering by the special, global shadow mapping camera
gGlob.pShadowCamera->SetDepthStencilBuffer(shadowMap, (UINT)light->GetShadowMapIndex() + split);
// Set the shadow camera's matrices to those of the light source
gGlob.pShadowCamera->SetViewMatrix(matLightView);
gGlob.pShadowCamera->SetProjectionMatrix(matLightProj);

// Render the depth (shadow) map (note that this automatically clears the associated depth-stencil view)
gGlob.pShadowCamera->RenderDepthOnly(true);

The implementations of util::TransformFloat3 and util::TransformTransposedFloat3 are direct copies, with the change that they use the XMFLOAT3 struct instead of Vector3, of the implementations given by Tardif here: http://alextardif.com/code/transformvector3.txt

For completedness, here's my code for those as well:


inline XMFLOAT3 TransformFloat3(const XMFLOAT3& point, const XMMATRIX& matrix) {
	XMFLOAT3 result;
	XMFLOAT4 temp(point.x, point.y, point.z, 1);	// Need a 4-part vector in order to multiply by a 4x4 matrix
	XMFLOAT4 temp2;

	temp2.x = temp.x * matrix._11 + temp.y * matrix._21 + temp.z * matrix._31 + temp.w * matrix._41;
	temp2.y = temp.x * matrix._12 + temp.y * matrix._22 + temp.z * matrix._32 + temp.w * matrix._42;
	temp2.z = temp.x * matrix._13 + temp.y * matrix._23 + temp.z * matrix._33 + temp.w * matrix._43;
	temp2.w = temp.x * matrix._14 + temp.y * matrix._24 + temp.z * matrix._34 + temp.w * matrix._44;

	result.x = temp2.x / temp2.w;			// View projection matrices make use of the W component
	result.y = temp2.y / temp2.w;
	result.z = temp2.z / temp2.w;

	return result;
}

inline XMFLOAT3 TransformTransposedFloat3(const XMFLOAT3& point, const XMMATRIX& matrix) {
	XMFLOAT3 result;
	XMFLOAT4 temp(point.x, point.y, point.z, 1);	// Need a 4-part vector in order to multiply by a 4x4 matrix
	XMFLOAT4 temp2;

	temp2.x = temp.x * matrix._11 + temp.y * matrix._12 + temp.z * matrix._13 + temp.w * matrix._14;
	temp2.y = temp.x * matrix._21 + temp.y * matrix._22 + temp.z * matrix._23 + temp.w * matrix._24;
	temp2.z = temp.x * matrix._31 + temp.y * matrix._32 + temp.z * matrix._33 + temp.w * matrix._34;
	temp2.w = temp.x * matrix._41 + temp.y * matrix._42 + temp.z * matrix._43 + temp.w * matrix._44;

	result.x = temp2.x / temp2.w;			// View projection matrices make use of the W component
	result.y = temp2.y / temp2.w;
	result.z = temp2.z / temp2.w;

	return result;
}

Any light-shedding on what may be at fault here would be most welcome.

I can also provide my HLSL code if requested but I don't see how that can really be at fault since it doesn't do any offsetting or such and the shadow are after all projected where they should, were it not for the jumping around between frames. So I believe the fault should be with the depth (shadow) map rendering as outlined above.

The depth map is a slice in a Texture2DArray, 2048x2048 pixels in size and uses the DXGI_FORMAT_D32_FLOAT format. Naturally it has only a single mip slice.

Advertisement

Isn't TransformFloat3(...) the same as XMVector3TransformCoord(...), and TransformTransposedFloat3(...) just a XMVector3TransformCoord(..., XMMatrixTranspose(...))?

Maybe you could use those, just to rule out any errors in those functions?

.:vinterberg:.

Yes, they should be; I only used those to try to correspond as much as possible to the original code which I assumed would work.

Changing to using the XM functions instead actually makes the discrepancies when moving the camera forward / backward slightly smaller (no noticeable change when moving left-to-right), however it's still a far cry from being at acceptable levels, and the camera rotation is just as severe with either method so there must be something else at play.

Thanks for the suggestion though, I guess the XM transform versions are somehow a bit more accurate.

Update: I have noticed that the shimmering only occurs while changing the orientation / translation of the rendering camera. Moving the light source causes some edge shimmering but much less so than when the camera moves.

Now the thing about these camera movements is that as soon as the camera stops moving / rotating, the shadows stabilize on the next frame. In other words, if the shadow project in a certain way in frame 1, then the camera is moved over frames 2 - 10 and is again still at frame 11, the shadows will project to the same texels in frames 1 and 11, while flickering over frames 2 - 10. I'm starting to wonder if this could somehow be related to double buffering in that the shadow map is rendered with the "latest" view-projection matrix, but the shader lags one frame behind in using the matrix from the previous frame?
But it doesn't seem to make much sense if this would happen automatically; shouldn't the shader use the latest cbuffer values as well as the depth map resource rather than some old buffered copy of these? Or could something else cause this kind of behaviour?

Sorry for the double posting but I don't want to edit my previous post too many times.

So I managed to snapshot two consecutive frames using RenderDoc, and indeed; the light matrix used in the scene render is an exact match of that used in the depth render for the previous frame.

Here's my results for reference (only showing the last row of the view matrix since that's the only one changing):


21.43514	45.88803	482.8738	1.00		// Shadow render, frame 1
22.50115	45.81404	482.772		1.00		// Scene render, frame 1

129.3685	-68.23999	325.803		1.00		// Shadow render, frame 2
21.43514	45.88803	482.8738	1.00		// Scene render, frame 2 (NOTE: matches shadow render, frame 1!)

In light of this I guess my question changes to is this intended behaviour or am I messing something else up somewhere else for this to happen?

And if it is indeed supposed to work like this, is there some standard way of solving the issue? I'm going to look to verify that the buffers to indeed get updated to the GPU after changing so that it isn't as simple as I have accidentally managed to do the update before the per-frame changes, but I don't think so.


I'm starting to wonder if this could somehow be related to double buffering in that the shadow map is rendered with the "latest" view-projection matrix, but the shader lags one frame behind in using the matrix from the previous frame?

I'm thinking you're updating your shader cbuffer before it gets updated by your "updating-routine", thus sending frame-1 matrices into your shader...?

There's not any doublebuffering going on internally, you're the one who're responsible for supplying all data to your own shaders, so it must be your code somehow :)

.:vinterberg:.

Yes indeed, I was accidentally updating the light buffer (which includes the transforms) prior to the shadow map generation in each frame. rolleyes.gif

Once fixing that the shimmering improved significantly and now with cascades and PFC filtering implemented as well the quality is quite near perfect. There is still the occasional shimmer with certain light / camera direction combinations, but never in the closest cascade split, and it may be visually nicer if one could interpolate where the cascades change instead of having a hard edge (may there be any articles detailing this around somewhere by the way? I could probably achieve it on my own but I'm always wary when it comes to shaders since there usually tends to show up a solution that is 10x faster written by someone else further down the road).

But in conclusion, things are working better than I even dared to hope they would now so all good and thanks for your clarification on the absense of internal double buffering; I didn't think there would be but you never know :)

https://msdn.microsoft.com/en-US/enus/library/ee416307%28v=vs.85%29.aspx

This article refers to two DirectX samples (CascadedShadowMaps11 and VarianceShadows11), which shows how to do the blend thing - plus there's other useful info in there!

I used those samples when doing my own cascaded shadowmaps, it helped me figuring out to how to fit the frustums to the world (which is a good thing to do, to get most precision out of the shadowmaps) :)

.:vinterberg:.

By the way, regarding this part of your code:


// The near- and far value multiplications are arbitrary and meant to catch shadow casters outside of the frustum, 
// whose shadows may extend into it. These should probably be better tweaked later on, but lets see if it at all works first.
It's not necessary to pull back the shadow near clip in order to capture shadows from meshes that are outside the view frustum. You can handle this with a technique sometimes referred to as "pancaking", which flattens said meshes onto each cascade's near clip plane. See this thread for details. I recommend implementing by disabling depth clipping in the rasterizer state, since it avoids artifacts for triangles that intersect the near clip plane.

Thanks for the suggestions.
That pancaking trick is really clever MJP, thanks for pointing that out!

@Vinterberg: yes, I have actually referred to that article a lot, unfortunately it keeps at a rather high level for a lot of its parts where it doesn't really go into much detail. On other parts it contains quite extensive information though. I must however admit to never looking into the VSM part before.

I think I have managed to get most of it working now, so that is very exciting indeed, save for two small things.

The first is that I need to implement some culling of objects that need not be considered to render the shadow maps of (I haven't really looked into this yet and it is probably simple enough; I'd imagine I can just perform my usual frustum culling routine using a ortho view-projection matrix that has its near plane pulled back to the light source (or rather the length of the scene camera's z-range) but is otherwise the same as the light matrix of each cascade split?).
The second is how to get this shadow map interpolation working properly. I just whipped the following up for testing, it doesn't really create any visible difference from just leaving the interpolation part out alltogether, but am I going about this in the right way or would I be better off to change my approach?


                // If we're close enough to the edge of the selected cascade we should interpolate between this and the next one (assuming there is a next one)
		float lerpPos	= 0.0;
		float sFactor1	= light.shadowMapId != 0 ? SampleShadowFactor(light.shadowMapId + cascadeId, float3(lsTex, lsPos.z)) : 1.0;
		float sFactor2	= 0.0;

		// Use dynamic branching to not have to evaluate this for most pixel groups (thread groups)
		if(lsPos.z >= 0.95 && light.shadowMapId != 0 && cascadeId < 3) {
			float4 lp2	= mul(float4(P, 1.0), light.matViewProj[cascadeId + 1]);
			lp2		/= lp2.w;
			float2 lt2	= float2(lp2.x / 2 + 0.5, lp2.y / -2 + 0.5);
			sFactor2	= SampleShadowFactor(light.shadowMapId + cascadeId + 1, float3(lt2, lp2.z));
			lerpPos		= 1.0 - ((1.0 - lsPos.z) * 20);
		}

		attenuation = lerp(sFactor1, sFactor2, lerpPos);

Some clarifications about the above code snippet; if the shadowMapId is zero, the light source isn't casting any shadows so that's why that is checked for.

Furthermore, I have 4 cascades so that's why there is a test to ensure there is indeed a next one. lsPos.z is the current pixel's depth in the current cascade's [cascadeId] light space. 0.95 is just an arbitrarily chosen depth value at which to start interpolating with the next cascade. The lerpPos variable is zero when lsPos.z == 0.95 and goes to one when lsPos.z approaches 1.

The first is that I need to implement some culling of objects that need not be considered to render the shadow maps of (I haven't really looked into this yet and it is probably simple enough; I'd imagine I can just perform my usual frustum culling routine using a ortho view-projection matrix that has its near plane pulled back to the light source (or rather the length of the scene camera's z-range) but is otherwise the same as the light matrix of each cascade split?).


Yes, you can perform standard frustum/object intersection tests in order to cull objects for each cascade. Since the projection is orthographic, you can also treat the frustum as an OBB and test for intersection against that. Just be aware that if you use pancaking, then you have to treat the frustum as if it extended infinitely towards the light source. If you're going to cull by testing against the 6 planes of the frustum, then you can simply skip testing the near clip plane.

The second is how to get this shadow map interpolation working properly. I just whipped the following up for testing, it doesn't really create any visible difference from just leaving the interpolation part out alltogether, but am I going about this in the right way or would I be better off to change my approach?


Generally you want to determine if your pixel is at the "edge" of a cascade, using whichever method you use for partitioning the viewable area into your multiple cascades. You can have a look at my code for an example, if you'd like. In that sample app, the cascade is chosen using the view-space Z value (depth) of the pixel. It basically checks how far into the cascade the pixel is, and if it's in the last 10% of the depth range it starts to blend in the next cascade.

This topic is closed to new replies.

Advertisement