Shadow Map Silhouette Revectorization (SMSR)

Programming

Graphics and GPU Programming

Published June 15, 2016 by Vladimir Bondarev, posted by VladimirBondarev

Do you see issues with this article? Let us know.

Shadow mapping is known for its comparability with rendering hardware, low implementation complexity and ability to handle any kind of geometry. However, aliasing is also a very common problem in shadow mapping. Projection and perspective aliasing are the two main discontinuity types which deteriorate projected shadow quality. Since the introduction of shadow mapping, many clever algorithms have been developed to reduce or even completely remove shadow map aliasing. Algorithms which are targeted to remove aliasing completely are unfortunately not compatible with current GPU architecture to run in real-time and usually serve as hardware change proposals (LogPSM, Irregular Z-Buffer Technique). Some algorithms which run in real-time are focused on optimal sample re-distribution (PSM, TSM, LiSPSM, CSM) and others are serve as filtering techniques (VSM, PCF, BFSM). Shadow Map Silhouette Revectorization (SMSR) is a filtering technique which re-approximates shadow silhouette based on MLAA implementation. SMSR consists of two main passes and a final merge pass (in total three passes). First pass searches for discontinuity information. Second pass determines discontinuity length and orientation where its translated into normalized xy-space. The xy-space is used to perform a simple linear line interpolation which eventually determines the new edge. The third and final pass merges new edges on top of the regular shadow map, resulting in a smoother shadow. Figure 1: From left to right, revectorization process.

[subheading]First pass[/subheading]

Figure 2: Compressed silhouette discontinuity. Inside the projected shadow map, we find shadow discontinuity by offsetting projected coordinates by one shadow map sample in all 4 directions (left, top, right, bottom). The discontinuity is compressed into a single value per axis (red channel for horizontal and green channel for vertical discontinuity) which then is used in the second pass. Compression layout: 0.0 = no discontinuity 0.5 = depending on which axis, to left or to bottom 0.75 = discontinuity in on both directions. 1.0 = depending on which axis, to right or to top Fragment Shader - First pass:


struct FBO_FORWARD_IN
{
	float4 color0		: 	COLOR0;
};


float Visible(sampler2D	inShadowMap, float inShadowMapXY, float4 inProjectedCoordinate, int2 inOffset)
{
	return tex2Dproj( inShadowMap, inProjectedCoordinate + float4(inOffset,0,0) * (1.0f/inShadowMapXY) ).r;
}


float2 Disc(sampler2D inShadowMap, float inShadowMapXY, float4 inProjectedCoordinate)
{
	float center	= Visible(inShadowMap, inShadowMapXY, inProjectedCoordinate, float2(0,0));

	float right		= abs(Visible(inShadowMap, inShadowMapXY, inProjectedCoordinate, int2(1,0))	- center) * center;
	float left		= abs(Visible(inShadowMap, inShadowMapXY, inProjectedCoordinate, int2(-1,0))- center) * center;
	float top		= abs(Visible(inShadowMap, inShadowMapXY, inProjectedCoordinate, int2(0,-1))- center) * center;
	float bottom	= abs(Visible(inShadowMap, inShadowMapXY, inProjectedCoordinate, int2(0,1))	- center) * center;
		
	float4 disc		= float4(left, right, bottom, top);
	
	/*
		Compress results:
		0.0f	= no discontinuity
		0.5f	= depending on which axis, to left or to bottom
		0.75f	= discontinuity to on both sides.
		1.0f	= depending on which axis, to right or to top
	*/
	float2 dxdy		= 0.75f + (-disc.xz + disc.yw) * 0.25f;	
	
	// Step filters out axis where no discontinuities are found
	return dxdy * step(1.0f, float2(dot(disc.xy, 1.0f), dot(disc.zw, 1.0f)));
}

FBO_FORWARD_IN main(float4				inPos			: POSITION,
					uniform sampler2D	inSampler1		: TEXUNIT1,			// In Shadow Map
					uniform sampler2D	inTexPosition	: TEXUNIT2,			// Buffer containing from camera-space world coordinates
					float2				inUV			: TEXCOORD0,		// Current fragment
					uniform float4x4	inMatrixShadowLightProjectionBias,	// Light View Matrix
					uniform float		inConst0,							// Bias
					uniform float		inConst1							// Shadow-map width & height
					)
{
	FBO_FORWARD_IN	outFBO;

	float4 color				= float4(0,0,0,0);
	float3 pos					= tex2D(inTexPosition,	inUV).xyz;			// World position. Can be reconstructed from depth and inverse camera-projection-matrix
	
	// Projected depth-map coordinates, between 0 and 1
	float4 biasOffset			= float4(0,0, inConst0, 0);
	float4 projectedCoordinate	= mul(inMatrixShadowLightProjectionBias, float4(pos, 1.0f)) + biasOffset;
	
	// Clip everything outside shadow map rectangle
	// How is this performance wise? can we optimize it.
	if(	projectedCoordinate.x >= 0.0f ||
		projectedCoordinate.y >= 0.0f ||
		projectedCoordinate.x <= 1.0f ||
		projectedCoordinate.y <= 1.0f)
	{
		color.rg = Disc(inSampler1, inConst1, projectedCoordinate);
	}
	
	outFBO.color0 = color;
	return outFBO;
}

[subheading]Second pass[/subheading] The second is the most important and demanding pass. It determines discontinuity length, orientation and fragment position which is translated into orientated normalized discontinuity (xy-coordinate) space. The normalized space is then used to perform linear line interpolation which determines the new silhouette edge. The main challenge is to correctly approximate discontinuity length from the screen-space in projected light-space. In order to approximate the discontinuity length, we have to determine how many screen-space pixels we have to offset from original screen-space position to where neighboring axis-aligned shadow-map sample can be found on the screen-space. The new screen-space position is determined by taking current fragment world-space into projected light space, applying offset, fetching and combining correct depth and then projecting it back onto the screen-space. This is the most demanding step, and perhaps performance-wise also the most expensive. After we know where our neighboring shadow-map sample is located on the screen-space, we step until we find a break in discontinuity. The discontinuity break can be initiated by exceeding delta-depth threshold, by reaching maximum search distance or by reaching a non-discontinuity sample on opposite axis. By performing this iteration in screen-space, we can approximate the discontinuity length of projected shadow.

Figure 3: Orientated Normalized Discontinuity Space. Knowing the discontinuity length and the discontinuity-begin, we can determine orientated normalized discontinuity space position. Once we have that information, we can fill or clip away a new shadow silhouette.

Figure 4: Clipped Area from Orientated Normalized Discontinuity Space. On low sample density area, merging new silhouette information with lighting or image buffer will result in a smoother shadow edge.

Figure 5: Red indicates new shadow silhouette. Fragment Shader - Second pass:


struct FBO_FORWARD_IN
{
	float4 color0		: 	COLOR0;
};

#define MAX_SEARCH		8


float IsBegin(float2 inDisc)
{
	float2 d = (inDisc - 0.75f) * 4.0f;
	if(inDisc.x == 0.0f)
		d.x = 0.0f;
	if(inDisc.y == 0.0f)
		d.y = 0.0f;
	return saturate(step(1.0f, dot(abs(d), 1.0f) * 0.5f));
}


float Search(	sampler2D	inTexPosition,
				sampler2D	inTexDisc,
				sampler2D	inShadowMap,
				float		inShadowMapXY,
				float4x4	inLight,
				float4x4	inCamera,
				float4x4	inProjection,
				float2		inCurrScrCoord,
				float2		inInitialDisc,
				int2		inDir)
{
	float dist				= 1.0f;
	float finalDist			= 0.0f;
	float2 currScrCoord		= inCurrScrCoord;
	float2 disc				= tex2D(inTexDisc, currScrCoord).xy;
	float foundBeginEdge	= IsBegin(disc);
	float initialDisc		= dot(inInitialDisc.yx*abs(inDir), 1.0f);
	float invShadowMapXY	= 1.0f / inShadowMapXY;
	
	for(int i=0; i invShadowMapXY)										// Filters for depth discontinuity with given threshold
			break;

		newProjCoord.z		= newDepth;
		float4 wpos			= mul(inverse(inLight), newProjCoord);					// transform the coordinate back to world-space coordinate
		float4 camSpace		= mul(inCamera, wpos);									// transform world space to camera space coordinate
		float4 projSpace	= mul(inProjection, camSpace);							// transform camera space to projection space
		currScrCoord		= ((projSpace.xy / projSpace.w) + 1.0f) * 0.5f;			// transform projection space to UV post-processing space
				
		disc				= tex2D(inTexDisc, currScrCoord).xy;					// Fetch target discontinuity
		if(dot(disc.yx*abs(inDir), 1.0f) != initialDisc)							// Break if discontinuity changes
			break;
			
		foundBeginEdge		= saturate(foundBeginEdge + IsBegin(disc));				// Check if current sample is the begin of discontinuity
		dist				+= 1.0f;												// Increment offset length
	}
	
	
	return lerp(-dist, dist, foundBeginEdge);										// if edge is not found, we are moving away from the begin point and the dist is negative.
}


float InterpolationPos(float2 inEdge, float inAxisDisc, float inSubCoord)
{
	// x - left
	// y - right
	
	/*
			x:-3			y:7
		 <--------p----------------------->
		-------------------------------------
		|	|	| p |	|	|	|	|	| B |
		-------------------------------------
		  1   2   3   4   5   6   7   8   9

		Length	= abs(x) + abs(y) - 1	= 9
	*/
	
	float edgeLength		= min(dot(abs(inEdge), 1.0f) - 1.0f, MAX_SEARCH);					// From directional edge length search, we initially add 1.0f to each direction. By subtracting 1 we correct the sum length.
			
	// Sub-sample coordinates, only added to positive edge direction
	float2 subCoord			= float2( lerp(0, 1-inSubCoord, saturate(inEdge.x)), lerp(0, inSubCoord, saturate(inEdge.y))) / edgeLength;
	
	// One-Length discontinuity, we have to handle it separately, since the discontinuity has "begin" in both directions
	float edgeLengthIsOne	= step(0.0f, 1.001f - edgeLength);									// Filter: if larger than 1.0f (+ epsilon)
	float2 dirFilter		= float2(step(inAxisDisc, 0.0f), step(-inAxisDisc, 0.0f));			// Filter: Sub sample coordinates are only added if the discontinuity begins from correct side.
	subCoord				= subCoord * lerp(float2(1.0f), dirFilter, edgeLengthIsOne);		// Apply filter
	
	float2 p				= (1.0f - (inEdge) / (edgeLength)) * step(0.0f, inEdge) + subCoord;	// Create normalized coordinate
	return max(p.x, p.y);
}


FBO_FORWARD_IN main(float4				inPos			: POSITION,
					uniform sampler2D	inSampler0		: TEXUNIT0,			// Pass 1 data
					uniform sampler2D	inSampler1		: TEXUNIT1,			// Shadow map
					uniform sampler2D	inSampler2		: TEXUNIT2,			// Final Image, should be lighting buffer
					uniform sampler2D	inTexPosition	: TEXUNIT3,			// World coordinates (very bad, 96 bit buffer!)
					float2				inUV			: TEXCOORD0,
					uniform float4x4	inMatrixShadowLightProjectionBias,	// Light View Matrix
					uniform float4x4	inMatrixProjection,					// Projection matrix
					uniform float4x4	inMatrixCamera,						// Camera matrix
					uniform float		inConst1							// Shadow-map width & height
					)
{
	//discard;
	FBO_FORWARD_IN outFBO;
	
	outFBO.color0		= float4(0,0,0,0);
	float2 disc			= tex2D(inSampler0, inUV).rg;
	float fill			= 0.0f;
	
	// Discard non-discontinuity fragments
	float discSum		= dot(disc, 1.0f);
	if(discSum > 0.0f)
	{		
		/*
			We are looking starting point marked with X:
			
			|-----------------------|
			|///|///|///|///|///|///|
			|-----------------------|
			|///| X |	|	|	|	|
			|-----------------------|
			|///|	|	| X |///|	|
			|-----------------------|
			|///|	|///|///|///|	|
			|-----------------------|
			|///|	|	|	|	|	|
			|-----------------------|
			
			X - From where we should begin the search
			
			The starting point X contains on both axis 0.5f or 1.0f.
			Decoded code:
			0.5f = 0.5f - 0.75 * 4.0f = -1.0f
			1.0f = 1.0f - 0.75 * 4.0f = 1.0f
			
			0.0f	= no discontinuity
			0.5f	= depending on channel, to left or to bottom
			0.75f	= discontinuity in both directions
			1.0f	= depending on channel, to right or to top
		*/
		
		
		float3 pos			= tex2D(inTexPosition, inUV).xyz;													// World space coordinate
		float2 discDir		= (disc - 0.75f) * 4.0f;
		
		// Fetch discontinuity length
		float left			= Search( inTexPosition, inSampler0, inSampler1, inConst1, inMatrixShadowLightProjectionBias, inMatrixCamera, inMatrixProjection, inUV, disc, int2(-1,0));
		float right			= Search( inTexPosition, inSampler0, inSampler1, inConst1, inMatrixShadowLightProjectionBias, inMatrixCamera, inMatrixProjection, inUV, disc, int2(1,0));
		float up			= Search( inTexPosition, inSampler0, inSampler1, inConst1, inMatrixShadowLightProjectionBias, inMatrixCamera, inMatrixProjection, inUV, disc, int2(0,1));
		float down			= Search( inTexPosition, inSampler0, inSampler1, inConst1, inMatrixShadowLightProjectionBias, inMatrixCamera, inMatrixProjection, inUV, disc, int2(0,-1));
		
		// Create sub sample discontinuity normalized coordinates
		float4 subCoord			= frac(mul(inMatrixShadowLightProjectionBias, float4(pos, 1.0f)) * inConst1);	// Local sub-sample normalized coordinate
		float2 normalizedCoordinate;
		normalizedCoordinate.x	= InterpolationPos(float2(left, right), discDir.x, subCoord.x);
		normalizedCoordinate.y	= InterpolationPos(float2(down, up), -discDir.y, subCoord.y);
		
		// Based on discontinuity direction, clip new edge from normalized coordinates
		float2 xyClip;
		xyClip.x			= lerp(step(subCoord.y, normalizedCoordinate.x), step(1.0f - normalizedCoordinate.x, subCoord.y), step(discDir.y, 0.0f));
		xyClip.y			= lerp(step(1.0f - normalizedCoordinate.y, subCoord.x), step(1.0f-normalizedCoordinate.y, 1.0f-subCoord.x), step(discDir.x, 0.0f));
		fill				= dot(xyClip, 1.0f); // sum
		
		// If discontinuity is in both directions on a single axis, fill.
		fill				= lerp(1.0f, fill, abs(discDir.x));
		fill				= lerp(1.0f, fill, abs(discDir.y));
		fill				= saturate(fill);
	
	}
	
	
	// Merge
	float3 color		= tex2D(inSampler2, inUV).rgb;
	outFBO.color0.rgb	= lerp(color, float3(0,0,0), fill);
	return outFBO;
}

[subheading]Results[/subheading] The algorithm performs very well in areas with low shadow map sampling rates. It interpolates new edges, which increase final visual quality of projected shadow. However, the algorithm is not perfect, some edges are not as smooth as a high-res shadow map projection. See future research on how the visual quality could be improved even further.