Blur compute shader

Started by
2 comments, last by Syntac_ 7 years, 7 months ago

Hi,

I was just looking at doing a blur compute shader and took a look at the Hieroglyph 3 implementations for inspiration, however, I'm confused as to why the blur shaders never guard against writing/reading outside of the UAV when the input image is larger than the threadgroup, is it safe to do this?

If not, how do people normally do blur shaders where you could be trying to read neighboring pixels that don't exist, do you just work on an area smaller than the image itself?

Here's a link to the gaussian blur shader I'm referring to.

Cheers.

Advertisement

Hi.

This is how you can clamp out the samples/ pixels at the borders.

Just to make sure we're safe on copyright etc., this is strongly based on the Blur shader implementation in the Frank D Luna D3D11 book.


[numthreads(N, 1, 1)]
void HorzBlurCS(int3 groupThreadID : SV_GroupThreadID,
				int3 dispatchThreadID : SV_DispatchThreadID)
{
	// fill local thread storage to reduce bandwidth. Blur N N pixels, needs N + 2 * BlurRadius for Blur radius
	
	// thread group runs N threads. To get the extra 2*BlurRadius pixels, have 2*BlurRadius threads sample an extra pixel.
	if(groupThreadID.x < gBlurRadius)
	{
		// clamp out of bound samples that occur at image borders
		int x = max(dispatchThreadID.x - gBlurRadius, 0);
		gCache[groupThreadID.x] = gInput[int2(x, dispatchThreadID.y)];
	}
	if(groupThreadID.x >= N-gBlurRadius)
	{
		// clamp out of bound samples that occur at image borders
		int x = min(dispatchThreadID.x + gBlurRadius, gInput.Length.x-1);
		gCache[groupThreadID.x+2*gBlurRadius] = gInput[int2(x, dispatchThreadID.y)];
	}

	// clamp out of bound samples that occur at image borders
	gCache[groupThreadID.x+gBlurRadius] = gInput[min(dispatchThreadID.xy, gInput.Length.xy-1)];

	// wait for all threads to finish.
	GroupMemoryBarrierWithGroupSync();
	
	// Now blur each pixel.
	float4 blurColor = float4(0, 0, 0, 0);
	
	[unroll]
	for(int i = -gBlurRadius; i <= gBlurRadius; ++i)
	{
		int k = groupThreadID.x + gBlurRadius + i;
		
		blurColor += gWeights[i+gBlurRadius]*gCache[k];
	}
	
	gOutput[dispatchThreadID.xy] = blurColor;
}

Crealysm game & engine development: http://www.crealysm.com

Looking for a passionate, disciplined and structured producer? PM me

In D3D11 all out-of-bounds reads and writes are well-defined for buffers and textures. Reading out-of-bounds returns 0, and writing out-of-bounds has no effect. Read/writing out-of-bounds on thread group local storage is not defined, so be careful not to do that.

Note that in D3D12 if you use root SRV's or UAV's there is no bounds checking, and so reading/writing out-of-bounds will have undefined behavior.

In D3D11 all out-of-bounds reads and writes are well-defined for buffers and textures. Reading out-of-bounds returns 0, and writing out-of-bounds has no effect. Read/writing out-of-bounds on thread group local storage is not defined, so be careful not to do that.

Note that in D3D12 if you use root SRV's or UAV's there is no bounds checking, and so reading/writing out-of-bounds will have undefined behavior.

Ah ok this makes sense, as I use LDS I was getting crazy values but thought that was from the UAV read. That helps thanks.

This topic is closed to new replies.

Advertisement