Sign in to follow this  

Blur compute shader

This topic is 473 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi,

 

I was just looking at doing a blur compute shader and took a look at the Hieroglyph 3 implementations for inspiration, however, I'm confused as to why the blur shaders never guard against writing/reading outside of the UAV when the input image is larger than the threadgroup, is it safe to do this?

 

If not, how do people normally do blur shaders where you could be trying to read neighboring pixels that don't exist, do you just work on an area smaller than the image itself?

 

Here's a link to the gaussian blur shader I'm referring to.

 

Cheers.

Share this post


Link to post
Share on other sites

Hi.

This is how you can clamp out the samples/ pixels at the borders.

Just to make sure we're safe on copyright etc., this is strongly based on the Blur shader implementation in the Frank D Luna D3D11 book.

[numthreads(N, 1, 1)]
void HorzBlurCS(int3 groupThreadID : SV_GroupThreadID,
				int3 dispatchThreadID : SV_DispatchThreadID)
{
	// fill local thread storage to reduce bandwidth. Blur N N pixels, needs N + 2 * BlurRadius for Blur radius
	
	// thread group runs N threads. To get the extra 2*BlurRadius pixels, have 2*BlurRadius threads sample an extra pixel.
	if(groupThreadID.x < gBlurRadius)
	{
		// clamp out of bound samples that occur at image borders
		int x = max(dispatchThreadID.x - gBlurRadius, 0);
		gCache[groupThreadID.x] = gInput[int2(x, dispatchThreadID.y)];
	}
	if(groupThreadID.x >= N-gBlurRadius)
	{
		// clamp out of bound samples that occur at image borders
		int x = min(dispatchThreadID.x + gBlurRadius, gInput.Length.x-1);
		gCache[groupThreadID.x+2*gBlurRadius] = gInput[int2(x, dispatchThreadID.y)];
	}

	// clamp out of bound samples that occur at image borders
	gCache[groupThreadID.x+gBlurRadius] = gInput[min(dispatchThreadID.xy, gInput.Length.xy-1)];

	// wait for all threads to finish.
	GroupMemoryBarrierWithGroupSync();
	
	// Now blur each pixel.
	float4 blurColor = float4(0, 0, 0, 0);
	
	[unroll]
	for(int i = -gBlurRadius; i <= gBlurRadius; ++i)
	{
		int k = groupThreadID.x + gBlurRadius + i;
		
		blurColor += gWeights[i+gBlurRadius]*gCache[k];
	}
	
	gOutput[dispatchThreadID.xy] = blurColor;
}

Share this post


Link to post
Share on other sites

In D3D11 all out-of-bounds reads and writes are well-defined for buffers and textures. Reading out-of-bounds returns 0, and writing out-of-bounds has no effect. Read/writing out-of-bounds on thread group local storage is not defined, so be careful not to do that.

 

Note that in D3D12 if you use root SRV's or UAV's there is no bounds checking, and so reading/writing out-of-bounds will have undefined behavior.

Share this post


Link to post
Share on other sites

In D3D11 all out-of-bounds reads and writes are well-defined for buffers and textures. Reading out-of-bounds returns 0, and writing out-of-bounds has no effect. Read/writing out-of-bounds on thread group local storage is not defined, so be careful not to do that.

 

Note that in D3D12 if you use root SRV's or UAV's there is no bounds checking, and so reading/writing out-of-bounds will have undefined behavior.

 

Ah ok this makes sense, as I use LDS I was getting crazy values but thought that was from the UAV read. That helps thanks.

Share this post


Link to post
Share on other sites

This topic is 473 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this