Bilateral Blur - CS Implementation

This topic is 2142 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

Recommended Posts

Hey.

I have here a solution of a gaussian blur - implemented in the compute shader and now i want to implement a bilateral blur (filter) but i don't know where to start

The formula for gaussian blur in 1D is (the implementation is splitted in horizontal and vertical to be more efficient):

The formula for bilateral blur is:

The problem is that i don't know how to apply the values i have from the gaussian blur on the bilateral blur.

The Source of the gaussian blur shader is:

 cbuffer cbSettings { float gWeights[11] = { 0.05f, 0.05f, 0.1f, 0.1f, 0.1f, 0.2f, 0.1f, 0.1f, 0.1f, 0.05f, 0.05f, }; }; cbuffer cbFixed { static const int gBlurRadius = 5; }; Texture2D gInput; RWTexture2D<float4> gOutput; #define N 256 #define CacheSize (N + 2*gBlurRadius) groupshared float4 gCache[CacheSize]; [numthreads(N, 1, 1)] void HorzBlurCS(int3 groupThreadID : SV_GroupThreadID, int3 dispatchThreadID : SV_DispatchThreadID) { // // Fill local thread storage to reduce bandwidth. To blur // N pixels, we will need to load N + 2*BlurRadius pixels // due to the blur radius. // // This thread group runs N threads. To get the extra 2*BlurRadius pixels, // have 2*BlurRadius threads sample an extra pixel. if(groupThreadID.x < gBlurRadius) { // Clamp out of bound samples that occur at image borders. int x = max(dispatchThreadID.x - gBlurRadius, 0); gCache[groupThreadID.x] = gInput[int2(x, dispatchThreadID.y)]; } if(groupThreadID.x >= N-gBlurRadius) { // Clamp out of bound samples that occur at image borders. int x = min(dispatchThreadID.x + gBlurRadius, gInput.Length.x-1); gCache[groupThreadID.x+2*gBlurRadius] = gInput[int2(x, dispatchThreadID.y)]; } // Clamp out of bound samples that occur at image borders. gCache[groupThreadID.x+gBlurRadius] = gInput[min(dispatchThreadID.xy, gInput.Length.xy-1)]; // Wait for all threads to finish. GroupMemoryBarrierWithGroupSync(); // // Now blur each pixel. // float4 blurColor = float4(0, 0, 0, 0); [unroll] for(int i = -gBlurRadius; i <= gBlurRadius; ++i) { int k = groupThreadID.x + gBlurRadius + i; blurColor += gWeights[i+gBlurRadius]*gCache[k]; } gOutput[dispatchThreadID.xy] = blurColor; } [numthreads(1, N, 1)] void VertBlurCS(int3 groupThreadID : SV_GroupThreadID, int3 dispatchThreadID : SV_DispatchThreadID) { // // Fill local thread storage to reduce bandwidth. To blur // N pixels, we will need to load N + 2*BlurRadius pixels // due to the blur radius. // // This thread group runs N threads. To get the extra 2*BlurRadius pixels, // have 2*BlurRadius threads sample an extra pixel. if(groupThreadID.y < gBlurRadius) { // Clamp out of bound samples that occur at image borders. int y = max(dispatchThreadID.y - gBlurRadius, 0); gCache[groupThreadID.y] = gInput[int2(dispatchThreadID.x, y)]; } if(groupThreadID.y >= N-gBlurRadius) { // Clamp out of bound samples that occur at image borders. int y = min(dispatchThreadID.y + gBlurRadius, gInput.Length.y-1); gCache[groupThreadID.y+2*gBlurRadius] = gInput[int2(dispatchThreadID.x, y)]; } // Clamp out of bound samples that occur at image borders. gCache[groupThreadID.y+gBlurRadius] = gInput[min(dispatchThreadID.xy, gInput.Length.xy-1)]; // Wait for all threads to finish. GroupMemoryBarrierWithGroupSync(); // // Now blur each pixel. // float4 blurColor = float4(0, 0, 0, 0); [unroll] for(int i = -gBlurRadius; i <= gBlurRadius; ++i) { int k = groupThreadID.y + gBlurRadius + i; blurColor += gWeights[i+gBlurRadius]*gCache[k]; } gOutput[dispatchThreadID.xy] = blurColor; } 

Even with the comments i feel very hard to follow. I'm totally not sure what values come from which calculus - so i also don't even have a idea how to implement it.

Maybe u can give me a hint whats exactly what or how i can approach the problem because i really don't get it by my own

Regards Helgon

Edit:

Some more questions

1) Almost every implementation of blurs i've found were done in the vertex shader. Is it really worth that "hard work" to do it in the CS? Of course its much quicker but is it such a huge difference? And if yes, why is it not done more often in the CS?

2) And another little question. Is the CS used often in game developing or is it more used by rendering softwares / mathematical applications? Edited by ~Helgon

1. 1
Rutin
29
2. 2
3. 3
4. 4
5. 5

• 13
• 13
• 11
• 10
• 13
• Forum Statistics

• Total Topics
632956
• Total Posts
3009452
• Who's Online (See full list)

There are no registered users currently online

×