Sign in to follow this  
ZaiPpA

How does convolution express blur?

Recommended Posts

Hi. [smile] I've been trying to understand what convolution means (when used for blur), since convolution is often mentioned when blur is discussed. I almost get it now (especially by the help of this), but there are still some things i don't understand. My problem right now, is that the convolution formula doesn't really seem to fit with how i blur an image (and how i believe everyone else does it as well). Let's say we have an input-image 'in' and want to generate an output-image 'out' by blurring the input-image. Also, let n be the number of samples per pixel (the kernel-size) (n is odd). For simplicity, let's assume the image we want to blur is one-dimensional, since the formulas are simpler for 1D. This means that our input and output images 'in' and 'out' are 1D-arrays. Also, our kernel W (=the table with the weights) is an 1D-array. The formula i use: This is how i would write the expression for the output-color of pixel x: (this is more or less how i do it in my shaders)
Image Hosted by ImageShack.us
(where out[x] and in[x] is the output-color and input-color at coordinate x, and W[i] is the i'th entry in the weights-array.) So the output-color out[x] is a weighted sum of the n input-pixels around pixel x (both input-pixels to the left of x, to the right of x and x itself are sampled). The convolution formula: Now, when i look at the convolution formula for the output-color out[x], from for example here and here, it is almost the same as mine, but not quite:
Image Hosted by ImageShack.us
The difference between my formula and the convolution formula, lies in which pixels from the input image that are read. It is a bit confusing. I see two problems with the convolution formula:
  • The input-pixels which are read are mirrored. (because it subtracts by i, instead of adding by i). (guess this doesn't matter for symmetric kernels, like most blurs are, but it still seems weird to me)
  • It doesn't read the input-pixels from an area centered around the center-pixel x, but it reads only input-pixels left to x. (this doesn't work for normal blur, so it would seem like you can't use convolution for blur. But i know this is wrong. :) )
I'm sure there is something i'm missing, since i've read in several articles that blur can be expressed in terms of convolution. But it just doesn't seem to fit with how blur is done. What am i missing? Hope you can help shed some light on this :) Thanx! (hmm.. that was a lot of text... sorry for writing a novel :) )

Share this post


Link to post
Share on other sites
Quote:
Original post by ZaiPpA
The input-pixels which are read are mirrored.
(because it subtracts by i, instead of adding by i).
(guess this doesn't matter for symmetric kernels, like most blurs are, but it still seems weird to me)

Let's say that the image is function 'f' and the kernel is function 'g'. In order to "scan" both functions in the same direction, the kernel has to be symmetric, and you can only convolute the image by the kernel (i.e. g*f). This is often the case in blurring which is why we use the simplification. But by "scanning" the functions in the opposite direction, not only can your kernel be asymmetric, but you also have the equality f*g = g*f. Not that in practice you'd ever really commute the functions, but that's the reason why the formal convolution equation does the "mirrored" scans.

Quote:
Original post by ZaiPpA
It doesn't read the input-pixels from an area centered around the center-pixel x, but it reads only input-pixels left to x.
(this doesn't work for normal blur, so it would seem like you can't use convolution for blur. But i know this is wrong. :) )

It's just as indexing thing. They chose to convolute over the range [0,t] but you can use whatever range you want. You use the range of the kernel because those are the values you care about, and everything beyond the edges of the kernel is considered to be zero so it doesn't contribute to the integral anyway.

Share this post


Link to post
Share on other sites
Thanx for your reply! Very helpful

Quote:
Original post by Zipster
Let's say that the image is function 'f' and the kernel is function 'g'. In order to "scan" both functions in the same direction, the kernel has to be symmetric, and you can only convolute the image by the kernel (i.e. g*f). This is often the case in blurring which is why we use the simplification. But by "scanning" the functions in the opposite direction, not only can your kernel be asymmetric, but you also have the equality f*g = g*f. Not that in practice you'd ever really commute the functions, but that's the reason why the formal convolution equation does the "mirrored" scans.


I understand what you say, that if you "scan" the functions in opposite directions, then f*g = g*f. Good explanation! (nice to have a reason for the weirdness)

But i don't understand why the kernel has to be symmetric if you scan in the same direction. Do you mean something like: "We want f*g = g*f to hold. Because we want that, the kernel has to be symmetric if we scan in the same direction, otherwise it wouldn't hold"?
If so, i understand.


Quote:
Original post by Zipster
It's just as indexing thing. They chose to convolute over the range [0,t] but you can use whatever range you want. You use the range of the kernel because those are the values you care about, and everything beyond the edges of the kernel is considered to be zero so it doesn't contribute to the integral anyway.


Agreed, it's just an index thing. But it is kind of important, isn't it? I mean, different indices leads to different results. So it still confuses me.
Maybe i understand it, though... Please tell me if the following is what you mean:


When using the convolution formula for the blur (using the terminology from my first post), we're actually not giving it the input-image in, but a modified input-image in'.
When doing the convolution on pixel x, in' has implicitly first been:
- mirrored around x (so that in'[x]=in[x], in'[x+1]=in[x-1], etc... in'[x+i]=in[x-i] in general)
- and moved floor(n/2) to the right (so that in"[x]=in[x-floor(n/2)])

so for our modified image, this holds: in'[x-i] = in[x+i-floor(n/2)].

In that case my formula and the convolution formula are the same:
Image Hosted by ImageShack.us


If so, i get that too... Convolution is a general framework, you can plug lots of different stuff into. You can insert my formula from above into the framework by doing this modification and you can insert lots of other sampling strategies into the framework as well, like sampling pixels to the right of x instead of centered around x, etc etc etc.


Ok, so we don't give the input-image to the convolution formula directly, but a modified one.
But... This means that when we say/write "convolve this image by this kernel" to someone, it is pretty much undefined what we mean, unless we also describe which modification must be done to the input-image first, right? (ok, it's probably usually implicit or sometimes the unmodified image might be used.. but strictly speaking, i mean)


(hope my long text above isn't too stupid [smile])

Share this post


Link to post
Share on other sites
Convolutions in mathematics are a measure of how much one function overlaps another as it is "shifted" over it. Here's a MathWorld page for the mathematical sense.

For images, your image is effectively a 2D function of the colours of the scene. Your "other function" that you are sweeping over it is a blurring kernel, which is just another 2D function that takes a local average of the intensities of the colours. What the convolution does is sweep that blurring kernel across the whole image to do the blur. It's just the mathematical term for this process.

Convolutions can be used for other kernels than just blurring too. A lot of image-wide manipulations can be expressed as a convolution, like sharpening or band-pass filters, detecting edges, or finding the gradient of the image.

Edit: Here's a tutorial that I got by Googling for convolution and image processing. It mightn't be the best on the 'net, but it's got some diagrams and images that might help visualise the process.

Share this post


Link to post
Share on other sites
Quote:
Original post by ZaiPpA
But i don't understand why the kernel has to be symmetric if you scan in the same direction. Do you mean something like: "We want f*g = g*f to hold. Because we want that, the kernel has to be symmetric if we scan in the same direction, otherwise it wouldn't hold"?
If so, i understand.

I should clarify a little. When you scan in the same direction, you automatically lose f*g = g*f. Both sides of the equation will yield different results, and you have to chose the ride side depending on the result you want. If the function on the left was symmetric, say 'g', then g*f using the same direction convolution would be equal to g*f using the opposite direction convolution (the "proper" mathematical convolution).

Quote:
Original post by ZaiPpA
Agreed, it's just an index thing. But it is kind of important, isn't it? I mean, different indices leads to different results. So it still confuses me.

Convolution is just a tool, and the results will indeed be different depending on the bounds of integration. It depends entirely on what you're trying to achieve.

Quote:
Original post by ZaiPpA
But... This means that when we say/write "convolve this image by this kernel" to someone, it is pretty much undefined what we mean, unless we also describe which modification must be done to the input-image first, right? (ok, it's probably usually implicit or sometimes the unmodified image might be used.. but strictly speaking, i mean)

Strictly speaking, yes it depends on the context. If you just told someone to "convolute 'g' with 'f'", then they'd probably use the proper formula since it's the most generalized one that works in all cases and you haven't given them any more information. If you told someone to "blur image 'f' with kernel 'g'", then they might still use the proper convolution formula for correctness' sake, or they might use another graphics blurring algorithm (like your original formula). If 'f' and 'g' were waveforms, then they might use another completely different equation that's well-established in audio filtering. The point is that the essence of the convolution is still captured, even if the precise equation is modified to reflect the needs of whoever is using it. Convolution, by the way, is the time domain equivalent of multiplication in the frequency domain -- if 'F' is the Fourier transform, then f*g = F(f)xF(g) -- and convolution in the frequency domain is multiplication in the time domain.

Share this post


Link to post
Share on other sites
Thank you very much for the links. There are some good intuitions in it. (the animation in the wolfram link is nice!)

problem 1 (solved):

I see that the wolfram article says that the function g (the kernel in this case) must be mirrored before being used in the formula. So it seems like that part of my 'theory' above was correct (they mirror the kernel instead of the input-image around x, which is also simpler). This solves the mirror problem.

problem 2:

When looking at the discrete finite convolution formula (the one from my first post), there is still the problem that the index i is never negative, and thus it only makes a weighted sum of the pixels to the left of pixel x, instead of a weighted sum of the pixels on both sides of x (neighboring pixels).

One solution would be to modify the input image, before using it in the convolution formula, as i described above. Is this how it's done?


Or maybe this last problem is instead solved by zipsters post from above:
Quote:
They chose to convolute over the range [0,x] but you can use whatever range you want.
That would also solve the problem. I just didn't know that the size of the range was optional. (it doesn't seem like it is, when reading the different articles)

Is it optional because there actually isn't defined a 'finite convolution formula'? Only the 'infinite convolution formula' (with range [-∞,∞]) has been defined, and any finite formula is just derived from this one, by truncating the range? (for example to the range where the kernel is non-zero, as in my case)
(so the one with range [0,x] is just one way of doing it, but a way which is used so often, that it is mentioned in almost every article discussing convolution)

Is this how it is?

(if this is so, i just don't get why the articles i linked to above, use this range, since it clearly doesn't fit with sampling the neighbor pixels (at least not without some modification), even though this is what they say it does in the articles...)

This is the last thing i need to understand before all the pieces fall into place [smile]


Edit: Posted this post before i saw zipsters last post. I would still like a reply to this post, though.

[Edited by - ZaiPpA on October 26, 2008 8:56:41 PM]

Share this post


Link to post
Share on other sites
There is no practical difference between convolving a signal in the two kernel scenarios (-1 to 1) vs (0 to 2).

Either way, you have edge cases that you must handle because of your choices related to your application (image processing.)

In a strictly logical signal processing scenario, there are no edge cases, because there are no edges. There arent any absolute positions either. Everything is relative.

Note that in one case (for a 2D filter) you would have:

output[x] = k0 * input[x-1] + k1 * input[x] + k2 * input[x+1]

the other:

output[x] = k0 * input[x] + k1 * input[x+1] + k2 * input[x+2]

Notice that in both cases, every sample in output[] is the weighted sum of 3 consecutive input[] samples, that their order has not changed, and the coefficients are the same.

What you are 'stuck' on in your 'problem 2' is merely a technicality of the specifics of your application where you want output[x] to be based on weighted averages which are centered on input[x]. Convolution itself neither requires nor benefits from that sort of stipulation.


Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this