In a previous post, we defined a special case of a convolution commonly seen in image processing applications, and gave some code to apply these filters to images.
Now that we’ve seen what a linear filter is, we will look at some examples of commonly used kernels in image processing for things like blurring, and edge detection. For each, we will briefly discuss the rationale behind picking those particular kernels.
A combination of these different convolutions allows one to do a surprisingly large amount of things, including in computer vision applications.
While some of these operations can be achieved using far more sophisticated approaches, applying a simple convolution often achieves pretty good results.
This post focuses on blurring in particular.
A simple averaging (box filter), and Gaussian blur will be considered. In both cases, the key idea is that you are taking an average over a neighbouring area to produce pixel values that are closer to in its neighbours.
This increased similarity means the picture becomes less defined as visual features which we associate with higher contrasting values, start to blend closer to its neighbours.
A 3×3 box blur kernel looks like
As you can see, you really are just taking an average of the 9 pixel values around your chosen point.
A Gaussian blur is a similar, except instead of giving all the pixels the same weight, we weight them using the Gaussian function. For those, that are mathematically inclined, it looks like
where the is a standard deviation you choose, is the distance in the axis away from our centre, and is the disatnce in the axis away from our centre.
The idea with using a Gaussian is to capture the idea that things that are far away lose their relevance quickly, so their weight should also get small quickly, but in a smooth way. You could also very well choose another weighting function that also penalises distance from origin.
A 3×3 Gaussian blur kernel with standard deviation , might look approximately like
Where do these numbers come from, you ask? Lets refer back to the Gaussian function. When we are at the origin, the distance away from the origin is 0 in the x and y axis, so that
If we shift along the x axis by 1, then
so that moving by a distance of 1 either vertically, or horizontally gives you a Gaussian value that is roughly times the size of the origin.
If we move vertically by 1, then horizontally by 1, our Gaussian
is roughly times the size of the origin.
Hence, the chosen matrix is roughly proportional. If you sum up all the matrix entries, you end up with 50, so we divide the whole thing by 50 to average it out the sum (similar to the box blur).
Integers are chosen to make it easier for a computer to work with. Along the same train of thought, one could also approximate to powers of 2 to make arithmetic even faster on a computer.
Generated with OpenCV blur and GaussianBlur functions.
Box blur 31×31
Gaussian blur 31×31 with standard deviation 5
In the next post we examine sharpening using another convolution.