Many techniques in image processing involve transformations that make use of neighbouring information. One simple class of transformations just takes a linear combination of some fixed neighbours. This class of transformation can be used to perform operations like blurring, sharpening, and edge detection. A surprisingly large amount of image manipulation can be done effectively using a combination of these transformations. They are sometimes referred to as linear filters.
This post aims to lay out some of the theory behind linear filters used in image processing. The aim is not to be as general or abstract as possible with the ideas, rather, to specialise towards implementation instead. Some Python code will be presented to illustrate how one can apply these filters.
In a future post, examples of common convolutions used in image processing will be presented.
For the rest of this post, indexing of matrices will start at 0, rather than 1. If you have not come across matrices before, you can think of them like 2D arrays. If is a matrix, then refers to the entry of the matrix , where refers to the coordinate, and refers to the coordinate.
Consider the following example. Let be a 4×4 matrix representing an image,
and be a matrix of the coefficients used in the linear combinations (the filter coefficients),
is sometimes called a kernel, or a mask.
We define a linear filter ,
After applying the filter, the new pixel value of the image at coordinate is,
Notice that it is just the weighted sum of the surrounding pixel values around the point (1,1), with the matrix providing the weights. This top-down, left-right multiply and add operation is sometimes called a correlation operation.
If instead, we picked the point , the pixel values that are out of bounds, e.g., , are treated as having the value 0.
To be a little more general, if the size of was , for some integer , then
With the way the filter is presented here, it is necessary to pick odd numbered dimensions (i.e., ). Note that you can always turn non square, and even dimensional kernels into equivalent square, odd dimensional kernels, so we will just consider the latter.
Alternatively, you can use a convolution operator rather than correlation, i.e.,
Applying the convolution operator with the example , and , yields
Notice the order of the coefficients is applied in reverse order, from bottom to top, right to left, so to get the same filter using a convolution operation, we would instead define
In practice, a lot of kernels have nice rotational symmetry (e.g., rotating the values by multiples of gives the same matrix), making it irrelevant whether you choose to apply a convolution or a correlation operation.
However, since convolutions have nice theoretical properties, there are good reasons why many people present these transformations using convolutions.
One of these nice properties is the idea of separability. If a linear filter is separable, then the key information in the kernel is contained in a column (or row) matrix. This allows us to reduce the number of operations required to compute the filter from at each point, to just by convolving the image with the column kernel , followed by a convolution with its transpose ( as a row).
For example, if we take the same image as before, but now with
then we can compute a separable filter by first computing
This is equivalent to the convolution with the 2D kernel
which is the matrix product (linear algebra) of , and .
Unfortunately, not all linear filters are separable. While we won’t go into the details, a few comments won’t go amiss. Determining if your 2D kernel matrix is separable is done by checking whether it has matrix rank (linear algebra) one or that it only has one non-zero singular value. To actually calculate what the column vector is, one could use the first column of one of the unitary matrices obtained by the singular value decomposition (SVD), scaled by its only non-zero singular value.
These examples assume a monochrome bitmap image stored in a 2D array.
Applying a 2D convolution to one position
# Returns a point with coordinates (i,j) of image # after the filter is applied. def filter2D(image, kernel, i, j): result = 0.0 kLen = len(kernel) wdth = len(image) hght = len(image) N = (kLen-1)/2 for k in range(-N, N+1): for l in range(-N, N+1): if i-k>=0 and j-1>=0 and i-k<hght and j-l<wdth: result += image[i-k][j-l]*kernel[N+k][N+l] return result
Applying a separable filter to an image
def convCol(image, kernel): kLen = len(kernel) width = len(image) height = len(image) result = [height * [width * ]] N = (kLen-1)/2 for i in range(height): for j in range(width): for k range(-N, N+1): if i-k>=0 and i-k<height: result[i][j] += image[i-k][j]*kernel[N+k] return result def convRow(image, kernel): kLen = len(kernel) width = len(image) height = len(image) result = [height * [width * ]] N = (kLen-1)/2 for i in range(height): for j in range(width): for k range(-N, N+1): if j-k>=0 and j-k<width: result[i][j] += image[i][j-k]*kernel[N+k] return result # Returns the entire image with filter applied to it. def separableFilter(image, kernel): fPrime = convCol(image, kernel) return convRow(fPrime, kernel)
Most of the time though, you will probably just want to use something like OpenCV, and get these access to these sorts of functions for free.
 Computer Vision: Algorithms and Applications, Richard Szeliski, Springer, 2011. Author homepage: http://szeliski.org/Book
 Computer vision: filtering (lecture notes), Raquel Urtasun: http://www.cs.toronto.edu/~urtasun/courses/CV/lecture02.pdf
 Separable convolutions, Steve Eddins: http://blogs.mathworks.com/steve/2006/10/04/separable-convolution
 Separable convolutions – Part 2, Steve Eddins: http://blogs.mathworks.com/steve/2006/11/28/separable-convolution-part-2
 Basics of convolutions: http://matlabtricks.com/post-3/the-basics-of-convolution