Image Filtering¶

Required Reading/Viewing:

How Blurs & Filters Work - Computerphile Quick note on this video - One thing Dr. Mike Pound discusses is using padding on the edges of images to keep sizes consistent through the filtering process. We won't be addressing padding just yet here, but will instead allow our images to become smaller as we filter.

Additional Reading

Forsyth, David A., and Jean Ponce. Computer vision: a modern approach. Prentice Hall Professional Technical Reference, 2002. Section 4.1

Recommended Jupyter Theme for presenting this notebook:

jt -t grade3 -cellw=90% -fs=20 -tfs=20 -ofs=20

#(Optional) Download data + videos if you don't have them.
import os, sys
sys.path.append('..')
from util.get_and_unpack import get_and_unpack

if not os.path.isdir('../data/'):
    url = 'http://www.welchlabs.io/unccv/the_original_problem/data/data.zip'
    get_and_unpack(url, location='..')
    
if not os.path.isdir('../videos/'):
    url = 'http://www.welchlabs.io/unccv/the_original_problem/videos.zip'
    get_and_unpack(url, location='..')

Problems with our implementation of Robert's Cross¶

Noisy - Robert's Cross is quite susceptible to noise.
Anisotropic - doesn't respond as strongly to slanted edges.
Slow as we've implemented it here.
No Direction as we've implemented Robert's Cross here, we only learn about the magnitude of our esimate of the image intensity derivative, not its direction.

Hopefully you've taken some time to think through some possible fixes to these problems. Let's start by seeing if we can make our images less noisy.

Noise¶

%pylab inline

Populating the interactive namespace from numpy and matplotlib

#Load image
im = imread('../data/medium_2/ball/ball_3.jpg')

#Let's have a look
fig = figure(0, (12,12))
imshow(im)
title('Gross, so noisy! What is this, 2003?');

This is a particularly noisy image from our dataset - it was captured under relatively low light conditions with the webcam of my 2008 MacBook.
The z-axis height of the surface below corresponds to the grayscale intensity of our image.

Let's have a look at the result of our implementation of Robert's Cross from earlier time on this image.

#Add path to top level directory so we can access utilities
import sys
sys.path.append('..')

#Import grayscale method
from util.image import convert_to_grayscale

#Scale image from 0-1, and convert to grayscale
gray = convert_to_grayscale(im/255.)

fig = figure(0, (6,6))
imshow(gray, cmap = 'gray');

hist(gray.ravel(), 100);
grid(1)

from util.filters import roberts_cross

edges = roberts_cross(gray)

fig = figure(0, (12,12))
imshow(edges);

Now, how can we remove some of the noise from our image?¶

Any ideas?
Are there properties of our noise that we can exploit to help us seperate it from our signal?

Local Averages¶

A very successful/popular appraoch to noise reduction is to replace the individual pixels in our images with some type of weighted average of their local neighborhoods.

How would this operation change the appearance of our image?¶

Let's find out!
Now, before we code this up, let's generalize this process a little.
A more general way to think about the average of our neighborhood is as the dot product of our image patch with a (3x3) matrix.

#Little trick to progress through slides within the notebook
from IPython.display import Image, display
from ipywidgets import interact

#Quick method to let me step through "slides"
def slide_show(slide_num=1):     
    display(Image('../graphics/averaging_slides/' + str(slide_num).zfill(2) + '.png'))

interact(slide_show, slide_num = (1, 2));

It turns out that taking the dot product of our image patch with a small matrix or kernel like this is incredibly common in computer vision.
Doing this across a whole image is called filtering, convolution, or cross correlation, we'll get into the details later.
Let's write a method to filter an image with an arbitray kernel/filter
First, let's think though how exactly our method will work.

Now, let's think about how to describe what we would like to do mathematically.
But first, we need to define the coordinate system we'll use on our images.

fig = figure(0, (9,9))
ax = fig.add_subplot(111)
ax.imshow(edges)
ax.tick_params(axis='both', which='major', labelsize=25)

As you may already know, the coordinate system of images in matplotlib is defined with the origin in the upper left hand corner, positive is down and to the right. Our first index corresponds to the vertical dimension and our second index corresponds to the horizontal.
We'll set up our mathematics in the same way, where $I(i = 0, j = 0)$ represented the upper left pixel.

Alright, now we're getting somewhere!
This operation is know as cross correlation in 2 dimensions, and is very similar to convolution, something we'll discuss in depth later. To compute convolution instead of cross correlation, we simply flip our kernel relative to our images.
We'll use they symbol $ \star $ to denote cross correlation
Not be be confused with $\ast$, which often denotes convolution.
Now, let's code up our cross correlation operation.
We could implement our filter in exactly as written in the above equation, howewever this would require iterating through $i, j$ as well, making for 4 for-loops. Also, the indexing of S is a little weird - we'll simplify things a little, and us np.multiply to multiply our image patch and kernel matrices together:

def filter_2d(im, kernel):
    '''
    Filter an image by taking the dot product of each 
    image neighborhood with the kernel matrix.
    Args:
    im = (H x W) grayscale floating point image
    kernel = (M x N) matrix, smaller than im
    Returns: 
    (H-M+1 x W-N+1) filtered image.
    '''

    M, N = kernel.shape
    H, W = im.shape
    filtered_image = np.zeros((H-M+1, W-N+1), dtype = 'float64')
    
    for i in range(filtered_image.shape[0]):
        for j in range(filtered_image.shape[1]):
            image_patch = im[i:i+M, j:j+N]
            filtered_image[i, j] = np.sum(np.multiply(image_patch, kernel))
            
    return filtered_image

Now, remember that the whole point of all of this was to try to reduce the noise in our images by taking local averges.
Another name for this local average kernel is a box or boxcar function.
Let's try doing this with our new filter_2d method.

kernel = (1/9)*np.ones((3,3))

kernel

array([[0.11111111, 0.11111111, 0.11111111],
       [0.11111111, 0.11111111, 0.11111111],
       [0.11111111, 0.11111111, 0.11111111]])

# Filter Image!
im_filtered = filter_2d(gray, kernel)

im_filtered.shape

(254, 254)

fig = figure(0, (20,10))

fig.add_subplot(1,2,1)
imshow(gray, cmap = 'gray');
title('Before');

fig.add_subplot(1,2,2)
imshow(im_filtered, cmap = 'gray');
title('After');

It may be a little hard to see, but our filtered image is a little smoother!

How could we smooth our image even further?¶

One way to achieve this is to use a larger kernel:

kernel = (1/81)*np.ones((9,9))
im_filtered = filter_2d(gray, kernel)

fig = figure(0, (20,10))

fig.add_subplot(1,2,1)
imshow(gray, cmap = 'gray');
title('Before');

fig.add_subplot(1,2,2)
imshow(im_filtered, cmap = 'gray');
title('After');

Now we can really see some smoothing taking place!
Let's have a look at our filtering operation on another image:

kernel = (1/25)*np.ones((9,9))

#Load up another image:
brick = imread('../data/easy/brick/brick_4.jpg')

#Scale intensity values between 0 and 1, convert to grayscale, and filter
brick_gray = convert_to_grayscale(brick/255.)
brick_filtered = filter_2d(brick_gray, kernel)

fig = figure(0, (12,12));
imshow(brick_filtered, cmap = 'gray');

A blurry brick!
Now, do the edges of this brick look weird to you at all?
Anything strange going on?
To me, it almost looks like there's some kind of "crosshash pattern" around our edges.
Let's look at a toy example:

toy_image = np.zeros((21,21))
toy_image[10, 10] = 1.0

fig = figure(0, (6, 6))
imshow(toy_image,vmin=0, vmax=1);

Our toy image is a small (21x21), image of all zeros with a single value of 1 in the very center.
A discrete 2d Dirac Delta function.
Now, what will happen if we filter this function with our moving average?

Note: color range is different in each image, be sure to pay attention to the colorbar units.

As you may have correctly guessed, the correct answer is a.

Quick Explanation

One way to think about this is imagining our kernel sliding over our image
When our kernel is in, let's say, the upper left corner, all image values are zero - resulting in a filtered value of zero.
Now, whevever our kernel eventually reaches our central pixel, whenever our kernel overlaps this pixel at all, the resulting output value will be $\frac{1}{25} \cdot 1 = 0.04$.
It doesn't matter if our kernel is directly above our central pixel or if the central pixels just "clips" the lower corner of our kernel - if the central pixel is anywhere in the kernel, our output value will be a constant $0.04$
This is why the central patch of our correct answer a is uniform.
Finally, the answer is not b. b is uniform, but it's the central patch is not quite large enough - it's about 3x3 - it should be the same size as our filter.

#And finally, just in case you're suspicious, we can actually perform the filtering in question:
box_5 = (1/25)*np.ones((5,5))
filtered_box_5 = filter_2d(toy_image, box_5)
imshow(filtered_box_5)
colorbar()

<matplotlib.colorbar.Colorbar at 0x117edf748>

An important point here - The above result kind of gives an impression that, filtering has made the center pixel spread without changing its intensity becuase we've scaled the colors to the maximum value in the filtered image. Another option here is to use a consistent color mapping, which allows us to get a better feel for how our input and output intensities compare:

fig = figure(0, (18, 7))
fig.add_subplot(1,2,1)
imshow(toy_image,vmin=0, vmax=1);
colorbar();
title('Before Filtering');

fig.add_subplot(1,2,2)
imshow(filtered_box_5, vmin=0, vmax=1);
colorbar();
title('After Filtering');

So the center pixel's intensity was distributed(smoothed) as a result of filtering.
Now, the reason for this fun little exercise:

This may not be a very "natural" way to blur our images.¶

When our eyes and most cameras create blur by being out of focus, the intensity generally falls of much more smoothly than we've seen with our boxcar smoothing filter.

How then, should we blur our images?¶

For our blur to look more natural, or behave as a de-focused camera lens, it would make sense for your blur to smoothly fade out from the cental bright pixel - more like option d in the question above.
We can accomplish this by taking a weighted average around the central pixel in our kernel instead of a uniform average as we have thus far.
A very popular way to do this in image processing is to use guassian kernels:

Guassian kernels have lots of nice properties, but the main one we care about there is that they decrease monotonically and smoothly from their central point.
Checkout section 5.7 of Computer Vision by Shapiro + Stockman for some more nice properties.
For now, let's figure out how to implement these guys in python.
Here's the equation for a 2 dimensional symmetric guassian kernel:

$$¶

G_{\sigma}(x, y) = \frac{1}{2 \pi \sigma^2} exp \bigg(- \frac{(x^2+y^2)}{2 \sigma^2} \bigg) $$

Let's think through this function a bit.
What do $x$, $y$, and $\sigma$ represent?
What values of $x$ and $y$ maximize sigma?
And most importantly, how do we make this continuous function definition into a kernel we can actually use?

Explanation

One way I like to think through problems like this is it to pick some test numbers to help me think through the problem.
Let's pick the size of our kernel to be (5x5), this would make k=2.
We know that our gaussian kernel should reach a maximum value at the center of our kernel (i=2, j=2).
Our Gaussian Kernel Function $G_{\sigma}(x, y)$ reaches a maximum when x=0 and y=0.
- One way to think about this is that any non-zero value of $x$ or $y$, when squared will become a positive number, and will make the overall argument of our exponent negative. $e$ to a negative power will always be smaller than $e^{0}$, so $G_{\sigma}(x, y)$ reaches a maximum when x=0 and y=0.
So the correct answer above should give an argmument of zero inside our exponent at the center of our kernel, when (i=2, j=2).
The only answer that meets the criteria is d.

Alright, let's code it up!

def make_gaussian_kernel(size, sigma):
    '''
    Create a gaussian kernel of size x size. 
    Args: 
    size = must be an odd positive number
    sigma = standard deviation of gaussian in pixels
    Returns: A floating point (size x size) guassian kernel 
    ''' 
    #Make kernel of zeros:
    kernel = np.zeros((size, size))
    
    #Handle sigma = 0 case (will result in dividing by zero below if unchecked)
    if sigma == 0:
        return kernel 
    
    #Helpful for indexing:
    k = int((size-1)/2)
    
    for i in range(size):
        for j in range(size):
            kernel[i, j] = (1/(2*np.pi*sigma**2))*exp(-((i-k)**2 + (j-k)**2)/(2*sigma**2))
            
    return kernel

gaussian_kernel = make_gaussian_kernel(size = 11, sigma = 2)

imshow(gaussian_kernel)

<matplotlib.image.AxesImage at 0x1192ebbe0>

And in 3d:

from mpl_toolkits.mplot3d import Axes3D

X, Y = np.meshgrid(range(11), range(11))
# Plot kernel as 3d surface.
fig = plt.figure(0, (9,9))
ax = fig.gca(projection='3d')
surf = ax.plot_surface(X, Y, gaussian_kernel, cmap=cm.viridis, antialiased=True)

Alright, let's put it all together.¶

box_kernel = (1/25)*np.ones((5,5))
gaussian_kernel = make_gaussian_kernel(size = 7, sigma = 1.5)

fig = figure(0, (20, 12))

fig.add_subplot(2,3,1)
imshow(toy_image, cmap = 'gray')
title('Original Image');

fig.add_subplot(2,3,2)
imshow(filter_2d(toy_image, box_kernel), cmap = 'gray')
title('Box Function Blur');

fig.add_subplot(2,3,3)
imshow(filter_2d(toy_image, gaussian_kernel), cmap = 'gray')
title('Gaussian Blur');

fig.add_subplot(2,3,4)
imshow(brick_gray, cmap = 'gray')
title('Original Image');

fig.add_subplot(2,3,5)
imshow(filter_2d(brick_gray, box_kernel), cmap = 'gray')
title('Box Function Blur');

fig.add_subplot(2,3,6)
imshow(filter_2d(brick_gray, gaussian_kernel), cmap = 'gray')
title('Gaussian Blur');

Ok, so we've gound a good method for reducing the noise level in our images - let's try it out with our edge image from Robert's cross, and see if we can reduce some of the noise.

fig = figure(0, (6,6))
imshow(edges);

#Make gaussian kernel
gaussian_kernel = make_gaussian_kernel(size = 11, sigma = 2)

#Filter edge image
edges_filtered = filter_2d(edges, gaussian_kernel)

fig = figure(0, (16,9))
fig.add_subplot(1,2,1)
imshow(edges)
title('Original');

fig.add_subplot(1,2,2)
imshow(edges_filtered)
title('Filtered with Gaussian Kernel');

As you can see, some of our noise has been smoothed out be our gaussian kernel.
This is a little easier to see in 3d:

Finally, we can use one of my favorite tools, the interact module to play with our smoothing:

from ipywidgets import interact

def blur_and_display(size = 5, sigma = 2):
    #Make gaussian kernel
    gaussian_kernel = make_gaussian_kernel(size = size, sigma = sigma)

    #Filter edge image
    edges_filtered = filter_2d(edges, gaussian_kernel)
    
    fig = figure(0, (9,9))
    imshow(edges_filtered)

interact(blur_and_display, size = (3, 31, 2), sigma = (1, 10, 0.5));

Appendix¶

Computation for first filtering example.

im = np.array([[3, 2, 8 , 5, 7], 
               [1, 4, 8, 9, 5], 
               [2, 6, 7, 3, 9], 
               [9, 7, 8, 3, 1], 
               [8, 2, 3, 4, 2]])
k = np.array([[0, -1, 0], 
              [-1, 1, 0],
              [0, 2, -1]])
              
filter_2d(im, k)

array([[ 6.,  7., -7.],
       [ 6.,  6., -8.],
       [-7., -4., -2.]])