Wednesday, July 14, 2010

Activity 6: Fourier Transform Model of Image Formation

Fourier Transforms
Fourier transform is one of many linear transforms that changes signals into inverse space, meaning that if the signal is in x dimension, it will convert it into 1/x dimension or also known as the frequency space. The Fourier transform of a signal f(x,y) is given by:
Fourier transforms are commonly implemented as built-in function in many programming language. Some of these are Python, Matlab and Scilab. The Fast Fourier Transform (FFT) algorithm by Cooley and Turkey is a fast and efficient implementation of the Fourier transform. However, to maximize the efficiency of the FFT, the following signals must have sizes that are of powers of two (64x64, 128x128, 256x256, etc.).

In applying FFT on a signal, quadrants will be inverted diagonally, but this can be easily fixed by shifting back the quadrants to their original places. The output of the transformation yields imaginary numbers (since the integrand involves an exponential of a imaginary number), so to view the intensity of the transformation, the modulus of signal must be plotted. In inverse Fourier transform, it is just the same as using FFT but the reconstructed signal will be inverted due to the negative sign present in the Fourier transform equation.

Basics of FFT
Now let us explore and get familiar in the FFT function in scilab. Let us start with a circle, a simple geometric shape.
Figure 1. A binary image of a centered circle. Image dimension is 128x128.

Applying the FFT on the image in figure 1, we obtain the linear transform of the circle in inverse space shown in figure 2.
Figure 2. Fourier transform of the centered circle in figure 1.

Examining figure 2, we can only see a blank black image, but by shifting the quadrants back to their original position (since quadrants are inverted diagonally) we can see the true Fourier transformed signal of the image in figure 1.
Figure 3. Shifted Fourier transform of the centered circle.

In figure 3, we can see the shifted transformed signal and it is just a white centered dot. Now let's try to reconstruct the circle in figure 1 by applying again FFT in the signal in figure 2.
Figure 4. Inverse Fourier transform of the signal in figure 2.

In figure 4, we cannot observe the expected inversion of the image since it is a circle and it is rotationally symmetric.

Now let's us try to inject a bit of complication, instead of simple geometrical shape let's to apply Fourier transformation on the image of a letter "A".
Figure 5. A binary image of the letter "A".

Applying the Fourier transform and the quadrant shifting, we obtain the linear transform shown in figure 6.
Figure 6. (left) Unshifted Fourier transform and (right) shifted Fourier transform of the letter "A" in figure 5.

Now we can observe the difference between the Fourier transforms of the simple circle and the letter "A". Taking now the inverse FFT of the signal in figure 6, we can now obviously observe the inversion of the image.
Figure 7. Inverse Fourier transform of the image in figure 6.

Convolution
Convolution is a linear operation where a function, f(x,y), in a way, is stained by another function, g(x,y). The convolution is given by the equation shown below,
If f(x,y) and g(x,y) are linearly transformed (or applied with Fourier transform) to F(fx,fy) and G(fx,fy) the integral will simply just be a multiplication.
In a system, we can consider that F is the input signal and G is the transfer function of the system. So suppose that we have a digital camera (a lens and a CCD camera system), the lens will represent the linear transformation applied on the incoming signal and the size of the lens represents the transfer function.

With that in mind, we can now simulate an imaging device such as a digital camera. Let's try to simulate how an object will look like in an imaging device with different transfer function (size of lens). Our test image is shown in figure 8 below.
Figure 8. A binary image of the letters "V","I", and "P". The image dimension is 128x128.

For the lens, we will use lenses that have radius of 1, 5, 10, and 15 pixel. The convolution of the test image with the corresponding lenses is shown below.
Figure 9. Convolution of the test image with corresponding lenses. (top, left to right. bottom) lens with radius 1, 5, 10 and 15.

As expected after applying the inverse Fourier transform, all reconstructed images are inverted. As for the clarity of the reconstructed image, the larger the size of the lens the more clear and defined the reconstructed image is.This can be explained by considering the diffraction of the image with respect to the size of the lens. If the lens size is much lesser than the size of the object, most of the light coming from the object will diffract, however if the size of the lens is much larger than the object's size, almost all of the light coming from the object will pass through the lens and will not diffract.

What if our lens has a Gaussian transparency and has different sizes? What would the reconstructed image be?
Figure 10. A lens with Gaussian transparency and sizes of (left to right) 1, 5, and 10.

As observed from figure 10, we can see that as the size if the lens increases, the test image is more identifiable. However, because of the Gaussian transparency of the lens, less light passed through the lens giving the reconstructed image less information.

Correlation
Like the convolution, correlation is a linear operation that measures how does one function, f(x,y), is closely similar to another function, g(x,y). The equation for correlation is given by,
From the equation shown above, we can clearly see that the correlation equation is very much similar to the convolution equation. In fact if the either one of the function f(x,y) or g(x,y) is an even function, the correlation equation would just be a convolution equation.
Since the correlation equation measures how closely similar two signals are, it is often used in pattern recognition. Let's try using correlation in pattern recognition in this image shown in figure 11.
Figure 11. A binary image that will be used for pattern recognition.

Suppose we want to find all of the A's in the image shown above, hence, an image of a similar letter A in the image will serve as our pattern.
Figure 12. A binary image of the letter "A" that will serve as our pattern.

So now let's see the correlation of figures 11 and 12, and find out how the correlation function serves in the pattern recognition.
Figure 13. The recognition of the letter "A" pattern in the image in figure 11.

Observing figure 13, we can see in the image that the white spot indicate the position of all the letter in the image figure 11. The correlation function detect where all the "A" are located in the image. What if we try to find another pattern. Let's say we want to find all the "AI" and the "AIN" in the image in figure 11.
Figure 14. (above) The patterns used and (below) the corresponding correlation images.

As we can observe from the image above, the white spots in the correlation images indicates where the pattern is located in the image in figure 11.

Correlation can also be used to detect edges in an image. This can be done by creating a pattern that represents the edge that you want to detect. To give an example, let's say we want to detect all the horizontal edges in the image of figure 8. Our pattern will represent a horizontal line given by the matrix
The correlated image in figure 8 with the given matrix above is shown in the next figure. We can see that the horizontal edges of the image was given emphasis and the edges that are not horizontal are almost unobservable.
Figure 15. Correlated image of figure 8 with a horizontal pattern.

Now let us observe different edge patterns and its corresponding correlation images.


Figure 16. Correlation images with different edge patterns.

As we can see from figure 16, the resulting correlation image depends on what edge pattern was used in the correlation equation. It gives emphasis to edges that are similar to the pattern used.

Image manipulation is truly a very powerful technique and so it finds a lot of image processing applications. It can artificially be an imaging device, and it can do pattern recognition and edge detection. Transforming a signal in the inverse space also lessens the complexity in solving the convolution and correlation integrals. Understanding the concept behind frequency manipulation is truly worth the work. I would give myself a 10 for this activity.

Reference:
  • Soriano, M. A6 - Fourier Transform Model of Image Formation. Applied Physics 186 Handout. 2010.

No comments:

Post a Comment