Canned Tuna: Hot and Spicy: 08/01/2010

Saturday, August 28, 2010

Activity 11: Playing Notes by Image Processsing

Playing notes by image processing?! Is this even possible?! Well at first, I thought that this wouldn't be accomplished. But after sometime pondering about it and brain storming with a colleague/friend, I suddenly realized that this can be done but it would be very tedious! Using all the image processing techniques that I have learned so far, playing notes by image processing would now be realized. First things first, what image do we need so that we can play notes from it. Of course from a musical score sheet. The one that I have used was a musical score sheet of Twinkle Twinkle Little Star.

Figure 1. Musical score sheet of Twinkle Twinkle Little Star.

Since the score sheet has two parts, the treble-staff and the bass-staff, I separated the two staffs and then separated the two staffs into three lines.

Figure 2. Treble-staff separated into three lines.

Figure 3. Bass-staff separated into three lines.

Determining the pitch and note length

So now, how are we going to play notes from these images? Placement and shape of notes in a musical score sheet tells information on its pitch and its timing or length. The placement of notes in the staff tells what pitch it is on. It tell us what is the value of the note in the heptatonic scale, or also known as the Do-Re-Mi-Fa-So-La-Ti scale, or sometimes represented as letters, C-D-E-F-G-A-B respectively. Now, since I have two staffs in the musical score sheet, the Treble and the Bass, I might as well describe the difference between the two. The treble staff is defined by a Treble clef or also known as the G-clef. In a treble staff the first bottom line is defined as E in the heptatonic scale, the space above the E is defined as the F in the heptatonic scale. Every upward/downward step from a line to a space, or vice versa, represents a upward/downward step in the heptatonic scale. The bass staff on the other hand is represented by a bass clef or also known as the F-clef. Same principles in the treble staff applies in the bass staff. However, the first bottom line in the bass staff is defined as the G in the heptatonic scale.

The timing or length of a note is defined by its shape. A whole note has a different form from a half note and so as from a quarter note. Explanation about this would be very lengthy, so for this I suggest you visit this site.

Now we know what the important things are, pitch and timing, how do we implement it for note playing? First let us implement the pitch part. Since the pitch is defined by the noted position in the staff. I implemented morphological operations to pin down the y-locations of the notes in the staff.

Figure 4. Morphological processes applied to the three lines in the treble staff to pin down the y-location of the notes in the staff

The image above was scanned so that the first pixel seen in each blob will remain. From the these y-coordinates, the pitch was then acquired by pixel to real value ratio. The frequencies of the heptatonic scale was obtained from this site.

Now the pitch problem is solved, only the timing part is left. To get the timing of each note, correlation technique was used. Since only two type of notes are present in the score sheet, a half note and a quarter note, only two notes are needed to be correlated.

Figure 5. Image of the half note and quarter note that was used in the correlation technique.

After the line was correlated with the image of the note, it was then binarized to a certain thresholding value so that at least one pixel per note would be left.

Figure 6. Binarized image of the the correlation image of the 1st treble line with (top) half note and (bottom) quarter note.

The resulting images from the correlations of the line with the two notes was ten added. This image would be the basis on how to classify the length of the note.

Figure 7. Sum of the images in figure 6.

To classify the length of the note, we check the x-locations of the pixels of image in figure 7 if it is equal to the pixels found in the half note and quarter note correlation images. If they are equal to one of the x-locations of the correlation images, then its length is defined by that correlation image.

Playing the note in Scilab

So now that we know the all of the pitch and length of the notes, how are we going to play it in Scilab? First, we must make a sinusoidal wave with frequencies and length obtained from the image processing. The length of the note is defined by the function soundsec. In Scilab, a function sound is used to play sinusoidal waves. However, the sound function plays the sinusoid in a constant amplitude, different from how a musical instrument plays a note. In a musical instrument, the overall musical note played has an amplitude envelope. This envelop is divided into parts; attack, decay, sustain and release.

Figure 8. Sound envelop of an instrument.

Now that I have the sound envelop. I multiplied this, element per element with the sinusoidal wave of one note. After all the notes are multiplied with this sound envelop, it was then normalized. Normalization is required for the Scilab function wavwrite, because wavwrite expects the sound data to be saved to be in the range of -1 to 1.

The two parts of the song were then combined by wave superposition to generate the full audio sound of the musical score sheet.

Listening part

Now for the listening part, since blogger lets you only upload file of photos and videos, all audio files generated using Scilab were then made in to a video file.

Warning: Bass notes in the audio/video file may not be heard using ordinary speakers. Use a speaker that is capable of playing low frequency sounds. Enjoy!

Twinkle Twinkle Little Star - Treble part

Twinkle Twinkle Little Star - Bass part

Twinkle Twinkle Little Star

Generating notes from Scilab made me very interested. The following audio/video files are the results of my interests in music and Scilab.

Livin' on a Prayer - Bass part

The following music in the audio/video file is something that I composed myself. It was created to simulate a music that involves four instruments. Enjoy!

Feel Good song - by Mabi

Unfortunately, Blogger cannot upload the audio/video file very clearly. So might as well download the audio files from here:

Twinkle Twinkle Little Star - Treble part

Twinkle Twinkle Little Star - Bass part

Twinkle Twinkle Little Star

Livin' on a Prayer - Bass part

Feel Good song - by mabi

I really enjoyed this activity, so I'll definitely give myself a grade of 10 on this one.

References:

http://en.wikipedia.org/wiki/Piano_key_frequencies
http://www.lumanmagnum.net/physics/sci_wav.html
http://www.wikihow.com/Read-Music
http://www.8notes.com
Dr. Soriano. Applied Physics 186 activity manual: A11- Playing Notes by Image Processing. 2010.

Thursday, August 19, 2010

Activity 10: Binary Operations

When an image is binarized it is split into two components, the background and the foreground. The background, in binary values, carries the zeros. It is often the part of the image that is ignored and disregarded. The foreground, on the other hand, is the the one that is valued with ones and it is often the region of interest in the image.

However, binarizing an image does not guarantee that the outcome will be noiseless since noise may be of the same gray level of the region of interest. This is where morphological operations such as opening and closing comes into play. Opening is a morphological operator that involves two steps, the first step is to erode the image with a specific structuring element and then dilate it the same structuring element used for erosion. The effect of this operation is that certain part of the foreground are removed and the preservation of the foreground that has the same shape as the structuring element. The other useful morphological operation is the Closing operator. Like the opening operator, the closing operator also involves two step, however, it first dilate the image with a specific structuring element and then erode it with same structuring element used. The effect of this operation is that the background that has the same shape as the structuring element used is preserved while eliminating other background pixels.

Now that we have the tools, we now must define the goal. In this activity, an image of scattered punched paper, shown in figure 1 will represent a normal human cell. Area of these cells must be obtained and averaged using techniques that were discussed/used in the previous activities.

Figure 1. An image of the scattered punched papers that represents the normal human cell.

Now let us try to separate the foreground from the background by binarizing the image to a certain threshold.

Figure 2. Binarized image at threshold equals 0.8 a.u.

But from the binarized image, we can see that noises are still present as predicted. So now, we apply one of the discussed operations, the Opening operator.

Figure 3. Opening operator applied to the image of the scattered punch paper.

Now to get the average pixel area of the cells, we apply the technique learned from Activity 4: Area estimation of images with defined edges by using the follow function and then apply Green's theorem after. But before doing so, we first cut up the image into 256x256 pixel subimages and then use bwlabel to label each blob in each of the subimages to measure each of the blobs area independently.

The calculated area was then kept in a list and then was plotted for a histogram to observe where is the mean of the histogram is.

Figure 4. Histogram of the cell area computed from the subimages.

But to confirm the mean area of the cell observed in the histogram, statistics was applied to mathematically obtain the mean and standard deviation of the cell area. The histogram data was first cropped area values greater than 1000 pixel area. This range was considered as an outlier since it is an overshoot produced by cells that are clumped together. The mean area and the standard deviation computed is,

Mean cell area: 428.21

Standard deviation: 174.05

So now that we obtained the average area of a single cell, let us find an application for this. Consider the image in figure 5.

Figure 5. Image of cancer cells together with normal cells.

The image above was again made of punched paper but this time larger shaped punch paper was included. This large punched paper represents a cancer cell. So now, that we know the average area of a normal cell, the goal is to implement a way to isolate these cancer cells from the normal cells. The image in figure 5 was also binarized to a certain threshold and then was cleaned by using the Opening operator.

Figure 6. (Top) Binarized image with a threshold of 0.8 and (Bottom) Resulting image after Opening operator was applied with a circular structuring element with a radius of 5 pixels.

Now to isolate the cancer cell, we will use again the opening operator but this time with a circular structuring element with radius equal to

The reason for this is that we consider the maximum deviation from mean is still an area of a normal cell and by knowing that the cancer cells are much larger than the normal cells. The resulting image after the Opening operator was applied is shown in the image below.

Figure 7. Isolated cancer cells.

In the image above, we can see that we successfully isolated the cancer cells from the normal cells by just using morphological operations and the techniques learned from the previous activity. Because of this success, I will give myself a grade of 10. I find this application very interesting since my research is about porous materials and so I can see that this technique is very useful in the field of my research.

References:

Dr. Soriano. Applied Physics 186 Activity manual, Activity 10: Binary Operations. 2010.
Morphology - Opening
Morphology - Closing
Opening Operator
Closing Operator

Saturday, August 7, 2010

Activity 9: Morphological Operations

Morphological operations are used in image processing as a method to enhance images for further processing and also for information extraction. These operations affect the shape of the image when applied, the shape may be expanded, shrunk, thinned, or deformed. Dilation and erosion are one of these operations. Dilation is a process where the image expands depending on the structure element (or strel) used. Dilation, in Set theory is defined by,

To illustrate how Dilation works, let us look at figure 1. A is the image to be dilated and B is the strel that will be used.

Figure 1. Illustration of how Dilation works on image A with a strel B.

The second operation is the Erosion, which is the opposite of Dilation. Erosion shrinks the image depending again on what strel is used. Erosion, again in Set theory, is given by the equation,

The illustration for Erosion is shown in figure 2.

Figure 2. Illustration of how Erosion works on image A with a strel B.

In this activity, various strel's were used in Dilating and Eroding different shapes. The shapes that were used was a 5x5 square, 3x4 right triangle, 10x10 hollow square with thickness of 2, and lastly a 5 units long cross. As for the strel's, a 2x2 square, 2x1 rectangle, 1x2 rectangle, 3 units long cross and a diagonal were used. The images of the shapes and strel's used are shown below.

Shapes used

Strel's used

Figure 3. Images of the shapes and strel's used in performing Dilation and Erosion operations.

Hand-made predictions of the Dilation and Erosion results were made. The following predications are shown below.

Figure 4. Predictions of dilation results of a 5x5 square with the different strel's.

Figure 5. Predictions of erosion results of a 5x5 square with the different strel's.

Figure 6. Predictions of dilation results of a 3x4 right triangle with the different strel's.

Figure 7. Predictions of erosion results of a 3x4 right triangle with the different strel's.

Figure 8. Predictions of dilation results of a 10x10 hollow square with the different strel's.

Figure 9. Predictions of erosion results of a 10x10 hollow square with the different strel's.

Figure 10. Predictions of dilation results of a 5 units long cross with the different strel's.

Figure 11. Predictions of erosion results of a 5 units long cross with the different strel's.

Now let's try to put the shape into Scilab and apply dilate and erode function and see if our predictions match the simulations from Scilab.

Figure 12. Dilation of a 5x5 square with different strel's using Scilab.

Figure 13. Erosion of a 5x5 square with different strel's using Scilab.

Figure 14. Dilation of a 3x4 right triangle with different strel's using Scilab.

Figure 15. Erosion of a 3x4 right triangle with different strel's using Scilab.

Figure 16. Dilation of a 10x10 hollow square with different strel's using Scilab.

Figure 17. Erosion of a 10x10 hollow square with different strel's using Scilab.

Figure 18. Dilation of a 5units long cross with different strel's using Scilab.

Figure 19. Erosion of a 5units long cross with different strel's using Scilab.

We can see that the predictions and the simulations from Scilab are all similar, there are just a few that didn't really matched. Exploring other morphological functions in Scilab, let us take a look at the functions skel and thin. These functions, unlike dilate and erode, don't need strel's for morphing the images.

Figure 20. Images of the shapes after applying skel function from Scilab.

Figure 21. Images of the shapes after applying thin function from Scilab.

In this activity, morphological operations in Scilab were presented and hand-made predications were made to somehow see how does these operations work. In this activity, understanding how the dilate and erode works is a bit tricky, but once you understand them, excitement comes in, specially when predicting the simulation results. I would give myself a grade if 10 for this activity.

Reference:

Dr. Soriano. Applied Physics 186 Activity Manual: A9 - Morphological Operations. 2010.