Us humans already have the inborn instinct or skill that allows us to recognize or differentiate one thing from the other. And most of the time, our own observations and experiences are enough for us to learn which is which. However machines and computers do not posses this instinct (they don't have instincts); much like how we teach babies and kids, we must also exert effort to teach machines how to recognize various things it may encounter. The difficulty with machines is that, although their language is based on math, they are still much harder to teach compared to kids (machines are stupid). In spite of this, the growing demand to increase productivity through automation and the physical limitations of the human body (there is just so much our body can do) pushes us to rely on machines and computers for various tasks. This demand and need for automated machine recognition resulted to a field known as machine vision. One sub domain of machine vision is pattern recognition which, from the name implies, basically means having a machine recognize and classify/segregate objects of patterns. In this activity we will be using Scilab to do pattern recognition of objects from an image. Our method for classification we will be using is the minimum distance classification.
All pattern recognition problems start with having a set of patterns or objects with a finite number of classification groups or classes. Each group or class is differentiated from each other based on a set of identifiable features and parameters. Hence in all pattern recognition tasks the first step is selecting the parameters or features we base our differentiation from. Since we are dealing with images it is logical that we use visually obtained information as our features. These features may be, color, shape, size, area, etc. and we may use as many as we see fit. We just have to remember that we need to select features that are distinctive to our objects or patterns otherwise we wont be able to classify them. In our method we place these features into vector form. That for each object there corresponds a vector that contains all of its features (equation 1).
The next step is to train the computer to recognize the different patterns. In our method this simply means that for each class we select a few samples and we take the average of their feature vectors (equation 2). This mean vector serves as the representation of the whole class; that is, each class should have a unique mean feature vector.
The final step is to test whether our feature selection and training was sufficient. We do this by letting the computer classify the other objects (excluding those used in training). The accuracy of classification depends on how much of the objects were classified into their proper classes. The minimum distance classification, which we use in this activity works by following equation 3.
As in equations 1 and 2 wi is the feature vector of object i and mj is the mean feature vector of class j. Using this equation we simply calculate d for each object for each class. We then classify object i into the class that resulted in the highest value of d.
In this activity I chose the suites of playing cards as my object class. Since I don't have a camera, I scanned the cards and this served as my raw data. From this images I classified whether the pattern is heart, diamond, spade, or club. Take note that I only consider the figures as my object and not the actual playing card (in other words the 10 of spades has 10 spade objects).
I used three features to construct the vectors describing each object. The first feature is their normalized R value (see activity 12). As expected the R value of diamond and heart is the same and so is the value of spade and club. Diamond and heart had a very high R value whereas club and spade had a low one. This made it easy for me to segment these objects from the background (figure 1) and calculate the other features I used. However, the segmentation was still not perfect so I used closing and opening (activity 8 and 9) operations to further improve the quality.
The next 2 features I used both characterizes the shape of each object. I calculated the ratio of the total area of each object with the area of the minimum rectangle that can contain the object which served as the second feature. The area of the minimum rectangle is simply the product of the maximum dimensions of the object ([max(x)-min(x)]*[max(y)-min(y)]) The final feature was simply the ratio of the object's total area with the square of its perimeter.
Table 1 contains all the features and group classification of the 9 samples per class I used to construct the training set and the mean feature vector of each class. I also labeled each group such that grp 1 are hearts, grp 2 diamonds, grp 3 spades, and grp 4 clubs. Now using the feature vectors of the other 40 objects in figure 1 (10 per suite/class) I calculated the corresponding d for each class. Table 2 shows the result of classifying these 40 objects and we clearly see a 100% accuracy of classification. This is further highlighted in figure 2 where I plotted the feature vectors of each object. Although the plot of Area/maxDim vs R could not distinguish between heart and diamond, it was compensated by the other parameters as seen in the 3D plot (figure 2d).
I give myself a grade of 10 in this activity for a job well done. I would like to thank Ms. Cindy Liza Esporlas for helping me scan the cards and sending me the images. I also would like to thank Mr. Jay Samuel Combinido for teaching me to do 3D plots.
Main Reference
Maricor Soriano, A14 - Pattern Recognition, AP186 2008
All pattern recognition problems start with having a set of patterns or objects with a finite number of classification groups or classes. Each group or class is differentiated from each other based on a set of identifiable features and parameters. Hence in all pattern recognition tasks the first step is selecting the parameters or features we base our differentiation from. Since we are dealing with images it is logical that we use visually obtained information as our features. These features may be, color, shape, size, area, etc. and we may use as many as we see fit. We just have to remember that we need to select features that are distinctive to our objects or patterns otherwise we wont be able to classify them. In our method we place these features into vector form. That for each object there corresponds a vector that contains all of its features (equation 1).
The next step is to train the computer to recognize the different patterns. In our method this simply means that for each class we select a few samples and we take the average of their feature vectors (equation 2). This mean vector serves as the representation of the whole class; that is, each class should have a unique mean feature vector.
Equation 2
The final step is to test whether our feature selection and training was sufficient. We do this by letting the computer classify the other objects (excluding those used in training). The accuracy of classification depends on how much of the objects were classified into their proper classes. The minimum distance classification, which we use in this activity works by following equation 3.
As in equations 1 and 2 wi is the feature vector of object i and mj is the mean feature vector of class j. Using this equation we simply calculate d for each object for each class. We then classify object i into the class that resulted in the highest value of d.
In this activity I chose the suites of playing cards as my object class. Since I don't have a camera, I scanned the cards and this served as my raw data. From this images I classified whether the pattern is heart, diamond, spade, or club. Take note that I only consider the figures as my object and not the actual playing card (in other words the 10 of spades has 10 spade objects).
I used three features to construct the vectors describing each object. The first feature is their normalized R value (see activity 12). As expected the R value of diamond and heart is the same and so is the value of spade and club. Diamond and heart had a very high R value whereas club and spade had a low one. This made it easy for me to segment these objects from the background (figure 1) and calculate the other features I used. However, the segmentation was still not perfect so I used closing and opening (activity 8 and 9) operations to further improve the quality.
The next 2 features I used both characterizes the shape of each object. I calculated the ratio of the total area of each object with the area of the minimum rectangle that can contain the object which served as the second feature. The area of the minimum rectangle is simply the product of the maximum dimensions of the object ([max(x)-min(x)]*[max(y)-min(y)]) The final feature was simply the ratio of the object's total area with the square of its perimeter.
Table 1. Features and Group of Training set
(click to enlarge)
(click to enlarge)
Table 1 contains all the features and group classification of the 9 samples per class I used to construct the training set and the mean feature vector of each class. I also labeled each group such that grp 1 are hearts, grp 2 diamonds, grp 3 spades, and grp 4 clubs. Now using the feature vectors of the other 40 objects in figure 1 (10 per suite/class) I calculated the corresponding d for each class. Table 2 shows the result of classifying these 40 objects and we clearly see a 100% accuracy of classification. This is further highlighted in figure 2 where I plotted the feature vectors of each object. Although the plot of Area/maxDim vs R could not distinguish between heart and diamond, it was compensated by the other parameters as seen in the 3D plot (figure 2d).
Table 2. Parameters and Classification Results of the 40 Test Objects
(click to enlarge)
(click to enlarge)
(click to enlarge)
Figure 2. a)Area/max Dimensions vs R, b) Area/Perimeter^2 vs R, c) Area/max Dimensions vs Area/Perimeter^2, d) 3D plot R (x axis), Area/max Dimensions (y axis), Area/Perimeter^2 (z axis). The classes heart, diamond, spade, club are represented by the red, green, blue, black markers respectively.
Having tested the accuracy of our method I further tested its flexibility using larger images with mixed objects (figure 3) with a total of 140 objects. The results shown in Table 3 and Figure 4 still shows a 100% accurate classification.Figure 2. a)Area/max Dimensions vs R, b) Area/Perimeter^2 vs R, c) Area/max Dimensions vs Area/Perimeter^2, d) 3D plot R (x axis), Area/max Dimensions (y axis), Area/Perimeter^2 (z axis). The classes heart, diamond, spade, club are represented by the red, green, blue, black markers respectively.
Table 2. Classification Results of the 140 Test Objects in figure 3
(click to enlarge)
(click to enlarge)
(click to enlarge)
Figure 4. a)Area/max Dimensions vs R, b) Area/Perimeter^2 vs R, c) Area/max Dimensions vs Area/Perimeter^2, d) 3D plot R (x axis), Area/max Dimensions (y axis), Area/Perimeter^2 (z axis).
Figure 4. a)Area/max Dimensions vs R, b) Area/Perimeter^2 vs R, c) Area/max Dimensions vs Area/Perimeter^2, d) 3D plot R (x axis), Area/max Dimensions (y axis), Area/Perimeter^2 (z axis).
I give myself a grade of 10 in this activity for a job well done. I would like to thank Ms. Cindy Liza Esporlas for helping me scan the cards and sending me the images. I also would like to thank Mr. Jay Samuel Combinido for teaching me to do 3D plots.
Main Reference
Maricor Soriano, A14 - Pattern Recognition, AP186 2008
No comments:
Post a Comment