Visual classification is a simple and self-explanatory process, at least for humans. Machines, on the other hand, have a much tougher time classifying objects in the images. Hence, classification has become one of the most engaging challenges for computer vision engineers. Whether your task is to distinguish between good and faulty products or to categorize facial expressions, similar methods can be used.
This article will showcase a basic, yet challenging and relevant classification problem (also delicious). The goal is to classify different sorts of fruit and vegetables. There are 10 distinct classes as shown in the image below.
The classic approach
The first step is to get familiar with the dataset in order to better understand the challenges of the task. It is also a good idea to write down the differencies you have noticed while looking at different classes. The two main features, not just in this case, are the color and the shape of the object. While the shape varies based on the perspective, the color can be influenced by lighting conditions. To learn more about robust color detection, check out the following article.
Another useful feature is texture. Imagine a task where you have to differentiate between a peach and a nectarine. Since they are the same color and they have the same shape, the distinguishable difference is that a peach has a rough surface while nectarine has a smooth surface.
Features mentioned above are used in the classic approach to classification. Nowadays, this is a common practice, so programs like Halcon have a built-in identifier functions to speed up the process. Halcon’s identifier is based on the color and texture features. It managed to correctly classify 69/70 images, with 1.4% error rate.
What happens if the conditions are not perfect? One of the potential uses of this concept could be at the fruit/vegetable weighing stations in the grocery stores. In that case, the fruit and vegetables are in a plastic bag, which can significantly interfere with texture detection. The plastic cover was added to simulate the plastic bag and, in this case, the identifier manages to correctly classify only 60/70 images, with 14,8% error rate. This is not good enough since most classification tasks require error rate lesser than 5%.
Another popular approach to classification is deep learning. It is a machine learning technique based on artificial neural networks which are designed to mimic human brain. They can learn from data without specifying which features to look at. For example, when humans see a lemon, they do not really think about its color and shape, they simply know it is a lemon because they have seen it many times before.
Therefore, deep learning is a really powerful tool which also requires a certain computational power while training the model. On the contrary, it is not that difficult to implement, since there is a lot of deep learning libraries which do most of the work for you. Similarly to the previous method, it only requires a correctly labeled dataset, with optional preprocessing.
Deep learning method was applied on the second datased, with a plastic cover on the fruit and vegetables and it correctly classified 70/70 images, with 0% error rate.
In conclusion, comparing apples and oranges is not as easy as it sounds, unless you are using deep learning.