1. Introduction
With the increase of the population and the improvement of life from day to day, people’s demands on quality, quantity, and type are getting higher. Therefore, factories must expand and change to follow up the amount of production. Currently, people are researching how to speed up the production line but maintain the same quality of food.
Size, weight, and shape are the major indices in determining standards in the industry such as packaging, grading, processing, and conveying. This makes researchers have to investigate the physical characteristics of fruit and vegetables to find the solution boosting up the production line [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10]. There are some works that classified the fruit/vegetables based on its shape and features. They applied various classifiers (Artificial Neural Network (ANN), Fourier-descriptor (FD), …) for some types of fruit/vegetable listed in
Table A1 of
Appendix A. In addition, there is a superior survey that delivers not only a summary of many different methods, tasks, and concepts but also the key challenges and future directions for food computing [
11]. Typically, the mass of the product will be measured using a load cell incorporated with a conveyor belt such as C35 AdvancedLine WD Checkweigher by Mettler Toledo, which can measure weight from 25 g to 7000 g with throughput up to 250 packets per minute or 2000 g Checkweigher from A&D Inspection has throughput up to 320 packets per minute or digital scale for individuals or businesses [
12,
13]. The load cell converts gravitational force into a measurable electrical signal. Usual load cell requires filters and Analog-to-digital Converter (ADC) to work. This leads to a high mechanical structural cost and is also time-consuming when we have to recalibrate the systems after a certain amount of usages time depending on the hardware quality.
Due to the limitation of the load cell, many researchers try to find other methods to estimate weight/volume without using a load cell [
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29,
30,
31,
32,
33,
34,
35,
36,
37,
38]. One of the most accurate ways is to estimate the volume of the product and using a pre-determined “density” to calculate the actual weight of the object because mass and volume are correlated to a certain degree. For instance, the correlation coefficient (R
2) between mass and volume of mango is 0.9978 [
15], tangerine, lime, lemon, and orange are 0.9045, 0.9592, 0.9641, and 0.9232, respectively [
28].
The previous method to estimate the volume of an object is the water displacement method (WDM), which is presented in References [
16,
17]. The water displacement method is known to give an accurate result by ensuring the water fully submerges the objects. However, this method is hard to do by always keeping the buoyancy force of water acting on the objects. In addition, it is difficult to maintain the force so that the object remains afloat in the water as well as get a good reading.
Another method is a 3D reconstruction using 3D scanning from multiple 3D views [
22,
23,
24,
25,
26,
27,
39]. The 3D construction returns highly accurate results, but it requires many cameras. Moreover, the calibration between cameras and the perfect timing when capturing the images at the same time is very complex. It also requires the complicated algorithm to preprocess and reconstruct a 3D representation of the object. On the other hand, the 3D scanning is only applicable in the laboratory because we have to work many times at many different angles. Therefore, the 3D point clouds reconstruction is very time-consuming.
Some researchers also used geometrical features of an object to estimate volume using Multiple Linear Regression (MLR) [
14]. The MLR is the supervised learning with a procedure called Ordinary Least Square [
40,
41]. It is used to predict which one has two or more independent inputs (the widths and volume of an object) and return the function as below.
We have, in the total of inputs, is the coefficient of a set of chosen features , and is the predicted volume.
In addition, the Artificial Neural Network (ANN) has also been used for volume estimation [
14,
26,
37]. One of the most popular ways to estimate the volume is to use a single image and measure the widths and/or lengths to feed into their model [
14,
37]. There are many types of ANNs in the literature [
26,
37,
42]. However, the Levenberg-Marquardt algorithm (LM) still provides sufficiently accurate results in volume estimation [
26,
37]. The Levenberg-Marquardt neural networks were used to form non-linear combinations of various features to estimate fruits’ volumes [
37,
42]. It takes the sum of squared of nonlinear functions to get the minimum of a function. With its simple and robust platform, we decided to apply it to predict the volume of fruits/vegetables.
In Equation (3),
is defined as the Jacobian matrix,
is a non-negative scalar, and
I is the identity matrix. For some scalar
related to
, the vector
is the solution of:
Even though the accuracy of ANNs is not high and each approach works for one type of product only, those are very promising approaches. If we can find the appropriate feature, the accuracy will increase so it will be more suitable for real-time implementation.
On the other hand, there are some proposed that used only one two-dimensional image of an object to estimate its volume by using the disk method [
29,
30,
31,
32,
33]. They slice the fruit into many disks with the properties shown in
Figure 1 [
15,
28,
29,
30,
31,
32,
33].
Figure 1 shows that, once we have the dimension (
) and height (
) of each slice, we can calculate its volume (see Equation (6)) via its area (see Equation (5)). Lastly, the volume of an object is calculated by summing all the volume of
n disks (see Equation (7)). In detail, the height (
) will equal 1 pixel to have the most precise estimated volume.
Ibrahim et al. [
15] also modified the disk method to fit with unpredictably shaped objects. Once they have two view images (top and side) of a mango, the area of each disk is defined in Equation (8).
With
and
is the width of the
ith disk from the top view and side view, respectively.
Figure 2 illustrates one slice with the mentioned parameters in Equation (8).
The disk method, however, is one of the most suitable ones for objects that has a round or elliptical shape. With a fruit/vegetable shape like a carrot, it cannot return the highest results. Moreover, Venkatesh et al. researched that the tilt angle of the fruit or vegetable must be below 10 degrees for the reasonable volume estimation [
19]. Therefore, the alignment step must be integrated into the production line. This also increases the cost of the alignment module or workforce.
Some fruits/vegetables also have an elliptical cone or cap sphere shape on its top or bottom such as a cucumber, melon, pumpkin, carrot, grape, tomato, kiwi fruit, passionfruit, etc. Among these fruits/vegetables, a carrot, a cucumber, a daikon, etc. are considered to be the slender axi-symmetric fruits or vegetables due to their height being much longer than their width. With this property, Vankatesh et al. cannot assume those fruits as an ellipsoid or a sphere [
19]. In addition, this approach requires multiple manual captures with different angles. Thus, it is not practical to implement in a read machine vision system.
Our approach focuses on improving the accuracy and computational time of the disk method [
29,
30,
31,
32,
33], so that the model will be accurate enough and fast for real-time measurement [
36]. With only one top view image of the fruit or vegetable, we slice the object into a number of equal slices. Then, the volumes of the top and bottom parts are assumed to be the volumes of the elliptical cone. We considered each slice of the middle part as a chopped pyramid. Based on the results of those previously mentioned steps, we can estimate the fruit/vegetable volume. In addition, since we know there is a high correlation between fruit/vegetable weight and its volume [
15,
19,
28], we can infer the mass based on volume information. Once we get the volume of the top, the middle, and the bottom parts, we estimate the volume and the density, from which we can calculate the object’s weight. In this paper, we focus on the carrot and cucumber, which are ranked as the most common vegetables in the world [
43].
We divided the content of this paper into three main sections. In the introduction, we briefly mention the current concerns regarding volume measurement of the fruit/vegetable in the assembly lines as well as in the laboratory settings. In addition, the literature review that we are going to apply and compare is also discussed in this section. The Materials and Methods section presents our proposed approach and how we collect data. The result shows the outcome of our method and other research studies and compare each.
4. Discussion
From
Table 7 and
Table 8, we can see that other methods’ accuracy rates are slightly lower and have a higher deviation than our research except for that of the ANN method on estimating the volume of the carrot.
In detail, the errors for the volume estimation of the proposed method were 3.4% and 3.2% for the case of the carrot and cucumber, respectively. With the same datasets, the disk method’s error rates were 6.7% and 6.6%, while those of the ANN method were 3.9% and 5.5%, and MLR’s error rates were 8.3% and 5.4%, respectively. The standard deviation of our method on estimating the volume of the carrot ranked second, which was 7.4 mL, surpassed by ANN’s one, which was 6.2 mL, and was followed by 12.5 mL and 14.8 mL when using MLR and the disk method. However, for the standard deviation of the cucumber dataset, the highest value was 15.1 mL for the disk method, while our proposed method ranked first with 6.7 mL. This was followed by 8.4 mL of MLR and 9.1 mL of the ANN method.
In addition, most of the works that we have done are completely automatic by using Computer Vision while Örnek et al. had to measure each carrot manually [
37]. Moreover, the accuracy of the disk method with two side images will decrease due to some loss at the contact surface between the object and the base. Additionally, the time execution for the disk method is also higher when each slice’s thickness is 1 pixel.
With a high correlation between mass and volume, we can estimate the weight via the density. All the results illustrated that the image processing technique can be applied in the estimation size and mass of slender axi-symmetric fruits/vegetables and gained high accuracy and was fast enough to be embedded into the production lines.
The errors of our system come from many phases during the process such as the preprocessing step, camera, and computational rounded. The error will be higher when the fruits/vegetables in which its special shape is not axi-symmetric such as a banana [
36]. However, the proposed method is able to work with a banana, which is the type of bending agricultural products shown in
Figure 27. In addition, our system can even process the fruits/vegetables from different angles, which is the concern of Venkatesh et al. [
19].
Figure 27 illustrated the slicing of a banana using our method. We found that the result is still reasonable with a mean error of 5.7% and the deviation of estimated volume is only 4.5875 mL. With bananas having smaller bending radii, our method provided better results. The error is only 1.2%. This demonstrates that our method is still applicable to fruits/vegetables having small bending radii. Additionally, a further improvement needs to be implemented in the near future.