1. Introduction
Rice is a staple food for more than half of the world’s population [
1]. As a primary crop, it contributes a dominant proportion of the balanced diet of humans, predominantly in the diet of Asians, where most of the world’s rice is consumed and grown [
2]. Pakistan is the 3rd largest exporter (80%) and the 12th largest producer of rice worldwide. Rice constitutes the second largest cash crop, and about 10% of its agricultural land is dominated by rice crops [
3]. Therefore, high-quality rice production is a significant source of food, income, and economic growth [
2]. Considering the continuing advancement of knowledge and technology in quality control, coupled with consumers’ food quality expectations, the need for precision and transparency in quality monitoring has become more important. Therefore, evaluating rice based on quality attributes is an extremely pivotal step to achieve and maintain high quality for both consumption and to maximize economic return [
4].
There are typically seven physical parameters associated with rice quality: damaged, broken, paddy, colored (yellow and white), chalky, stonesand foreign objects, and dimensions/weight of the rice kernel [
5]. The quality of rice kernels is often compromised by the quality of the seed and working parts of the post-harvest agricultural processing machinery.
Generally, in parts of the world where rice is cultivated, and more specifically in Pakistan, the investigation of the type of grain, grading, and gauging quality attributes to national and international standards is done manually by human inspectors (vernier, caliper, and weighing scales) [
6]. The labor-intensive checking process is multifaceted. It depends on human factors, for example, the number of people in the crew and the efficiency of performing a particular task. Although it ensures accuracy to some degree, it requires significant trained manpower, resulting in high labor costs, and judgments are still subjective. Therefore, the quality of the kernels is manually checked every 1–2 h as a new batch is introduced into the warehouse [
7].
In agriculture, it is noteworthy that traditional visual and manual equipment-based (Vernier, caliper, and weighing scales) quality inspection systems, if replaced by computer vision systems at the commercial level, could prove to be fast, efficient, and accurate evaluation systems [
4]. Computer vision-based determination of the qualitative characteristics of rice kernels will allow industrialists to adopt a non-destructive and automated continuous food assessment inspection system [
1]. This will allow processing operations to be monitored continuously, thereby enabling an operator to react quickly to changes in kernel material properties, thus reducing the overall manual inspection cost [
8] while improving accuracy.
In this study, techniques using image processing and computer vision, combined with machine learning, were used to assess grain quality using a non-destructive and inexpensive approach. This study presents Pakistan’s first commercial automated rice quality assessment system. The uniqueness of our system compared with other computer vision-based rice quality assessment systems is that it detects all seven major physical features of the rice kernel that play a significant role in assessing and controlling the quality of the dry kernel. In addition, the system was deployed in different rice factories for a period of approximately 3 months to evaluate its efficacy. The testing indicated a successful implementation in the operation of rice factories, associated with the quality of the rice kernels. It was found that 99% accuracy was achieved for the size, weight, color, and chalkiness. For the other parameters, a precision of 98.8% was achieved for the classification of damaged and undamaged kernels, 98% accuracy for spotting broken kernels, and 100% for paddy rice kernels. Based on these findings, it was deduced that the system is efficient in analyzing the quality of kernels in IRRI-6, PK386, 1121 white rice, selah rice, super kernel basmati brown rice, and white rice based on their specific characteristics. The rice types selected for this study were based on their importance, as they are the most significant varieties produced, consumed, and exported internationally by Pakistan.
A detailed review of previous works related to computer vision and machine learning technology involved in food grain classification is presented to grade the novelty of our system with the systems developed in the past.
1.1. Deep Neural Networks
Deep learning neural networks are popular and effective for dealing with image classification problems in machine learning-based algorithms. They do not rely on a feature extraction algorithm because they use image pixels as input and can produce high accuracies [
8]. Ref. [
4] addressed the performance of four different algorithm frameworks to classify processed rice into their corresponding classes of low-processed sound kernels (LPS), low-processed broken kernels (LPB), high-processed sound kernels (HPS), and high-processed broken kernels (HPB). The artificial neural network (ANN) algorithm, which has a 12 − 5 × 4 topology, was very effective with an accuracy of 98.72%. Ref. [
2] used convolutional neural networks (CNN) for the classification of two different types of rice, that is, whole rice and broken rice, based on their different sizes. The data consisted of two thousand camera images of broken and whole rice of the Loc troi strain. When the CNN model was applied, it exhibited a precision of 99.16% and 89.75% for the training and testing data, respectively. The study was conducted on only one rice type, making it inefficient to assess the quality attributes of other types of rice that are mostly produced and consumed in the region.
1.2. Support-Vector Machines (SVM)
Support-vector machines (SVM) are supervised learning models that are classified based on arranging the input vectors into a high-dimensional space and building a hyperplane for segregating the data. It is an effective technique for solving classification and pattern recognition problems [
9]. Ref. [
10] employed a multi-class SVM to classify and grade various varieties of rice in their respective classes with an accuracy of 86%. The SVM comprising the universal Pearson VII kernel function successfully classified processed rice into corresponding classes with an improved precision of 98.48%. Previously, [
11] an attempt was made to develop an SVM algorithm based on a linear kernel to effectively separate overlapping rice kernels. Contour detection and watershed algorithms were used to evaluate the contours and perform segmentation, respectively. The calculated classification accuracy was 88.0%, whereas that of the segmentation was 96.0%.
1.3. Fuzzy Inference System
Machine learning techniques based on fuzzy logic have been proven to successfully interpret human behavior in terms of judgment and analysis [
12]. An intelligent method based on fuzzy logic was developed to qualitatively grade milled rice by utilizing the AND operator to generate 25 rules in the rule base of the fuzzy inference system. Compared with skilled workers, the overall confidence was 89.80%, and high sensitivity and specificity were observed for grading rice into its respective classes. Ref. [
13] provides more recent evidence that confirms the capabilities of fuzzy logic-based systems as they implemented another automatic rice classification system (ANFIS). They claimed that the system outperformed the k-NN (k-nearest neighbor algorithm) and SVM ML methods with an accuracy of more than 98.5% in classifying broken and whole-grain rice.
1.4. k-Nearest Neighbor Algorithm (k-NN)
Owing to its simplicity, effectiveness, and nonparametric nature, the k-nearest neighbor algorithm (k-NN) is also popular for classification problems. It groups similar points together and determines classification boundaries based on the proximity of neighboring points [
14]. Experiments for grading Paw San rice into three classes, A, B, and C, were conducted [
15] prior to segmentation, image pre-processing was performed on images captured against a dark background, and selected features were fed to k-NN classifiers. The resulting accuracies were as high as 100% for class A, 93% for class B and 83% for classes A, B, and C, respectively.
1.5. Edge Detection, Segmentation, and Thresholding Algorithms
Recently, Ref. [
16] proposed an evaluation system to inspect the quality of Type C4 Raja rice using a digitally processed Canny edge detection algorithm. The length-to-width ratio was calculated to determine the rice category. Ref. [
17] are among the few researchers that have addressed the issue of rice grain evaluation by implementing a typical webcam for image acquisition. The bounding box and band detection techniques were executed to identify objects by creating an enclosing boundary and computing their area for discerning specific regions. One of the criteria for analysis was the overlapping and non-overlapping positioning of grains, where the enumerated error for overlapping was quite high at 53.82%, whereas that of non-overlapping was significantly low at 0.47%.
The objective of our study is to qualitatively analyze rice kernels based on computer vision and machine learning techniques. The study successfully classified seven different rice types, IRRI-6, PK386, 1121 white rice and selah rice, basmati brown super kernel, and white rice, to the best of our knowledge. Each rice type was classified by detecting all the physical features (damaged, broken, paddy, colored (yellow and white), chalky, stones, objects, and weight) that are responsible for grading the rice product on the basis of quality. The images were acquired from a flatbed scanner to extract the morphological characteristics of the rice, such as size (length, width), color, and weight, to assess the chalkiness, yellowness, damage, and whether they were paddy or not. Random forest, linear regression (LR), and visual geometry (VGG-16) were the primary models used in this study. LR works on the basis of probability to form an algorithm based on predictive analysis and is commonly used for classification applications. It uses a sigmoid function to map the predicted values in the form of probability [
18]. We believe that there is significant potential to develop methods based on LR and VGG16 models. As reported for [
19] both LR and VGG-16, rice varieties in Thailand have been classified.
They claimed to have chosen VGG-16 because it showed good results in a short span of time. However, to obtain better results, they had to significantly reduce the image size, compromising the efficiency of the model, whereas the LR model also yielded only satisfactory results. Ref. [
8], in their work for rice classification, used the “inrange” function, which defined upper and lower boundaries to classify rice type on the basis of color, but the accuracy achieved was lower than that reported in our study using a similar technique. We further developed a user-friendly and non-complex graphical user interface (GUI) system that is missing in many studies [
9,
19,
20], making them impractical.
The application provides the system with a provision to be customized, which would allow the end-user to input data and custom parameters, as every manufacturer has different standards for broken and paddy rice, chalkiness, and yellowness.
A detailed review of previous works is presented in Section I. The proposed system design methodology is described in Section II. The vetting and results are discussed in Section III. Section IV presents the conclusions of this study.
6. Results and Discussions
The results of the system were validated by cross-checking with the conventional method of measuring the characteristics associated with the quality of the rice product. A digital Vernier caliper was used to calculate the length and width of various grains to determine the average difference between the values calculated by the system and those calculated manually. For length, the difference is 3–4 percent, that is, up to 0.324 mm, while for width, it is approximately 10 percent, which is 0.208 mm. Yellowness is determined using the inRange function, where threshold pixels are set to distinguish yellow from rice grains, yielding 99% accuracy. However, when the process was performed to detect chalky rice after readjusting the HSV color space, we obtained an accuracy of 99%. The VGG-19 machine learning technique was used to evaluate damaged and paddy rice kernels, and the model spotted object features with efficacies of 98.8% and 100%, respectively.
Figure 11 and
Figure 12 show the system generated confusion matrices of the damaged and paddy rice kernels, respectively.
To predict the weight of the kernel, a random forest regression model was employed; the dataset was split into two features, no trees (estimators) were 100, and the decision was made according to the performance of the tree based on the RMSE and MAPE of the testing dataset, which was 1.21 × 10
−8 and 6.7 × 10
−5, respectively. Based on the values of RMSE and MAPE, the system showed 99% accuracy.
Figure 13 shows a graphical representation of the predicted and true weight values of the rice kernel.
The identification of broken rice from the dataset was carried out by setting threshold values for the major axis length of the rice kernel, and the generated results of the system were equated with the broken values obtained from the rice industries that showed 98% accuracy.
Graphical User Interface (GUI): Python 0.1.9.2 decanter was utilized to make the desktop application, called National Grain Tech; this comprises an interactive design providing a user-friendly environment and features, such as multiple buttons/options to resize the data, and caters for the needs of a new user.
Figure 14 shows the front-end view of the desktop application.
The application is user-friendly and easy to use by non-technicians. The applications have the option to scan and run the test for 10 g or 100 g rice samples. It provides options for a detailed and summarized test report.
Figure 15 presents the summarized test results for a sample of each rice kernel and indicates the percentage of features such as yellow, chalky, damaged, and paddy rice. The sample contained 20% damaged rice. When selecting the particular feature tab, the particular rice kernels identified according to that feature are displayed.
Figure 16 shows the detailed quality analysis report generated by the application specifying total number of grains, average length and width, weight and number of grains for yellow, damaged, paddy, and chalky rice.
To the best of our knowledge, this is the first time a rice quality analyzer has been able to achieve such high accuracy while providing a rapid and comprehensive feature analysis. To test the developed system, it was deployed in different rice factories for approximately 3 months to evaluate its efficacy. The testing indicated a successful implementation in the operation of rice factories, associated with the quality of the rice kernels. Work is still underway to adapt this to large-scale implementation in Pakistan’s rice industry.
Table 3 summarizes the accuracy achieved for all features. The achieved accuracy for determining features, such as size, weight, color, and chalkiness, was 99%. Damaged and undamaged rice kernels were detected with 99.8% accuracy. In addition, broken and paddy rice kernel features were detected with 98% and 100% accuracy, respectively.
7. Conclusions
Overcoming the challenges faced by rice industries due to the existing traditional manual assessment of rice quality, which is prone to errors, is tedious and time-consuming. This study presents Pakistan’s first AI-based rice quality analyzer (‘National Grain Tech’) that analyzes the quality of a sample of rice kernels in less than 60 s. The system was successfully tested on six different rice types (IRRI-6, PK386, 1121 white rice, selah rice, super kernel basmati brown, and white rice) and was able to predict seven major features that strongly contribute to the quality assessment of the rice products, which is a unique result. Previous research has only evaluated singular rice types and extracted a maximum of four features. The results demonstrated 99% accuracy in determining the size, weight, color, and chalkiness of rice kernels, whereas an accuracy of 98.8% was achieved for the classification of damaged and undamaged, 98% accuracy for spotting broken, and 100% for determining paddy rice kernels. The system proved to be non-destructive and precise, enabling an operator to react quickly to changes in comparison to conventional methods that are inconsistent, unreliable, and inefficient. The results are significant as the developed system improves local rice quality testing and quality control capacity through a faster, more comprehensive, accurate, and less expensive mechanism in comparison to previous research studies. This system was validated by deploying and testing it in various rice factories for a period of approximately 3 months to evaluate its efficacy. The testing indicated a successful implementation in the operation of the rice factories, improving the testing time and accuracy, resulting in precise quality assessment, which significantly impacts and determines the price and export potential of the tested rice. Based on the advice of rice experts in the industry, we plan to extend this work to include the detection of insects, moisture detection, and spotting of overlapped kernels by incorporating advanced techniques.