Accuracy Assessment of Tomato Harvest Working Time Predictions from Panoramic Cultivation Images

Naito, Hiroki; Ota, Tomohiko; Shimomoto, Kota; Hosoi, Fumiki; Fukatsu, Tokihiro

doi:10.3390/agriculture14122257

Open AccessArticle

Accuracy Assessment of Tomato Harvest Working Time Predictions from Panoramic Cultivation Images

by

Hiroki Naito

^1,2,*,

Tomohiko Ota

²,

Kota Shimomoto

³,

Fumiki Hosoi

¹ and

Tokihiro Fukatsu

³

¹

Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 1138657, Japan

²

Research Center for Agricultural Robotics, National Agriculture and Food Research Organization, Tsukuba 3050856, Japan

³

Institute of Agricultural Machinery, National Agriculture and Food Research Organization, Tsukuba 3050856, Japan

^*

Author to whom correspondence should be addressed.

Agriculture 2024, 14(12), 2257; https://doi.org/10.3390/agriculture14122257

Submission received: 29 October 2024 / Revised: 27 November 2024 / Accepted: 5 December 2024 / Published: 10 December 2024

(This article belongs to the Special Issue Sensor-Based Precision Agriculture)

Download

Browse Figures

Versions Notes

Abstract

:

The scale of horticultural facilities in Japan is expanding, making the efficient management of labor costs essential, particularly in large-scale tomato production. This study developed a consistent and practical system for predicting harvest working time and estimating the quantity and weight of harvested fruit using panoramic images of cultivation rows. The system integrates a deep learning model, the Mask ResNet-50 convolutional neural network, to count harvestable fruits from images and a predictive algorithm to estimate working time based on the fruit count. The results indicated that the average for all workers could be predicted with an error margin of 30.1% when predicted three days before the harvest date and 15.6% when predicted on the harvest date. The trial also revealed that the accuracy of the predictions varied based on workers’ experience and cultivation methods. This study highlights the system’s potential to optimize harvesting plans and labor allocation, providing a novel tool for reducing labor costs while maintaining efficiency in large-scale tomato greenhouse production.

Keywords:

tomato; prediction; harvest working time; deep learning; Mask R-CNN

1. Introduction

In the operation of large-scale horticultural facilities, the need for more human resources and high labor costs is a common issue worldwide. Labor costs are a significant burden for large-scale facilities producing commercial horticultural crops, such as tomatoes (Solanum lycopersicum L.), which often require several dozen to several hundred employees. For example, in greenhouse horticulture, labor costs account for 29% and 25% of production costs in the Netherlands [1] and Israel [2], respectively. In Japan, a nationwide survey [3] (p. 84) indicates that labor costs constitute 35% of total production costs.

Reducing labor costs is a key factor in lowering overall production expenses. Therefore, efficient workforce management is critical for maintaining profitability while expanding the production scale [4]. To seek efficient work plans, research has been conducted on modeling the harvesting processes of sweet peppers [2] and cut roses [5]. Another study conducted work planning simulations using multi-agent systems that perform different tasks, such as pickers and runners for strawberry and raspberry production systems [6]. Tomato harvesting and shipment preparation remain unmechanized and are performed manually, resulting in a significant labor burden. The survey indicates [3] (p. 50) that harvesting accounts for 24% of total work time in tomato production facilities in Japan, while 23% is spent on shipment preparation. The time required for harvesting is highly dependent on fluctuations in yield, which vary significantly with weather conditions and fruit set. Consequently, harvesting can sometimes be delayed or completed prematurely. Therefore, if the harvest volume can be estimated beforehand, labor can be allocated more efficiently. Notably, the number of harvested tomatoes and their weight strongly correlate with the time required for harvesting work [7].

Previous studies developed systems to record work performance and improve farm work planning efficiency. In an early exploration of agricultural information systems, a labor management application was created that used a portable terminal to remotely log work hours and details on a database server [8]. To obtain more detailed data, systems that automatically measure weight and register working hours were proposed [7,9,10]. Additionally, cloud-based systems that gather this information and allow managers to view it remotely and in an integrated manner have also been developed [11]. By analyzing these data, managers can assess the work speed of individual workers [7].

Accurate work planning requires both individual work speed data and an estimate of the daily workload. In harvesting, this workload is represented by the number of harvested fruits. Having both sets of data enables accurate prediction of the time each worker needs to complete their tasks and helps evaluate worker performance. However, no method has been fully established to predict daily yields at fine temporal and spatial scales. Previous research has focused on predicting potential fruit yields based on dry matter production for tomato [12], cucumber [13], and sweet pepper [14]. More recently, machine learning-based methods that rely on environmental and historical harvest data have been proposed. For example, studies have introduced network architectures combining a temporal convolutional network and a recurrent neural network to predict future yields based on past data and greenhouse conditions [15]. Others have employed wavelet neural networks optimized by genetic algorithms to forecast tomato yields in sunlight greenhouses in China [16] or used artificial neural networks to simulate tomato crop dry matter weight and fresh fruit yields [17]. Several studies [18,19] estimated the number of harvestable fruits from crop images, reporting strong correlations (r² > 0.80) between estimated and actual yields. However, these studies did not aim to predict daily yields, which are critical for evaluating the daily workload. Other approaches, such as topological-based models combined with data mining [20], have been proposed to predict daily yields. Still, these focused solely on yield prediction without investigating the relationship between yield and work hours. A process modeling method was introduced to predict harvest working time for a greenhouse-based cut-flower production system [5], demonstrating that the system could achieve 94% accuracy in predicting working time. Nevertheless, the system’s complexity makes it impractical for routine use, highlighting the need for a more straightforward approach to predict harvest working time.

This study proposes a system that utilizes panoramic images of multiple plants in a greenhouse to predict harvest-related parameters, including harvest working time. Section 2 outlines how the number of harvested fruits, harvest weight, and working time are predicted. In Section 3, we explain the key finding of this study, which is that prediction accuracy varies depending on the worker, cultivation method, and prediction period. Section 4 discusses the factors affecting prediction accuracy and explores how this method can be integrated into work planning for commercial greenhouses.

2. Materials and Methods

2.1. Method of Collecting Panoramic Images of Cultivation Rows

2.1.1. Low-Stage Cultivation

Panoramic images of low-stage (LS) cultivated tomatoes were obtained in a pipe greenhouse at the Institute of Agricultural Machinery, the National Agriculture and Food Research Organization (NARO), in Tsukuba, Ibaraki, Japan. Image collection was conducted daily from 13 December 2018 to 12 March 2019. The tomato variety selected for this study was the Japanese ‘Momotaro York’. The greenhouses were organized with two rows of growing beds, with 18 plants per row per side, arranged in two staggered rows, and the plants were planted on 6 November 2018 (Figure 1a). A three-stage pruning system was employed, in which the plants were trained to a single stem and all axillary buds were removed.

As described in a previous study [21], the fruit-set monitoring system was used to collect panoramic images of the plants. The system generated panoramic images at night by moving along a heating pipe rail at a constant speed. Analyzing the acquired panoramic images enabled the collection of a wide range of information, including yield prediction, growth diagnoses, and disease and pest detection. This study utilized the system to predict harvest working time based on the number of harvestable fruits counted from the images. It was programmed to activate automatically at 3:00 a.m. nightly, running along the central row of the greenhouse, as indicated by the light blue arrow in Figure 1a. The system captured nightly panoramic images from one side of the two growing rows.

2.1.2. Long-Term Multi-Stage Cultivation

Panoramic images of long-term multi-stage (LTMS) tomatoes were captured in a high-eave greenhouse at the Institute of Vegetable and Floriculture Science, National Agriculture and Food Research Organization (NARO), Tsukuba, Ibaraki, Japan (Figure 1b). The images were collected daily from 1 June 2019 to 2 August 2019, using the same tomato variety, ‘Momotaro York’, as in the LS cultivation. The greenhouse contained six rows of Rockwool hydroponic beds, with 42 plants per row on each side, arranged in two staggered rows. High-wire cultivation was implemented, where the plants were trained to have a single main stem, and all axillary buds were removed. When the plants reached a height of three meters, the vines were lowered weekly using a high-wire system to maintain the visibility of three fruit clusters, including harvestable ones, in the images. After lowering the vines, all leaves below 160 cm in height were removed to improve the visibility of harvestable fruits. During LTMS cultivation, the fruit-set monitoring system automatically activated at 3:00 a.m. nightly to capture panoramic images. As shown by the red frame in Figure 1b, the system ran automatically along one row at the edge of the greenhouse, collecting panoramic images of both sides of each row nightly.

2.2. Prediction of the Harvest Parameters

2.2.1. Method of Estimating Harvestable Fruit

The number of harvestable fruits was estimated from the panoramic images using an instance segmentation model based on the Mask ResNet-50 convolutional neural network (Mask R-CNN) [22], fine-tuned for tomato fruit detection. Of the fruit regions proposed by Mask R-CNN, only those deemed harvestable based on hue values were included in the count to estimate the number of harvestable fruits. Mask R-CNN was selected because, at the time of the experiment, it was one of the few models capable of performing instance segmentation and had been used in similar applications for segmenting tomato fruits [23]. Although Mask R-CNN tends to have longer inference times than one-shot models like YOLO, this was not an issue for the post-processing of images in this study. Mask R-CNN was trained following the methodology outlined in [21], where it was optimized to identify tomato fruit regions within the images. For this study, a fruit detection model was trained and validated using images from different rows within the experimental columns. The dataset consisted of 100 images from LS cultivation and 452 images from LTMS cultivation, which were split in a 3:1 ratio for training and validation, respectively. Training, validation, and testing of the Mask R-CNN model were conducted using the TensorFlow (1.10.0) and Keras libraries (ver. 2.1.3) (https://github.com/matterport/Mask_RCNN; accessed on 31 October 2024). The model was implemented on hardware consisting of a GeForce GTX 1080 Ti graphics card (NVIDIA Corporation, Santa Clara, CA, USA), an Intel Xeon processor E5-1650 v4 (6-core, 3.60 GHz) (Intel Corporation, Santa Clara, CA, USA), 64 GB of RAM, and a 64-bit Ubuntu 18.04 LTS operating system. The training process used an R-CNN model pretrained on the MS COCO dataset and involved 300 epochs for each dataset, repeating one epoch for the number of training images. The learning rate and IoU threshold were set to 0.001 and 0.50, respectively. Average precision (AP) was calculated at each epoch, and the model with the highest AP was selected for further inference.

During inference, the tomato fruits in the panoramic images were counted to assess their maturity. The images were divided into ten sections for both LS and LTMS cultivation, and inference was performed on each section. Mask R-CNN identified the candidate fruit areas, and the corresponding pixels were extracted. The extracted pixel values were then converted from the RGB to the HSL (Hue, Saturation, Lightness) color space, and the maturity level of each pixel was classified based on its hue value. This study employed the HSL value-based method for fruit maturity classification instead of Mask R-CNN. Mask R-CNN requires fixed datasets and cannot easily adapt to changing harvest criteria, making it less practical for commercial greenhouses. In contrast, the HSL-based method allows quick adjustments through threshold modifications and is better suited to dynamic conditions.

The classification criteria (Table 1) divide fruits into three maturity stages: Green (pre-change), Turning (transitional), and Red (post-change). The color change observed in tomatoes before harvest is attributed to the degradation of chlorophylls a and b, which imparts a green hue, and the simultaneous biosynthesis of β carotene and lycopene, producing orange and red hues [24]. Therefore, the color state of fruit skin is classified into one of three stages: pre-change (Green), transitional stage (Turning), and post-change (Red). To evaluate maturity levels based on fruit color changes, the hue in HSL (Hue, Saturation, Lightness) or HSI (Hue, Saturation, Intensity) color space is commonly employed. This study followed the methodologies established in previous research [25], specifically those that classify fruit regions based on hue.

After the classification step, the maturity level with the highest pixel count was assigned as the maturity level of the entire fruit. To address potential issues caused by overexposure due to specular reflection, pixels with a brightness (L) value of 210 or higher were excluded from the analysis. These thresholds for hue and brightness were empirically determined in this study through preliminary experiments since the values can vary depending on factors such as cultivar and imaging conditions. All of these processing steps were executed using the OpenCV (ver. 3.1.0) library in Python (ver. 3.5) to ensure efficient image analysis.

2.2.2. Measurement Methods for Ground Truth Data

Weekly harvest surveys were conducted during the growing seasons for both LS and LTMS cultivation. Workers with diverse experience levels were included in this study to account for the variability in harvest efficiency across different skill levels. Specifically, three workers were selected: Worker A, without any farming experience; Worker B, an experienced worker involved in research-related farming activities but not daily farming; and Worker C, who regularly engaged in harvesting operations in the experimental greenhouse. This selection was made to reflect the actual conditions of commercial greenhouses, where workers with varying skill levels, from beginners to experts, often work simultaneously.

In the LS cultivation, a harvest survey was performed once a week over seven weeks, from 24 January 2019 to 8 March 2019. Two workers participated: Workers A and B. Each worker manually harvested fruits from one row consisting of 18 plants. The yield and harvesting time monitoring system, developed by [7], was used to record the number of harvested fruits (

N_{f}

), the harvested weight (

W_{f}

), and the harvest working time (

T_{h}

). This system consists of an electronic balance (FG30KBM, A & D Co., Ltd., Tokyo, Japan), a handheld barcode scanner (HMBC-880, Hibino intersound Co., Ltd., Tokyo, Japan), a microprocessor (MBED NXP LPC1768, Switchscience Co., Ltd., Tokyo, Japan), and a cart. By reading pre-registered barcodes of the type of work and the worker’s ID, the system automatically logged the harvested fruit weight and working time and saved the information on a micro-SD card.

For the LTMS cultivation, the harvest survey spanned eight weeks, from 12 June 2019 to 1 August 2019. A skilled worker, Worker C, harvested 20 plants from the designated rows shown in Figure 1b, including those inside the red box. During each harvesting session, three harvest parameters,

N_{f}

,

W_{f}

, and

T_{h}

, were systematically recorded.

2.2.3. Prediction and Accuracy Verification of the Harvest Parameters

The actual numbers for

N_{f}

,

W_{f}

, and

T_{h}

were predicted based on the estimated numbers of harvestable fruits from the collected panoramic images. Predictions were made for six different days during the week for LS cultivation, ranging from the same day of harvest to five days before harvest. In the case of LTMS cultivation, three prediction days were set, from the harvest day to two days prior. This limitation in LTMS cultivation was due to the manual adjustment of the tomato plants before the third day of harvest, which altered the positions of the fruit clusters. Consequently, only three days, which allowed eight weeks of uninterrupted data sampling, were included in the analysis. The prediction formula is defined by Equation (1). We used a linear regression equation for the prediction because the previous study [7] suggested a favorable linear relationship between the number of harvested fruits, weight, and working time. The number of harvestable fruits above a certain ripeness level, denoted as

x_{k}

, was used as the explanatory variable. The

N_{f}

,

W_{f}

, and

T_{h}

values recorded on the actual harvest day were used as the objective variable (

{\hat{y}}_{k}

).

{\hat{y}}_{k} = a x_{k} + C .

(1)

The explanatory variable x_k was the number of harvestable fruits measured on the day of the prediction within the prediction period described above (6 days for LS cultivation, from 5 days before harvest to the day of harvest; 3 days for LTMS cultivation, from the day of harvest to 2 days before harvest). The difference in the subscript k indicates that the predictions are for different weeks. In addition, to determine the explanatory variable x_k, the numbers of fruits at three different ripeness levels were compared: (a) the number of Turning-only fruits, (b) the number of Red-only fruits, and (c) the number of fruits classified as either Turning or Red. The ripeness criterion that produced the highest accuracy for same-day predictions was selected for the subsequent analyses.

Prediction accuracy was evaluated using the leave-one-out cross-validation method. This method involved training the model on data from all weeks except the target prediction week. The coefficient a and the constant C in Equation (1) were determined using the least-squares method. The model was then used to predict the remaining data for the specific target week. The weighted absolute percentage error (WAPE), defined in Equation (2), served as the accuracy metric, making it particularly suitable for analyzing data with significant variations in yield across different days. In the equation, k denotes the difference in weeks, the unit of prediction, and n denotes the total number of weeks (e.g., seven for LS cultivation and eight for LTMS cultivation).

W A P E = \frac{\sum_{k = 1}^{n} |y_{k} - {\hat{y}}_{k}|}{\sum_{k = 1}^{n} |y_{k}|} .

(2)

3. Results

3.1. Examples of Collected Cultivation Row Panoramic Images

Figure 2 shows panoramic images captured during both LS and LTMS cultivation, along with color images highlighting the areas identified as fruit regions through inference. In Figure 2a,a′, the LS cultivation method led to partial occlusion of fruits by leaves, reducing visibility due to the plant arrangement. In contrast, as shown in Figure 2b,b′, the LTMS cultivation method, which involved leaf removal, resulted in better visibility of the fruits.

3.2. Effect of Differences in Maturity Criteria on the Accuracy of the Same-Day Prediction

Figure 3 shows the difference in prediction error of the WAPE when changing the criterion for the number of harvestable fruits used as an explanatory variable (

x_{k}

) in same-day predictions. The three graphs show the differences among the workers. The prediction error was highest when only Turning fruits were considered. For Workers A and B in LS cultivation, the explanatory variables that included both Turning and Red fruits yielded the lowest error. In contrast, for Worker C in LTMS cultivation, the explanatory variable that included only red fruits resulted in the lowest error. This trend was observed across all three harvest parameters. However, significant differences were observed between Turning and other criteria in many cases, whereas no significant difference was found between ‘Red’ and ‘Turning + Red’. Additionally, for Worker A and Worker C in

N_{f}

, no significant difference was observed among the criteria.

In subsequent analyses, the number of fruits, including both Turning and Red fruits, was used as the explanatory variable in LS cultivation. In contrast, a number of Red-only fruits were used in LTMS cultivation. These criteria were selected because they indicated the lowest prediction error for each cultivation method.

3.3. Differences in Harvesting Efficiency by Workers

Figure 4 shows the relationship between actual harvesting time (

T_{h}

) and number of harvested fruits (

N_{f}

) for each worker. The equation displayed represents the regression line for the variable, with its slope indicating the harvesting speed of each worker. All regressions showed a high correlation, with a coefficient of determination higher than 0.95. The fastest harvesting speeds were recorded for Worker C (2.5 s/fruit), followed by Worker B (7.3 s/fruit) and Worker A (8.1 s/fruit). Notably, a significant difference in efficiency was observed between the LS cultivation group (Workers A and B) and Worker C, which may be attributed to variations in cultivation methods and differences in daily greenhouse experience.

3.4. Prediction and Accuracy Verification of Harvest Parameters

3.4.1. Number of Harvested Fruits ( $N_{f}$ )

Figure 5 shows the prediction results for

N_{f}

. In Figure 5a, the relationship between

N_{f}

and harvestable fruits estimated from the same-day images is illustrated. Although regression lines were generated for each worker, the parameters were similar across different workers, indicating that a common regression equation can be used for estimation. Figure 5b displays the prediction error, measured as the WAPE, for

N_{f}

. The horizontal axis ‘Days before harvesting’ shows the number of days between the harvest date and the prediction date. This means that the numbers of harvestable fruits on the different prediction dates were substituted for the explanatory variable

x_{k}

in Equation (1). The

N_{f}

values of workers A and B in LS cultivation were predicted within a 30% error margin starting three days before harvest and within a 20% error margin starting two days before harvest. Notably, experienced Worker B exhibited a greater prediction error than inexperienced Worker A until three days before harvest, after which the accuracy reversed, with Worker B achieving higher prediction accuracy. Skilled Worker C in LTMS cultivation demonstrated a prediction error of 34.6% two days before harvest, which decreased to 18.1% the day before harvest. This accuracy level was comparable to those of Workers A and B in LS cultivation.

3.4.2. Harvested Weight ( $W_{f}$ )

Figure 6 shows the results of

W_{f}

predictions. Figure 6a shows the relationship between

W_{f}

and the number of harvestable fruits estimated from the same-day images. Similar to

N_{f}

, the parameters of the regression line were similar; however, the intercept and slope varied slightly among workers. Figure 6b shows the prediction error, WAPE, of

W_{f}

, with the horizontal axis representing the number of days between the harvest date and the prediction date. The prediction uses the number of harvestable fruits on each prediction date as the explanatory variable

x_{k}

in Equation (1).

W_{f}

exhibited a trend analogous to that of

N_{f}

. For Workers A and B in LS cultivation, the prediction error was about 25% three days before harvest, decreasing to approximately 15% two days before harvest, and averaging around 10% on the harvest day. Initially, it was anticipated that experienced Worker B would exhibit a more significant prediction error than inexperienced Worker A four days before harvest; however, three days prior, no notable difference was observed between the two groups, differing from the trend seen in

N_{f}

(Figure 5b).

Worker C in LTMS cultivation showed a similar trend. Still, the overall prediction error was greater than that observed for Workers A and B in LS cultivation, with an error of approximately 20% even on harvest day.

3.4.3. Harvest Working Time ( $T_{h}$ )

Finally, Figure 7 shows the prediction results for

T_{h}

. Figure 7a shows the relationship between

T_{h}

and the number of harvestable fruits estimated from the same-day images. Contrary to

N_{f}

and

W_{f}

, described in the previous section, the regression lines differed for each worker. Figure 7b shows the prediction error, WAPE, of

T_{h}

, with the horizontal axis representing the number of days between the harvest date and the prediction date. The prediction uses the number of harvestable fruits on each prediction date as the explanatory variable

x_{k}

in Equation (1). As with

N_{f}

and

W_{f}

, the prediction accuracy improves as the harvest date approaches. Workers in the LS cultivation (A and B) had a prediction error of less than 30% from three days before the harvest. However, it was relatively large compared to the prediction errors for

N_{f}

and

W_{f}

. By worker, the prediction error for unskilled Worker A remained at approximately 24.7% until the same day of harvest; however, for experienced Worker B, the prediction error decreased to 12.4%. Experienced Worker C in the LTMS cultivation had the lowest prediction error, contrary to

N_{f}

and

W_{f}

. The error decreased to 9.6% on the same day as the harvest.

Considering the prediction error of WAPE as the average for all workers, the average was 30.1% three days before harvest and 15.6% on the same day. The results for

T_{h}

differed from those for

N_{f}

and

W_{f}

. Specifically, the prediction error for

T_{h}

was smallest for Worker C, followed by Worker B and Worker A. This can be rephrased to mean that the more experienced the worker and the higher the fruit visibility in LTMS cultivation, the higher the prediction accuracy.

4. Discussion

In this study, a fruit-set monitoring system that can be operated in horticultural facilities was used to predict the harvest working time (

T_{h}

) for each worker, which is necessary for efficient planning. In addition to

T_{h}

, the accuracy of the prediction of the number of harvested fruits (

N_{f}

) and their weight (

W_{f}

) was investigated. We evaluated how the prediction accuracy of these parameters varied depending on the worker, cultivation method, and prediction days before harvest.

The relationship between actual harvesting time (

T_{h}

) and number of harvested fruits (

N_{f}

) was investigated to compare the harvesting speeds of three workers who participated in this study. As shown in Figure 4, the results indicate that the more experienced the worker, the higher the harvesting speed. However, compared to Workers A and B in LS cultivation, the harvesting efficiency of Worker C in LTMS cultivation was nearly three times higher. As shown in Figure 2, the LTMS cultivation tested by Worker C exhibited higher visibility of the harvested fruit owing to leaf picking, which may have significantly increased the harvesting speed compared to the LS cultivation trial. A previous study [7] reported a high correlation between

N_{f}

and

T_{h}

. The results of this study are consistent with this report, with high coefficients of determination ranging from 0.96 to 0.98 for all workers. However, the regression line parameters were not uniform, and different prediction equations were necessary for each worker.

Common trends were observed in the predictions of

N_{f}

and

W_{f}

. Regarding the regression equations making the predictions, no significant differences were identified among the workers or cultivation methods, as shown in Figure 5a and Figure 6a. This indicates that a single regression equation is applicable for predicting

N_{f}

and

W_{f}

. However, attention should be paid to inherent error factors when using a single equation for prediction. For example, different ripeness criteria were selected to determine harvestable fruits for LS and LTMS cultivation methods. The inability to correctly establish the criterion may introduce variability. The trends regarding the effect of the number of days between the prediction and harvest dates on prediction accuracy differed between the LS and LTMS cultivars. In LS cultivation, the prediction error decreased, starting from three days before harvest, remaining within 30% for

N_{f}

and 25% for

W_{f}

. The improvement in prediction accuracy three days before harvest was attributed to the progression of fruit coloration, which brought the fruits closer to maturity. In contrast, for LTMS cultivation, the prediction accuracy began to improve two days before harvest. This made the effective prediction period one day shorter than that of LS cultivation. This may be because LTMS cultivation occurred during summer when fruit coloration progressed more rapidly than LS cultivation. Thus, the prediction period using this method is influenced by seasonal variations to a small extent. Similar to a previous study [26], a detailed model that categorizes fruit maturation into distinct stages and describes the processes of coloring and the influence of temperature can effectively explain these seasonal variations. Compared to previous studies [20] that predicted daily harvest weight, such as a topologically based model method that reported an error of 26%, this study achieved more favorable results. It is important to note that previous studies predicted the harvest weight for the entire greenhouse, thereby incorporating errors arising from variations in growth conditions. Factors on the day of harvest, such as labor and past yields, rather than environmental influences, are also believed to affect yield. Since environmental factors, labor on the day of harvest, and past yields were not considered in this analysis, the accuracy of the predictions could be enhanced in future studies by incorporating these secondary data.

For the prediction of

T_{h}

, the prediction error on the day was 15.6% on average (Figure 7b). This result was lower than the prediction accuracy of previous works using the Gworks model, which reported 94% for Dutch cut roses [5] and 92–96% for sweet pepper [2]. One possible reason for this is the complexity of the model itself. Our method is straightforward and only uses the number of harvestable fruits counted from the image as an explanatory variable. On the other hand, in previous research, it was necessary to input many parameters, such as the distribution of the harvest volume, job frequencies, greenhouse resources, and parameters of the probability density function related to the harvesting process [2,5]. Furthermore, these parameters had to be carefully measured in advance. In this respect, although this research has an advantage in actual use, the issue remains that detailed modeling of the work process is necessary to improve prediction accuracy.

In addition, regarding differences in accuracy for the workers, the prediction error was lower for

T_{h}

for Workers C, B, and A, with a higher prediction accuracy for more experienced workers and LTMS cultivation with higher fruit visibility. The results suggest that as they became more skilled in their work, their work speed increased and became more consistent. This, in turn, resulted in a lower prediction error for work time. This result can be summarized as follows: the higher the work efficiency, the more accurate the prediction of

T_{h}

. Work efficiency is affected by the level of skill at work and the tailoring method of cultivation, which is related to the ease of harvesting. Although this is an example of a working robot rather than a human [27], it shows that, in robotic harvesting, the obstacle-shielding rate is the dominant factor in the ease of harvesting when approaching the fruit to be harvested. In LTMS cultivation, leaves were collected before harvesting. Thus, the shielding ratio was lower than that in LS cultivation, which is believed to have facilitated the harvesting process. As another factor that may affect tomato harvesting efficiency, Japanese researchers have investigated tomato harvesting efficiency in older adults. They reported deficiency differences among age groups and even older adults [28]. In addition, a report from the province of Almeria, Spain [29], compared work time data for approximately 50 workers with similar reports from Israel and concluded that Spanish workers were 60% faster, influenced by factors such as yield and whether the harvest location was clustered. Other reports [30] have shown that extremely high or low fruiting positions for workers reduce harvesting efficiency. Another study [4], citing previous related studies, also noted that work efficiency is affected by climate, worker skill, crop structure, age, and fruit number. Although numerous factors affect work efficiency, improving the problems that hinder the ease of harvesting removes uncertainties, such as poor decisions and work disruptions in the harvest task. Consequently, the uniformity of the harvest speed was enhanced, and the accuracy of

T_{h}

estimation was improved. Notably, here, the regression line for estimating

T_{h}

has different parameters for different workers, as shown in Figure 7a. Therefore, we conclude that it is realistic to construct a prediction equation for each worker.

Finally, we discuss how the proposed harvest working time prediction method can be applied to actual work planning. Assuming the process is used at actual production sites, the original harvest working plan may change depending on the worker’s attendance. As mentioned in the Introduction, systems that record individual workers’ work speeds and progress in detail have been reported [4,7,9,11]. By integrating the recorded harvesting efficiency data of workers obtained by these systems with the prediction method proposed in this study, it becomes possible to accurately estimate the time each worker spends harvesting on a given day. The estimated time provides essential information for managers to perform work assignments. For example, [31] conducted a detailed simulation of the work path and work procedures to optimize tomato work efficiency in a greenhouse, resulting in a 32% labor reduction. By extending this study, the estimated workload at each location in the facility was obtained, and the optimal harvest route for the day was planned based on the simulation. Such quantitative planning of harvest work will assist greenhouse managers in ensuring that the planned work is completed. In addition, when linked to a work management system such as that in [4], it provides a quantitative evaluation of worker productivity and transparency in assessing worker personnel. This is expected to provide new value in maintaining workers’ work ethics and motivation in the face of labor shortage, which has recently become a serious issue.

This system shows the potential to improve the efficiency of greenhouses, but some issues still need to be addressed. The impact of optimized labor allocation on cost reduction needs to be verified. This research was conducted on a small scale, and in the future, verification needs to be carried out on a larger scale across the entire facility with multiple workers. Further investigation is required into predictions when work other than harvesting is involved. Regarding the introduction of hardware, the safety and risks involved in unmanned nighttime robot operations must be addressed. Finally, the accuracy of the fruit detection model needs to be improved compared to newer, more advanced models. Solving these issues will enhance the feasibility and effectiveness of the system.

5. Conclusions

In this study, we developed a system for predicting harvest working time (

T_{h}

) based on daily panoramic images of tomato cultivation rows, along with the number (

N_{f}

) and weight (

W_{f}

) of harvested fruits. The results revealed that prediction accuracy varied depending on the experience of the workers and the cultivation method used. More experienced workers demonstrated higher harvest efficiency, resulting in fewer prediction errors. These findings highlight the potential for implementing this system in commercial greenhouse operations to optimize worker assignments and improve overall productivity. Future research should explore integrating additional factors into the prediction model to further enhance accuracy and operational efficiency.

Author Contributions

Conceptualization, H.N., T.O. and T.F.; methodology, H.N. and T.O.; software, H.N.; validation, H.N.; resources, T.O. and T.F.; data curation, H.N. and K.S.; writing—original draft preparation, H.N.; writing—review and editing, H.N., T.O., K.S., F.H. and T.F.; visualization, H.N.; supervision, T.O., F.H. and T.F.; project administration, T.O. and T.F.; funding acquisition, T.O. and T.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by a grant from a project study commissioned by the Ministry of Agriculture, Forestry, and Fisheries, Japan, on ‘AI-based optimization of environmental control and labor management for large-scale greenhouse production’.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board of the Institute of Agricultural Machinery, National Agriculture and Food Research Organization (protocol code is No. 2901 and date of approval is 30 June 2017).

Data Availability Statement

The data presented in this paper are available upon request from the corresponding authors.

Acknowledgments

We thank the Institute of Vegetable and Floriculture Science, NARO, for providing tomato plants as measurement materials.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Bac, C.W.; Van Henten, E.; Hemming, J.; Edan, Y. Harvesting Robots for High-Value Crops: State-of-the-Art Review and Challenges Ahead. J. Field Robot. 2014, 31, 888–911. [Google Scholar] [CrossRef]
Elkoby, Z.; van’t Ooster, B.; Edan, Y. Simulation Analysis of Sweet Pepper Harvesting Operations. In Advances in Production Management Systems. Innovative and Knowledge-Based Production Management in a Global-Local World; Grabot, B., Vallespir, B., Gomes, S., Bouras, A., Kiritsis, D., Eds.; Springer: Berlin/Heidelberg, Germany, 2014; pp. 441–448. [Google Scholar]
JGHA. Survey on Large-Scale Facility Horticulture and Plant Factories: Fact-Finding Survey and Case Study; JGHA: Tokyo, Japan, 2024. [Google Scholar]
Ohyama, K.; Fujioka, J.; Sato, T.; Matsuo, T. System for determining the job status of individual laborers in a large-scale greenhouse. Comput. Electron. Agric. 2023, 206, 107661. [Google Scholar] [CrossRef]
van’t Ooster, A.; Bontsema, J.; van Henten, E.J.; Hemming, S. GWorkS—A discrete event simulation model on crop handling processes in a mobile rose cultivation system. Biosyst. Eng. 2012, 112, 108–120. [Google Scholar] [CrossRef]
Harman, H.; Sklar, E.I. Multi-agent task allocation for harvest management. Front. Robot. AI 2022, 9, 864745. [Google Scholar] [CrossRef]
Ota, T.; Iwasaki, Y.; Nakano, A.; Kuribara, H.; Higashide, T. Development of yield and harvesting time monitoring system for tomato greenhouse Production. Eng. Agric. Environ. Food 2019, 12, 40–47. [Google Scholar] [CrossRef]
Otuka, A.; Sugawara, K. A labor management application using handheld computers. Agric. Inf. Res. 2003, 12, 95–103. [Google Scholar] [CrossRef]
Ampatzidis, Y.G.; Whiting, M.D.; Scharf, P.A.; Zhang, Q. Development and evaluation of a novel system for monitoring harvest labor efficiency. Comput. Electron. Agric. 2012, 88, 85–94. [Google Scholar] [CrossRef]
Ampatzidis, Y.G.; Whiting, M.D.; Liu, B.; Scharf, P.A.; Pierce, F.J. Portable weighing system for monitoring picker efficiency during manual harvest of sweet cherry. Precis. Agric. 2013, 14, 162–171. [Google Scholar] [CrossRef]
Ampatzidis, Y.; Tan, L.; Haley, R.; Whiting, M.D. Cloud-based harvest management information system for hand-harvested specialty crops. Comput. Electron. Agric. 2016, 122, 161–167. [Google Scholar] [CrossRef]
Saito, T.; Kawasaki, Y.; Ahn, D.H.; Ohyama, A.; Higashide, T. Prediction and Improvement of Yield and Dry Matter Production Based on Modeling and Non-destructive Measurement in Year-round Greenhouse Tomatoes. Hortic. J. 2020, 89, 425–431. [Google Scholar] [CrossRef]
Maeda, K.; Ahn, D.-H. Estimation of dry matter production and yield prediction in greenhouse cucumber without destructive measurements. Agriculture 2021, 11, 1186. [Google Scholar] [CrossRef]
Watanabe, T.; Muramatsu, Y.; Homma, M.; Higashide, T.; Ahn, D.H. Development of a Simple Empirical Yield Prediction Model Based on Dry Matter Production in Sweet Pepper. Agriculture 2022, 68, 13–24. [Google Scholar] [CrossRef]
Gong, L.; Yu, M.; Jiang, S.; Cutsuridis, V.; Pearson, S. Deep learning based prediction on greenhouse crop yield combined TCN and RNN. Sensors 2021, 21, 4537. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Xiao, R.; Yin, Y.; Liu, T. Prediction of tomato yield in Chinese-style solar greenhouses based on wavelet neural networks and genetic Algorithms. Information 2021, 12, 336. [Google Scholar] [CrossRef]
López-Aguilar, K.; Benavides-Mendoza, A.; González-Morales, S.; Juárez-Maldonado, A.; Chiñas-Sánchez, P.; Morelos-Moreno, A. Artificial neural network modeling of greenhouse tomato yield and aerial dry matter. Agriculture 2020, 10, 97. [Google Scholar] [CrossRef]
Bargoti, S.; Underwood, J.P. Image segmentation for fruit detection and yield estimation in apple orchards. J. Field Robot. 2017, 34, 1039–1060. [Google Scholar] [CrossRef]
Apolo-Apolo, O.E.; Martínez-Guanter, J.; Egea, G.; Raja, P.; Pérez-Ruiz, M. Deep learning techniques for estimation of the yield and size of citrus fruits using a UAV. Eur. J. Agron. 2020, 115, 126030. [Google Scholar] [CrossRef]
Hoshi, T.; Sasaki, T.; Tsutsui, H.; Watanabe, T.; Tagawa, F. A daily harvest prediction model of cherry tomatoes by mining from past averaging data and using topological case-based modeling. Comput. Electron. Agric. 2000, 29, 149–160. [Google Scholar] [CrossRef]
Naito, H.; Shimomoto, K.; Fukatsu, T.; Hosoi, F.; Ota, T. Interoperability analysis of tomato fruit detection models for images taken at different Facilities, Cultivation Methods, and Times of the Day. AgriEngineering 2024, 6, 1827–1846. [Google Scholar] [CrossRef]
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. arXiv 2018, arXiv:1703.06870. [Google Scholar] [CrossRef]
Afonso, M.; Fonteijn, H.; Fiorentin, F.S.; Lensink, D.; Mooij, M.; Faber, N.; Polder, G.; Wehrens, R. Tomato Fruit Detection and Counting in Greenhouses Using Deep Learning. Front. Plant Sci. 2020, 11, 571299. [Google Scholar] [CrossRef] [PubMed]
Carrillo-López, A.; Yahia, E.M. Changes in color-related compounds in tomato fruit exocarp and mesocarp during ripening using HPLC-APcI+-mass Spectrometry. J. Food Sci. Technol. 2014, 51, 2720–2726. [Google Scholar] [CrossRef] [PubMed]
Hayashi, S.; Shigematsu, K.; Yamamoto, S.; Kobayashi, K.; Kohno, Y.; Kamata, J.; Kurita, M. Evaluation of a strawberry-harvesting robot in a field test. Biosyst. Eng. 2010, 105, 160–171. [Google Scholar] [CrossRef]
Naito, H.; Kawasaki, Y.; Hidaka, K.; Higashide, T.; Misumi, M.; Ota, T.; Lee, U.; Takahashi, M.; Hosoi, F.; Nakagawa, J. Effect of air temperature on each fruit growth and ripening stage of strawberry ‘Koiminori’. Int. Agrophys. 2024, 38, 195–202. [Google Scholar] [CrossRef]
Fujinaga, T.; Yasukawa, S.; Ishii, K. Development of automatic tomato fruit harvesting robot for facility horticulture. J. Robot. Soc. Jpn. 2021, 39, 921–925. [Google Scholar] [CrossRef]
Takahashi, M.; Murata, K.; Aizawa, M. The research of aged persons’ working capacity and workload in harvesting tomatoes and bud pinching of mum. Agric. Res. 2010, 63, 143–144. [Google Scholar]
Manzano-Agugliaro, F.; Garcia-Cruz, A. Time study techniques applied to labor management in greenhouse tomato (Solanum lycopersicum L.) cultivation. Agrociencia 2009, 43, 267–277. [Google Scholar]
Atsumi, T.; Sakuma, H.; Higashiyama, T. On the Labuor Saving of Harvesting for Vegetables in Greenhouse Culture. Jpn. J. Farm Work. Res. 1980, 1980, 1–5. [Google Scholar] [CrossRef]
Bechar, A.; Yosef, S.; Netanyahu, S.; Edan, Y. Improvement of work methods in tomato greenhouses using simulation. Trans. ASABE 2007, 50, 331–338. [Google Scholar] [CrossRef]

Figure 1. Layout of the trial greenhouse: (a) low-stage cultivation arrangement; (b) long-term multi-stage cultivation arrangement. In both figures, the red boxes indicate the plants to be targeted, and the light blue arrows indicate the path lines of the fruit-set monitoring system.

Figure 2. Examples of panoramic images: (a) a panoramic image captured during LS cultivation; (a′) LS cultivation panorama image highlighted fruit candidate regions in color; (b) a panoramic image captured during LTMS cultivation; (b′) LTMS cultivation panorama image highlighted fruit candidate regions in color.

Figure 3. Prediction errors in

N_{f}

,

W_{f}

, and

T_{h}

based on different maturity levels. Each graph shows the results for individual workers. The assigned alphabets represent the results of the Mann–Whitney U test, where different letters indicate significant differences between groups at the 5% significance level.

Figure 3. Prediction errors in

N_{f}

,

W_{f}

, and

T_{h}

based on different maturity levels. Each graph shows the results for individual workers. The assigned alphabets represent the results of the Mann–Whitney U test, where different letters indicate significant differences between groups at the 5% significance level.

Figure 4. Relationship between

N_{f}

and

T_{h}

for each worker. The regression equations and coefficients of determination are shown alongside the regression lines.

Figure 4. Relationship between

N_{f}

and

T_{h}

for each worker. The regression equations and coefficients of determination are shown alongside the regression lines.

Figure 5. Prediction results for

N_{f}

for each worker: (a) scatter plots and regression equations depicting the relationship between the number of harvestable fruits counted from the panoramic image on the harvest day and

N_{f}

; (b) changes in the prediction error of

N_{f}

for different days, ranging from the harvest day to the prediction day.

Figure 5. Prediction results for

N_{f}

for each worker: (a) scatter plots and regression equations depicting the relationship between the number of harvestable fruits counted from the panoramic image on the harvest day and

N_{f}

; (b) changes in the prediction error of

N_{f}

for different days, ranging from the harvest day to the prediction day.

Figure 6.

W_{f}

prediction results for each worker: (a) scatter plots and regression equations illustrating the relationship between the number of harvestable fruits counted from the panoramic image on the same day of harvest and

W_{f}

; (b) changes in

W_{f}

prediction errors for different days between the harvest day and prediction day.

Figure 6.

W_{f}

prediction results for each worker: (a) scatter plots and regression equations illustrating the relationship between the number of harvestable fruits counted from the panoramic image on the same day of harvest and

W_{f}

; (b) changes in

W_{f}

prediction errors for different days between the harvest day and prediction day.

Figure 7. Prediction results for

T_{h}

for each worker: (a) scatter plots and regression equations illustrating the relationship between the number of harvestable fruits counted from panoramic images on the same day of harvest and

T_{h}

; (b) changes in the prediction error for

T_{h}

across different days from the harvest date to the prediction date.

Figure 7. Prediction results for

T_{h}

for each worker: (a) scatter plots and regression equations illustrating the relationship between the number of harvestable fruits counted from panoramic images on the same day of harvest and

T_{h}

; (b) changes in the prediction error for

T_{h}

across different days from the harvest date to the prediction date.

Table 1. Hue value ranges for classifying tomato fruit maturity.

Color	Hue Value
Red	0 ≤ hue < 40, 300 ≤ hue < 360
Turning	40 ≤ hue < 70
Green	70 ≤ hue < 160

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Naito, H.; Ota, T.; Shimomoto, K.; Hosoi, F.; Fukatsu, T. Accuracy Assessment of Tomato Harvest Working Time Predictions from Panoramic Cultivation Images. Agriculture 2024, 14, 2257. https://doi.org/10.3390/agriculture14122257

AMA Style

Naito H, Ota T, Shimomoto K, Hosoi F, Fukatsu T. Accuracy Assessment of Tomato Harvest Working Time Predictions from Panoramic Cultivation Images. Agriculture. 2024; 14(12):2257. https://doi.org/10.3390/agriculture14122257

Chicago/Turabian Style

Naito, Hiroki, Tomohiko Ota, Kota Shimomoto, Fumiki Hosoi, and Tokihiro Fukatsu. 2024. "Accuracy Assessment of Tomato Harvest Working Time Predictions from Panoramic Cultivation Images" Agriculture 14, no. 12: 2257. https://doi.org/10.3390/agriculture14122257

APA Style

Naito, H., Ota, T., Shimomoto, K., Hosoi, F., & Fukatsu, T. (2024). Accuracy Assessment of Tomato Harvest Working Time Predictions from Panoramic Cultivation Images. Agriculture, 14(12), 2257. https://doi.org/10.3390/agriculture14122257

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Accuracy Assessment of Tomato Harvest Working Time Predictions from Panoramic Cultivation Images

Abstract

1. Introduction

2. Materials and Methods

2.1. Method of Collecting Panoramic Images of Cultivation Rows

2.1.1. Low-Stage Cultivation

2.1.2. Long-Term Multi-Stage Cultivation

2.2. Prediction of the Harvest Parameters

2.2.1. Method of Estimating Harvestable Fruit

2.2.2. Measurement Methods for Ground Truth Data

2.2.3. Prediction and Accuracy Verification of the Harvest Parameters

3. Results

3.1. Examples of Collected Cultivation Row Panoramic Images

3.2. Effect of Differences in Maturity Criteria on the Accuracy of the Same-Day Prediction

3.3. Differences in Harvesting Efficiency by Workers

3.4. Prediction and Accuracy Verification of Harvest Parameters

3.4.1. Number of Harvested Fruits ( $N_{f}$ )

3.4.2. Harvested Weight ( $W_{f}$ )

3.4.3. Harvest Working Time ( $T_{h}$ )

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Accuracy Assessment of Tomato Harvest Working Time Predictions from Panoramic Cultivation Images

Abstract

1. Introduction

2. Materials and Methods

2.1. Method of Collecting Panoramic Images of Cultivation Rows

2.1.1. Low-Stage Cultivation

2.1.2. Long-Term Multi-Stage Cultivation

2.2. Prediction of the Harvest Parameters

2.2.1. Method of Estimating Harvestable Fruit

2.2.2. Measurement Methods for Ground Truth Data

2.2.3. Prediction and Accuracy Verification of the Harvest Parameters

3. Results

3.1. Examples of Collected Cultivation Row Panoramic Images

3.2. Effect of Differences in Maturity Criteria on the Accuracy of the Same-Day Prediction

3.3. Differences in Harvesting Efficiency by Workers

3.4. Prediction and Accuracy Verification of Harvest Parameters

3.4.1. Number of Harvested Fruits ( N f )

3.4.2. Harvested Weight ( W f )

3.4.3. Harvest Working Time ( T h )

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.4.1. Number of Harvested Fruits ( $N_{f}$ )

3.4.2. Harvested Weight ( $W_{f}$ )

3.4.3. Harvest Working Time ( $T_{h}$ )