Extraction of Corn Plant Phenotypic Parameters with Keypoint Detection and Stereo Images

Gao, Yuliang; Li, Zhen; Li, Bin; Zhang, Lifeng

doi:10.3390/agronomy14061110

Open AccessEditor’s ChoiceArticle

Extraction of Corn Plant Phenotypic Parameters with Keypoint Detection and Stereo Images

¹

Graduate School of Engineering, Kyushu Institute of Technology, Kitakyushu 804-0015, Japan

²

School of Electrical Engineering, Nantong University, Nantong 226021, China

³

College of Artificial Intelligence, Yangzhou University, Yangzhou 225012, China

^*

Author to whom correspondence should be addressed.

Agronomy 2024, 14(6), 1110; https://doi.org/10.3390/agronomy14061110

Submission received: 28 April 2024 / Revised: 11 May 2024 / Accepted: 22 May 2024 / Published: 23 May 2024

(This article belongs to the Section Precision and Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

:

Corn is a global crop that requires the breeding of superior varieties. A crucial aspect of the breeding process is the accurate extraction of phenotypic parameters from corn plants. The existing challenges in phenotypic parameter extraction include low precision, excessive manual involvement, prolonged processing time, and equipment complexity. This study addresses these challenges by opting for binocular cameras as the data acquisition equipment. The proposed stereo corn phenotype extraction algorithm (SCPE) leverages binocular images for phenotypic parameter extraction. The SCPE consists of two modules: the YOLOv7-SlimPose model and the phenotypic parameter extraction module. The YOLOv7-SlimPose model was developed by optimizing the neck component, refining the loss function, and pruning the model based on YOLOv7-Pose. This model can better detect bounding boxes and keypoints with fewer parameters. The phenotypic parameter extraction module can construct the skeleton of the corn plant and extract phenotypic parameters based on the coordinates of the keypoints detected. The results showed the effectiveness of the approach, with the YOLOv7-SlimPose model achieving a keypoint mean average precision (mAP) of 96.8% with 65.1 million parameters and a speed of 0.09 s/item. The phenotypic parameter extraction module processed one corn plant in approximately 0.2 s, resulting in a total time cost of 0.38 s for the entire SCPE algorithm to construct the skeleton and extract the phenotypic parameters. The SCPE algorithm is economical and effective for extracting phenotypic parameters from corn plants, and the skeleton of corn plants can be constructed to evaluate the growth of corn as a reference. This proposal can also serve as a valuable reference for similar functions in other crops such as sorghum, rice, and wheat.

Keywords:

keypoint detection; phenotypic parameters; YOLOv7-Pose; stereo vision; corn

1. Introduction

Corn is a crucial staple food worldwide [1], and its myriad products and derivatives find applications across various domains. Breeding is the key technology to increase corn yield, and plant phenomics plays a pivotal role in modern breeding [2]. Scrutinizing phenotypic parameters among individual corn plants facilitates targeted crop improvement, aligns breeding with specific demands, and augments overall yield. During breeding, rapid acquisition of phenotypic parameters is a crucial task [3]: it can not only improve the efficiency of phenotypic parameter extraction but can also monitor plant growth [4]. Traditionally, extracting phenotypic parameters for corn involved manual measurements with rulers and protractors, leading to inefficiencies, substantial measurement errors, and potential harm to plants. One advanced method employs high-precision point cloud generation and 3D reconstruction of the plant [5]. Existing studies on representation extraction primarily focus on 3D reconstruction. Zermas [6] utilized high-resolution RGB imagery from UAVs and handheld cameras for corn 3D reconstruction, extracting parameters such as plant count, leaf area index, individual and average plant height, individual leaf length, leaf location, and angles relative to the stem. Zhao [7] used a single RGB image for the 3D reconstruction of plants and extracted parameters such as height, canopy size, and trunk diameter. Zhu [8] utilized an improved skeleton extraction algorithm to reconstructed its three-dimensional (3D) model of a tomato. Li [9] utilized LiDAR and an RGB camera to obtain the plant heights of corn plant in corn field. However, the following challenges exist when using 3D point cloud data: leveraging complex equipment, venue restrictions, intricate programs, and a lack of unified algorithms for point cloud data processing. This study addresses these issues by exploring binocular images for 3D keypoint extraction to efficiently extract 3D phenotypic parameters from corn plants using inexpensive and widely available binocular cameras.

The traditional binocular data 3D keypoint detection method is mainly applied to the KITTI dataset [10], employing cost volume to regress only one 3D center point coordinate of each bounding box. For example, Stereo CenterNet [11] focuses on regressing the 3D center, utilizing binocular images for regression, and subsequently obtaining the 3D bounding box. In addition, 3D keypoint [12] can use 3D information and monocular images to directly regress the 3D center of the object.

To overcome the above limitations and take advantage of prior information from 2D keypoint detection models, based on YOLOv7-Pose [13], the 3D keypoint detection method proposed in this work utilizing binocular images first employs a modified 2D detect or to jointly output 2D bounding boxes and keypoints in left images. The same detection outputs 2D bounding boxes in the right images, allowing the creation of a union bounding box. By utilizing stereo-matching networks with left and right bounding boxes, depth maps are obtained. The depth map combined with keypoints facilitates the extraction of 3D keypoints and subsequent phenotypic parameter extraction.

Although 2D keypoint detection technology originated in human pose detection [14], Du [15] used keypoint detection and point clouds to detect the 3D poses of tomatoes, Zheng [16] utilize pixel position and depth of keypoints to calculate the size of vegetables. Application in this work for corn plant keypoint detection is based on the YOLOv7-Pose method and stereo matching, leveraging coordinate point regression for faster detection speed. This study contributes by describing a binocular corn keypoint dataset and proposing the stereo corn phenotype extraction algorithm (SCPE) for corn plant phenotypic parameter extraction. SCPE comprises two modules: the YOLOv7-SlimPose model and the phenotypic parameter extraction module, the latter incorporating stereo matching, skeleton construction, and parameter calculation. The key contributions of this study are as follows:

(1): This work proposes a novel approach named SCPE for extracting phenotypic parameters for corn plants with binocular images. A binocular image dataset of corn plants has been constructed, with keypoints and bounding boxes accurately marked for each object.
(2): This work designed the YOLOv7-SlimPose model through a comprehensive analysis of the YOLOv7-Pose. Structure optimization, loss function adjustment, and model pruning were executed within the core architecture of YOLOv7-Pose. This model is designed to achieve precise detection of corn-bounding boxes and keypoints using much fewer parameters.
(3): This work proposes the phenotypic parameter extraction module to utilize the output of YOLOv7-SlimPose. Leveraging the model’s output, this module constructs skeletons of leaves and stems and extracts phenotypic parameters for corn plants. Moreover, the module facilitates corn plant growth monitoring functions [17], such as detecting lodging, monitoring the number of leaves, and assessing the overall normalcy of growth.

The remainder of the study is organized as follows: Section 2 provides an overview of the dataset construction, detailing the data augmentation methods employed during training, and presents the specifics of the proposed method, SCPE. Section 3 provides the performance of the SCPE algorithm in terms of keypoint detection and the results of phenotypic parameter extraction. Section 4 discusses the findings, and Section 5 provides a brief conclusion of the proposed SCPE in this work.

2. Materials and Methods

2.1. Materials

2.1.1. Data Acquisition

The main task of this work is extracting phenotypic parameters of corn plants using binocular images, and the images utilized were obtained from the experimental field at Yangzhou University in Yangzhou, Jiangsu Province, China. The data collection took place from 15 July to 1 August 2023. The specific corn variety under investigation was SuYuNuo.1 (Jiangsu Zhonghe seed industry Co., Ltd., Nanjing, China).

To capture the entire developmental process of corn, images of the selected corn plants were taken daily from the heading to the maturity stage. During this period, a number of image pairs of corn plants were obtained that play an important role in phenotypic parameter extraction. Additionally, images were captured at different times and under various weather conditions to enhance the diversity of data under different lighting conditions, preventing overfitting of the model.

To ensure keypoint detection precision, a blue curtain was positioned behind the corn plants to minimize the impact of the background clutter caused by the presence of similar corn plants during image capture. The ZED2I binocular depth camera was employed for capturing binocular photographs and depth maps. This work collected 1000 sets of raw data and expanded it to 4000 pairs through data augmentation. Each set included the camera’s internal parameters, binocular images, and depth maps. The RGB image resolution was 1080 × 720 pixels, the depth maps had the same resolution of 1080 × 720 pixels, and the images were saved. in PNG format. These data were divided into train and test sets at a ratio of 9:1. Last, 20 different data points from the raw data for phenotypic parameter extraction were selected, and the extracted phenotypic parameters were used to be compared with the manual measurement values. The composition and use of data are shown in Table 1.

2.1.2. Acquisition of Labels

In the process of data acquisition, the LabelMe annotation tool was used to label both bounding boxes and keypoints. The annotation file comprehensively captures the center coordinates, width, and height of the bounding box, along with the 2D coordinates of keypoints, as shown in Figure 1. For each corn plant, two types of objects were marked: leaves and stems, with seven types of keypoints. As depicted in Figure 2, seven types of keypoints have been annotated, and the specific parameters are detailed as follows:

(1): Root point: The root point at which the root of the corn plant connects to the ground.
(2): Top point: The top point refers to the highest (relative to the ground) point of the corn plant.
(3): Leaf connection point: The leaf connection point refers to the point at which a leaf is attached to the main stem.
(4): Leaf highest point: The highest point refers to the uppermost point of the leaf.
(5): Leaf angle point: The leaf angle point refers to one-quarter the distance from the leaf connection point to the highest point of the leaf.
(6): Leaf tip point: The leaf tip point refers to the tip of the leaf.
(7): Stalk point: The stalk point refers to the point at which the corncobs connect to the stem of the corn plant.

Figure 1. Example of keypoints labeled in images, bounding boxes of stem and leaves are not shown.

Figure 2. Diagram of the keypoints of the plant. The stem object contains three types of keypoints, and the leaf object contains four types of keypoints.

Based on the coordinates of the keypoints in the left image, the depth value z is extracted from the depth map, as illustrated in Figure 3. Unlike the KITTI dataset, where the left image label is not intrinsically linked to the right image label, this work employed the internal parameters of the camera to project the bounding box and keypoints from the left to the right image. The 3D coordinates

(x, y, z)

of pixel

(u, v)

were obtained using the formula given by Equation (1), providing annotation information for the right image.

x = \frac{(u - c_{u})}{f_{u}} z, y = \frac{(v - c_{v})}{f_{v}} z

(1)

where

(c_{u}, c_{v})

is the location of the pixel relative to the camera center, and

f_{u}

and

f_{v}

are the horizontal and vertical focal lengths, respectively.

2.1.3. Data Augmentation

To enrich the training dataset and enhance the model’s generalization, this work implemented various data augmentation techniques. These methods include mosaic [18], random flip [19], scale, and color-space conversion. Horizontal mirroring of images annotated with 2D boxes and keypoints to effectively double the dataset size was employed. Additionally, for improved color representation, the images were converted from RGB to HSV. The use of the HSV color space is particularly advantageous for capturing primary color tones, brightness levels, the leaf’s highest point, and the contrast between lightness and darkness. In the context of corn plant images, distinct color characteristics are evident for the background, corn, stems, and leaves, respectively. By incorporating data augmentation during training, the model’s generalization is significantly enhanced, especially in diverse lighting conditions. This approach contributes to a more robust and general model. Figure 4 includes examples of augmented images and raw images.

2.1.4. Phenotypic Parameter of Corn Plant

Considering the importance of phenotypic parameters in plant phenotyping and the ease of identifying the corresponding keypoints in the plant, this work attempted to extract four key phenotypic parameters from corn plants. These parameters include plant height, angle between leaf and stem length, leaf length, and ear position. By leveraging the marked keypoints and depth information extracted from the depth map, the 3D coordinates of these keypoints can be obtained. By utilizing these 3D coordinates and associating them with the respective components, this work constructed the skeleton of the corn plant and extracted the phenotypic parameters. The calculation process was based on the following rules:

(1): Height of Plant:
The height of the plant was defined as the distance from the root to the top of the plant. To calculate plant height, this work used the root, stalk, top, and leaf connecting points. The Euclidean distance between two adjacent points was calculated in the order of Y, and these distances were summed to obtain plant height.
(2): Angles between Leaf and Stem:
The angle between each leaf and stem [20] of each leaf varied. They needed to be calculated individually. Each angle was calculated by determining the angle between the line connecting the averaged leaf connection point and the angle point of the same leaf and the line connecting the leaf connection point and the point above.
(3): Length of Leaf:
The length of the leaf in one corn plant of each leaf varied. For the four keypoints in one leaf, the Euclidean distances were calculated between the leaf connection point and the angle point, between the angle point and the leaf’s highest point, and between the leaf’s highest point and the leaf tip point. Adding these Euclidean distances together, the length of the leaf can be obtained.
(4): Ear Position:
The ear position [21] refers to the location of the corn cob on the entire corn stem. The Euclidean distance can be calculated between the root point and the stalk point in the stem. Using the previously calculated plant height, the ear position can be obtained.

2.2. Overall Technical Route

To address the challenge of extracting the phenotypic parameters of corn plants, this study introduces the SCPE for extracting phenotypic parameters. SCPE comprises the YOLOv7-SlimPose model and phenotypic parameter extraction module. The overall flowchart of the proposed method is depicted in Figure 5.

The first is to create a corn plant keypoint dataset to train the YOLOv7-SlimPose model. In the keypoint detection stage. The YOLOv7-SlimPose model performed both bounding box and keypoint detection tasks within a single model. It outputs bounding boxes and keypoints for the left and the bounding box of the right images. Then, in the phenotypic parameter extraction stage, the PSMNet was utilized to generate the depth map from the bounding box in the left and right images. Using the Z values obtained from the keypoints in the depth map, the 3D coordinates were calculated for the camera’s internal reference as in Section 2.1.2. Through these 3D coordinates, the skeleton of the corn plant was constructed, leading to the extraction of phenotypic parameters for corn plants. This comprehensive approach integrates both object detection and depth information to provide a robust solution for extracting phenotypic parameters. The pseudocode is shown in Algorithm 1.

Algorithm 1 Stereo Corn Phenotype Extraction Algorithm (SCPE)

Input: Left & Right images

Output: Phenotypic Parameters

step 1: keypoint detection by YOLOv7-SlimPose;

step 2:

if keypoint detected then

| pyramid stereo matching network

else

| go back to step 1

end

step 3: Skeleton Construction;

step 4: Phenotypic parameter Computation;

2.3. Standard YOLOv8 Model

YOLOv7-Pose is the kernel of the first stage of SCPE. The standard YOLOv8 Model is also the basic framework for the critical point detection stage. YOLOv7-Pose, a keypoint detection network based on the YOLO structure [22], is different from the approach of encoding raw images into heat maps, as in CenterNet [23]. Instead, it directly outputs the results end-to-end, leading to a significant enhancement in the training speed. YOLOv7-Pose employs two multi-scale feature fusion methods, from bottom to top and from top to bottom within the same framework. This design predicts all keypoints with an anchor for detection, effectively accomplishing the keypoint detection task without introducing excessive computational overhead.

The primary architecture of YOLOv7-Pose comprises the backbone network, neck layer, and head layer. The backbone network is responsible for extracting image features from multiple scales. The neck layer fuses the features from the backbone network at each scale. The head layer utilizes four feature maps obtained by two decoupled heads to predict objects of different sizes and keypoints.

As shown in Figure 6, the CBS convolution module, which constitutes an efficient convergence network (ELAN) structure, is a pivotal component. It consists of a two-dimensional convolution kernel of different sizes, a batch normalization function, and an SiLU activation function. Multiple-base CBS convolutions form an ELAN structure. The input information from the backbone layer was fused with the features in the neck layer using bottom-to-top and top-to-bottom strategies. Finally, the output of each head layer was connected to two decoupled heads to predict the bounding box and keypoints of the corn plant.

In the head layer, each object bounding box is characterized by six data elements: the anchor horizontal coordinates

C_{x}

, anchor vertical coordinates

C_{y}

, predicted box width W, predicted box height H, detection box confidence

b o x_{c o n f}

, and the class confidence

c l a s s_{c o n f}

. Each keypoint consists of three data elements: the horizontal coordinate

K_{x}^{1}

, the vertical coordinate

K_{y}^{1}

, and the confidence

K_{c o n f}^{1}

. Therefore, for each object, the network predicted 6 elements for the target box probe head and 21 elements for keypoint detection heads, totaling 27 elements, as shown in Equation (2).

V = \{C_{x}, C_{y}, W, H, {b o x}_{c o n f}, {c l a s s}_{c o n f}, K_{x}^{1}, K_{y}^{1}, K_{c o n f}^{1}, \dots, K_{x}^{n}, K_{y}^{n}, K_{c o n f}^{n}\}

(2)

2.4. YOLOv7-SlimPose

This work optimized the structure and bounding box loss function of YOLOv7-Pose in the keypoint detection stage of SCPE, resulting in a more streamlined model with fewer parameters, named YOLOv7-SlimPose, dedicated to keypoint detection in corn plants. Figure 6 illustrates the architecture of YOLOv7-SlimPose compared with the original YOLOv7-POSE. The bounding box loss function in YOLOv7-SlimPose has been modified from the complete intersection over union (CIoU) to the minimum point distance-based IoU (MPDIoU) [24]. This change addresses issues encountered when the predicted box shares the same aspect ratio as the real labeled box but differs significantly in width and height values. To reduce the model’s computational demands and size, the neck component was optimized using GSConv and GSCNeck. GSConv replaces the convolutional layer with a large number of parameters in the neck area, whereas GSIN is inspired by inception and GSConv replaces the ELEN-H module in YOLOv7-Pose. After training, the model size was additionally reduced through pruning techniques.

2.4.1. Bounding Box Loss Function

The YOLOv7 bounding box employs the complete intersection over union (CIoU) loss [25], which, compared with the original IoU loss, takes into account the Euclidean distance to the center and considers cases of overlapping centroids and different aspect ratios. Equations (3)–(6) illustrate the calculation:

L_{C I o U} = 1 - I o U + \frac{ρ^{2} (b, b^{g t})}{c^{2}} + α v

(3)

I o U = \frac{A \cap B}{A \cup B}

(4)

v = \frac{4}{π^{2}} {(arctan \frac{w^{gt}}{h^{gt}} - arctan \frac{w}{h})}^{2}

(5)

α = \frac{v}{(1 - I o U) + v}

(6)

where A and B are the real and predicted boxes, IoU is the ratio of the intersection area to the concatenated area of the real box and predicted box, and b and

b^{g t}

are the coordinate positions of the centroids of the predicted and real boxes, respectively, where w and h are the width and height of the bounding box, respectively. c is a weighting factor, and v is a penalty factor for the ratios of the width and height of the prediction boxes to the real boxes.

ρ (b, b^{g t})

is the square of the Euclidean distance between the two points.

However, the CIoU loss encounters challenges when the predicted box has the same aspect ratio as the real labeled box but significantly different width and height values. To address this issue, this work selects MPDIoU loss as a suitable alternative to replace the CIoU loss. MPDIoU incorporates three key factors: overlapping or non-overlapping areas, center-point distance, and deviations in width and height. The loss calculation process is simplified by minimizing the point distance between the predicted bounding box and the truth-value bounding box. This replacement aims to address the difficulties in the optimization process. For MPDIoU, Equations (7)–(10) present the calculation:

d_{1}^{2} = {(x_{1}^{B} - x_{1}^{A})}^{2} + {(y_{1}^{B} - y_{1}^{A})}^{2}

(7)

d_{2}^{2} = {(x_{2}^{B} - x_{2}^{A})}^{2} + {(y_{2}^{B} - y_{2}^{A})}^{2}

(8)

M P D I o U = I o U - \frac{d_{1}^{2}}{h^{2} + w^{2}} - \frac{d_{2}^{2}}{h^{2} + w^{2}}

(9)

L_{M P D I o U} = 1 - M P D I o U

(10)

where A and B are the real box and predicted box, respectively, and IoU is the ratio of the intersection area of the real box and the box predicted to the concatenated area as Equation (4),

x_{1}^{A}

,

y_{1}^{A}

,

x_{2}^{A}

, and

y_{2}^{A}

denote the top-left and bottom-right point coordinates of A,

x_{1}^{B}

,

y_{1}^{B}

,

x_{2}^{B}

, and

y_{2}^{B}

denote the top-left and bottom-right point coordinates of B.

2.4.2. Slim Neck

In the age of equipment miniaturization, this work addresses the challenge of model size in YOLOv7-Pose, which has almost twice the number of parameters as in YOLOv7. The model size was reduced while maintaining precision, and the strategy involved updating the neck part of the model and conducting pruning training. In the neck layer of YOLOv7-Pose, GSConv [26] was introduced at the position of multi-channel convolution, replacing the basic original convolution. GSConv is proposed based on depthwise convolution [27], optimizing convolution operations through a multi-step process. Initially, it undersamples a standard convolution layer to reduce computational complexity. Then, it employs depthwise convolution for efficient feature map processing. The outcomes of these two operations are concatenated, leveraging their respective capabilities while preserving the network’s ability to capture essential features. Lastly, GSConv utilizes a shuffle operation to rearrange the channel order of the feature maps, enhancing information flow and facilitating more efficient computations.

Additionally, this work introduced the GSIN module, based on the inception [28] and GSConv concepts. The GSIN module extracts multi-dimensional features through four different parallel convolution modules and concatenates them. The outputs of these operations are then fused and convolved through the GSConv and CBS modules. GSIN can extract features better with fewer parameters. The GSIN module is employed to replace the ELAN-H with large channels in the neck layer. The structure of the GSIN and GSConv is shown in Figure 6.

2.4.3. Pruning Training

To further reduce the parameters of the model and accelerate the detection speed, structural pruning [29] was applied to reduce the parameter count. The structural pruning process executed on the model with the best training results involved three main steps.

(1): Sparse Training:
In this step, the importance of each convolution kernel in the depth model was evaluated. By applying sparse training, crucial convolution kernels are identified.
(2): Model Pruning:
In this step, unimportant convolution kernels are removed during the model pruning stage.
(3): Fine-Tuning:
In this step, the pruned model was fine-tuned to achieve precision comparable to that of a normally trained network. This step ensures that, despite the reduction in parameters, the model maintains or even improves its performance.

In structural pruning, This work compared three mainstream pruning techniques: Slim, Group Slim, and Lamp. This comparison evaluates the effectiveness and trade-offs of each pruning approach in terms of model size, and precision.

2.5. Phenotypic Parameter Extraction Module

In this work, the phenotypic parameter extraction module of SCPE is utilizing the output of the keypoint detection stage to output the phenotypic parameters. The phenotypic parameter extraction module involves three main steps:

(1): Stereo Matching Network:
After obtaining the bounding boxes of the corn plants from the left and right images, the bounding box was selected in the left and right images to create a union box. The depth map was then generated through the stereo-matching network in the union box. PSMNet was used for this purpose, which introduced the spatial pyramid pooling structure. This allowed us to obtain a depth map of the object region by constructing a 4D cost volume and a 3D convolution. From the depth map, the 3D coordinates of the keypoints were determined based on the keypoints in the left image with the depth map.
(2): Construction of the Skeleton:
Using these 3D coordinates, the skeleton of each leaf and stem can be constructed. For one corn plant, the skeleton construction was mainly divided into the construction of the leaves and stems. The skeleton of the leaves was constructed according to the object detection results of the leaves and their related keypoints. In one leaf, four keypoints were detected, and the skeleton of the leaf was constructed in the following order: leaf connection point, leaf angle point, highest leaf point, and leaf tip point. The skeleton of the stem was constructed according to the stem object detection result with their related keypoints (root, stalk, and top points) and leaf-connecting points. First, they are averaged and then ordered according to the Z-values, constructing the skeleton in the determined order.
(3): Phenotypic Parameter Computation:
After constructing the skeleton of the corn plants, the positions of the leaves and the stem with their points in 3D space can be obtained. Following the calculation method outlined in Section 2.1.4, the phenotypic parameters can be extracted.

From these steps in the phenotypic parameter extraction module, the phenotypic parameters can be extracted for corn plants. The process of the phenotypic parameter extraction module is shown in Figure 7.

2.6. Experimental Setting

The software and hardware settings for model training and testing in this study are listed in Table 2. The training epoch was 1100 epochs with a batch size of 4. The optimization was performed using the Adam optimizer, and the learning rate was adjusted every 500 rounds.

2.7. Model Evaluation Metrics

In this work, mAP was used to measure the accuracy of keypoint detection, and error was used to measure the accuracy of representation extraction. For the keypoint detection task, evaluating the precision of the detection involves more than simply measuring the Euclidean distance between true and predicted points. The assessment depends on the type of keypoints, and different weights are assigned to The similarity between the actual and predicted points was calculated. In this study, this work adopted the same evaluation metrics as YOLOv7-Pose, utilizing the average precision (AP) and the mAP based on the keypoint similarity metric Object Keypoint Similarity (OKS). OKS is calculated using Equation (11).

{OKS}_{n} = \frac{\sum_{i} exp \{- d_{ni}^{2} / 2 S_{n}^{2} σ_{i}^{2}\} δ (v_{n^{i}} = 1)}{\sum_{i} δ (v_{n^{i}} = 1)}

(11)

where:

n

denotes the number of corn plants in the detection image;

ni

denotes the keypoint number of the plant;

v_{n^{i}}

denotes the visibility of the keypoint in the image;

d_{n}

denotes the Euclidean distance between the true and predicted points. A smaller value of

d_{n}

indicates a better prediction at that point.

S_{n}

indicates the square root of the size of the area occupied by the object detection bounding box that identifies the corn plant.

σ_{i}

measures the standard deviation of different keypoints. Finally, the Euclidean distance

d_{n}

of the keypoints, the area

S_{n}

of the detected corn plant, the labeling bias

σ_{i}

, and the OKS are normalized to maintain consistency with the above analysis.

AP (average precision) is the average precision of the predicted result on the test set. As shown in Equation (12), by setting a threshold s for the OKS, if the value of the OKS at a keypoint is greater than s, the keypoint is considered to be detected as correct. Otherwise, it was considered incorrect. The mAP (mean average precision) is the mean value of AP for different thresholds. As shown in Equation (13), the threshold was set from 0.5 to 0.95, and the step was 0.05.

A P = \frac{Σ_{n} δ (O K S_{n} > s)}{Σ_{n} 1}

(12)

m A P = mean {A P @ (0.50 : 0.05 : 0.95)}

(13)

Error(%) is used to measure whether the extracted phenotypic parameters are accurate. The calculation formula is as follows Equation (14):

E r r o r = \frac{| F - G T |}{G T}

(14)

where: F denotes the extracted phenotypic parameters;

G T

denotes the phenotypic parameters measured by manual measurement.

3. Results

In this section, this work follows the order of proposed improvements to conduct experiments. Table 3 explores the impacts of three different bounding box loss functions in the YOLOv7-SlimPose model. To address inherent challenges in CIoU, this work proposes using MPDIoU for enhancement. For comparative analysis, SIoU [30] was also included in the experiment for comparison. The results indicate that the highest mAP achieved with the MPDIoU was 96.8%, surpassing the CIoU loss by 0.5% and the SIoU by 0.3%. This experiment suggests that employing MPDIoU in YOLOv7-SlimPose leads to superior bounding box regression because of SIoU and CIoU, as the MPDIoU loss function accounts for the overlapping, non-overlapping areas, and center point distance, width, and height deviation.

Using the improvement of MPDIoU to simplify the neck component of the model, this work proposes the use of GSConv and GSIN to replace the original convolution and ELEN-H modules in the neck. Table 4 shows that replacement had no negative effects. This work obtained a 0.1% improvement in the mAP with 6 million fewer parameters.

Table 5 presents the results of further reducing parameters through structural pruning. Three different pruning algorithms—Slim, Group Slim, and Lamp—were compared. The results show that none of the three pruning algorithms adversely impacted mAP. The Slim algorithm reduced the number of parameters by approximately 13%, which is better than Group Slim and Lamp.

From the above work, this work presents YOLOv7-SlimPose, an adaptation of YOLOv7-Pose achieved through improvement in the IoU loss function, replacement of components in the neck layer, and structural pruning. Table 6 compares the results between YOLOv7-Pose and YOLOv7-SlimPose. YOLOv7-Pose exhibits an mAP of 0.957 with 80 million parameters. In contrast, YOLOv7-SlimPose demonstrates improved mAP (96.8%) and fewer parameters (67 million). YOLOv7-SlimPose excels in predicting bounding boxes and keypoints, offering enhanced speed and efficiency, and is suitable for deployment on embedded devices.

Figure 8 shows the comparison of loss during training between YOLOv7-SlimPose and YOLOv7-Pose. Figure 9 shows the P-R curve of the best result of mAP90.

Figure 10 shows the change process and the extracted phenotypic parameters of corn. plant from an image using the SCPE algorithm, which first detects the stem and leaves in the image, and then the relevant keypoints are detected. The 3D skeletons of plants were constructed through the relevant points of leaves and stems. Finally, the phenotypic parameters of the plant can be extracted. The phenotypic parameters of the samples in Figure 10 are summarized in Table 7.

Twenty images were selected from the test set as the experimental data, and the phenotypic parameters extracted using the SCPE algorithm were compared with the true values manually measured from the 3D point cloud. Table 8 presents the results of the SCPE algorithm for the experimental data. The SCPE algorithm was used to detect 20 samples with 440 keypoints, and 430 keypoints were correctly detected with a precision of 98%. The YOLOv7-SlimPose model took an average of 0.09 s to detect a corn plant in one left or right image, and the PSMNet processing module took an average of 0.2 s to obtain the 3D coordinate. The construction of the skeleton and phenotypic parameter extraction required almost no time. The SCPE algorithm used 0.38 s to extract the phenotypic parameters in total. The errors in the experimental data are listed in Table 9.

Table 10 presents a comparison between the SCPE algorithm and the other algorithms. It can be seen from the table that the SCPE used the keypoint detection-based method to extract the phenotypic parameters. SCPE was much faster than the method based on 3D reconstruction, and the accuracy was also better. Although SCPE is difficult to use in a multi-plant environment, it is currently the fastest phenotypic parameter extraction algorithm for specific plants with a high degree of precision.

4. Discussion

This work proposes SCPE method utilizing stereo images for the phenotypic parameter extraction of corn plants through the keypoint-detecting model. The cost-effectiveness of binocular cameras allows SCPE to find an optimal balance between precision and cost. The SCPE method comprises two modules: the YOLOv7-SlimPose model for keypoint detection and the phenotypic parameter extraction module for constructing the skeleton and obtaining phenotypic parameters. The SCPE method demonstrates notable advancements over the original YOLOv7-Pose. To cater to miniaturized devices, we extensively optimized the original model with fewer parameters and increased speed.

In the first model, YOLOv7-SlimPose, for keypoint detection, seven kinds of keypoints were identified, encompassing various nodes of the corn plant. Building on YOLOv7-Pose, this work replaced the original bounding box loss function, CIoU loss, with MPDIoU loss to reduce the model size. The GSConv and GSIN modules were further utilized to reduce the size of the neck in the original model and prune the trained model based on the Slim pruning algorithm, achieving a bounding box mAP of 96.8% while reducing the model size to 81.4% of its original size. In comparison to YOLOv7-Pose, YOLOv7-SlimPose excels in speed, model size, and keypoint detection precision. Owing to its easy-to-train peculiarities, keypoints, and data structure, the YOLOv7-SlimPose method can be seamlessly adapted to other crops or plants. The left and right images are passed through the model to obtain bounding boxes with keypoints.

In the second phenotypic parameter extraction model for constructing the skeleton and obtaining phenotypic parameters, the depth map is acquired through PSMNet, which uses the object’s bounding box for both left and right images. These depth maps, when combined with the keypoints, yield the 3D coordinates for extracting phenotypic parameters. Leveraging these 3D coordinates for various keypoints allows the construction of the corn plant’s skeleton. Subsequently, phenotypic parameters are extracted by calculating Euclidean distances based on a predefined computational method. Additionally, the constructed skeleton provides ways to obtain the growth of corn plants, such as detecting lodging, monitoring the number of leaves, and assessing the overall normalcy of growth. SCPE requires only binocular images for phenotypic parameter extraction of corn plants, achieving faster running speeds compared to traditional methods relying on point cloud data. It operates in real-time within the equipment, effectively meeting the demand for swift phenotypic parameter extraction. Despite using a smaller dataset of 20 data points, the errors were approximately 10% compared with the manual calculation. However, the YOLOv7-SlimPose model was trained using data from a laboratory; it can also be utilized in real outdoor farmland, but accuracy may slightly decrease.

Compared with previous studies for phenotypic parameter extraction, SCPE is not the most accurate, but it is certainly the fastest, most convenient, and, at the same time, can also guarantee very high precision (about 90%). Compared with most phenotypic parameter extraction work using 3D reconstruction, SCPE is more concise and efficient. It is more suitable for the phenotypic parameter extraction of a large number of plants in the same category in the breeding stage.

It must be pointed out that, according to experiments, the error in the results is mainly caused by the error in the depth map. So using better equipment, models, or manual reviews to improve the quality of depth maps can improve the accuracy of SCPE.

To enhance model generality, future research will involve building corn plant datasets for different varieties on real outdoor farmland and implementing filters to reduce background interference. Additionally, the following work will explore the possibility of directly outputting phenotypic parameters by using an end-to-end model with binocular data, eliminating the need for depth data. We believe that, through future work, the SCPE algorithm can be applied to miniaturized devices, such as mobile phones and vehicle robots, to automatically conduct phenotypic parameter extraction of dense crops in farmland environments, without the need for manual intervention.

5. Conclusions

This work introduces a novel SCPE algorithm designed to extract the phenotypic parameters of corn plants using stereo images. The SCPE method comprises the YOLOv7-SlimPose model and a phenotypic parameter extraction module. Building upon YOLOv7-Pose, the YOLOv7-SlimPose model refines the original bounding box loss function, CIoU loss, to MPDIoU loss. To reduce the model size, this work adjusted the neck and conducted pruning on the trained model while maintaining precision.

The YOLOv7-SlimPose model successfully achieves bounding box and 2D keypoint detection in left and right images. The keypoint mAP of the YOLOv7-SlimPose model is 96.8%, with 61.5 million parameters and a detection speed of 0.38 s per corn plant image. Compared with the original model, the mAP shows a 0.3% increase, and the parameter count is reduced by 15%. Experimental results showcase the YOLOv7-SlimPose model’s effectiveness in keypoint detection tasks, boasting higher precision and a more compact model size compared to the original YOLOv7-Pose model.

The phenotypic parameter extraction model involves stereo matching, skeleton construction, and the phenotypic parameter calculation procedure. The stereo matching procedure can output a depth map and obtain the 3D coordinates of the keypoints, and the skeleton is constructed based on these 3D coordinates. The phenotypic parameters can be extracted from the skeleton and 3D coordinates. The stereo matching procedure efficiently yields a depth map from the bounding box in left and right images in 0.2 s.

The SCPE algorithm achieved an accuracy of about 90% for phenotypic parameter extraction of corn plants, and the extraction speed for phenotypic parameters was 0.38 s per corn plant. This marks it as the fastest method currently available for phenotypic parameter extraction. The SCPE algorithm serves as a technical foundation for phenotypic parameter extraction in plant phenotyping, can also monitor the growth of corn plants, and can be used for other crops or plants, such as sorghum, rice, and wheat.

Author Contributions

Conceptualization, Y.G. and Z.L.; methodology, Z.L.; software, Y.G. Validation, Y.G.; formal analysis, Y.G.; investigation, Y.G. and Z.L.; writing—original draft preparation Y.G.; writing, review, and editing, Y.G.; visualization, Y.G.; supervision, B.L. and L.Z.; funding Acquisition: Y.G. All authors read and agreed to the published version of the manuscript.

Funding

This work was supported by JST and the establishment of University Fellowships Toward Creation of Science Technology Innovation (grant number JPMJFS2133).

Data Availability Statement

The datasets analyzed during the current study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

CIoU: complete intersection over union; ELAN: efficient convergence network; mAP: mean average precision; MPDIoU: minimum point distance-based IoU; OKS: object keypoint similarity; SCPE: Stereo Phenotype Extraction algorithm.

References

García-Lara, S.; Serna-Saldivar, S.O. Corn history and culture. In Corn; Elsevier: Amsterdam, The Netherlands, 2019; pp. 1–18. [Google Scholar]
Raju, S.K.K.; Thompson, A.M.; Schnable, J.C. Advances in plant phenomics: From data and algorithms to biological insights. Appl. Plant Sci. 2020, 8, e11386. [Google Scholar] [CrossRef]
Liu, X.; Li, N.; Huang, Y.; Lin, X.; Ren, Z. A comprehensive review on acquisition of phenotypic information of Prunoideae fruits: Image technology. Front. Plant Sci. 2023, 13, 1084847. [Google Scholar] [CrossRef] [PubMed]
Shang, Y.; Hasan, M.K.; Ahammed, G.J.; Li, M.; Yin, H.; Zhou, J. Applications of nanotechnology in plant growth and crop protection: A review. Molecules 2019, 24, 2558. [Google Scholar] [CrossRef] [PubMed]
Ma, Z.; Liu, S. A review of 3D reconstruction techniques in civil engineering and their applications. Adv. Eng. Inform. 2018, 37, 163–174. [Google Scholar] [CrossRef]
Zermas, D.; Morellas, V.; Mulla, D.; Papanikolopoulos, N. 3D model processing for high throughput phenotype extraction–the case of corn. Comput. Electron. Agric. 2020, 172, 105047. [Google Scholar] [CrossRef]
Zhao, G.; Cai, W.; Wang, Z.; Wu, H.; Peng, Y.; Cheng, L. Phenotypic parameters estimation of plants using deep learning-based 3-D reconstruction from single RGB image. IEEE Geosci. Remote. Sens. Lett. 2022, 19, 2506705. [Google Scholar] [CrossRef]
Zhu, T.; Ma, X.; Guan, H.; Wu, X.; Wang, F.; Yang, C.; Jiang, Q. A method for detecting tomato canopies’ phenotypic traits based on improved skeleton extraction algorithm. Comput. Electron. Agric. 2023, 214, 108285. [Google Scholar] [CrossRef]
Li, Y.; Wen, W.; Fan, J.; Gou, W.; Gu, S.; Lu, X.; Yu, Z.; Wang, X.; Guo, X. Multi-source data fusion improves time-series phenotype accuracy in maize under a field high-throughput Phenotyping platform. Plant Phenomics 2023, 5, 0043. [Google Scholar] [CrossRef] [PubMed]
Qian, R.; Lai, X.; Li, X. 3D object detection for autonomous driving: A survey. Pattern Recognit. 2022, 130, 108796. [Google Scholar] [CrossRef]
Shi, Y.; Guo, Y.; Mi, Z.; Li, X. Stereo CenterNet-based 3D object detection for autonomous driving. Neurocomputing 2022, 471, 219–229. [Google Scholar] [CrossRef]
Li, Z.; Gao, Y.; Hong, Q.; Du, Y.; Serikawa, S.; Zhang, L. Keypoint3D: Keypoint-Based and Anchor-Free 3D Object Detection for Autonomous Driving with Monocular Vision. Remote Sens. 2023, 15, 1210. [Google Scholar] [CrossRef]
Nguyen, H.X.; Hoang, D.N.; Bui, H.V.; Dang, T.M. Development of a Human Daily Action Recognition System for Smart-Building Applications. In Proceedings of the International Conference on Intelligent Systems & Networks, Hanoi, Vietnam, 18–19 March 2023; Springer Nature: Singapore, 2023; pp. 366–373. [Google Scholar]
Fu, H.; Gao, J.; Liu, H. Human pose estimation and action recognition for fitness movements. Comput. Graph. 2023, 116, 418–426. [Google Scholar] [CrossRef]
Du, X.; Meng, Z.; Ma, Z.; Lu, W.; Cheng, H. Tomato 3D pose detection algorithm based on keypoint detection and point cloud processing. Comput. Electron. Agric. 2023, 212, 108056. [Google Scholar] [CrossRef]
Zheng, B.; Sun, G.; Meng, Z.; Nan, R. Vegetable Size Measurement Based on Stereo Camera and Keypoints Detection. Sensors 2022, 22, 1617. [Google Scholar] [CrossRef] [PubMed]
Xiao, J.; Suab, S.A.; Chen, X.; Singh, C.K.; Singh, D.; Aggarwal, A.K.; Korom, A.; Widyatmanti, W.; Mollah, T.H.; Minh, H.V.T.; et al. Enhancing assessment of corn growth performance using unmanned aerial vehicles (UAVs) and deep learning. Measurement 2023, 214, 112764. [Google Scholar] [CrossRef]
Dulal, R.; Zheng, L.; Kabir, M.A.; McGrath, S.; Medway, J.; Swain, D.; Swain, W. Automatic Cattle Identification using YOLOv5 and Mosaic Augmentation: A Comparative Analysis. In Proceedings of the 2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA), Sydney, Australia, 30 November–2 December 2022; pp. 1–8. [Google Scholar]
Li, P.; Chen, X.; Shen, S. Stereo r-cnn based 3d object detection for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 7644–7652. [Google Scholar]
Atefi, A.; Ge, Y.; Pitla, S.; Schnable, J. Robotic detection and grasp of maize and sorghum: Stem measurement with contact. Robotics 2020, 9, 58. [Google Scholar] [CrossRef]
Ortez, O.A.; McMechan, A.J.; Hoegemeyer, T.; Rees, J.; Jackson-Ziems, T.; Elmore, R.W. Abnormal ear development in corn: A field survey. Agrosyst. Geosci. Environ. 2022, 5, e20242. [Google Scholar] [CrossRef]
Jiang, P.; Ergu, D.; Liu, F.; Cai, Y.; Ma, B. A Review of Yolo algorithm developments. Procedia Comput. Sci. 2022, 199, 1066–1073. [Google Scholar]
Duan, K.; Bai, S.; Xie, L.; Qi, H.; Huang, Q.; Tian, Q. Centernet: Keypoint triplets for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6569–6578. [Google Scholar]
Siliang, M.; Yong, X. Mpdiou: A loss for efficient and accurate bounding box regression. arXiv 2023, arXiv:2307.07662. [Google Scholar]
Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. Proc. Aaai Conf. Artif. Intell. 2020, 34, 12993–13000. [Google Scholar] [CrossRef]
Zhao, X.; Song, Y. Improved Ship Detection with YOLOv8 Enhanced with MobileViT and GSConv. Electronics 2023, 12, 4666. [Google Scholar] [CrossRef]
Guo, Y.; Li, Y.; Wang, L.; Rosing, T. Depthwise convolution is all you need for learning multiple visual domains. Proc. Aaai Conf. Artif. Intell. 2019, 33, 8368–8375. [Google Scholar] [CrossRef]
Shah, S.R.; Qadri, S.; Bibi, H.; Shah, S.M.W.; Sharif, M.I.; Marinello, F. Comparing inception V3, VGG 16, VGG 19, CNN, and ResNet 50: A case study on early detection of a rice disease. Agronomy 2023, 13, 1633. [Google Scholar] [CrossRef]
Vadera, S.; Ameen, S. Methods for pruning deep neural networks. IEEE Access 2022, 10, 63280–63300. [Google Scholar] [CrossRef]
Gevorgyan, Z. SIoU loss: More powerful learning for bounding box regression. arXiv 2022, arXiv:2205.12740. [Google Scholar]

Figure 3. Example of the depth map.

Figure 4. Example of the data augmentation.

Figure 5. Overall flowchart of the proposed method SCPE.

Figure 6. Proposed YOLOv7-SlimPose structure diagram compared with original YOLOv7.

Figure 7. Sequential diagram of the phenotypic parameter extraction module.

Figure 8. Comparison of loss during training.

Figure 9. P-R Curve.

Figure 10. The change process of the image when the SCPE algorithm extracts the phenotypic parameters. In each line, the first image is the original image, the second is the image with the keypoint after keypoint detection by the YOLOv7-SlimPose model, and the third shows the skeleton constructed from the 3D coordinates.

Table 1. Composition and use of data in this work.

Raw Data	Augment Data	Train Set	Test Set	Phenotype Extraction
1000	4000	3600	400	20

Table 2. Software and hardware configuration.

Accessories	Model
CPU	Intel (R) Xeon (R) CPU E5-1650 v4
RAM	64 G
Operating system	Ubuntu18.04
GPU	NVIDIA GeForce RTX 1080Ti × 3
Development	Python3.8, Pytorch1.8.1
Environments	CUDA11.1

Table 3. Comparison of different IoU loss function.

Model	Precision	Recall	mAP
YOLOv7-Pose + CIoU	97.5%	93.0%	96.3%
YOLOv7-Pose + SIoU	97.9%	93.2%	96.5%
YOLOv7-Pose + MPDIoU	98.1%	93.5%	96.7%

Table 4. Comparison of different neck structure.

Model	Precision	Recall	mAP	Parameters
YOLOv7-Pose	98.2%	93.5%	96.7%	80.0 M
YOLOv7-Pose + SlimNeck	98.1%	93.5%	96.8%	74.0 M

Table 5. Comparison of different pruning method.

Prune Method	Precision	Recall	mAP	Parameters
No pruning	98.2%	93.5%	96.7%	74.0 M
Lamp	98.1%	93.5%	96.8%	69.0 M
Group Slim	97.9%	93.4%	96.8%	68.3 M
Slim	98.1%	93.3%	96.8%	65.1 M

Table 6. Comparison of the results baseline(YOLOv7-Pose) and YOLOv7-SlimPose.

Model	Precision	Recall	mAP	Parameters
YOLOv7-Pose	97.5%	93.5%	96.3%	80 M
SlimPose(ours)	98.1%	93.3%	96.8%	65.1 M

Table 7. The extracted phenotypic parameters of the corn plants in Figure 10.

ID	Plant Height (mm)	Ear Position (mm)	Angles between Leaf and Stem (°)
A	1580	485	67, 66, 76, 64, 43, 40, 57
B	1482	428	48, 49, 44, 47, 49, 55, 44, 64, 43
C	1989	577	108, 48, 48, 40, 52, 49, 47, 42, 46
D	1837	598	27, 82, 80, 84, 89, 69, 71
ID	Length of Leaf (mm)
A	268, 506, 663, 794, 994, 753, 669
B	357, 744, 788, 729, 808, 718, 625
C	197, 517, 747, 662, 732, 760, 647, 694, 675, 731
D	436, 1296, 1531, 1249, 1346, 1483, 1384

Table 8. The results of the SCPE algorithm in experimental data.

Method	Number of samples	Number of keypoints	Precision	YOLOv7-SlimPose model speed
SCPE algorithm	20	440	$98.0 %$	$0.09 s /$ item
Phenotypic parameters extraction model speed
$0.2 s /$ item

Table 9. Errors of different phenotypic parameters.

Height of Plant	Ear Position	Length of Leaf	Angles between Leaf and Stem
10.6%	6.9%	14.8%	19.2%

Table 10. Performance comparison between SCPE algorithm and other research algorithms.

Author	Research Object	Method	Speed	Accuracy
Zermas et al. [6]	Corn	3D Reconstruction		74.8–92.48%
Zhao et al. [7]	Plants	3D Reconstruction		75–91%
Zhu et al. [8]	Tomato	3D Reconstruction		91.44%
Li et al. [9]	Corn in field	3D Reconstruction		92.95–93.65%
Zheng et al. [16]	Vegetables	Keypoint detection		87.31–97.05%
SCPE (Ours)	Corn	Keypoint detection	$0.38 s$	80.8–93.1%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gao, Y.; Li, Z.; Li, B.; Zhang, L. Extraction of Corn Plant Phenotypic Parameters with Keypoint Detection and Stereo Images. Agronomy 2024, 14, 1110. https://doi.org/10.3390/agronomy14061110

AMA Style

Gao Y, Li Z, Li B, Zhang L. Extraction of Corn Plant Phenotypic Parameters with Keypoint Detection and Stereo Images. Agronomy. 2024; 14(6):1110. https://doi.org/10.3390/agronomy14061110

Chicago/Turabian Style

Gao, Yuliang, Zhen Li, Bin Li, and Lifeng Zhang. 2024. "Extraction of Corn Plant Phenotypic Parameters with Keypoint Detection and Stereo Images" Agronomy 14, no. 6: 1110. https://doi.org/10.3390/agronomy14061110

APA Style

Gao, Y., Li, Z., Li, B., & Zhang, L. (2024). Extraction of Corn Plant Phenotypic Parameters with Keypoint Detection and Stereo Images. Agronomy, 14(6), 1110. https://doi.org/10.3390/agronomy14061110

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Extraction of Corn Plant Phenotypic Parameters with Keypoint Detection and Stereo Images

Abstract

1. Introduction

2. Materials and Methods

2.1. Materials

2.1.1. Data Acquisition

2.1.2. Acquisition of Labels

2.1.3. Data Augmentation

2.1.4. Phenotypic Parameter of Corn Plant

2.2. Overall Technical Route

2.3. Standard YOLOv8 Model

2.4. YOLOv7-SlimPose

2.4.1. Bounding Box Loss Function

2.4.2. Slim Neck

2.4.3. Pruning Training

2.5. Phenotypic Parameter Extraction Module

2.6. Experimental Setting

2.7. Model Evaluation Metrics

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI