Three-Dimensional Reconstruction, Phenotypic Traits Extraction, and Yield Estimation of Shiitake Mushrooms Based on Structure from Motion and Multi-View Stereo

Xu, Xingmei; Li, Jiayuan; Zhou, Jing; Feng, Puyu; Yu, Helong; Ma, Yuntao

doi:10.3390/agriculture15030298

Open AccessArticle

Three-Dimensional Reconstruction, Phenotypic Traits Extraction, and Yield Estimation of Shiitake Mushrooms Based on Structure from Motion and Multi-View Stereo

by

Xingmei Xu

¹,

Jiayuan Li

¹,

Jing Zhou

¹,

Puyu Feng

²,

Helong Yu

^1,*

and

Yuntao Ma

^1,2,*

¹

College of Information Technology, Jilin Agricultural University, Changchun 130118, China

²

College of Land Science and Technology, China Agricultural University, Beijing 100193, China

^*

Authors to whom correspondence should be addressed.

Agriculture 2025, 15(3), 298; https://doi.org/10.3390/agriculture15030298

Submission received: 9 January 2025 / Revised: 25 January 2025 / Accepted: 28 January 2025 / Published: 30 January 2025

(This article belongs to the Section Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

:

Phenotypic traits of fungi and their automated extraction are crucial for evaluating genetic diversity, breeding new varieties, and estimating yield. However, research on the high-throughput, rapid, and non-destructive extraction of fungal phenotypic traits using 3D point clouds remains limited. In this study, a smart phone is used to capture multi-view images of shiitake mushrooms (Lentinula edodes) from three different heights and angles, employing the YOLOv8x model to segment the primary image regions. The segmented images were reconstructed in 3D using Structure from Motion (SfM) and Multi-View Stereo (MVS). To automatically segment individual mushroom instances, we developed a CP-PointNet++ network integrated with clustering methods, achieving an overall accuracy (OA) of 97.45% in segmentation. The computed phenotype correlated strongly with manual measurements, yielding R² > 0.8 and nRMSE < 0.09 for the pileus transverse and longitudinal diameters, R² = 0.53 and RMSE = 3.26 mm for the pileus height, R² = 0.79 and nRMSE = 0.12 for stipe diameter, and R² = 0.65 and RMSE = 4.98 mm for the stipe height. Using these parameters, yield estimation was performed using PLSR, SVR, RF, and GRNN machine learning models, with GRNN demonstrating superior performance (R² = 0.91). This approach was also adaptable for extracting phenotypic traits of other fungi, providing valuable support for fungal breeding initiatives.

Keywords:

shiitake mushrooms; structure from motion; point cloud; deep learning; point cloud semantic segmentation

1. Introduction

Mushrooms are a common type of fungus and are globally recognized for their culinary and medicinal applications. They are valued for their distinctive taste, rich nutritional content, anti-tumor properties, cholesterol-lowering effects, and significant commercial importance [1]. According to Expert Market Research, the global mushroom market was valued at approximately USD 68.03 billion in 2023, with a projected compound annual growth rate of 7.18% from 2024 to 2032 [2]. Shiitake mushrooms (Lentinula edodes), native to China, are a dual-purpose crop used in both food and medicine and have become one of China’s most competitive agricultural exports [3]. In 2022, China produced 12.95 million tons of shiitake mushrooms, accounting for 98.3% of the global production (Ministry of Agriculture and Rural Affairs of China). Phenotypic traits of shiitake mushrooms are vital for selecting high-quality strains and grading products. However, current phenotypic extraction technologies have limited research on the phenotype–genotype relationship, which is critical for variety improvement and product grading. Therefore, developing a high throughput, precise, and non-destructive method for extracting phenotypic parameters of edible fungi is essential for advancing mycological research and industrial applications.

Current methods for obtaining phenotypic parameters include manual measurements, 2D imaging, and 3D point cloud extraction. Manual measurements are slow, repetitive, subjective, and prone to damaging the plant during the process [4]. Two-dimensional imaging methods are limited to measuring phenotypic parameters, such as cap diameter in two dimensions, and cannot capture spatial information such as thickness or orientation [5,6]. The 3D point cloud extraction method employs certain techniques such as laser scanning, depth cameras, and multi-view image-based 3D reconstruction [7]. Laser tools, including laser scanners [8,9] and LiDAR systems [9,10,11], provide high-resolution, accurate point clouds with real world dimensions but are expensive, offer low point cloud density, and have slow data collection speeds. Depth cameras, such as Kinect and Intel RealSense [12,13,14,15], facilitate rapid 3D data reconstruction but produce lower-quality point clouds because of their limited resolution and high sensitivity to environmental conditions.

The 3D reconstruction method based on multi-view images has the advantages of low-cost, high point cloud density and accuracy, and is not easily affected by the environment. Structure from Motion (SfM) and Multi-View Stereo (MVS) are usually used together to obtain more comprehensive and accurate 3D reconstruction results, which are widely used in the field of computer vision. For instance, He et al. [16] utilized a low-cost SfM and MVS system to obtain seven phenotypic parameters for strawberries. Hao et al. [17] developed MVS-Pheno V2 to collect point cloud data for cotton crops, employing the PointSegAt deep learning network to segment overlapping cotton leaves and extract phenotypic parameters. Xiao et al. [18] applied multi-view image reconstruction to extract ten phenotypic traits from 3D point clouds of sugar beet roots, while Xiao et al. [19,20] extended the SfM-MVS technology to drones, designing a cross-surround photography method for the high-throughput, phenotypic extraction of maize, sugar beet, cotton, and cotton bolls. Additionally, Sun et al. [21] created an ensemble learning framework that leveraged this technology to predict soybean yield, demonstrating that incorporating 3D structural features enhanced prediction accuracy. In summary, the application of 3D reconstruction point clouds from multi-view images is a cost-effective, precise, and robust approach that provides a valuable tool for crop breeding and agricultural production.

During the collection of multi-view images, the significant irrelevant background content is often captured, reducing the efficiency of 3D reconstruction. He et al. [22] mitigated this issue by cropping multi-view images to reduce the size and enhance the processing efficiency, aiding subsequent point cloud preprocessing. However, this method does not fully eliminate background information. Segmenting multi-view images before 3D reconstruction has been proven to completely remove background content [18]. Conversely, traditional segmentation algorithms face challenges in terms of accurately preserving complex crop edges and handling background shadows. Threshold-based and watershed algorithms are sensitive to noise and rely on brightness variations, limiting their effectiveness in processing complex images [23]. Region-growing algorithms are capable of identifying continuous areas, whereas they are computationally intensive and highly sensitive to seed point selection. These methods generally lack robustness and fail to capture high-level semantic information in complex scenarios [24]. In contrast, deep learning has demonstrated exceptional performance in image segmentation, accurately identifying targets, preserving boundaries, and extracting regions of interest holistically. For example, Yang et al. [25] applied the U²-Net model to remove irrelevant backgrounds from multi-view images of leafy vegetable plants, effectively eliminating large-scale background noise in reconstructed point clouds. This approach enhanced the accuracy of 3D reconstruction and improved the efficiency of subsequent point cloud preprocessing.

However, the research on the multi-view 3D reconstruction algorithm of edible fungi (including shiitake mushrooms) is still limited [26,27]. Existing methods have some problems in point cloud segmentation, 3D reconstruction, and phenotypic analysis, such as incomplete background elimination, expensive equipment, a large amount of calculation, and limited accuracy. In view of these limitations, this study uses the YOLOv8 algorithm to segment the shiitake mushroom multi-view image and uses the SfM-MVS algorithm for 3D reconstruction. Combined with the improved CP-PointNet++ model and clustering algorithm, an automatic point cloud segmentation pipeline was developed to achieve a high-throughput, lossless, and fast extraction of shiitake mushroom phenotypic parameters. Finally, the extracted parameters are input into the generalized regression neural network (GRNN) for yield estimation, which provides a robust tool for the application of fungal breeding.

2. Materials and Methods

This study introduced an automated pipeline for 3D reconstruction, point cloud segmentation, phenotypic parameter extraction, and yield prediction of shiitake mushrooms using the SfM 3D reconstruction method. The pipeline consisted of three main stages.

First, the YOLOv8 model segmented regions of interest (ROI) from RGB images, and the segmented images were utilized for subsequent 3D point cloud reconstruction.
Subsequently, the obtained point cloud underwent a series of preprocessing steps, and an improved PointNet++ model, incorporating the CBAM module and Partial Convolution (PConv), was utilized for segmentation, generating point clouds for the pileus, stipe, and shiitake mushroom sawdust substrate. The region-growing algorithm and fast Euclidean clustering algorithm were then applied to segment the individual mushroom point clouds, enabling the calculation of phenotypic parameters.
Finally, the calculated phenotypic parameters were input into machine learning algorithms for yield estimation, as outlined in the detailed workflow shown in Figure 1.

Figure 1. Overall workflow for 3D reconstruction, point cloud segmentation, phenotypic calculation, and yield prediction of shiitake mushrooms. A novel CP-PointNet++ point cloud segmentation network was proposed based on CBAM and PConv.

2.1. Shiitake Mushrooms Sample and Data Collection

Shiitake mushrooms mycelia of the Shanghai Academy of Agricultural Sciences 509 strain were inoculated onto shiitake mushroom sawdust substrate measuring 10 cm in diameter and 40 cm in length. The substrate was incubated in a growth chamber maintained at 20–25 °C and 85% humidity. After the appearance of brownish button primordia, thinning was performed, and the substrate was left to develop until most mushrooms reached maturity.

After the mushrooms matured, the substrates were removed for multi-view image capture. A smartphone (iPhone 13 pro max) with an aperture of f1.5, an exposure time of 10 ms, and a resolution of 3024 × 4032 was fixed on a tripod positioned 30–50 cm from the culture substrate and markers. The markers were 30, 10, and 5 cm in length, width, and height, respectively. Horizontal images were extracted at the same height as the substrate, along with two downward images. For downward shots, the camera was placed 25 and 50 cm above the substrate at viewing angles of 30° and 60°, respectively.

The image capture process is illustrated in Figure 2a, with approximately 150 images from multiple angles (Figure 2b). After image collection, 163 mushrooms were harvested from the substrate. Phenotypic parameters, including pileus transverse and longitudinal diameters, pileus thickness, stipe diameter, stipe height, and mushroom mass, were manually measured using a soft ruler, Vernier caliper, and balance.

2.2. Semantic Segmentation and 3D Reconstruction of Multi-View Images

In terms of data acquisition, the collected multi-view RGB images contain substantial background information, which significantly affects the accuracy and efficiency of the subsequent 3D reconstruction. Therefore, removing the background information is essential. As an end-to-end network architecture, YOLOv8 demonstrates excellent performance in semantic segmentation and is well suited for identifying subtle semantic differences in complex environments. In this study, YOLOv8 was applied to segment digital images and extracted the ROI for mushrooms. Image annotation was performed using LabelMe to classify the images into ROI and irrelevant areas. The plant pixels were assigned a value of 255 and labeled white, whereas the irrelevant areas were assigned a value of 0 and labeled black. An example of the annotation results is shown in Figure 2c. The dataset comprised 1482 images divided into training, validation, and testing sets in an 8:1:1 ratio.

(1): Image Segmentation Based on YOLOv8

The YOLOv8 [28] network architecture consists of three main parts.

Backbone: this component is responsible for feature extraction utilized in convolutional and deconvolutional layers combined with residual connections and bottleneck structures to reduce the network size and enhance performance.

Neck: this component performed multi-scale feature fusion by integrating feature maps from various stages of the Backbone, thereby enhancing the feature representation capabilities.

Head: this component executed the final segmentation task by iteratively training the samples, minimizing the loss function, and enhancing the segmentation performance of plant images.

During training, the image resolution was adjusted to 640 × 640, and YOLOv8x-Seg applied online image augmentation to introduce slight variations in each epoch. As a key data enhancement technique, a number of images were selected and combined, followed by basic data enhancements, including flipping, scaling, color gamut adjustments, etc., enabling the model to detect partially occluded objects and objects at varying positions. The AdamW optimizer was employed with a learning rate of 0.002, momentum of 0.9, batch size of 32, and 300 epochs to ensure the convergence of the loss function. During testing, the network generated a 640 × 480 mask image that was resized to the original input image size using bilinear interpolation. The ROI regions were then extracted based on the mask with irrelevant areas filled in white.

(2): Shiitake mushroom Point Cloud 3D Reconstruction

The 3D reconstruction of shiitake mushrooms was performed using COLMAP (Ver. 3.9.1, https://colmap.github.io/, accessed on 18 August 2024.). Initially, the Scale-Invariant Feature Transform (SIFT) algorithm extracted the key points from multi-view RGB images and generated the feature descriptors for these key points. The key points from different viewpoints were then matched to establish correspondences between the images. The SfM algorithm estimated the camera positions and orientations for each image based on the matched feature points, producing a sparse 3D point cloud that represented the preliminary geometric structure of the scene. Finally, MVS algorithm refined the reconstruction by generating a dense 3D point cloud based on the sparse point cloud foundation.

2.3. Point Cloud Data Preprocessing

2.3.1. Point Cloud Down Sampling and Scale Restoration

The dense point cloud reconstructed using the SfM-MVS approach had a large data volume, requiring a down sampling method to reduce the processing time for the subsequent algorithms. A 3D voxel grid method was employed to generate a voxel grid from the input point cloud data. Within each voxel, the centroid of all points was used to approximate and replace all other points, reducing the data volume while preserving the structure. This process minimized the computational time and enhanced the efficiency of the subsequent algorithms. The voxel size was set to 0.008, removing an average of 82% of the points without altering the outer contours of the point cloud, ensuring accurate phenotypic parameter calculations.

To determine the true dimensions of the shiitake mushrooms spawn, a coordinate scaling correction was applied to the reconstructed 3D point cloud. Using a reference marker, a scaling factor was calculated to adjust the point cloud coordinates to accurately represent the real-world dimensions. The calculation formula is as follows:

(x^{'}, y^{'}, z^{'}) = (\frac{\frac{l_{r e a l}}{l_{r e c o}} + \frac{w_{r e a l}}{w_{r e c o}} + \frac{h_{r e a l}}{h_{r e c o}}}{3}) * (x, y, z)

(1)

where

l_{r e a l}

,

w_{r e a l}

and

h_{r e a l}

represent the actual length, width, and height of the marker, respectively;

l_{r e c o}

,

w_{r e c o}

, and

h_{r e c o}

denote the corresponding length, width, and height in the reconstructed 3D point cloud, respectively. The original coordinates

(x, y, z)

were transformed into new coordinates

(x^{'}, y^{'}, z^{'})

based on the calculated scaling factor.

2.3.2. Point Cloud Filtering and Coordinate Correction

In the SfM algorithm, the camera position in the first image is often selected as the coordinate origin for the point cloud. However, with multiple datasets, this approach can produce disorganized coordinate systems, complicating subsequent point cloud processing. To address this, a centroid-shifting operation was performed. The centroid of the point cloud was calculated, and all points were translated to align the centroid with the origin (0,0,0).

The 3D point cloud data reconstructed from multi-view images segmented by YOLOv8 may contain outlier noise points owing to hardware limitations and human operations. To address this, a Statistical Outlier Removal (SOR) filter was applied, which is a widely used method in point cloud processing to remove outliers based on the local neighborhood statistics for each point. For this filtering, the parameters were set to K = 15 and n = 0.75, effectively reducing noise while preserving the main structure of the point cloud.

The Random Sample Consensus (RANSAC) algorithm was employed to compute the normal vector m of the whiteboard. Among them, m_x, m_y, and m_z are the components of the plane’s normal vector along the x, y, and z axes, respectively. Using m and the z-axis normal vector n

(0, 0, 1)

, the rotation axis

d

and rotation angle

θ

were determined. The rotation matrix R was then generated using Rodrigues’ rotation formula and applied to the original point

(x_{i}, y_{i}, z_{i})

to obtain the corrected point

(x_{i}^{'}, y_{i}^{'}, z_{i}^{'})

, as represented by the following equations:

d = m \times n = (m_{y} \cdot 1 - m_{z} \cdot 0, m_{z} \cdot 1 - m_{x} \cdot 0, m_{x} \cdot 0 - m_{y} \cdot 1) = (d_{1}, d_{2}, d_{3})

(2)

c o s (θ) = \frac{m \cdot n}{|m| \cdot |n|} = \frac{m_{z}}{\sqrt{m_{x}^{2} + m_{y}^{2} + m_{z}^{2}}}

(3)

R = I + s i n (θ) \cdot {[d]}_{\times} + (1 - c o s (θ)) \cdot {[d]}_{\times}^{2}

(4)

(x_{i}^{'}, y_{i}^{'}, z_{i}^{'}) = R \cdot (x_{i}, y_{i}, z_{i})

(5)

where I is the 3 × 3 identity matrix, and

{[d]}_{\times}

is the skew-symmetric matrix of the normalized rotation axis

d

:

{[d]}_{\times} = (\begin{matrix} 0 & {- d}_{3} & d_{2} \\ d_{3} & 0 & {- d}_{1} \\ {- d}_{2} & d_{1} & 0 \end{matrix})

(6)

Pass-through filtering was applied to remove the marker and whiteboard beneath the aligned point cloud, yielding a clean point cloud of shiitake mushrooms spawn, as illustrated in Figure 3.

2.4. Shiitake Mushrooms Spawn Point Cloud Segmentation Model

2.4.1. PointNet++ and Its Improvements

(1): CP-PointNet++

This study employed the CP-PointNet++ model as an enhanced version of PointNet++ [29] for point cloud segmentation. PointNet++ utilized a hierarchical feature extraction method to improve the model’s understanding of local structures by sampling and grouping point cloud data at various scales. Through layer-wise feature transformation and pooling, rich local feature representations were extracted while preserving global structural information. The segmentation network of PointNet++ featured a multi-level architecture with a core structure based on Set Abstraction (SA) modules, each comprising three layers: Sampling Layer, Grouping Layer, and MLP Layer.

The PointNet++ segmentation network employed an Encoder–Decoder structure, with the features passed to subsequent modules via skip link concatenation. Feature propagation was facilitated by the interpolation function and unit PointNet. The interpolation function restored the features omitted during the down sampling process in the SA module, whereas the unit PointNet consisting of MLP and ReLU continued to extract the features from the data.

To overcome the limitations of the original PointNet++ model, such as insufficient feature extraction, low segmentation accuracy, and high memory usage, this study proposed the CP-PointNet++ model. CP-PointNet++ enhanced feature extraction by integrating the Convolutional Block Attention Module (CBAM) into the MLP layers of the original PointNet++ network. Additionally, Partial Convolution (PConv) replaced the standard convolutions, effectively reducing memory usage during training. The model structure is shown in Figure 4.

(2): Introduction of CBAM Module

In the shiitake mushroom point cloud, similar features across different categories can reduce segmentation accuracy. To address this, the CBAM [30] was incorporated into the MLP layers to enhance feature extraction capabilities. CBAM comprised two modules: the Channel Attention Module (CAM) and Spatial Attention Module (SAM). CAM processed the information of each channel using global average pooling and global max pooling. The outputs were combined through a fully connected layer and passed through a sigmoid function to generate the channel-level attention weights. These weights adjusted the feature responses, emphasizing the most relevant channels. The formula is as follows:

M_{c} (F) = f_{s i g m o i d} (M L P (A v g P o o l (F)) + M L P (M a x P o o l (F)))

(7)

Unlike CAM, SAM focused on the significance of spatial locations. It extracted the spatial dimension information through global pooling and generated a spatial attention map using convolutional operations to adjust the importance of each spatial location. The formula is as follows:

M_{s} (F) = f_{s i g m o i d} (C o n v ([A v g P o o l (F); M a x P o o l (F)]))

(8)

By integrating the two attention mechanisms, CBAM inferred the attention maps and generated enhanced features. The combined formula for the two modules is as follows:

F^{'} = M_{c} (F) \otimes F

(9)

F^{″} = M_{s} (F) \otimes F

(10)

where F represents the intermediate features in the network,

\otimes

denotes element-wise multiplication, and F″ is the refined feature map. The module structure is shown in Figure 5.

(3): Replacement of Conv with PConv

PConv [31] replaced standard convolutions with lightweight convolutions to reduce the excessive similarity between channels in standard convolution layers. This was achieved by selectively utilizing a subset of channels for feature extraction, which was then concatenated with the remaining channels. Finally, point-wise convolutions were applied to strengthen the inter-channel correlations. The module structure is shown in Figure 6.

2.4.2. Pileus and Stipe Segmentation and Phenotypic Parameter Calculation

The CP-PointNet++ algorithm segmented the point cloud into three categories: pileus, stipe, and spawn. Further segmentation of the pileus and stipe collections is necessary. The pileus typically has a curved or circular shape, with a smooth surface and distinct curvature characteristics. Additionally, the normal directions of adjacent points exhibited a relatively consistent variation trend. A region-growing algorithm was adopted to segment individual pilei.

The stipe exhibited spatial separability, whereas occlusion during data acquisition resulted in varying point cloud densities in the 3D reconstruction. Fast Euclidean clustering served as a distance-based algorithm, assuming that points within the same class could be spatially proximate. This efficient and computationally simple algorithm was applied to segment the stipe collection, based on the spatial characteristics of the stipe point cloud.

The phenotypic parameters of shiitake mushrooms to be calculated included pileus transverse diameter, pileus longitudinal diameter, pileus height, stipe diameter, stipe height, and minimum bounding box (OBB) volume, which was the sum of the pileus and stipe bounding box volumes. The calculation methods are illustrated in Figure 7.

The PCA algorithm was applied to the pileus to determine the primary orientation of the point cloud and perform rotation. The transformed pileus was then projected onto the XOY plane, where the Euclidean distance between the two farthest points defined the transverse diameter. The longest diameter perpendicular to the transverse diameter was identified as the longitudinal diameter. The pileus thickness was calculated as the absolute difference between the maximum and minimum values along the z-axis.

After applying the PCA algorithm to rotate the stipe, slices were extracted at 25%, 50%, and 75% positions along the principal direction. The slice thickness parallel to the YOZ plane was set at 10% of the stipe height. For each slice, the least-squares method was used to fit a circle and calculate the diameter. The average of these three diameters was considered as the stipe diameter. Stipe height was calculated as the absolute difference between the maximum and minimum values along the x-axis.

2.5. Yield Estimation

The task of yield estimation is to build the yield estimation model of shiitake mushrooms. The input of the model is the phenotypic parameters of shiitake mushrooms (including six phenotypic parameters extracted, including the transverse diameter of the cap, the longitudinal diameter of the cap, the thickness of the cap, the height of the stipe, the diameter of the stipe, and the volume of the bounding box). The output of the model is the yield value, which solves the non-linear regression problem. This task is not complex for the deep learning model, but it is difficult to obtain agricultural data, the amount of data is small, and they are prone to fitting problems, so it is particularly important to select the appropriate algorithm for the estimation of shiitake mushroom yield. After consulting the literature, it is found that machine learning algorithms such as Partial Least Squares Regression (PLSR), Support Vector Machine Regression (SVR), Random Forest (RF), and Generalized Regression Neural Network (GRNN) are suitable for dealing with such nonlinear regression problems.

PLSR combined principal component analysis with multi-variate regression to effectively address the multi-collinearity among input variables. It identified the latent variables (principal components) to explain the input data variance and maximized the correlation between these variables and the output variable. SVR constructed an optimal hyperplane to minimize the regression errors, exceled in handling high-dimensional and non-linear problems, and mapped data to higher-dimensional spaces using the kernel functions for the regression analysis. As an ensemble learning algorithm based on decision trees, RF made predictions by training multiple decision trees. It can be robust against overfitting to effectively handle non-linear features and demonstrate the reliability of feature selection and random sampling. GRNN served as a neural network model based on radial basis functions, and weighted averaging was performed based on the local similarity of input data. It quickly fitted the data, delivered the predictions efficiently, handled noisy data effectively, and converged rapidly.

In this study, the four models were used to estimate the yield of shiitake mushrooms. The dataset was randomly divided, with 80% used for training and 20% for testing.

2.6. Model Training and Performance Evaluation

(1): Hardware and software environment for model training

The hardware environment for 3D reconstruction and model training included an Intel(R) Xeon(R) Gold 6246R CPU, NVIDIA Quadro RTX 8000 GPU with 48GB memory, and 128GB of RAM. The software environment was run on the Windows 10 operating system, with the deep learning models developed using PyTorch 1.13 and CUDA 11.7. The point cloud data were divided into training, validation, and test sets at an 8:1:1 ratio. The hyperparameters were configured as follows: batch sizes of 16 and 300 epochs, a learning rate of 0.001, Adam optimizer, and a weight decay coefficient of 0.07.

(2): Semantic segmentation evaluation

The model was evaluated using Precision (P), Recall (R), F1 Score, and Average Precision (AP). The definitions of these metrics are as follows:

P = \frac{T P}{T P + F P}

(11)

R = \frac{T P}{T P + F N}

(12)

F 1 = \frac{2 (P \times R)}{P + R}

(13)

In this context, True Positives (TPs) were the samples correctly predicted as positive by the model, whereas True Negatives (TNs) were the samples correctly predicted as negative. False Positives (FPs) were samples incorrectly predicted as positive, and False Negatives (FNs) were samples incorrectly predicted as negative.

(3): Point cloud segmentation evaluation

The performance of the trained PointNet++ model for point cloud segmentation was evaluated using Overall Accuracy (OA) and Mean Intersection over Union (mIoU). The corresponding formulas are as follows:

O A = \frac{\sum_{i = 0}^{k} {T P}_{i}}{\sum_{i = 0}^{k} ({T P}_{i} + {F P}_{i} + {F N}_{i})}

(14)

m I o U = \frac{1}{k} \sum_{i = 1}^{k} \frac{{T P}_{i}}{({T P}_{i} + {F P}_{i} + {F N}_{i})}

(15)

where k represents the number of classes, TP_i represents the true positives for the i-th class, FP_i represents the false positives for the i-th class, and FN_i represents the false negatives for the i-th class.

(4): Phenotypic parameter calculation and yield estimation evaluation

The evaluation metrics selected were Mean Absolute Percentage Error (MAPE), Root Mean Squared Error (RMSE), Normalized Root Mean Squared Error (nRMSE), and Coefficient of Determination (R²). Among these, RMSE, nRMSE, and R² were also used as evaluation metrics for the yield estimation model.

M A P E = \frac{100 %}{n} \sum_{i = 1}^{n} |\frac{y_{i} - f (x_{i})}{y_{i}}|

(16)

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {(y_{i} - \hat{y})}^{2}}{n}}

(17)

n R M S E = \frac{R M S E}{\bar{y}}

(18)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{\dot{i} = 1}^{n} {(y_{i} - \bar{y_{i}})}^{2}}

(19)

where n represents the number of samples, y_i is the observed value,

y_{i}

is the predicted value, and

\bar{y_{i}}

is the mean of the observed values.

3. Results

3.1. Semantic Segmentation Results of YOLOv8x

Table 1 shows the performance of the YOLOv8x segmentation model on the test set. With the Intersection over Union (IoU) threshold set to 0.5, the model achieved a P of 99.96%, an R of 1.0, an F1 score of 94.2%, and a mean Average Precision (mAP) of 99.5%.

3.2. Point Cloud Segment Results

Table 2 compares the performance of the proposed CP-PointNet++ model with the original PointNet++ (BASE) network. The original PointNet++ demonstrated a relatively low segmentation accuracy (OA = 89.86%, mIoU = 75.14%). Incorporating the CBAM module significantly enhanced the model’s ability to distinguish point cloud categories, achieving OA = 96.92% and mIoU = 87.62%. Replacing the standard convolution in PointNet++ with PConv reduced the memory usage during training (3.5 GB to 2.5 GB) while slightly improving the feature extraction capability (OA: 89.86% to 94.56%, mIoU: 75.14% to 83.55%). The improved CP-PointNet++ achieved the highest segmentation performance (OA = 96.92%, mIoU = 87.62%) with reduced memory usage (2.6 GB). The results of the improved model are still optimal when compared to the PointNet and PointNet++ MSG models.

Instances segmentation of the pileus and stipe was achieved using the region growing algorithm and the fast Euclidean clustering algorithm. For the region growing algorithm, the parameters included 400 neighboring points for the search, a smoothing threshold of 20°, and a curvature threshold of 0.045. These parameters were carefully selected to balance the trade-off between over-segmentation and under-segmentation. The segmentation of pileus instances is influenced by both the smoothing and curvature thresholds. A higher smoothing threshold, combined with a larger curvature threshold, can tolerate greater surface variation but result in under-segmentation by merging distinct instances. Conversely, a lower smoothing threshold, along with a smaller curvature threshold, enhances segmentation precision by identifying finer details but increases the risk of over-segmentation, splitting regions that should belong to the same pileus instance.

For the fast Euclidean clustering algorithm, the parameters included a maximum of 500 points for neighborhood search, a search radius of 2 for nearest neighbors, a minimum of 250 points required for clustering, and a maximum of 10,000 points. These settings were fine-tuned to ensure the accurate segmentation of independent stipes. Increasing the search radius may group different instances together, resulting in under-segmentation, while decreasing the search radius may fragment the same instance, leading to over-segmentation. Figure 8 shows the segmentation results of some mushroom point clouds, demonstrating the effectiveness of the selected parameters in achieving accurate segmentation.

3.3. Phenotypic Parameter Calculation Results

Figure 9 compares the calculated phenotypic parameters of shiitake mushrooms based on 3D point clouds with manually measured values. The calculated pileus transverse diameter (R² = 0.92, RMSE = 3.30 mm, nRMSE = 0.07, MAPE = 6.79%), longitudinal diameter (R² = 0.86, RMSE = 3.76 mm, nRMSE = 0.09, MAPE = 8.16%), and stipe diameter (R² = 0.79, RMSE = 1.38 mm, nRMSE = 0.12, MAPE = 9.96%) indicated minor differences compared to the manual measurements. In contrast, the calculated pileus height (R² = 0.53, RMSE = 3.26 mm, nRMSE = 0.17, MAPE = 16.33%) and stipe height (R² = 0.65, RMSE = 4.98 mm, nRMSE = 0.15, MAPE = 13.01%) exhibited larger discrepancies.

To analyze the sources of error in the calculated phenotypic parameters, the manually measured values were sorted in ascending order and divided into nine equal intervals, with the error values calculated for each interval (Figure 10). The results revealed that smaller measured values corresponded to larger errors across all phenotypic parameters. For instance, in the intervals with the smallest measured values, such as the pileus height (10–15 mm), stipe diameter (13.0–17.7 mm), and stipe height (6.0–9.1 mm), the errors were 22%, 27%, and 20%, respectively. As the manually measured values increased, the errors generally decreased. However, for the pileus transverse and longitudinal diameters, the errors increased with higher values, attributed to the subjective measurement errors caused by certain deformed shiitake mushrooms (Figure 11).

3.4. Yield Estimation Results

Figure 12 illustrates the yield estimation results using four models, including PLSR, SVR, RF, and GRNN, on an independent test set. Among these, the GRNN model demonstrated optimal performance, achieving R² = 0.91, RMSE = 2.28 g, and nRMSE = 0.08. In comparison, the PLSR and SVR models exhibited slightly lower R² values of 0.87, with a higher RMSE and nRMSE, whereas the RF model performed moderately, ranking between the others.

4. Discussion

4.1. 3D Reconstruction and Point Cloud Segmentation

The SfM-MVS 3D reconstruction algorithm produces highly accurate and dense point clouds, addressing challenges such as uneven point cloud density from depth cameras or LiDAR, high equipment costs, and complex operations [7,32]. In this study, image segmentation was conducted prior to 3D reconstruction to eliminate irrelevant background areas, effectively mitigating environmental interference [16,17,33], which is a critical step in developing an automated 3D reconstruction and phenotypic acquisition platform. Researchers have demonstrated the effectiveness of using drones (UAVs) for rapid and accurate 3D reconstruction of crop canopy structures at the field scale to extract phenotypic features [20,21,34], highlighting the excellent performance of 3D reconstruction technology in large-scale applications and providing a feasible foundation for high-throughput 3D reconstruction in intelligent mushroom cultivation factories. In terms of camera and lighting environment, this study has limitations. Although smart phones are a cost-effective and easily accessible option, their sensors may introduce noise or distortion under non-ideal conditions. The reconstruction accuracy will be improved by using equipment with more advanced image stabilization and weak light function. Lighting conditions play a key role in capturing multi-view images for 3D reconstruction. Too bright illumination and insufficient illumination will lead to an increase in image noise and a reduction in recognizable feature points, which will affect the accuracy of feature detection and matching in SfM.

During the point cloud segmentation phase, Ghahremani et al. [35] applied the RANSAC algorithm to fit different plant organs and calculate phenotypic parameters such as stipe diameter, organ angles in grapevine and Brassica, and leaf angle, whereas Miao et al. [36] employed Euclidean and K-means clustering to segment individual corn plant point clouds. Although these methods reduce computational costs, they are highly sensitive to parameter settings and lack robustness. To address these limitations, this study introduced an improved CP-PointNet++ model incorporating CBAM and PConv, significantly enhancing the feature extraction capabilities and achieving OA and mIoU values of 97.45% and 92.71%, respectively. Unlike traditional point cloud instance segmentation models that demand substantial memory for large-scale input point clouds [37], the proposed combination of deep learning and clustering algorithms reduces hardware requirements, supporting the development of automated 3D reconstruction and phenotype extraction platforms.

4.2. Phenotypic Parameter Extraction and Yield Estimation of Shiitake Mushrooms

In recent years, researchers have employed 3D reconstruction technology to extract crop phenotypic parameters at multiple scales [10,18,38], offering high-throughput, rapid, and non-destructive advantages compared to traditional manual measurements. However, fungal phenotypic parameter extraction remains limited to the 2D imaging stage [3,6], restricting the analysis to cap-related features and hindering the correlation of phenotypic data with genotypic information or quality screening. The low correlation between the measured and computed pileus height arises primarily from manual deformation during edge measurements [33], while larger errors in stipe height calculations result from point cloud gaps during reconstruction. Improved point cloud completion can enhance the accuracy of phenotypic parameters [13,39,40]. This study demonstrated strong correlations between the computed and manually measured values for pileus transverse and longitudinal diameters and stipe diameter, validating the accuracy and practicality of 3D reconstruction methods in fungal phenotypic analysis. The algorithm has the potential to analyze the phenotype of bacteria in the actual growth environment in greenhouse or fungus factory. However, in the actual growth environment, factors such as uneven illumination, temperature, and humidity changes may introduce noise during image acquisition, which will affect the quality of 3D reconstruction and phenotypic analysis. Due to the close distance between mushroom fruiting bodies, there may be a problem of missing point clouds in 3D reconstruction, which requires operations such as point cloud supplementation.

Traditional machine learning algorithms often struggle with high-dimensional features, as they fail to effectively capture complex interdimensional relationships, which are susceptible to the “curse of dimensionality” and are less robust to noise and redundant features [22,41]. In this study, the GRNN model was implemented for yield estimation because of its ability to effectively capture the relationships between multiple phenotypic parameters and yield, as well as its strong generalization capability [42]. Similar to Xiao et al. [18], who developed a price function for sugar beet based on root phenotypic parameters linked to quality and sugar content, mushroom pricing could depend on yield and grade. Future research will employ computer vision techniques for automatic mushroom grading, integrating these results with the yield data from this study, to develop an economic model for mushroom cultivation and evaluate its impacts.

5. Conclusions

This study proposed an automated algorithm for the three-dimensional reconstruction and segmentation of Shiitake mushrooms based on multi-view images. The workflow consisted of five main components: image acquisition and segmentation, 3D reconstruction, point cloud preprocessing, point cloud segmentation, and phenotypic parameter calculation. The Yolov8 model achieved excellent performance in segmenting ROI from multi-view images, with an accuracy of 99.96%. The PointNet++ model enhanced with CBAM and PConv modules excelled in point cloud segmentation, achieving an OA of 97.45%. For the phenotypic parameter calculation, the nRMSE for the pileus transverse and longitudinal diameters and the stipe diameter was below 10%, whereas the errors for pileus height and stipe height were higher, with nRMSE values of 17% and 15%, respectively. Using the GRNN model, the shiitake mushrooms yield was estimated based on the extracted phenotypic parameters, achieving an RMSE of 2.276 g. This method could ex-tend beyond shiitake mushrooms to other fungi, including white mushrooms, straw mushrooms, reishi mushrooms, lion mane mushrooms, and oyster mushrooms. Future research will focus on integrating 3D reconstruction technologies with deep learning to enhance phenotypic parameter extraction, supporting applications in mushroom grading, phenotypic–genotypic analysis, and related fields.

Author Contributions

X.X.: methodology, writing—original draft, writing—review and editing, formal analysis, project administration, and funding acquisition. J.L.: methodology, software, validation, visualization, writing—original draft, data curation, formal analysis, project administration, and writing—review and editing. J.Z.: project administration, writing—original draft, conceptualization, data curation, and writing—review and editing. P.F.: writing—original draft and writing—review. H.Y.: visualization, investigation, and supervision. Y.M.: writing—original draft, methodology, project administration, and writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Technology Development Plan Project of Jilin Province (20240304096SF), and the Major Special Projects of the Ministry of Agriculture and Rural Affairs (NK202302020205).

Institutional Review Board Statement

The study did not require ethical approval.

Data Availability Statement

Data supporting the findings of this study are available from the corresponding author.

Acknowledgments

The authors would like to acknowledge the anonymous reviewers for their valuable comments and the members of the editorial team for carefully proofreading the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Bell, V.; Silva, C.R.P.G.; Guina, J.; Fernandes, T.H. Mushrooms as Future Generation Healthy Foods. Front. Nutr. 2022, 9, 1050099. [Google Scholar] [CrossRef] [PubMed]
EMR Mushroom Substrate Market—Size & Industry Share|2032. Available online: https://www.expertmarketresearch.com/reports/mushroom-substrate-market (accessed on 13 November 2024).
Tao, K.; Liu, J.; Wang, Z.; Yuan, J.; Liu, L.; Liu, X. ReYOLO-MSM: A Novel Evaluation Method of Mushroom Stick for Selective Harvesting of Shiitake Mushroom Sticks. Comput. Electron. Agric. 2024, 225, 109292. [Google Scholar] [CrossRef]
Li, Y.; Wen, W.; Miao, T.; Wu, S.; Yu, Z.; Wang, X.; Guo, X.; Zhao, C. Automatic Organ-Level Point Cloud Segmentation of Maize Shoots by Integrating High-Throughput Data Acquisition and Deep Learning. Comput. Electron. Agric. 2022, 193, 106702. [Google Scholar] [CrossRef]
Yin, H.; Wei, Q.; Gao, Y.; Hu, H.; Wang, Y. Moving toward Smart Breeding: A Robust Amodal Segmentation Method for Occluded Oudemansiella Raphanipes Cap Size Estimation. Comput. Electron. Agric. 2024, 220, 108895. [Google Scholar] [CrossRef]
Shi, L.; Wei, Z.; You, H.; Wang, J.; Bai, Z.; Yu, H.; Ji, R.; Bi, C. OMC-YOLO: A Lightweight Grading Detection Method for Oyster Mushrooms. Horticulturae 2024, 10, 742. [Google Scholar] [CrossRef]
Zhou, L.; Wu, G.; Zuo, Y.; Chen, X.; Hu, H. A Comprehensive Review of Vision-Based 3D Reconstruction Methods. Sensors 2024, 24, 2314. [Google Scholar] [CrossRef]
Moreno, H.; Andújar, D. Proximal Sensing for Geometric Characterization of Vines: A Review of the Latest Advances. Comput. Electron. Agric. 2023, 210, 107901. [Google Scholar] [CrossRef]
Rodriguez-Sanchez, J.; Snider, J.L.; Johnsen, K.; Li, C. Cotton Morphological Traits Tracking through Spatiotemporal Registration of Terrestrial Laser Scanning Time-Series Data. Front. Plant Sci. 2024, 15, 1436120. [Google Scholar] [CrossRef]
Ao, Z.; Wu, F.; Hu, S.; Sun, Y.; Su, Y.; Guo, Q.; Xin, Q. Automatic Segmentation of Stem and Leaf Components and Individual Maize Plants in Field Terrestrial LiDAR Data Using Convolutional Neural Networks. Crop J. 2022, 10, 1239–1250. [Google Scholar] [CrossRef]
Li, Y.; Liu, J.; Zhang, B.; Wang, Y.; Yao, J.; Zhang, X.; Fan, B.; Li, X.; Hai, Y.; Fan, X. Three-Dimensional Reconstruction and Phenotype Measurement of Maize Seedlings Based on Multi-View Image Sequences. Front. Plant Sci. 2022, 13, 974339. [Google Scholar] [CrossRef]
Liu, Y.; Yuan, H.; Zhao, X.; Fan, C.; Cheng, M. Fast Reconstruction Method of Three-Dimension Model Based on Dual RGB-D Cameras for Peanut Plant. Plant Methods 2023, 19, 17. [Google Scholar] [CrossRef] [PubMed]
Lou, M.; Lu, J.; Wang, L.; Jiang, H.; Zhou, M. Growth Parameter Acquisition and Geometric Point Cloud Completion of Lettuce. Front. Plant Sci. 2022, 13, 947690. [Google Scholar] [CrossRef] [PubMed]
Song, P.; Li, Z.; Yang, M.; Shao, Y.; Pu, Z.; Yang, W.; Zhai, R. Dynamic Detection of Three-Dimensional Crop Phenotypes Based on a Consumer-Grade RGB-D Camera. Front. Plant Sci. 2023, 14, 1097725. [Google Scholar] [CrossRef] [PubMed]
Xie, W.; Wei, S.; Yang, D. Morphological Measurement for Carrot Based on Three-Dimensional Reconstruction with a ToF Sensor. Postharvest Biol. Technol. 2023, 197, 112216. [Google Scholar] [CrossRef]
He, J.Q.; Harrison, R.J.; Li, B. A Novel 3D Imaging System for Strawberry Phenotyping. Plant Methods 2017, 13, 93. [Google Scholar] [CrossRef]
Hao, H.; Wu, S.; Li, Y.; Wen, W.; Fan, J.; Zhang, Y.; Zhuang, L.; Xu, L.; Li, H.; Guo, X.; et al. Automatic Acquisition, Analysis and Wilting Measurement of Cotton 3D Phenotype Based on Point Cloud. Biosyst. Eng. 2024, 239, 173–189. [Google Scholar] [CrossRef]
Xiao, S.; Chai, H.; Wang, Q.; Shao, K.; Meng, L.; Wang, R.; Li, B.; Ma, Y. Estimating Economic Benefit of Sugar Beet Based on Three-Dimensional Computer Vision: A Case Study in Inner Mongolia, China. Eur. J. Agron. 2021, 130, 126378. [Google Scholar] [CrossRef]
Xiao, S.; Ye, Y.; Fei, S.; Chen, H.; Zhang, B.; Li, Q.; Cai, Z.; Che, Y.; Wang, Q.; Ghafoor, A.; et al. High-Throughput Calculation of Organ-Scale Traits with Reconstructed Accurate 3D Canopy Structures Using a UAV RGB Camera with an Advanced Cross-Circling Oblique Route. ISPRS J. Photogramm. Remote Sens. 2023, 201, 104–122. [Google Scholar] [CrossRef]
Xiao, S.; Fei, S.; Ye, Y.; Xu, D.; Xie, Z.; Bi, K.; Guo, Y.; Li, B.; Zhang, R.; Ma, Y. 3D Reconstruction and Characterization of Cotton Bolls in Situ Based on UAV Technology. ISPRS J. Photogramm. Remote Sens. 2024, 209, 101–116. [Google Scholar] [CrossRef]
Sun, G.; Zhang, Y.; Chen, H.; Wang, L.; Li, M.; Sun, X.; Fei, S.; Xiao, S.; Yan, L.; Li, Y.; et al. Improving Soybean Yield Prediction by Integrating UAV Nadir and Cross-Circling Oblique Imaging. Eur. J. Agron. 2024, 155, 127134. [Google Scholar] [CrossRef]
He, W.; Ye, Z.; Li, M.; Yan, Y.; Lu, W.; Xing, G. Extraction of Soybean Plant Trait Parameters Based on SfM-MVS Algorithm Combined with GRNN. Front. Plant Sci. 2023, 14, 1181322. [Google Scholar] [CrossRef] [PubMed]
Jardim, S.; António, J.; Mora, C. Image Thresholding Approaches for Medical Image Segmentation—Short Literature Review. Procedia Comput. Sci. 2023, 219, 1485–1492. [Google Scholar] [CrossRef]
Lei, L.; Yang, Q.; Yang, L.; Shen, T.; Wang, R.; Fu, C. Deep Learning Implementation of Image Segmentation in Agricultural Applications: A Comprehensive Review. Artif. Intell. Rev. 2024, 57, 149. [Google Scholar] [CrossRef]
Yang, D.; Yang, H.; Liu, D.; Wang, X. Research on Automatic 3D Reconstruction of Plant Phenotype Based on Multi-View Images. Comput. Electron. Agric. 2024, 220, 108866. [Google Scholar] [CrossRef]
Mollazade, K.; Lucht, J.V.D.; Jörissen, S.; Nüchter, A. 3D Laser Imaging for Measuring Volumetric Shrinkage of Horticultural Products during Drying Process. Comput. Electron. Agric. 2023, 207, 107749. [Google Scholar] [CrossRef]
Retsinas, G.; Efthymiou, N.; Anagnostopoulou, D.; Maragos, P. Mushroom Detection and Three Dimensional Pose Estimation from Multi-View Point Clouds. Sensors 2023, 23, 3576. [Google Scholar] [CrossRef]
Jocher, G.; Qiu, J.; Chaurasia, A. Ultralytics YOLO 2023. Available online: https://github.com/ultralytics/ultralytics (accessed on 29 January 2025).
Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. arXiv 2017, arXiv:1706.02413. [Google Scholar]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. arXiv 2018, arXiv:1807.06521. [Google Scholar]
Chen, J.; Kao, S.; He, H.; Zhuo, W.; Wen, S.; Lee, C.-H.; Chan, S.-H.G. Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks. arXiv 2023, arXiv:2303.03667. [Google Scholar]
Liu, C.; Kong, D.; Wang, S.; Wang, Z.; Li, J.; Yin, B. Deep3D Reconstruction: Methods, Data, and Challenges. Front. Inform. Technol. Electron. Eng. 2021, 22, 652–672. [Google Scholar] [CrossRef]
Guo, R.; Xie, J.; Zhu, J.; Cheng, R.; Zhang, Y.; Zhang, X.; Gong, X.; Zhang, R.; Wang, H.; Meng, F. Improved 3D Point Cloud Segmentation for Accurate Phenotypic Analysis of Cabbage Plants Using Deep Learning and Clustering Algorithms. Comput. Electron. Agric. 2023, 211, 108014. [Google Scholar] [CrossRef]
Fei, S.; Xiao, S.; Li, Q.; Shu, M.; Zhai, W.; Xiao, Y.; Chen, Z.; Yu, H.; Ma, Y. Enhancing Leaf Area Index and Biomass Estimation in Maize with Feature Augmentation from Unmanned Aerial Vehicle-Based Nadir and Cross-Circling Oblique Photography. Comput. Electron. Agric. 2023, 215, 108462. [Google Scholar] [CrossRef]
Ghahremani, M.; Williams, K.; Corke, F.; Tiddeman, B.; Liu, Y.; Wang, X.; Doonan, J.H. Direct and Accurate Feature Extraction from 3D Point Clouds of Plants Using RANSAC. Comput. Electron. Agric. 2021, 187, 106240. [Google Scholar] [CrossRef]
Miao, Y.; Li, S.; Wang, L.; Li, H.; Qiu, R.; Zhang, M. A Single Plant Segmentation Method of Maize Point Cloud Based on Euclidean Clustering and K-Means Clustering. Comput. Electron. Agric. 2023, 210, 107951. [Google Scholar] [CrossRef]
Zhuang, C.; Li, S.; Ding, H. Instance Segmentation Based 6D Pose Estimation of Industrial Objects Using Point Clouds for Robotic Bin-Picking. Robot. Comput. -Integr. Manuf. 2023, 82, 102541. [Google Scholar] [CrossRef]
Patel, A.K.; Park, E.-S.; Lee, H.; Priya, G.G.L.; Kim, H.; Joshi, R.; Arief, M.A.A.; Kim, M.S.; Baek, I.; Cho, B.-K. Deep Learning-Based Plant Organ Segmentation and Phenotyping of Sorghum Plants Using LiDAR Point Cloud. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 8492–8507. [Google Scholar] [CrossRef]
Chen, H.; Liu, S.; Wang, C.; Wang, C.; Gong, K.; Li, Y.; Lan, Y. Point Cloud Completion of Plant Leaves under Occlusion Conditions Based on Deep Learning. Plant Phenomics 2023, 5, 0117. [Google Scholar] [CrossRef]
Li, Y.; Si, S.; Liu, X.; Zou, L.; Wu, W.; Liu, X.; Zhang, L. Three-Dimensional Reconstruction of Cotton Plant with Internal Canopy Occluded Structure Recovery. Comput. Electron. Agric. 2023, 215, 108370. [Google Scholar] [CrossRef]
Yang, M.; Cho, S.-I. High-Resolution 3D Crop Reconstruction and Automatic Analysis of Phenotyping Index Using Machine Learning. Agriculture 2021, 11, 1010. [Google Scholar] [CrossRef]
Xu, X.; Chen, S.; Ren, L.; Han, C.; Lv, D.; Zhang, Y.; Ai, F. Estimation of Heavy Metals in Agricultural Soils Using Vis-NIR Spectroscopy with Fractional-Order Derivative and Generalized Regression Neural Network. Remote Sens. 2021, 13, 2718. [Google Scholar] [CrossRef]

Figure 2. Data collection and annotation. (a) Illustration of the data collection process. (b) Collected RGB images. (c) Example of image annotation.

Figure 3. 3D reconstruction and data preprocessing process: (a) extraction and matching of shiitake mushroom feature points. (b) SfM: structure from motion. (c) MVS: multi-view stereo. (d) Data preprocessing.

Figure 4. CP-PointNet++ segmentation network structure. The CBAM module was added to the MLP layers to enhance the model’s feature extraction ability, and the standard convolutions in the MLP were replaced with PConv.

Figure 5. Schematic diagram of the CBAM module.

Figure 6. Schematic diagram of the PConv module.

Figure 7. (a) Pileus point cloud; (b) shiitake mushrooms OBB bounding box; (c) stipe point cloud; (d) pileus transverse and longitudinal diameters; (e) pileus height; (f) stipe diameter; and (g) stipe height.

Figure 8. Display of segmentation results for a portion of shiitake mushrooms spawn point cloud. The first row shows the shiitake mushrooms spawn point cloud; the second row presents the point cloud data segmented by CP-PointNet++, where pink, blue, and gray represent the pileus, stipe, and spawn categories, respectively; and the third row shows multiple shiitake mushrooms spawn instances output by the clustering algorithm.

Figure 9. Comparison of the calculated phenotypic parameters and measured values of shiitake mushrooms: (a) pileus transverse diameter; (b) pileus longitudinal diameter; (c) pileus thickness; (d) stipe diameter; (e) stipe height.

Figure 10. Analysis of phenotypic calculation errors. Histograms of errors for the sorted pileus transverse diameter (a), pileus longitudinal diameter (b), pileus thickness (c), stipe diameter (d), and stipe height (e), and the incomplete shiitake mushroom point cloud (f).

Figure 11. Some deformed shiitake mushrooms. (a–c) Deformed mushrooms with difficulty in manually measuring transverse diameter. (d–f) Deformed mushrooms with difficulty in manually measuring longitudinal diameter.

Figure 12. Correlation of yield estimation results for each model: (a) PLSR; (b) RF; (c) SVR; (d) GRNN.

Table 1. Results of each model on the independent test set.

Model	P (%)	R (%)	F1 (%)	[email protected] (%)
YOLOv8x	99.96	100	94.2	99.5

Table 2. Results of each model on the independent test set. BASE: PointNet++. BASE+CBAM: PointNet++ with the CBAM module added. BASE + PConv: PointNet++ with PConv replacing standard convolution. CP-PointNet++: the model proposed in this study that integrates both the CBAM module and PConv into PointNet++.

Model	OA (%)	mIoU (%)	Memory During Train (G)
PointNet	88.65	64.18	3.2
PointNet++ MSG	91.86	76.73	3.6
BASE	89.86	75.14	3.4
BASE+CBAM	96.92	87.62	3.5
BASE+PConv	94.56	83.55	2.5
CP-PointNet++	97.45	92.71	2.6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, X.; Li, J.; Zhou, J.; Feng, P.; Yu, H.; Ma, Y. Three-Dimensional Reconstruction, Phenotypic Traits Extraction, and Yield Estimation of Shiitake Mushrooms Based on Structure from Motion and Multi-View Stereo. Agriculture 2025, 15, 298. https://doi.org/10.3390/agriculture15030298

AMA Style

Xu X, Li J, Zhou J, Feng P, Yu H, Ma Y. Three-Dimensional Reconstruction, Phenotypic Traits Extraction, and Yield Estimation of Shiitake Mushrooms Based on Structure from Motion and Multi-View Stereo. Agriculture. 2025; 15(3):298. https://doi.org/10.3390/agriculture15030298

Chicago/Turabian Style

Xu, Xingmei, Jiayuan Li, Jing Zhou, Puyu Feng, Helong Yu, and Yuntao Ma. 2025. "Three-Dimensional Reconstruction, Phenotypic Traits Extraction, and Yield Estimation of Shiitake Mushrooms Based on Structure from Motion and Multi-View Stereo" Agriculture 15, no. 3: 298. https://doi.org/10.3390/agriculture15030298

APA Style

Xu, X., Li, J., Zhou, J., Feng, P., Yu, H., & Ma, Y. (2025). Three-Dimensional Reconstruction, Phenotypic Traits Extraction, and Yield Estimation of Shiitake Mushrooms Based on Structure from Motion and Multi-View Stereo. Agriculture, 15(3), 298. https://doi.org/10.3390/agriculture15030298

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Three-Dimensional Reconstruction, Phenotypic Traits Extraction, and Yield Estimation of Shiitake Mushrooms Based on Structure from Motion and Multi-View Stereo

Abstract

1. Introduction

2. Materials and Methods

2.1. Shiitake Mushrooms Sample and Data Collection

2.2. Semantic Segmentation and 3D Reconstruction of Multi-View Images

2.3. Point Cloud Data Preprocessing

2.3.1. Point Cloud Down Sampling and Scale Restoration

2.3.2. Point Cloud Filtering and Coordinate Correction

2.4. Shiitake Mushrooms Spawn Point Cloud Segmentation Model

2.4.1. PointNet++ and Its Improvements

2.4.2. Pileus and Stipe Segmentation and Phenotypic Parameter Calculation

2.5. Yield Estimation

2.6. Model Training and Performance Evaluation

3. Results

3.1. Semantic Segmentation Results of YOLOv8x

3.2. Point Cloud Segment Results

3.3. Phenotypic Parameter Calculation Results

3.4. Yield Estimation Results

4. Discussion

4.1. 3D Reconstruction and Point Cloud Segmentation

4.2. Phenotypic Parameter Extraction and Yield Estimation of Shiitake Mushrooms

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI