1. Introduction
With the continuous development of plant phenomics, three-dimensional plant phenotypic analysis has become a challenging research topic. Using deep learning for point cloud segmentation is the foundation of crop phenotype measurement and breeding. The common point cloud datasets used for training are scarce and difficult to obtain, and there is no commonly used basic data for organ instance segmentation for phenotype extraction. In addition, due to the complex structure of plants, the data annotation work needs considerable manual processing. A well-labeled dataset is essential for the segmentation of plant point clouds using deep learning. In order to obtain a well-labeled dataset, it should have the following characteristics: complete plant structure, high precision, and the ability to cover multiple varieties and growth periods. Consequently, building a labeled crop plant point cloud dataset of the entire growth period is a key step toward achieving accurate crop point cloud segmentation using deep learning.
Although the lack of well-labeled 3D plant datasets limits the further progress of plant point cloud segmentation [
1], many scholars have made significant advancements in building plant point cloud segmentation datasets in recent years. Zhou et al. [
2] manually segmented the 3D point cloud data of soybean plants and gave each point a real label. This was used as the training set for point cloud segmentation and real ground data for evaluating segmentation accuracy using machine learning methods. Li et al. [
3] used the MVS-Pheno platform to obtain multi-view images and point clouds of corn plants in the study of organ-level point cloud automatic segmentation of corn branches based on high-throughput data acquisition and deep learning. At the same time, the research team developed a data annotation tool kit specifically for corn plants, called Label3DMatch, and annotated the data to ultimately build a training dataset. Conn et al. [
4] planted tomatoes, tobacco, and sorghum under the five growth conditions of ambient light, shade, high temperature, strong light, and drought, and performed 3D laser scanning (311 tomato scans, 105 tobacco scans, and 141 sorghum scans) on the plant stem structure during 20–30 days’ development. A 3D plant dataset was constructed after summarizing the species, conditions, and time points. Li et al. [
5] used this original dataset and manually marked the semantic labels belonging to stems and leaves using the semantic segmentation editor (SSE) tool and established a well-labeled point cloud dataset for plant stem leaf semantic segmentation and leaf instance segmentation. Hideaki et al. [
6] proposed a 3D phenotype platform that can measure plant growth and environmental information in a small indoor environment to obtain plant image datasets. In addition, annotation tools were introduced, which can manually, but effectively, create leaf tags in plant images on a pixel-by-pixel basis. Barth et al. [
7] rendered a composite dataset containing 10,500 images through Blender. The scene used had 42 program-generated plant models and random plant parameters. These parameters were based on 21 empirically measured plant characteristics at 115 locations on 15 plant stems. The fruit model was obtained through 3D scanning and the plant part textures were collected through photos as a reference dataset for modeling and evaluating the segmentation performance. David et al. [
8] established a large, diverse, and well-labeled wheat image dataset, called the Global Wheat Head Detection (GWHD) dataset. It contained 4700 high-resolution RGB images from multiple countries and 190,000 wheat head markers at different growth stages, with a wide range of genotypes. Wang et al. [
9] constructed a lettuce point cloud dataset consisting of 620 real and synthetic point clouds fused together for 3D instance segmentation network training. Lai et al. [
10] first used the SfM-MVS method to obtain point clouds of these plant population scenes, which were then annotated similarly to the S3DIS dataset to obtain data that could be trained and tested. In order to provide important and available basic data support for the development of three-dimensional point cloud segmentation and phenotype automatic acquisition technology of soybeans, this study uses the multiple-view stereo technology to construct 102 soybean three-dimensional plant models by taking advantage of its low cost, fast speed and high precision. At the same time, it is manually labeled to construct the dataset for point cloud segmentation. Compared with other datasets, this dataset contains three-dimensional information on soybean plants during the whole growth period, which has certain advantages in model accuracy and quantity.
There are several key binocular stereovision spatial positioning technologies involving image acquisition, camera calibration, image preprocessing, edge feature extraction, and stereo matching. Multi-vision is based on binocular vision, adding one or more cameras as a measuring assistant so that multiple pairs of images from different angles of the same object can be obtained. For the 3D reconstruction of a single plant, this method is more suitable for low sunlight conditions in the laboratory (Duan et al. [
11]; Hui et al. [
12]). This method can also be used for 3D reconstruction in the field such as studying overall crop canopy volumes (Biskup et al. [
13]; Shafiekhani et al. [
14]). Compared with other methods, the multiple-view stereo method requires relatively simple equipment, and the model can be established quickly and effectively, with minimum human-computer interaction required. Although the reconstruction speed is average and the requirements for the reconstruction of environmental factors are high, the reconstruction accuracy is high, it is easy to use, and the required equipment price is relatively low. Zhu et al. [
15] built a soybean digital image acquisition platform based on the principle of constructing a multi-perspective stereovision system with digital cameras covering different angles, effectively improving the problem of mutual occlusion between soybean leaves. The morphological sequence images of target plants for 3D reconstruction were then obtained. Nguyen et al. [
16] described a field 3D reconstruction system for plant phenotype acquisition. The system used synchronous, multi-view, high-resolution color digital images to create real 3D crop reconstructions and successfully obtained the plant canopy geometric characteristic parameters. Lu et al. [
17] developed an MCP-based SfM system using multiple-view stereo technology and studied the appropriate 3D reconstruction method and the optimal shooting angle range. Choudhury et al. [
18] devised the 3DPhenoMV method. Plant images captured from multiple side views were used as the algorithm input, and a 3D model of the plant was reconstructed using multiple side views and camera parameters. Miller et al. [
19] used low-cost hand-held cameras and SfM-MVS to reconstruct a spatially accurate 3D model of a single tree. Shi et al. [
20] adopted the multi-view method, allowing information from two-dimensional (2D) images to be integrated into the three-dimensional (3D) plant point cloud model, and evaluated the performance of 2D and multi-view methods on tomato seedlings. Lee et al. [
21] proposed an image-based 3D plant reconstruction system based on multiple UAVs to simultaneously obtain two images from different views of plants during growth and reconstruct 3D crop models with moving structures, based on multiple view stereo algorithms and metric structures. Sunvittayakul et al. [
22] developed a platform for acquiring 3D cassava root crown (CRC) models using close-range photogrammetry for phenotypic analysis. This novel method is low cost, and it is easy to set up the 3D acquisition requiring only a background sheet, a reference object, and a camera and is suitable for field experiments in remote areas. Wu et al. [
23] developed a small branch phenotype analysis platform, MVS-Pheno V2, based on multi-view 3D reconstruction, which focused on low plant branches and realized high-throughput 3D data collection.
In this study, the multiple view stereo method (MVS) was used to reconstruct soybean plants. A soybean image acquisition platform was constructed to obtain multi-angle images of soybean plants at different growth stages. Based on the silhouette contour principle, the model was established by contour approximation, vertex analysis, and triangulation, and 3D point cloud and original soybean datasets were constructed. Meanwhile, the obtained 3D models of soybean were manually labeled using CloudCompare v2.6.3 software. An annotated 3D dataset called Soybean-MVS, including 102 models, was established. Due to the inherent changes in the appearance and shape of natural objects, the segmentation of plant parts was a challenge. In this paper, to verify the availability of this dataset, RandLA-Net and BAAF-Net point cloud semantic segmentation networks were used to train and test the Soybean-MVS dataset.
4. Discussion
This paper explored the growth of soybean plants based on 3D reconstruction technology.
Figure 10 shows the full soybean plant growth period, using the three-dimensional model of DN251 soybean plants constructed in this study as an example. The original three-dimensional soybean plant whole growth period dataset and the labeled three-dimensional plant soybean whole growth period dataset constructed in this study can provide an important basis for solving and tackling issues raised by breeders, producers, and consumers. For example, research on crop phenotypic measurement and other issues requires the effective phenotypic analysis of plant growth and morphological changes throughout the growth period. Considering this, we propose the use of point cloud segmentation.
First of all, this paper chose the multiple-view stereo method to reconstruct the entire growth period of soybean plants. This method obtains detailed information about plants through crop images and extracts the phenotypic parameters of crops through related algorithms. Cao et al. [
26] developed a 3D imaging acquisition system to collect plant images from different angles to reconstruct 3D plant models. However, only 20 images were collected in that study to meet the minimum image overlap requirements for 3D model reconstruction. In our study, 60 soybean plant images from different perspectives were collected at four different heights during image acquisition, so the 3D model obtained after 3D reconstruction was more accurate. At the same time, a three-dimensional dataset of the whole growth period of the original soybean was established. By comparing the original point cloud amount of the V and R stages, the relationship between the point cloud amount of the three-dimensional soybean plant model and the growth period was analyzed, which confirmed that the number of plant point clouds was consistent with corresponding real plant development. This provides an important basis for more accurate three-dimensional reconstruction of crops in the whole growth period in the future.
Secondly, training point cloud segmentation models usually require a large amount of tag data, the cost of which is very high, particularly in intensive prediction tasks such as semantic segmentation. In addition, the plant phenotype dataset also faces the additional challenges of severe occlusion and different lighting conditions, which makes obtaining annotations more time-consuming (Rawat et al. [
27]). Gong et al. [
28] used a structured light 3D scanning platform, based on a special turntable, to obtain the 3D point cloud data of rice panicles, and then used the open-source software LabelMe to mark point by point and create a rice panicle point cloud dataset. Boogaard et al. [
29] manually marked cucumber plants twice with CloudCompare and constructed annotated dataset A and annotated dataset B. Dutagaci et al. [
30] obtained 11 3D point cloud models of Rosa through X-ray tomography and manually annotated them, creating a labeled dataset to evaluate 3D plant organ segmentation methods, called the ROSE-X dataset. However, these datasets do not emphasize the importance of three-dimensional data of the entire growth period of plants and the amount of data is relatively small, which lacks integrity for subsequent studies such as the phenotypic measurement of whole plant growth periods. In our study, Soybean-MVS, a labeled three-dimensional dataset of the whole growth period of soybean, was constructed, which fully meets the data volume requirements of in-depth learning point cloud segmentation training and evaluation and ensures the integrity of the dataset used for the point cloud segmentation research. This not only provides a basis for measuring plant phenotype, bionic species, and other issues, but may also provide a basis for exploring the natural laws of plant growth.
Thirdly, in the process of labeling the dataset in our paper, since the soybean plant main stem and stem information are relatively similar, and a soybean plant only has one main stem, the number is much lower than leaf and stem, leading to a low segmentation accuracy of the main stems. There is a situation where the points on the petiole were classified as leaves. However, the visualization results show that each point cloud segmentation network model still segmented most of the points on the main stems. Therefore, the Soybean-MVS dataset can ensure the effectiveness of the point cloud segmentation task.
Finally, the Soybean-MVS dataset is universal. The universality of datasets is crucial to empirical research evaluation for at least three reasons: (1) providing a basis for measuring progress by copying and comparing results; (2) revealing the shortcomings of the latest technology, thus paving the way for novel methods and research directions; (3) the method can be developed without first collecting and tagging data (Schunck et al. [
31]). Furthermore, data with high universality can meet the requirements of different point cloud segmentation models and obtain a highly reliable segmentation model. Turgut et al. [
32] evaluated their performance on real rose shrubs based on the ROSE-X and synthetic model datasets and adjusted six-point cloud-based deep learning architectures (PointNet, etc.) to subdivide the structure of a rosebush model. In our paper, RandLA-Net and BAAF-Net were used for testing (also applicable to other 3D point cloud classification and segmentation models based on depth learning). In the future, we will continue to expand and adjust the Soybean-MVS dataset and apply it to other point cloud segmentation network models, to further improve it.