Rubber Tree Recognition Based on UAV RGB Multi-Angle Imagery and Deep Learning
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsDear Authors,
After analyzing the manuscript entitled “Rubber tree recognition based on UAV RGB multi-angle imagery and deep learning” by Liang et al. I make the following considerations:
The rapid and accurate identification of individual rubber trees (H. brasiliensis) not only plays an important role in predicting biomass and yield, but also contributes to estimating carbon sequestration. The trunks of defoliated trees can be clearly identified in high-resolution RGB images. For this purpose, during the study, a drone equipped with an RGB camera was used to capture high-resolution images of rubber plantations from three observation angles (-90°, -60°, 45°) and two flight directions (SN: perpendicular to the rows of rubber plantings, and WE: parallel to the rubber planting rows) during the deciduous period. Four convolutional neural networks (Multi-scale Attention Network, MAnet; Unet++; Unet; Pyramid Scene Parsing Network, PSPnet) were employed to explore the beneficial observation angles and directions for rubber tree identification and counting.
The Introduction is well written, apart from a few minor errors.
The Materials and Methods section of the manuscript requires clarification.
Discussion: I find it a little strange that the results of monitoring a deciduous species (gum tree) are compared with evergreens (pines). Nevertheless, I think that the discussion section is properly written.
The citation format of the Reference list does not correspond to that expected in the journal Drones. Please correct it.
L 39: the scientific name of the plant should be written in Italics.
L 40-42: I understand that it can be an important tree species (H. brasiliensis) in the tropics from the point of view of CO2 sequestration, but I think it requires a careful approach, since the target species is not native to Asia. It has been proven in many cases that the introduced species can change the environment, which damages the native species. Note that no such evidence is yet available for H. brasiliensis.
L 71-72: Please specify the plants with their scientific names.
L 101-103: There is no need to write such detailed figures in the Introduction, they should rather be transferred to the Discussion as a comparison.
L 123: It is advisable to add what made it possible to identify the target trees: Rubber trees can be identified by the characteristic color/pattern of their bark compared to the surrounding tree species. Or, what other characteristics of rubber trees made it possible to identify them in RGB images?
L 140: Figure 2?? Figure 1 should be here. Moreover, the references to figures are not in the correct format in the text (e.g. L 188). And the figures are not in the right place either. Please check the entire manuscript.
L 152-161: What software was the flight plan designed in? DroneDeploy was used for it?
L 155: Spaces are missing. “..5472 × 3648 pixels..” instead of “..5472×3648 pixels..” Please correct similar ones in the entire manuscript.
L 156: What was the longitudinal and transverse image overlap?
L 160-161 and L 172-173: How many rubber trees were there in the study area? Because 75 are mentioned in the first sentence and 614 in the second. Or what makes the tifference?
L 206: “..proposed by Zhao et al. [30] to obtain..” instead of “..proposed by [30] to obtain..” Please check the entire manuscript.
L 221: a space is missing.
L 267 (Figure 4.): A more detailed figure caption is required, and what do the two blue squares represent in the figure?
L 383: delete spaces.
Considering the above, I suggest a minor revision of the manuscript.
Comments on the Quality of English LanguageThe English language of the manuscript is clear and understandable, however, I recommend minor corrections and revisions.
Author Response
Thank you for your suggestion. We have revised the manuscript accordingly and addressed the review comments point by point. The revised texts have been highlighted with a yellow background. Please note that the line numbers in this report refer to the revised manuscript.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThe manuscript "Rubber tree recognition based on UAV RGB multi-angle imagery and deep learning” explores the use of unmanned aerial vehicles and deep learning algorithms for rubber tree identification and counting during defoliation periods. It suggests that these methods, specifically using the Unet++ algorithm, can improve recognition accuracy, providing a new approach for forest monitoring during specific phenological periods.
The topic is suitable for the journal and is relevant to precision agriculture or precision forestry. However, it needs substantial revisions before it can be considered for publication.
While the current introduction concentrates heavily on deep learning (DL), it's essential to remember that this study delves into the use of UAVs in precision agriculture and forestry. Consequently, the authors are advised to revise the introduction and discussion sections to reflect this dual focus. Consider incorporating insights from the following articles:
• “Fast Tree Detection and Counting on UAVs for Sequential Aerial Images with Generating Orthophoto Mosaicing” https://doi.org/10.3390/rs14164113
• “Estimating Tree Height and Volume Using Unmanned Aerial Vehicle Photography and SfM Technology, with Verification of Result Accuracy” https://doi.org/10.3390/drones4020019
• “Tree Crown Detection and Delineation in a Temperate Deciduous Forest from UAV RGB Imagery Using Deep Learning Approaches: Effects of Spatial Resolution and Species Characteristics” https://doi.org/10.3390/rs15030778
• “A Novel Technique Using Planar Area and Ground Shadows Calculated from UAV RGB Imagery to Estimate Pistachio Tree (Pistacia vera L.) Canopy Volume” https://doi.org/10.3390/rs14236006
The 'Materials and Methods' section requires more information about the area of study. Moreover, the authors should consider providing a more direct comparison of the architectures in relation to their specific use case. This would help the reader understand the benefits and potential drawbacks of each architecture in the context of this research.
The 'Results' section is clear, although figures and tables would benefit from a detailed explanation in the text. Currently, some results are not fully discussed in the discussion section, an inconsistency that should be addressed. The 'Discussion' section needs to elaborate on the benefits of the tree trunk counting methodology for precision agriculture and the comparison with other works, which is not clearly expressed at present. Therefore, a subsection including this information is advised.
As for figures, captions should be self-explanatory. For each figure and table, there should be clear descriptions and interpretations, allowing the reader to understand the significance of the data without referring to the text. Therefore, all captions in the paper should be enlarged to explain much better what is contained in the figure or table without resorting to the text. In addition, the quality of the figures should be improved. For example, in figure 5 the letters and numbers do not read well.
Finally, the terminology used throughout the manuscript, "tree identification" and "crown," could lead to confusion, because these terms are employed for full developed trees, including leaves. The more precise terms "trunk" or "tree trunk" and "bare crown" would improve clarity. Moreover, I do not understand why the authors write “Figure. X” instead of “Figure X”.
English language is fine but the paper uses both American (e.g., "recognize") and British (e.g., "recognise") English interchangeably. Authors should stick to one spelling convention for consistency. In addition, there are some errors and instances where clarity could be improved. For example, "Although a number of research on rubber plantation change detection" (line 43) should be "Although a number of research studies on rubber plantation change detection". Another example: "The forward and lateral overlap was set to 80% and 70%, respectively" (line 156). The verb 'was' should be 'were'.
Specific comments
Line 32
"This research provides a new idea for tree identification"
In reality, this research focuses on trunk identification, making this statement somewhat misleading.
Line 54-58
"For example, Zhang et al. [5] […] accuracy R2 achieved 0.68."
The focus of this study is on UAV technology, making references to satellite technology seem misplaced.
Line 59
"the remote sensing […] monitoring at the tree level "
I agree, thus the authors should maintain a focus on UAV technology, avoiding unrelated information about satellites.
Line 64
"various civilian applications"
As the paper pertains specifically to precision agriculture, references to "civilian uses" should be omitted to maintain focus.
Line 66-68
"UAV-based images acquired from […] aboveground biomass [11], tree crown [12], and tree height [13]."
This explanation should be expanded. Authors should develop a paragraph detailing current state-of-the-art methods for counting/assessing individual tree trunks/canopies/crowns.
Line 80-81
More specifics on the Fast R-CNN architecture and its advantages over contrast CNNs are needed. The assertion "faster and more accurate than the proposed contrast CNNs" lacks context and specificity.
Lines 86-91
The comparison between DL algorithms and traditional ML methods like SVM and RF appears unclear. Moreover, there is no discussion on computing power and time requirements… are really DL methods more efficient than ML methods? An in-depth analysis or comparison in the context of image recognition would be beneficial.
Line 117-123
While the difficulties of using UAV-based RGB images for rubber tree counting due to high canopy coverage and complex crown architecture are clearly outlined, additional references to prove these claims would be useful.
Line 137
The study area description would benefit from more details and information, such as the total size in hectares.
Line 155
"from 10:00 a.m. to 4:00 p.m."
The impact of sun position and shadowing on image quality is well known. How have the authors addressed this issue?
Line 169
"were mosaicked"
How have artifacts introduced during the mosaicking process been handled?
Line 177
"divided into training images of 80% and validation images of 20%"
What is the rationale behind these percentages? Why not 70-30 or 90-10? Altering these values might affect results quality.
Line 183
"The DL method has been widely used in target detection, computer vision, natural 183 language processing, speech recognition, and other fields."
Unnecessary information. Please, focus on the topic.
Lines 283-291
The explanation of results depicted in Figure 5 requires more detail. The text (not just the figure) should clarify what is shown in the figure, and any notable trends.
Line 299
"PSPnet and MAnet had obviously missing detection in some cases."
However, these results aren't discussed in the discussion section, which simply states, "Although MAnet achieved optimal target detection on data sets related to the medical field, this algorithm may not be applicable to all object detection tasks".
Line 357
The implications of this tree trunk counting methodology for precision agriculture/forestry should be discussed in a dedicated section.
Line 391
"The crowns and trunks"
When referring to a tree's canopy without leaves, the term "bare crown" should be used. Please correct this throughout the manuscript.
Lines 396-410
The choice of the flight directions SN and WE, along with their impact on the study's results, requires clearer explanation and substantiation. Additionally, how have different flight times influenced the results? For instance, tree shadows vary considerably in Figure 10.
Line 477
"in recognizing rubber trees"
The correct phrase should be "in recognizing rubber tree trunks". Please amend this throughout the manuscript.
Lines 485-487
"That means oblique observation has […] with DL algorithms."
This sentence is somewhat complicated and requires rephrasing for clarity.
Line 489
Future research or the broader implications of the study should be discussed more extensively in the discussion or conclusion section, offering readers a wider perspective on the field.
Comments on the Quality of English LanguageEnglish language is fine but the paper uses both American (e.g., "recognize") and British (e.g., "recognise") English interchangeably. Authors should stick to one spelling convention for consistency. In addition, there are some errors and instances where clarity could be improved. For example, "Although a number of research on rubber plantation change detection" (line 43) should be "Although a number of research studies on rubber plantation change detection". Another example: "The forward and lateral overlap was set to 80% and 70%, respectively" (line 156). The verb 'was' should be 'were'.
Author Response
Thank you for your suggestion. We have revised the manuscript accordingly and addressed the review comments point by point. The revised texts have been highlighted with a yellow background. Please note that the line numbers in this report refer to the revised manuscript.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThe manuscript has been significantly improved by the authors. However, an important comment from the previous review regarding figure captions remains unaddressed:
"As for figures, captions should be self-explanatory. For each figure and table, there should be clear descriptions and interpretations, allowing the reader to understand the significance of the data without referring to the text. Therefore, all captions in the paper should be enlarged to explain much better what is contained in the figure or table without resorting to the text."
For instance, the caption for Figure 6 states:
"Figure 6. Rubber tree trunks detection results with four CNN algorithms."
But it lacks clarity. For example, on the yellow/purple colours and terms like "origin" and "ground truth".
A more comprehensive caption for Figure 6 could be:
"Figure 6. Detection results of rubber tree trunks using four distinct CNN algorithms: Multi-scale Attention Network (MAnet), Unet++, Unet, and Pyramid Scene Parsing Network (PSPnet). Trees are highlighted in yellow, while the background is depicted in purple. "Origin" is the image source, RGB input. "Ground truth" are the labelled images."
The same applies to captions for Figures 2, 3, and 8. Authors should revise the captions before the publication of the paper.
Comments on the Quality of English LanguageThere are a number of English language errors in the text. For example, in the discussion section, here's a list of identified errors by line number:
426: "Effects of UAV Multi-angle and Orientations Observation on Tree Identification in Rubber Forest" should be "Effects of UAV Multi-angle Observations and Orientations on Tree Identification in a Rubber Forest."
427: "This study accessed" should be "This study assessed."
451-452: "The large offset makes" should be "This large offset results in."
476-477: "This research employed four outstanding deep learning methods" is subjective. "Outstanding" might not be appropriate in a research context.
484: "performed well for X-ray microscopy image segmentation, crowd counting, and liver Computerized tomography (CT) images segmentation" should be "performed well in tasks like X-ray microscopy image segmentation, crowd counting, and liver CT image segmentation."
511-512: "The homogenous and high overlapping canopy characteristics of rubber trees make" should be "The homogeneous and highly overlapping canopy characteristics of rubber trees make."
This is just an example. The authors should revise the rest of the manuscript accordingly.
Author Response
We appreciate your valuable review comments and suggestions. The captions of the figures and tables of the manuscript have been revised to enhance their readability. We carefully checked and revised the English language errors in the entire manuscript. For the detailed information please see the attachment.
Author Response File: Author Response.pdf