Next Article in Journal
Harmonized Skies: A Survey on Drone Acceptance across Europe
Previous Article in Journal
The Sound of Surveillance: Enhancing Machine Learning-Driven Drone Detection with Advanced Acoustic Augmentation
 
 
Article
Peer-Review Record

UAV Photogrammetric Surveys for Tree Height Estimation

by Giuseppina Vacca * and Enrica Vecchi
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Submission received: 2 February 2024 / Revised: 14 March 2024 / Accepted: 18 March 2024 / Published: 20 March 2024
(This article belongs to the Section Drones in Agriculture and Forestry)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The reviewed paper deals with the interesting issue of automated determination of tree height from photogrammetrically processed UAV images. The topic remains relevant as photogrammetric reconstruction of vegetation still faces significant challenges such as moving leaves, thin branches, etc., which reduce the completeness and quality of point clouds. This can subsequently affect the accuracy of vegetation height determination. However, the article contains several formal, terminological and methodological errors, which I will address chronologically according to their occurrence in the text.

The explanation of abbreviations in the abstract is not consistent. For example, in precision agriculture, "(PA)" is in parentheses, but explanations are completely missing for "SfM" or "CHM." All abbreviations should be explained upon first use in the text (including LiDAR in the Introduction section).

In the Introduction section, there are absolutely no references related to photogrammetric tree height measurements or the SfM technology used in the article. Instead, it contains a paragraph with 20 citations regarding the use of LiDAR technology, which was practically not utilized in the article.

In section 2.2, there is a lack of information about the longitudinal and transverse overlap of images for individual flights. How were the missions planned? Or were they flown manually with interval shooting? What was the distribution of GCPs over the two areas? This is more important than the example of target in Fig. 3.

In Table 1, incorrect information is provided in the Sensor column - the focal length data does not belong here, and it is also unclear whether it refers to the actual physical focal length or the full-frame equivalent. For flight number 1, the sensor resolution is stated as 48 MP, but the previous text mentions a setting of 12 MP - what resolution was actually used? It would be interesting to explain why a lower resolution was used than what is possible with the given UAV. From experience, we know that images in 48MP mode are significantly noisy - was this the reason for using a different mode? The sensor resolution data for Area 1 and Area 2 is also inconsistently presented, once in MP and once in w x h (pixels) format. What is the pixel size on the DJI mini 3 sensor? For Area 2, this data is presented (2.41 μm). It would be good to standardize the way the data is presented. What model of DJI Phantom was used for Area 2? Phantom 2, 3, 4? Based on the camera designation, I assume Phantom 4, but this information should be directly in the article.

Table 1 - GCP accuracy - the errors for flight 3 (30 m) are too high (approx. 12 times higher than GSD) and left without commentary. Max and min values of erros should be provided in separate table. Did the authors use the GCPs during bundle adjustment? Or did they used only the basic 3D affine transformation after the import of reference coordinates? Why? If the distribution of GCPs was right, it could help to enhance the quality of the camera network.

Overall, it would be better to add a separate table with drone camera parameters (sensor size in mm and pixels, focal length, pixel size on the sensor), and include only information about flight height and GSD in Table 1.

In section 2.3, line 160 - what is meant by "standard Structure from Motion algorithm"? The reader would expect that the provided reference [38] would lead to an article that makes it clear that Agisoft Metashape uses the "standard SfM algorithm", but there is no mention of Agisoft Metashape in the referenced article. Similarly, there is no mention of different existing SfM strategies (incremental, global, hierarchical) there. It would be appropriate to add further references on this issue or at least mention that Metashape likely uses an incremental SfM strategy. It is also not true that "SfM is a photogrammetric method for high-resolution topographic reconstructions." SfM is only used for image orientation (resulting in a sparse cloud of tie points and the parameters of interior and relative orientation). Completely different algorithms are then used for detailed surface reconstruction, which can basically be divided into local, global, and Multi-View Stereo (MVS). Metashape presumably uses one of the many MVS algorithms. The elaboration of the dense point cloud, DEM, and ortho photo generation definitely does not belong to the SfM workflow as the authors state in line 172.

Explanation is missing in line 174 for what "High quality parameter" means - various software packages designate this option differently. Essentially, it refers to determining the input resolution of images, and for Metashape, the original image size is reduced by half at the High setting (the raster area becomes a quarter). So, the images were not processed at their original 12 MP resolution.

Table 2. Why is the file size for flight number 3 twice as large as for flights 1 and 2 despite having approximately the same number of points in the point cloud?

Table 3. Flight 2, Mean value -0,26 should be -0.26

Page 8, Line 248 - is it possible that the mean difference value compared to direct measurement (-0.62 m) would decrease if a DEM with higher resolution than 20 cm was used? Especially with smaller trees, some vertices could be lost in a sparser raster. Anyway, this result suggests that something is wrong with the data processing. With a GSD of 1 cm, we expect the depth (height) accuracy to be at most 2-3 times worse than the pixel size (aprox. 2-3 cm). Also, from Fig. 6 b), it is evident that there is a systematic error present in the graph - all differences are negative, and compared to tree heights up to 1.4 m, it looks like entire trees are missing from the DEM. For better support of conclusions, it would be good to add a depiction of the relationship between error size and tree height or its state (leafless, etc.) - provide some examples where height determination differences were the largest (show a point cloud sample of a tree, its corresponding DEM, and difference height for GIS analysis).

It would also be good to present the accuracy of tree height determination not only as difference values but also as a relative error compared to tree height - because it matters whether we achieve an accuracy of 0.3 m in determining the height of a 3 m tall tree (10% error) or of a 30 cm plant (100% error).

Line 346 - oblique images in the next experiments would definitely help enhance the quality of ground parts of the point cloud under the trees, which could lead to improvements in the classification process.

Author Response

Dear Reviewer,

 

thank you for the careful comments which helped us to improve this final version of the paper.

You can find the edited sections underlined in red in the manuscript file.

Hereafter we cite your comments using RC#, naming our replies with AA#.

Thank you for the attention you spent on our work.

 

Regards,

 

The Authors.

 

 

 

RC1: The explanation of abbreviations in the abstract is not consistent. For example, in precision agriculture, "(PA)" is in parentheses, but explanations are completely missing for "SfM" or "CHM." All abbreviations should be explained upon first use in the text (including LiDAR in the Introduction section).

AA1: You are right, thank you. We fixed the abbreviations both in the abstract and in the introduction.

 

RC2: In the Introduction section, there are absolutely no references related to photogrammetric tree height measurements or the SfM technology used in the article. Instead, it contains a paragraph with 20 citations regarding the use of LiDAR technology, which was practically not utilized in the article.

AA2: Dear Reviewer, you are right. We included some sources already present in the old version and new ones (Lines 59-61).

 

RC3: In section 2.2, there is a lack of information about the longitudinal and transverse overlap of images for individual flights. How were the missions planned? Or were they flown manually with interval shooting? What was the distribution of GCPs over the two areas? This is more important than the example of target in Fig. 3.

AA3: Dear Reviewer, you are right. We added this information in Section 2.2 (Lines 132-137; 147-151).

 

RC4: In Table 1, incorrect information is provided in the Sensor column - the focal length data does not belong here, and it is also unclear whether it refers to the actual physical focal length or the full-frame equivalent. For flight number 1, the sensor resolution is stated as 48 MP, but the previous text mentions a setting of 12 MP - what resolution was actually used? It would be interesting to explain why a lower resolution was used than what is possible with the given UAV. From experience, we know that images in 48MP mode are significantly noisy - was this the reason for using a different mode? The sensor resolution data for Area 1 and Area 2 is also inconsistently presented, once in MP and once in w x h (pixels) format. What is the pixel size on the DJI mini 3 sensor? For Area 2, this data is presented (2.41 μm). It would be good to standardize the way the data is presented. What model of DJI Phantom was used for Area 2? Phantom 2, 3, 4? Based on the camera designation, I assume Phantom 4, but this information should be directly in the article.

AA4: Dear Reviewer, you are right. We homogenized the data in Table 1 and separated the camera parameters, as suggested in your next comment (RC6). Moreover, we expressed the focal lengths as the full-frame equivalent (Table 2). Concerning the model of the DJI Phantom, this was a typo in the draft of the paper, we included the model number (Phantom-DJI 4) in the text (Line 147). About the sensor resolution, yes, we set a value of 12MP following recommendations from online forums suggesting that 48MP images are noisier, thus they may prevent the detection of some details. Therefore we agree with you, and we will deepen this aspect in future research.

 

RC5: Table 1 - GCP accuracy - the errors for flight 3 (30 m) are too high (approx. 12 times higher than GSD) and left without commentary. Max and min values of errors should be provided in separate table. Did the authors use the GCPs during bundle adjustment? Or did they used only the basic 3D affine transformation after the import of reference coordinates? Why? If the distribution of GCPs was right, it could help to enhance the quality of the camera network.

AA5: Dear Reviewer, thank you for your suggestion. We included a new table (now Table 3) providing these values. Moreover, we commented on the high values related to Flight 3 in Lines 187-190.

 

RC6: Overall, it would be better to add a separate table with drone camera parameters (sensor size in mm and pixels, focal length, pixel size on the sensor), and include only information about flight height and GSD in Table 1.

AA6: Dear Reviewer, thank you for your suggestion. We separated the information adding a new Table (Table 2) in the new version of the paper.

 

RC7: In section 2.3, line 160 - what is meant by "standard Structure from Motion algorithm"? The reader would expect that the provided reference [38] would lead to an article that makes it clear that Agisoft Metashape uses the "standard SfM algorithm", but there is no mention of Agisoft Metashape in the referenced article. Similarly, there is no mention of different existing SfM strategies (incremental, global, hierarchical) there. It would be appropriate to add further references on this issue or at least mention that Metashape likely uses an incremental SfM strategy. It is also not true that "SfM is a photogrammetric method for high-resolution topographic reconstructions." SfM is only used for image orientation (resulting in a sparse cloud of tie points and the parameters of interior and relative orientation). Completely different algorithms are then used for detailed surface reconstruction, which can basically be divided into local, global, and Multi-View Stereo (MVS). Metashape presumably uses one of the many MVS algorithms. The elaboration of the dense point cloud, DEM, and ortho photo generation definitely does not belong to the SfM workflow as the authors state in line 172.

AA7: Dear Reviewer, we were trying to simplify and be concise in this paragraph but maybe it resulted in a slightly inappropriate description. We added citations for the Structure from Motion workflow implemented by Metashape and changed the sentences according to your suggestions (Lines 163-173).

 

RC8: Explanation is missing in line 174 for what "High-quality parameter" means - various software packages designate this option differently. Essentially, it refers to determining the input resolution of images, and for Metashape, the original image size is reduced by half at the High setting (the raster area becomes a quarter). So, the images were not processed at their original 12 MP resolution.

AA8: Dear Reviewer, thank you for this suggestion. We added a more detailed explanation in Lines 179-181.

 

RC9: Table 2. Why is the file size for flight number 3 twice as large as for flights 1 and 2 despite having approximately the same number of points in the point cloud?

AA9: Dear reviewer, thank you for your attention! This was a typo, we checked the actual values and replaced them in the new version of the paper, thank you again (now Table 4).

 

RC10: Table 3. Flight 2, Mean value -0,26 should be -0.26

AA10: You are right, thank you.

 

RC11: Page 8, Line 248 - is it possible that the mean difference value compared to direct measurement (-0.62 m) would decrease if a DEM with higher resolution than 20 cm was used? Especially with smaller trees, some vertices could be lost in a sparser raster. Anyway, this result suggests that something is wrong with the data processing. With a GSD of 1 cm, we expect the depth (height) accuracy to be at most 2-3 times worse than the pixel size (approx. 2-3 cm). Also, from Fig. 6 b), it is evident that there is a systematic error present in the graph - all differences are negative, and compared to tree heights up to 1.4 m, it looks like entire trees are missing from the DEM. For better support of conclusions, it would be good to add a depiction of the relationship between error size and tree height or its state (leafless, etc.) - provide some examples where height determination differences were the largest (show a point cloud sample of a tree, its corresponding DEM, and difference height for GIS analysis).

AA11: Dear Reviewer, thank you for your comments. Below are our replies.

  1. Is it possible that the mean difference value compared to direct measurement (-0.62 m) would decrease if a DEM with higher resolution than 20 cm was used? Especially with smaller trees, some vertices could be lost in a sparser raster.

 

We performed the same elaborations using 5cm resolution without enhancing the accuracy. For this reason, we decided to set 20 cm of resolution to avoid heavy computational effort.

 

  1. Anyway, this result suggests that something is wrong with the data processing. With a GSD of 1 cm, we expect the depth (height) accuracy to be at most 2-3 times worse than the pixel size (approx. 2-3 cm). Also, from Fig. 6 b), it is evident that there is a systematic error present in the graph - all differences are negative, and compared to tree heights up to 1.4 m, it looks like entire trees are missing from the DEM.

 

We commented on the histogram of Area 2, underlying that the differences are negatively shifted since the issue in our case is not only related to the GSD but more to the low dimensions of the trees which are hardly detectable from the dense point clouds.

 

  • For better support of conclusions, it would be good to add a depiction of the relationship between error size and tree height or its state (leafless, etc.) - provide some examples where height determination differences were the largest (show a point cloud sample of a tree, its corresponding DEM, and difference height for GIS analysis).

 

Concerning Area 2, we explained that the obtained residual values were comparable with the actual trees’ heights. Moreover, we underlined that almost all the plants were characterized by a very poor presence of leaves, yet we still tried to relate the difference to the plant species without any significant outcomes. For these reasons, we decided to present box and whisker plots and regression lines for the flights in Area 1, only.

 

RC12: It would also be good to present the accuracy of tree height determination not only as difference values but also as a relative error compared to tree height - because it matters whether we achieve an accuracy of 0.3 m in determining the height of a 3 m tall tree (10% error) or of a 30 cm plant (100% error).

AA12: Dear Reviewer, you are right. We included a more detailed statistical analysis in the Methodology Section (Lines 245-262). Note that, as we previously said in AA11, some of the results are not totally presented for Area 2, since the residuals already gave a description of the observed behaviour.

 

RC13: Line 346 - oblique images in the next experiments would definitely help enhance the quality of ground parts of the point cloud under the trees, which could lead to improvements in the classification process.

AA13: Dear Reviewer, you are right. Maybe our sentence was understating the concept without explaining it properly. We modified the sentence in the new version of the paper (Lines 411-414).

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The paper does not bring out innovative research elements; its structure appears as a simple aerial photogrammetric survey contextualized in two different areas of investigation. Honestly, I see it as a simple processing of datasets

Author Response

Dear Reviewer, thank you for reading our paper.

We are confident that the version of the manuscript has improved after integrating the revisions.

You can find the edited sections of the paper underlined in red.

Thank you for the attention you spent on our work.

 

Sincerely.

The Authors.

Reviewer 3 Report

Comments and Suggestions for Authors

Find my comments in the pdf file.

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Moderate editing of the English language is required

Author Response

Dear Reviewer,

thank you for the careful comments which helped us to improve this final version of the paper.

You can find the edited sections underlined in red in the manuscript file.

Hereafter we cite your comments using RC#, naming our replies with AA#.

Thank you for the attention you spent on our work.

 

Regards,

 

The Authors.

 

 

 

 

RC1: In line 10: What kind of trees (i.e., olive trees)? It is good to know from the beginning this information…

AA1: Dear Reviewer, you are right. We specified the tree species in the abstract.

 

RC2: In lines 19-20 are you referring to test area 2 (content of low trees)? I would suggest being consistent with the terms. Once in your previous line, you used area 1… For example, “ while the application in area 2 was more problematic (in case of shorter trees).

AA2: Dear Reviewer, you are right. We changed the sentence according to your suggestion (Lines 21-22).

 

RC3: In line 21: instead of using “DTM” replace it with the “local maxima” algorithm. Either way, you have you already mentioned CHM.

AA3: Dear Reviewer, thank you. We fixed it according to your suggestions (Lines 23-24).

 

RC4:  In line 28: I assume you wanted to say optical multispectral and not optical, multispectral. Optical data are also multispectral.

AA4: Dear Reviewer, you are right. We fixed the sentence according to your suggestions (Line 30).

 

RC5: In lines 51-53: I strongly recommend adding the following recent reference 1) Panagiotidis, D.; Abdollahnejad, A.; Slavik, M. 3D point cloud fusion from UAV and TLS to assess temperate managed forest structures. Int. J. Appl. Earth Obs. Geoinf. 2022, 112, 102917. https://www.sciencedirect.com/science/article/pii/S1569843222001182?via%3Dihub

AA5: Dear Reviewer, thank you for your suggestion. We added this reference (Lines 53-55).

 

RC6: In lines 55-57: I strongly recommend adding the following reference. It is more in agreement with your study: 2) Peter Surový, Nuno Almeida Ribeiro & Dimitrios Panagiotidis (2018) Estimation of positions and heights from UAV-sensed imagery in tree plantations in agrosilvopastoral systems, International Journal of Remote Sensing, 39:14, 4786-4800, DOI: 10.1080/01431161.2018.1434329 https://www.tandfonline.com/doi/full/10.1080/01431161.2018.1434329

AA6: Dear Reviewer, thank you for your suggestion. We added this reference (Lines 57-59).

 

RC7: Lines 90-97: No need to get into details in this section. Delete those lines, that’s a topic for methodology..

AA7: Dear Reviewer, thank you for your suggestion. We incorporated this paragraph in the Methodology Section, simplifying this part (Lines 91-97).

 

RC8: In general the section of methodology is incomplete. I would suggest deleting some irrelevant parts and instead focus more in the literature of what you did in this paper. For example, add a paragraph of local maxima algorithm (the one you used to estimate your heights) by adding also related papers. Read the following and the previous paper suggested by me to get some ideas. Here is another suggestion that is strongly related to your study and you can cite: 3) Dimitrios Panagiotidis, Azadeh Abdollahnejad, Peter Surový & Vasco Chiteculo (2017) Determining tree height and crown diameter from high-resolution UAV imagery, International Journal of Remote Sensing, 38:8-10, 2392-2410, DOI: 10.1080/01431161.2016.1264028 https://www.tandfonline.com/doi/full/10.1080/01431161.2016.1264028

AA8: Dear Reviewer, you are right. We added a more detailed description of the Local Maxima approach in Section 2.4 (2.5 in the old version of the paper) (Lines 225-234).

 

RC9: In figure 1, fix the coordinate system in the map settings in arcgis or any other gis software that you use, also add north arrow.

AA9: Dear Reviewer, you are right, there were missing information in Figure 1. We specified the reference system in the figure caption: “ETRS89-ETRF2000-UTM32 reference system (EPSG: 6707)”. We also added the term “plantation” in the “Study Areas” subsection (2.1), thank you.

 

RC10: In lines 120-121: It is not clear what you are trying to say with this statement. If I understood well you want to say that in both sites there is negligible slope? If this is the case it doesn’t really matter for the construction process of terrain and surface models since you will normalize the heights (CHMs).

AA10: Dear Reviewer, you are right, maybe this sentence was oversimplifying the concept. We meant that the negligible slopes in both sites allow the production of the Digital Terrain Models without running into significant errors due to the interpolation process. Anyway, we modified the sentence in the new version of the paper (Lines 202-204, now in Section 2.3).

 

RC11: In my opinion figure 2 doesn’t really offering something I would recommend deleting it and keep just figure 3 where you show the application of GCPs on the floor. In figure 2 I can also distinguish some heavy clouds. Was there any issue finding satellites and how many did mavic was able to find before your flight operation?

AA11: Dear Reviewer, we did not consider the GNSS data of the drone during the processing.

 

RC12: I also see that you used different drones in area 1 and 2. What was the main reason for that? Also, in area 1 which is larger (1.2ha) you used 6 GCPs and in area 2 which is .5ha you used 7 GCPS. Could you please elaborate?

AA12: Dear Reviewer, the two considered areas are related to different contexts. Area 2 was involved in a wider project named CARMA (that we cited in the text “Ecoserdiana. Progetto di Ricerca su Tecnologie di CARatterizzazione Monitoraggio e Analisi per il ripristino e la bonifica (CARMA) - Fondo Europeo di Sviluppo Regionale - Por Fesr Sardegna 2014-2020, available at: https://www.ecoserdiana.com/servizi/progetti-di-ricerca.html.”), while Area 1 was chosen as a more standard case study. Unfortunately, we could not choose the employed drone in Area 2, but in both cases, we kept the choice in the context of low-cost instruments. Concerning the number of GCPs, originally two more targets were placed in Area 1, but they were not visible during the flights.

 

RC13: Based on figure 2b (for area 2) and table 1. What was the reason for flying that high? I mean the size of these young trees is barely distinguishable with the eye. It is very difficult for the drone from an altitude of >40 m to “capture points”. Your overlapping is quite good, but the resolution is quite low, and I am curious to see your sparse and dense clouds from area 2.

AA13: Dear Reviewer, you are right. In the paper, we specified that the point cloud was not able to properly represent the young trees (Lines 292-294). Nevertheless, some disturbing elements did not enable us to fly at lower altitudes in that case, but we are planning to acquire new data with lower flight altitudes, as specified in the conclusions (Lines 399-402).

 

RC14: Was the sensors directly over the nadir point or tilted? I am asking that because of figure 4, and what are the white areas in the crowns of area 1? In the same image I also distinguish shadowing effects due to the flight time. What was the flight time in all cases and what was the date? Please add this info.

AA14: Dear Reviewer, you are right. Figure 4 was misleading, but we only used nadiral images (now specified in Table 1). We changed the Figure in the new version of the paper and added date and time information in Table 1.

 

RC15: I also suggest to add (in table 1 or 2) the flight path pattern and duration of flights.

AA15: Dear Reviewer, thank you for your suggestion, we added a column in Table 1.

 

RC16: In subsection 2.3 how many original images you had and how many finally alligned in metashape? I also suggest to add this information somewhere in the text. Also, what was the cell size and resolution of your models?

AA16: Dear Reviewer, you are right, the information about the number of aligned images was missing in the original draft of the paper. We added this data in the new version of the paper, in Lines 177-179. Concerning the cell size DTM and DSM models, these are specified in Lines 196-220 (now moved to Section 2.3).

 

RC17: For the reconstruction of the DTM mesh did you use automatic or manual classification and based on which classified point clouds the sparse or the dense? And how about the DSM mesh…

AA17: Dear Reviewer, concerning the DTMs and DSMs processing, they are explained at the end of Section 2.3 (Lines 210-220) (different position in the text with respect to the previous version of the paper). Both DSMs and DTMs have been produced using the Rasterize tool in Cloud Compare, without involving the use of Metashape. Concerning DTMs, the interpolation has been applied to the filtered dense point clouds using: i) the automatic filtering of Cloud Compare (CSF) – Area 1; and LAStool in Qgis – Area 2.

 

RC18: In methodology after subsection 2.5 you need to add a separate subsection at least a small paragraph dedicated to the statistical approach and formulas have been used.

AA18: Dear Reviewer, thank you for your suggestion. We added a paragraph concerning the statistical analysis into Subsection 2.5 (now relating to the field measurements) (Lines 245-262).

 

RC19: Have you considered using box and whisker plots illustrate the variance for the measured and estimated height variables in both area 1 and 2?

AA19: Dear Reviewer, thank you for the suggestion. We added a Figure (Figure 7) in the new version of the paper.

 

RC20: In line 356: replace the term low trees with a more technical term, you could say shorter trees seedlings or saplings (you have mentioned in line 285 “(heights ranging between 10 cm and 140 cm)”. Try to be consistent elsewhere in the text.

AA20: Dear Reviewer, thank you for this suggestion. We replaced the term, trying to keep it consistent throughout the text.

 

RC21: Lines 277-289: are basically describing/repeating again the methodology. I strongly recommend to delete this part, Instead try to discuss your findings more with other works.

AA21: Dear Reviewer, you are right. We removed this part and changed the related paragraph in the discussion.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

.

Author Response

Thank you

Reviewer 3 Report

Comments and Suggestions for Authors

The authors have put in significant effort to address all comments, aiming to enhance the overall presentation and quality of this paper. Consequently, I am fully satisfied with its current state.

Author Response

Thank you

 

Back to TopTop