Classification of Complicated Urban Forest Acoustic Scenes with Deep Learning Models
Round 1
Reviewer 1 Report
The paper titled “Classification of complicated urban forest acoustic scenes with deep learning models” was reviewed. In general, the manuscript is very well-written, structured, and informative.
Comments to the authors:
I have only some minor comments on your work:
Overall, the title, abstract and the keywords correspond to the aims and objectives of the manuscript.
The introduction section is very well-prepared, structured and informative, and the aim of the research is clearly outlined and justified.
Line 45: The abbreviation “PAM” for passive acoustic monitoring, it has already been introduced in line 11 in Abstract. There is no need to repeat it.
Methods sections is clear, descriptive, and very well presented in the Materials and Methods sections is clear, descriptive and very well presented in the sufficient details.
Line 284: Specify, table number in text …” (the OA in the table represents the highest overall” ….
Line 301: Move the title of Table 3 to next page.
Line 307: Move text “3.3. Comparison of different models’ ability to predict new data“ to next page.
Line 348: Move text “4. Discussion” to next page.
The results obtained from the study are clearly presented, detailed, and properly discussed with relevant research works.
The reviewer recommends revising the conclusions. The second and third paragraphs are not a summary of the results but present the proposals of the authors' future activities.
The references cited are appropriate to the research topic.
This reviewed manuscript is very interesting and valuable.
Best regards,
Author Response
Comment 1: Line 45: The abbreviation “PAM” for passive acoustic monitoring, it has already been introduced in line 11 in Abstract. There is no need to repeat it.
Thank you for pointing this out. We have removed redundant descriptions in the revised manuscript.
Comment 2: Line 284: Specify, table number in text …” (the OA in the table represents the highest overall” ….
Thank you for highlighting this omission, which we have now corrected by adding the table number to the manuscript:’ (the OA in the Table 2 represents the highest overall accuracy obtained by the model up to the current training epoch)’. (Page 8 Lines 288-290)
Comment 3: Line 301: Move the title of Table 3 to next page.
Thank you for pointing this out. We have now adjusted the corresponding content layout. (Page 9 Lines 310-311)
Comment 4: Line 307: Move text “3.3. Comparison of different models’ ability to predict new data” to next page.
Thank you for pointing this out. We have now adjusted the corresponding content layout. (Page 10 Line 317)
Comment 5: Line 348: Move text “4. Discussion” to next page.
Thank you for pointing this out. We have now adjusted the corresponding content layout. (Page 12 Line 358)
Comment 6: The reviewer recommends revising the conclusions. The second and third paragraphs are not a summary of the results but present the proposals of the authors' future activities.
Thank you for pointing this out. We agree with your comment. We have now removed the second and third paragraphs, then added our plans for future work to the first paragraph: ‘In future work, we will further optimize the performance of existing models and try to de-ploy them on front-end devices with automatic data upload function, thereby reducing labor costs.’ (Page 14 Lines 468-470)
Reviewer 2 Report
This manuscript explores methodology to allow for the evaluation of bioacoustics data in a more efficient and economical way. The introduction does a good job of providing the context for how PAM is being used in ecoacoustics and in the evaluation of anthropic disturbance on natural soundscape, providing a justification for its potential utility in sound ecology. The methodology employed seemed effective at creating a program with high accuracy and recall. It is also very interesting the effort the Authors made to provide comparisons between different Training models. The discussion did a good job in highlighting the potential of comparing different models to recognize biological acoustic scenes based on deep learning techniques, but also of identifying areas of potential application and further research.
Minor comments:
LL. 42: increasing should be increasingly
LL. 43: add "to" before understand
LL.61-63: please rephrase avoiding excessive repetitions
LL. 135: scientific names should be in italic
LL. 157: I suggest the Authors to substitute this table (the information content is already esplicite in the main text) with a Figure summarizing the methodology for this work (described in LL. 147-155).
LL. 442: I suggest to expand a little bit this section "4.3. Comparison of related studies", by commenting on recent work about other animal taxa, living in forest environment, exploring the same difficulties in applying PAM techniques to monitor animal populations.
I also think it is quite important to cite the work of Farina and Sueur in the Introduction section.
Author Response
Comment 1: LL. 42: increasing should be increasingly
Thank you for pointing this out, which we have revised in the manuscript. (Page 1 Lines 43-44)
Comment 2: LL. 43: add "to" before understand
Thank you for pointing this out. We have revised it in the manuscript. (Page 1 Line 44)
Comment 3: LL.61-63: please rephrase avoiding excessive repetitions
We agree with your comment. We revised the text and it now reads: ‘Many methods in environmental sound recognition come from the field of speech recognition’. (Page 2 Lines 65-66)
Comment 4: LL. 135: scientific names should be in italic
Thank you for pointing this out. We have revised it in the manuscript. (Page 3 Line 138)
Comment 5: LL. 157: I suggest the Authors to substitute this table (the information content is already esplicite in the main text) with a Figure summarizing the methodology for this work (described in LL. 147-155).
We agree with your assessment. We have moved this table to the appendix and used a figure to outline how the dataset is divided. (Page 4 Lines 160-161. Page 14 Lines484-485)
Comment 6: LL. 442: I suggest to expand a little bit this section "4.3. Comparison of related studies", by commenting on recent work about other animal taxa, living in forest environment, exploring the same difficulties in applying PAM techniques to monitor animal populations.
We agree with your assessment. We have expanded our discussion of PAM applying in the field of Hainan gibbon monitoring in Section 4.3:’ In the study of other animal populations, Dufourq et al [61]. designed and trained a high-accuracy deep learning model for detecting the call of Hainan gibbon Nomascus hainanus in the massive data collected by PAM. In this way, the efficiency of wildlife conservation can be improved, but how to obtain enough call samples of the target species is also a problem (for example, the habitat is inaccessible or the population is reduced because the species is threatened).’ (Page 13 Lines 451-456)
Comment 7: I also think it is quite important to cite the work of Farina and Sueur in the Introduction section.
Thank you for your recommendation. We have cited the work of Farina and Sueur in the revised manuscript. (Page 1 Lines 42-43. Page 2 Lines 58-59)
Author Response File: Author Response.docx