Hierarchical Feature Association and Global Correction Network for Change Detection
Round 1
Reviewer 1 Report
The paper presents a hierarchical feature association and global correction network for change detection (HFA-GCN). The proposed model comprises a hierarchical feature association module (HFA) for hierarchical change information valid utilization, and a global correction module (GC) for obtain more accurate change feature while naturally extract global information. The experiments are carried out on the opening CDD, LEVIR-CD and GZ-CD datasets, showing effectiveness in hierarchical feature association and global correction. On the whole, the idea is interesting and innovative. The basic line of research is well clarified, and the experimental verification is reasonable for me. I recommend accepting the manuscript after considering the following revisions.
1. In Figure 1 the red lines which output from DI, input to GC, lack obvious markings, optimize them so that readers can have a more intuitive understanding of the overall network.
2. There is inconsistency in the description of the table in the article. Although I can understand that ‘Table 1’ and ‘Table I’ represent the same table, it is recommended to unify these two expressions.
3. In the first three experimental results tables of the article, the results with the second best performance are labeled, but in the last table, only the best results are labeled. Please unify and standardize.
4. In Figure 2 the specific operation of adding and multiplying needs to be detailed explanation, need to add a description or text to explain in the diagram.
5. In Eq.(16), there are identical DIi on both sides of the equation, to express the process of iteratively utilizing information, it should be distinguished.
6. The most challenging aspect of change detection is to obtain the different scales land cover changes. How did the author solve the problem in this work?
Minor editing of English language is required.
Author Response
Thanks for your suggestions, we have revised and responded to each comment, please see the attachment.
Author Response File: Author Response.docx
Reviewer 2 Report
Dear authors,
overall, you provided a well written and structured article. I only have some minor points that should be improved.
l. 39, 58, 76, 92, 108, 141, 151, 165, 169: These are all some sort of headlines. However, each in a differenz format / style. They should match and help to clearly structure the text.
l. 175, (183,), 370, 384, 405, 479: References on sections and tables are in roman numbers (I, II ..) and do not match the captions.
l. 265-269: unknown / missing letters or symbols.
l. 317f: There is an error about active/passive voice in the sentence starting with "Secondly,..."
l. 387, 408: The information on the datasets should (additionally) be included above in the corresponding chapters 3.1.x.
Chapter 3.3.: The evalation would benefit from a few examples for different the measurements (especially FP and FN) as I cannot tell if these are related to pixels, objects, or...?
Author Response
Thanks for your suggestions, we have revised and responded to each comment, please see the attachment.
Author Response File: Author Response.docx
Reviewer 3 Report
Change detection is a challenging task in the field of remote sensing and holds significant research value. Firstly, I would like to acknowledge the innovation and effort put into your article. The authors’ deas are presented clearly, and the proposed Hierarchical Feature Association and Global Correction Network (HFA-GCN) can be divided into five parts: Backbone (feature extraction), HFA (hierarchical feature association), decoder, GC (global feature correction), and changedecoder (final change decoupler). Through the HFA module, the same-scale features from the dual-branch feature extraction are associated with adjacent features and gradually decoupled. The decoupled change map is post-processed, and the GC module, based on attention mechanism, associates the change map with features at the same level of feature extraction and corrects the predictions. Finally, the changedecoder is used to decouple and obtain the final change map.
Secondly, the HFA module concatenates features at different time steps and supplements and enhances the current features with adjacent features. As far as I know, previous works in the same domain have only considered separate processing of features from different time steps. The joint processing strategy proposed in this article, considering both spatial and temporal features during multi-level feature association, demonstrates a certain level of novelty. The GC module considers the characteristics of self-attention mechanism, which associates the predicted map with encoder features to achieve global correlation and correction. This solves the problem of misalignment caused by feature amplification or reduction, presenting a very innovative idea. Moreover, the two proposed modules by the author can achieve the expected results. However, the article should provide a demonstration of the backbone and also include experimental data to support the statement that the model has a higher computational complexity. Additionally, the article contains multiple minor mistakes authors should aim to improve the article's rigor.
Here are my suggestions for the authors:
1. Some of the keywords provided are not mentioned in the abstract. It is recommended to refine the abstract accordingly.
2. Structural diagrams should be provided for the backbone, decoder, and changedecoder.
3. The references for the compared network models should be indicated, and it is recommended to verify the publication dates of the papers that describe those models.
4. The dataset introduction should include more detail information about its characteristics.
5. For the proposed backbone, it is suggested to include comparative experiments with other well-known backbones, such as ResNet.
6. It is recommended to include efficiency comparison experiments (FLOPs, Params) between the proposed model and other models after conducting ablation experiments.
7. Regarding the ablation experiments, specifically the GC-RIF experiment in the LEVIR-CD dataset, the GC-RIF performance, in terms of F1 and IOU, should be better than GC. However, the opposite result is mentioned in this dataset. It is advised to repeat the experiments to ensure the accuracy of the results or give some analysis.
8. A contradiction arises between the initial reference to "FDCFNet" (Page7 Line 317,319) and the subsequent mention of "HFA-GCN" in all the other sections such as the Experiment and Analysis.
9. The introduction to compared models in section 3.2 is too brief. It is recommended to provide a brief explanation about their characteristics.
10. It is suggested to include an introduction to the adopted loss functions and provide their equations.
11. The article states that five evaluation metrics are referenced but only four are shown in the experimental figures. This should be corrected.
12. The ablation experiment section should provide clear explanations for both GC and GC-RIF, as the current wording is somewhat obscure.
13. The inconsistency between "HAF" in Table 4 and the previously introduced "HFA" should be resolved.
No further comments.
Author Response
Thanks for your suggestions, we have revised and responded to each comment, please see the attachment.
Author Response File: Author Response.docx
Reviewer 4 Report
In this paper, the authors proposed a method based on hierarchical feature association and a global correction network for change detection in earth observation. The proposed method is based on sound principles and a good experimental design has been performed. The shown experimental results are better than state of the art. I consider the article apt for publication after a few minor revisions:
Comments:
1. The numbers in the Line 76, 92, and 108 are the same as the first-level numbers, which should be corrected.
2. All the formulas are not standardized, variables should be in italics or bold type.
3. Fig.2 and Fig.3 have no relative position in the text.
4. All the positions of “where” in the text should be unified. Some of them are flush without any blanks, such as Line 263 while the others are with blanks, such as 293.
5. The section 3.3.1-3.3.3 don’t need to be as third-level numbers, maybe they can be instead of (1) or (a).
6. “OURS” is more proper than “OUR” in the experiments.
7. There are some small mistakes in the manuscript, such as Line 418 “In Fig.5, c”, Line 441 “Fig. 6” Line 454 “In Fig. e” and so on.
8. The authors stated that “the disadvantages of this model are high computational effort, many parameters, and high resource consumption.” There is no Table or something else to discuss about it.
Author Response
Thanks for your suggestions, we have revised and responded to each comment, please see the attachment.
Author Response File: Author Response.docx
Round 2
Reviewer 3 Report
The authors have made appropriate revisions based on the comments and improved the quality of the manuscript. I recommend accepting the manuscript as its current status.