Bias in Deep Neural Networks in Land Use Characterization for International Development
Round 1
Reviewer 1 Report
Dear Authors,
Many thanks for your manuscript submission to Journal of Remote Sensing. I appreciate your research work, while a few aspects are suggested for your edits in the revision process:
a) Add some quantitative results in Abstract section, and remove "keyword 1, 2, 3" in the current version.
b) Add a short paragraph to summarize major contributions of your work.
c) If possible, consider supplementing a Section of Related Work for some parallel comparison on the origination of your models and algorithms, then proceed with the section of Materials and Methods.
d) If your present the parameters of algorithms and the keynote procedure of your experiments in tabulated results (which means using tables to note the crucial details), that would be much better for review.
e) This paper lacks quantitative results in the experimental section, which urges careful edits and quite a lot of re-work to convince your framework.
f) Evaluation metrics are not clearly presented, please specify each of them in the Equations and the updated experiments (including ablation study, if viable); also, I think the authors need to convince that their scenario of research work at least have some intersections on topics of remote sensing.
g) While the authors cited dozens of references in the latest five years, a great number of latest research work on deep convolutional neural network (DCNN) and deep belief neural network (DBNN),etc. (and their variations) are missing. Meanwhile, each of the citations should comply with the well-recognized publication format on MDPI Journal of Remote Sensing.
h) Published Remote Sensing Journals are ranging from 15-30 pages and 20 pages on average. This manuscript has only 9 pages in content plus 2 pages of References. I suggest the authors spend some time to supplement their expected re-work and comprehensively improve the quality of product.
Once again, thank you and we look forward to seeing your future success. Stay well and good luck!
Best wishes,
Yours faithfully,
Author Response
Dear reviewer,
We appreciate your time and efforts to go through our manuscript and providing valuable comments.
We made our best efforts to address the topics raised in your review comments.
Especially, we made a substantial effort to supplement and expand the presentations in results section as recommended.
Please find the detailed answers to your comments below and in the manuscript as well.
Best regards,
-----------------------------------------
Many thanks for your manuscript submission to Journal of Remote Sensing. I appreciate your research work, while a few aspects are suggested for your edits in the revision process:
a)Add some quantitative results in Abstract section, and remove "keyword 1, 2, 3" in the current version.
The abstract is updated to include some quantitative results.
Keywords are updated in the manuscript
b) Add a short paragraph to summarize major contributions of your work.
Thanks for the comment.
Our study is, to our best knowledge, the first effort to quantify biases in AI for land use classification using remote sensing data.
We have updated the abstract with the following sentences.
“The framework used in our study to better understand biases in DNN models would be useful when Machine learning (ML) techniques are adopted in lieu of ground based data collection for international development programs. Because such programs aim to solve issues of social inequality to which MLs are only applicable when they are transparent and accountable. “
c) If possible, consider supplementing a Section of Related Work for some parallel comparison on the origination of your models and algorithms, then proceed with the section of Materials and Methods.
Thanks for the suggestion. We updated the manuscript with a new section of 'Related Works' and provided relevant references.
d) If your present the parameters of algorithms and the keynote procedure of your experiments in tabulated results (which means using tables to note the crucial details), that would be much better for review.
We have updated the manuscript with tables to show the detailed parameters and variables in Appendix A (Table A1, A2, A3, A4)
e) This paper lacks quantitative results in the experimental section, which urges careful edits and quite a lot of re-work to convince your framework.
We have substantially increased methods and results section with much more information that were not included in the previous version of the manuscript.
f) Evaluation metrics are not clearly presented, please specify each of them in the Equations and the updated experiments (including ablation study, if viable); also, I think the authors need to convince that their scenario of research work at least have some intersections on topics of remote sensing.
Thanks for the comments. The manuscript is updated to include confusion matrices for overall model performance (Table 2), accuracies for urban and rural areas (Table 3), accuracies for different clusters (Table 5, Table A1, Table A2). Graphs of the learning curve from our model (Figure 2) are also added to the manuscript.
As we have updated in the abstract, our work has an implication when AI based land use classification on remote sensing data are used in place of ground observation by providing a framework to assess biases.
g) While the authors cited dozens of references in the latest five years, a great number of latest research work on deep convolutional neural network (DCNN) and deep belief neural network (DBNN),etc. (and their variations) are missing. Meanwhile, each of the citations should comply with the well-recognized publication format on MDPI Journal of Remote Sensing.
Thanks for the comment. We have updated the manuscript with relevant references. Updates are made in section 2. Related Works in the manuscript.
h) Published Remote Sensing Journals are ranging from 15-30 pages and 20 pages on average. This manuscript has only 9 pages in content plus 2 pages of References. I suggest the authors spend some time to supplement their expected re-work and comprehensively improve the quality of product.
Although we initially submitted this paper as a ‘technical note’, we took the reviewer’s comments and expanded our manuscript significantly. We believe that now the length of our manuscript is long enough to be considered as an article.
Once again, thank you and we look forward to seeing your future success. Stay well and good luck!
Best wishes,
Yours faithfully,
Reviewer 2 Report
Review comments: Manuscript ID:-remotesensing-1287147
General Comment
This is an interesting manuscript about exploring ways to quantifying bias in Deep Neural Networks in land use characterization. However, there are aspects that require improvement and clear presentation before being considering for publication. Specific comments and suggestions are included below.
Specific comments and suggestions
L1-19; I suggest including a concluding sentence in the abstract.
L22-25: “Applications of popular machine learning algorithms such as Deep Neural Networks (DNN) on real-world problems are on the increase……,” for such claim I suggest adding more sources.
L25-26: add source(s)
L31-33: I suggest including the value of accuracies from such earlier studies. This will help by how much the present study improved the classification accuracy.
L33-37: Which one is more affected by the socio-economic dynamics? The land use or land cover? To my knowledge my study shows that land cover is more affected by the socio-economic dynamics of a given region. If this is the case, why the authors ignored this, and made the focus of this study into identifying bias only for the land use classification? I suggest justifying the subject of their study.
L44-46: I also suggest here for including the accuracy value for a reason mentioned under L31-33.
L48: I suggest defining the abbreviation “AI”. Though it is common, there may be a reader who does know what it stands for.
L116-120: How the resolution differences of the images and the corresponding socio economic variables standardized?
L129: by how much it outperformed?
L132-133: “The entire satellite imagery dataset was randomly split as 80% of images for training (2191 images) and 20% for testing (685 images),” I suggest adding source(s) for the base of such split.
L165: add source(s) of the formula.
L169-189: I suggest presenting the accuracy results in a confusion matrix table, where we can see the producer, user and overall accuracies.
L236-240: “For each cluster, we calculated the distribution of eight socioeconomic covariates: distance to a road (log transformed), distance to waterway (log transformed), distance to IUCN areas, VIIRS night-time light data (which is often used as a proxy for wealth), elevation of the area, slope of the area, female population between 5 and 10 years, and male population between 5 and 10 years,” these are methods not results. I suggest moving to the appropriate place in the method section and presenting the results directly here.
L273-319: This not a discussion. It is extension of the major findings. I suggest reworking to this part. Either mixing the discussion and results section or moving those paragraphs into the results section, and focus on the discussing of the major findings in this section. Generally, the discussion part need to address the following points: how this study will increase our knowledge base and inspire others to conduct further research, how the study results (not) support findings of earlier studies, and whether your findings agree with current knowledge and expectations.
Author Response
Dear reviewer,
We appreciate your time and efforts to go through our manuscript and providing valuable comments.
We made our best efforts to address the topics raised in your review comments.
Especially, we made a substantial effort to supplement and expand the presentations in results section as recommended.
Please find the detailed answers to your comments below and in the manuscript as well.
Best regards,
-----------------------------------------------
General Comment
This is an interesting manuscript about exploring ways to quantifying bias in Deep Neural Networks in land use characterization. However, there are aspects that require improvement and clear presentation before being considering for publication. Specific comments and suggestions are included below.
Specific comments and suggestions
L1-19; I suggest including a concluding sentence in the abstract.
Thanks for the comment, we have updated the abstract with following concluding sentence.
“The framework used in our study to better understand biases in DNN models would be useful when Machine learning (ML) techniques are adopted in lieu of ground based data collection for international development programs. Because such programs aim to solve issues of social inequality to which MLs are only applicable when they are transparent and accountable. “
L22-25: “Applications of popular machine learning algorithms such as Deep Neural Networks (DNN) on real-world problems are on the increase……,” for such claim I suggest adding more sources.
We have added an extra section 2. Related Works to provide more abundant references of previous works using DNN.
L25-26: add source(s)
The manuscript is updated with new references (line 39)
L31-33: I suggest including the value of accuracies from such earlier studies. This will help by how much the present study improved the classification accuracy.
We have updated the manuscript with the overall accuracy from this reference.
“DNN for land cover classification has been explored extensively due to its objectivity in producing promising results such as high-resolution building footprints with considerably high validation accuracy of 0.93 (3,4).”
L33-37: Which one is more affected by the socio-economic dynamics? The land use or land cover? To my knowledge my study shows that land cover is more affected by the socio-economic dynamics of a given region. If this is the case, why the authors ignored this, and made the focus of this study into identifying bias only for the land use classification? I suggest justifying the subject of their study.
Thanks for the comment. We agree that Land Cover may be under stronger influence by socio-economic dynamics. However, our study was carried within the context of a project that uses the results from a Land Use classification for development project (line 90-94). Besides, our manuscript is submitted to a Special Issue of Remote Sensing on “Land Use Classification with GIS and Remote Sensing Data Based on AI Technology.” Therefore we made a strong focus on Land Use classification not because we think that land cover is less affected by the socio-economic dynamics.
L44-46: I also suggest here for including the accuracy value for a reason mentioned under L31-33.
The manuscript is updated with the accuracies from the reference.
L48: I suggest defining the abbreviation “AI”. Though it is common, there may be a reader who does know what it stands for.
The manuscript is updated to provide the full words for AI.
L116-120: How the resolution differences of the images and the corresponding socio economic variables standardized?
Thanks for the comment, we used the spatially standardized data from Worldpop. More details regarding the resolution and the links to the data sources are found in Table A4.
L129: by how much it outperformed?
Table 1 is added for an easy comparison between the performance of different algorithms.
L132-133: “The entire satellite imagery dataset was randomly split as 80% of images for training (2191 images) and 20% for testing (685 images),” I suggest adding source(s) for the base of such split.
We followed widely adopted methods for cross validation using a proportion between training and test samples practiced by many researchers such as Chollet 2017. The manuscript is updated with the reference.
L165: add source(s) of the formula.
A reference Foody and Giles on land cover classification accuracy assessment is added in the manuscript
L169-189: I suggest presenting the accuracy results in a confusion matrix table, where we can see the producer, user and overall accuracies.
The manuscript is updated to include confusion matrices for overall model performance (Table 2), for accuracies for urban and rural area (Table 3), accuracies for different clusters (Table 5, Table A1, Table A2)
L236-240: “For each cluster, we calculated the distribution of eight socioeconomic covariates: distance to a road (log transformed), distance to waterway (log transformed), distance to IUCN areas, VIIRS night-time light data (which is often used as a proxy for wealth), elevation of the area, slope of the area, female population between 5 and 10 years, and male population between 5 and 10 years,” these are methods not results. I suggest moving to the appropriate place in the method section and presenting the results directly here.
Thanks for the comment. We moved the paragraph to the method section under 3.4 socio-economic covariates analysis
L273-319: This not a discussion. It is extension of the major findings. I suggest reworking to this part. Either mixing the discussion and results section or moving those paragraphs into the results section, and focus on the discussing of the major findings in this section. Generally, the discussion part need to address the following points: how this study will increase our knowledge base and inspire others to conduct further research, how the study results (not) support findings of earlier studies, and whether your findings agree with current knowledge and expectations.
Thanks a lot for this comment. Originally this manuscript did not have separate results and discussion sections. We agree with the author's point and take the recommendation to merge two sections into one. We updated the manuscript by merging ‘Results’ and ‘Discussion’ sections into ‘Results and Discussion”.
Round 2
Reviewer 1 Report
Dear Authors,
Thanks a lot for your updated manuscript. After your careful edits, the comprehensive quality of this paper has been significantly improved. There are a few minor problematic issues (not limited to these), which I listed as below:
i) The original version cited each references as "[ . ]", which I think is correct, while this version changed the format as "( . )", I don't know why. All other manuscript I reviewed use "[ . ]" or "[ . ] - [ . ]" or "[ . - . ]" when citing.
ii) Be sure to apply middle-alignment for all the related figures (and all of them should comply with the appropriate size and resolution in RS journal).
iii) Make the "k" of "k-means" italic, and use subscript of "1" for "F1-score".
iv) There are still some grammatical errors in the context of this version, and I think that the Conclusion section can be strengthened. Please apply.
v) References: some unprofessional citations of long conference persist in this manuscript, please check the RS Template to conduct proofreading.
We look forward to seeing your great success on acceptance. Good luck!
Best regards,
Yours faithfully,
Author Response
Dear reviewer,
Thanks a lot for your thorough review and kind words.
We appreciate your input throughout the review process which greatly helped to improve the quality of our manuscript.
Below, we addressed the comments and made updates in the manuscript accordingly.
Best,
Dohyung Kim
-------------------------------------------------------------------------
The original version cited each references as "[ . ]", which I think is correct, while this version changed the format as "( . )", I don't know why. All other manuscript I reviewed use "[ . ]" or "[ . ] - [ . ]" or "[ . - . ]" when citing.
Thanks for the comment. We have changed the citation style following MDPI style.
ii) Be sure to apply middle-alignment for all the related figures (and all of them should comply with the appropriate size and resolution in RS journal).
We have used MDPI style for figures and to respond the reviewer’s comment, we have applied middle-alignment to all the figures.
iii) Make the "k" of "k-means" italic, and use subscript of "1" for "F1-score".
Thanks for the comment, updates are made in the manuscript accordingly.
iv) There are still some grammatical errors in the context of this version, and I think that the Conclusion section can be strengthened. Please apply.
We have gone through the manuscript and corrected the grammatical errors. We also updated conclusion section with below paragraph.
“At the beginning of this article, we argued that because of the lack of explainability and biases in the DNN base model for land use classification, it would demonstrate worse efficacy for the most vulnerable communities. Explainability is one of the most crucial factors of AI algorithms if they are to be applied in real world problems, but unfortunately, not many advances have been seen in the field of LULCC, especially for the applications on land use classification. Through a novel combination of techniques ranging from DNN to clustering algorithms, we explored and identified biases in AI, rendering the process of automatic identification of schools more transparent and explainable.
Through our study, we have identified three possible sources of bias in the DNN base models: the socioeconomic covariates used as proxy for socioeconomic development, the bias from the sample of images used to train the model and the bias introduced by the original dataset used to pre-train our deep neural network. One of the most critical findings from this study is that DNN based models could be least effective for the most vulnerable communities.
This finding is important when the DNN based results are used in a project to solve real world issues because such biases in DNN models can significantly undermine the effectiveness of development projects which are increasingly dependent on data and insights produced using DNN. As such, we envision that the framework of our study be applied to enlighten the errors and biases in DNN based models adopted in various types of humanitarian operations.
We have identified some lines of future work that could improve the bias detection framework presented in this paper. First, we may increase the numbers of clusters with more training samples to detect the more nuanced differences from the school buildings. Secondly, another approach worth considering would be to extract the identified school from each image (using the class activation gradient) and then training a new model with less noise only on the most important section of the image: the building itself. Furthermore, an ensemble model [57] could be used to take both kinds of features: the type of buildings and the landscape that surrounds it. It would be also fruitful to pursue further research about the relationship between a Deep Neural Network's performance and the socio-economic context of the landscapes where the subjects of the models are located, aiming to discover more geospatial variables linked with the performance of the model.
Finally, we hope that our study contributes in enhancing the awareness of the importance of the explainable and equitable algorithms for the development sector.”
v) References: some unprofessional citations of long conference persist in this manuscript, please check the RS Template to conduct proofreading.
We have gone through each citation and corrected errors.
Reviewer 2 Report
Review comments: Manuscript ID:- remotesensing-1287147
The manuscript (ID: remotesensing-1287147) entitled with “ Bias in Deep Neural Networks in Land Use Characterization for International Development” has gone through a significant revision as compared to the earlier version. Major issues from my side were already taken into account.
The manuscript has merit for identifying ways to quantify biases in Deep Neural Networks for Land Use characterization with an example of identifying school buildings in Colombia.
The study uses datasets from satellite imageries and corresponding socio economic variables to address the research objective. The findings provided three possible sources bias, which can be used as a base for future work. Thus, I recommend considering this manuscript for publication.
Author Response
Dear Reviewer,
Thanks a lot for your inputs throughout the reviewer process.
We believe that the quality of our manuscript has been greatly improved thanks to your insightful comments.
We appreciate for your time and efforts again.
Best,
Dohyung Kim