1. Introduction
With the expanding human population, human interventions have intensified in nature to fulfill diverse needs. Consequently, it becomes crucial to monitor environmental changes to preserve wildlife and effectively manage human activities [
1]. Field-based surveys are acknowledged as a primary method of detecting changes but are burdened with various drawbacks. They are time-consuming, require significant human resources for fieldwork, and have limitations in terms of geographic coverage. These factors present challenges in monitoring changes using solely field-based techniques [
2]. On the other hand, multi-temporal remote sensing images provide a cost-effective and efficient approach to monitoring changes in the Earth’s surface [
3].
Change detection (CD) methods can be categorized into supervised and unsupervised approaches. Unsupervised methods directly detect changes without needing training samples, while supervised methods utilize training samples to identify changes [
4]. Mishra et al. [
5] studied land use and land cover changes in a Himalayan watershed using the maximum likelihood algorithm on Landsat-5 and Sentinel-2 images. Christaki et al. [
6] applied Artificial Neural Networks to detect changes in UAV images after a catastrophic earthquake, explicitly focusing on textural features. While classical algorithms primarily rely on spectral information, which may yield less accurate outcomes, they can incorporate spatial features to improve identification accuracy. However, the manual extraction of spatial features is scene-specific and requires the careful selection of appropriate features from a range of options [
7].
In contrast to classical feature extraction methods, deep learning networks can automatically extract high-level spectral–spatial information. As a result, the user’s involvement in determining and identifying suitable features is reduced. Furthermore, the extracted features will no longer depend on the image scene. Roy et al. [
8] introduced a new framework based on convolutional neural networks (CNNs) in which deep spatial features extracted by a 2-D CNN were used as inputs for a 3-D CNN. Aghdami-nia et al. [
9] developed a modified version of the standard U-Net model called the automatic coastline extraction framework to enhance sea–land segmentation. Previous methods have primarily focused on utilizing single-scale CNNs, which limits their ability to capture the intricate multiscale spatial patterns inherent in images. Additionally, selecting the appropriate input patch size around each pixel requires precise user input.
The primary motivation behind our research is to improve the accuracy and comprehensiveness of change detection by automating the extraction of high-level information and surpassing the limitations of traditional CD methods. This study introduces a novel CNN-based CD method that considers the multiscale spectral-spatial features. The performance of the proposed model is evaluated against conventional techniques such as the change vector analysis (CVA) method and a random forest (RF) algorithm. Comparative analyses demonstrate the superiority of our CNN-based CD method and provide valuable insights into its reliability and accuracy. The findings of this study have the potential to enhance wildlife conservation efforts, facilitate the effective management of human activities, and help broaden the effectiveness of remote sensing in environmental monitoring.
The paper is organized as follows:
Section 2 outlines the research methodology.
Section 3 discusses the experimental results. Lastly,
Section 4 provides a summary of the conclusions.
2. Methodology
As mentioned, this study compares the proposed MSCNN CD method with two classical CD methods, the CVA and RF methods. The workflow of the study is depicted in
Figure 1.
Figure 1 illustrates the initial steps of the study in which the images undergo geometric and radiometric preprocessing. This step is essential for all the research conducted. Training and testing samples are collected from a manually generated ground truth image in the subsequent step. Following this, change maps are generated using the three CD methods. Finally, the change maps are evaluated and compared. The following sections provide a summarized description of each of the employed CD methods.
2.1. CVA
The CVA technique defines a change vector as the disparity vector between two n-dimensional vectors in a feature space, thus establishing it as an unsupervised method. These vectors represent two separate observations of the same pixel at different time points. The length of the change vector corresponds to the magnitude of the change event in the spectral feature space. The change magnitude (CM) can be quantified as follows:
where
represents the digital number of band
j for data
i [
10]. Then, the Otsu thresholding technique [
11] is employed to obtain a binary change map.
2.2. RF
The RF is recognized as a classifier employing a Classification and Regression Tree ensemble for prediction purposes. The trees are generated using a bagging technique in which a subset of training samples is randomly selected with replacement. Consequently, certain samples may be drawn multiple times, while others may not be chosen at all [
12]. The RF model is trained using the collected samples, and subsequently, predictive processing is applied to the stacked images to generate a change map.
2.3. Multiscale CNN
CNNs have been extensively used in various remote sensing applications. CNNs utilize a shared-connection kernel to extract high-level spatial features. These networks include multiple layers, including convolutional, activation function, batch normalization, pooling, and fully connected layers [
13]. As previously mentioned, single-scale CNNs struggle to capture multiscale information in remote sensing images. Determining the optimal patch size requires a time-consuming trial-and-error process [
14]. This study introduces the MSCNN as a potential solution, incorporating a multiscale framework that eliminates the need to search for an optimal patch size and reduces reliance on a single value. The desired configuration is depicted in
Figure 2, illustrating the network architecture.
The change identification process involves utilizing separate 2ِِD CNN networks with various input dimensions to classify the stacked bi-temporal Landsat images, and the Majority Voting (MV) algorithm is employed to integrate the results. The [3 × 3, 7 × 7, and 9 × 9] patch sizes are used as inputs. The sequencing of the filters is presented in the subsequent order: [64,128,256], accompanied by a kernel size of 3 × 3. Batch normalization layers are used to address overfitting in the convolutional layers. The learning rate and the optimizer are set to 0.0001 and Adam, respectively.
3. Experimental Result
3.1. Study Area and Dataset
This study utilizes Landsat-8 satellite images to evaluate the effectiveness of the proposed network in monitoring changes in the Sahand City area. Sahand City is situated in the East Azerbaijan province of Iran, with a longitude of 46°7′19.16″ and a latitude of 37°56′18.41″. In response to the population growth of Tabriz, this city was established in 2007 as a measure of population control and city management. Sahand City, located 20 km southwest of Tabriz, has witnessed rapid development in recent decades, particularly after the construction of the Tabriz–Sahand highway, which has improved accessibility for residents of both cities. The Landsat images were obtained from the Google Earth Engine on 10 July 2013 and 1 August 2021. The geographic location of the studied area is depicted in
Figure 3.
3.2. Result Analysis
Figure 4 visually presents the obtained binary change maps generated by the CVA, RF, and MSCNN techniques. The findings indicate that CVA incorrectly identified most areas as changes. In contrast, the RF and MSCNN techniques demonstrated superior performance in detecting changes.
The CVA method solely relies on pixel-based information and often exhibits inadequate efficacy in change detection. Introducing sample data to the change detection algorithms, known as the supervised method, can significantly improve detection accuracy. To enhance result analysis,
Figure 5 showcases the confusion matrices of the binary change maps, comparing them with the ground truth data. This presentation allows for a comprehensive evaluation. Based on the confusion matrices, it is evident that the CVA algorithm misclassified 190 pixels, while RF and MSCNN misclassified 132 and 50 pixels, respectively. These numbers provide valuable insights into the performance of each algorithm in terms of pixel classification accuracy.
Additional evaluation criteria, including precision, precision, recall, f1 score, overall accuracy (O.A), and the kappa coefficient (K.C), are utilized to conduct a comprehensive and quantitative assessment of the results. Based on the assessment measures in
Table 1, the proposed MSCNN approach demonstrates a precision of 89.58% in detecting changes, representing the highest precision among all methods. In contrast, CVA exhibits the lowest performance, with a precision value of 60.42%. Similarly, when considering other criteria, it becomes evident that the proposed network outperforms both the RF and CVA algorithms with respect to accuracy. These findings highlight the superior performance of the proposed network compared to the other two methods.
Sahand City has experienced notable transformations, particularly in converting barren lands into urban areas. Previously, this region was predominantly barren, providing an ideal location for urban development. The majority of changes observed in the area involve the construction of buildings and transportation routes.
4. Conclusions
Progress in remote sensing methodologies has greatly improved the monitoring of environmental changes, including in urban areas. This advancement has significantly enhanced our understanding of and ability to address ecosystem modifications, which also have notable economic implications. However, classical methods cannot incorporate spatial information into their analyses, limiting their effectiveness in considering the spatial context of detected changes. On the other hand, deep learning-based techniques that extract spatial features offer high accuracy in change detection. Due to the rapid population growth of Tabriz, Sahand City has undergone significant development in a short period to accommodate citizens. Therefore, examining changes in this city can provide valuable insights for better urban planning. To this aim, this study compared a new deep learning-based CD approach to classical CD methods, namely RF and CVA, to detect changes in Sahand City. Based on an evaluation of the results, the unsupervised CVA method had the lowest performance in CD. Employing supervised RF algorithms can enhance change detection accuracy, but utilizing the MSCNN network resulted in a remarkable 17% increase in the overall accuracy of the binary change map. The construction of buildings and new transportation infrastructure accounts for most of the changes in the area. This investigation challenges the widely held belief that simple algorithms can effectively detect changes. In contrast, the findings emphasize the importance and effectiveness of advanced deep learning techniques in substantially improving outcome accuracy.
Author Contributions
Conceptualization, S.T. and B.A.B.; methodology, S.T. and B.A.B.; software, S.T. and B.A.B.; validation, S.T. and B.A.B.; formal analysis, S.T. and B.A.B.; investigation, S.T. and B.A.B.; resources, S.T. and B.A.B.; data curation, S.T.; writing—original draft preparation, S.T.; writing—review and editing, S.T. and B.A.B.; visualization, S.T.; supervision, M.M. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Acknowledgments
We would like to thank the European Space Agency (ESA) for generously providing us with free access to Landsat 8 imagery.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Shi, W.; Zhang, M.; Zhang, R.; Chen, S.; Zhan, Z. Change Detection Based on Artificial Intelligence: State-of-the-Art and Challenges. Remote Sens. 2020, 12, 1688. [Google Scholar] [CrossRef]
- Liu, S.; Marinelli, D.; Bruzzone, L.; Bovolo, F. A review of change detection in multitemporal hyperspectral images: Current techniques, applications, and challenges. IEEE Geosci. Remote Sens. Mag. 2019, 7, 140–158. [Google Scholar] [CrossRef]
- Wang, L.; Yan, J.; Mu, L.; Huang, L. Knowledge discovery from remote sensing images: A review. WIREs Data Min. Knowl. Discov. 2020, 10, e1371. [Google Scholar] [CrossRef]
- Chughtai, A.H.; Abbasi, H.; Karas, I.R. A review on change detection method and accuracy assessment for land use land cover. Remote Sens. Appl. Soc. Environ. 2021, 22, 100482. [Google Scholar] [CrossRef]
- Mishra, P.K.; Rai, A.; Rai, S.C. Land use and land cover change detection using geospatial techniques in the Sikkim Himalaya, India. Egypt. J. Remote Sens. Space Sci. 2020, 23, 133–143. [Google Scholar] [CrossRef]
- Christaki, M.; Vasilakos, C.; Papadopoulou, E.-E.; Tataris, G.; Siarkos, I.; Soulakellis, N. Building Change Detection Based on a Gray-Level Co-Occurrence Matrix and Artificial Neural Networks. Drones 2022, 6, 414. [Google Scholar] [CrossRef]
- Ji, M.; Liu, L.; Du, R.; Buchroithner, M.F. A Comparative Study of Texture and Convolutional Neural Network Features for Detecting Collapsed Buildings After Earthquakes Using Pre- and Post-Event Satellite Imagery. Remote Sens. 2019, 11, 1202. [Google Scholar] [CrossRef]
- Roy, S.K.; Krishna, G.; Dubey, S.R.; Chaudhuri, B.B. HybridSN: Exploring 3-D–2-D CNN Feature Hierarchy for Hyperspectral Image Classification. IEEE Geosci. Remote Sens. Lett. 2019, 17, 277–281. [Google Scholar] [CrossRef]
- Aghdami-Nia, M.; Shah-Hosseini, R.; Rostami, A.; Homayouni, S. Automatic coastline extraction through enhanced sea-land segmentation by modifying Standard U-Net. Int. J. Appl. Earth Obs. Geoinf. 2022, 109, 102785. [Google Scholar] [CrossRef]
- Nackaerts, K.; Vaesen, K.; Muys, B.; Coppin, P. Comparative performance of a modified change vector analysis in forest change detection. Int. J. Remote Sens. 2005, 26, 839–852. [Google Scholar] [CrossRef]
- Otsu, N. A Threshold Selection Method from Gray-Level Histograms. IEEE Trans. Syst. Man. Cybern. 1979, 9, 62–66. [Google Scholar] [CrossRef]
- Sheykhmousa, M.; Mahdianpari, M.; Ghanbari, H.; Mohammadimanesh, F.; Ghamisi, P.; Homayouni, S. Support Vector Machine Versus Random Forest for Remote Sensing Image Classification: A Meta-Analysis and Systematic Review. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 6308–6325. [Google Scholar] [CrossRef]
- Asghari Beirami, B.; Mokhtarzade, M. A new deep learning approach for hyperspectral image classification based on multifeature local kernel descriptors. Adv. Space Res. 2023, 72, 1703–1720. [Google Scholar] [CrossRef]
- Sharifi, O.; Mokhtarzadeh, M.; Asghari Beirami, B. A new deep learning approach for classification of hyperspectral images: Feature and decision level fusion of spectral and spatial features in multiscale CNN. Geocarto Int. 2021, 37, 4208–4233. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).