Multi-Class Double-Transformation Network for SAR Image Registration
Abstract
:1. Introduction
- We utilize each key point directly as a class to design the multi-class model of SAR image registration, which avoids the difficulty of constructing the positive instances (matched-point pairs) in the traditional (two-classification) registration model.
- We design the double-transformation network with the coarse-to-precise structure, where key points from two images are, respectively, used to train two sub-networks that alternately predict key points from another image. It addresses the problem that the categories are inconsistent in training and testing sets.
- A precise-matching module is designed to modify the predictions of two sub-networks and obtain the consistent matched-points, where the nearest points of each key point are introduced to refine the predicted matched-points.
2. Related Works
2.1. The Attention Mechanism
2.2. The Transformer Model
3. The Proposed Method
3.1. The Multi-Class Double-Transformation Networks
3.1.1. Constructing Samples-Based Key Points
3.1.2. Multi-Class Double-Transformation Networks
3.2. The Precise-Matching Module
4. Experiments and Analyses
- 1.
- expresses the root mean square error of the registration result. Note that means that the performance reaches sub-pixel accuracy.
- 2.
- is the number of matched-points pairs. Its value is higher, which may be beneficial for obtaining a transformation matrix with a better performance of image registration.
- 3.
- expresses the error obtained based on the Leave-One-Out strategy and the root mean square error. For each point in , is the average of all errors ( of points).
- 4.
- is used to detect whether the retained feature points are evenly distributed in the quadrant, and its value should be less than .
- 5.
- expresses the bad point proportion in obtained matched-points pairs, where a point with a residual value above a certain threshold (r) is called the bad point.
- 6.
- denotes the absolute value of the calculated correlation coefficient. Note that the Spearman correlation coefficient is used when ; otherwise, the Pearson correlation coefficient is applied.
- 7.
- is a statistical evaluation of the entire image feature point distribution [43], which should be less than .
- 8.
- is the linear combination of the above seven indicators, calculated byWhen , is not used, and the above formula is simplified as
4.1. Comparison and Analysis of the Experimental Results
4.2. The Visual Results of SAR Image Registration
4.3. Analyses on the Precise-Matching Module
4.4. Analyses on the Double-Transformation Network
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Fitch, J.P. Synthetic Aperture Radar; Springer Science & Business Media: New York, NY, USA, 2012. [Google Scholar]
- Joyce, K.E.; Samsonov, S.V.; Levick, S.R.; Engelbrecht, J.; Belliss, S. Mapping and monitoring geological hazards using optical, LiDAR, and synthetic aperture RADAR image data. Nat. Hazards 2014, 73, 137–163. [Google Scholar] [CrossRef]
- Quartulli, M.; Olaizola, I.G. A review of EO image information mining. ISPRS J. Photogramm. Remote Sens. 2013, 75, 11–28. [Google Scholar] [CrossRef] [Green Version]
- Wang, Y.; Du, L.; Dai, H. Unsupervised sar image change detection based on sift keypoints and region information. IEEE Geosci. Remote Sens. Lett. 2016, 13, 931–935. [Google Scholar] [CrossRef]
- Poulain, V.; Inglada, J.; Spigai, M. High-resolution optical and SAR image fusion for building database updating. IEEE Trans. Geosci. Remote Sens. 2011, 49, 2900–2910. [Google Scholar] [CrossRef] [Green Version]
- Byun, Y.; Choi, J.; Han, Y. An Area-Based Image Fusion Scheme for the Integration of SAR and Optical Satellite Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2013, 6, 2212–2220. [Google Scholar] [CrossRef]
- Moser, G.; Serpico, S.B. Unsupervised Change Detection from Multichannel SAR Data by Markovian Data Fusion. IEEE Trans. Geosci. Remote Sens. 2009, 47, 2114–2128. [Google Scholar] [CrossRef]
- Song, T.; Yi, S. Fast and Accurate Target Detection Based on Multiscale Saliency and Active Contour Model for High-Resolution SAR Images. IEEE Trans. Geosci. Remote Sens. 2016, 54, 5729–5744. [Google Scholar]
- Giusti, E.; Ghio, S.; Oveis, A.H.; Martorella, M. Proportional Similarity-Based Openmax Classifier for Open Set Recognition in SAR Images. Remote Sens. 2022, 14, 4665. [Google Scholar] [CrossRef]
- Schwind, P.; Suri, S.; Reinartz, P.; Siebert, A. Applicability of the sift operator to geometric sar image registration. Int. J. Remote Sens. 2010, 31, 1959–1980. [Google Scholar] [CrossRef]
- Wang, S.H.; You, H.J.; Fu, K. Bfsift: A novel method to find feature matches for sar image registration. IEEE Geosci. Remote Sens. Lett. 2012, 9, 649–653. [Google Scholar] [CrossRef]
- Wang, S.; Quan, D.; Liang, X.; Ning, M.; Guo, Y.; Jiao, L. A deep learning framework for remote sensing image registration. ISPRS J. Photogramm. Remote Sens. 2018, 145, 148–164. [Google Scholar] [CrossRef]
- Dellinger, F.; Delon, J.; Gousseau, Y.; Michel, J.; Tupin, F. SAR-SIFT: A SIFT-Like Algorithm for SAR Images. IEEE Trans. Geosci. Remote Sens. 2015, 53, 453–466. [Google Scholar] [CrossRef] [Green Version]
- Wu, Y.; Miao, Q.; Ma, W.; Gong, M.; Wang, S. PSOSAC: Particle Swarm Optimization Sample Consensus Algorithm for Remote Sensing Image Registration. IEEE Geosci. Remote Sens. Lett. 2018, 15, 242–246. [Google Scholar] [CrossRef]
- Liu, Y.; Zhou, Y.; Zhou, Y.; Ma, L.; Wang, B.; Zhang, F. Accelerating SAR Image Registration Using Swarm-Intelligent GPU Parallelization. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5694–5703. [Google Scholar] [CrossRef]
- Mao, S.; Yang, J.; Gou, S.; Jiao, L.; Xiong, T.; Xiong, L. Multi-Scale Fused SAR Image Registration Based on Deep Forest. Remote Sens. 2021, 13, 2227. [Google Scholar] [CrossRef]
- Zhang, S.; Sui, L.; Zhou, R.; Xun, Z.; Du, C.; Guo, X. Mountainous SAR Image Registration Using Image Simulation and an L2E Robust Estimator. Sustainability 2022, 14, 9315. [Google Scholar] [CrossRef]
- Gong, M.; Zhao, S.; Jiao, L.; Tian, D.; Wang, S. A novel coarse-to-fine scheme for automatic image registration based on SIFT and mutual information. IEEE Trans. Geosci. Remote Sens. 2013, 52, 4328–4338. [Google Scholar] [CrossRef]
- Sollers, J.J.; Buchanan, T.W.; Mowrer, S.M.; Hill, L.K.; Thayer, J.F. Comparison of the ratio of the standard deviation of the RR interval and the root mean squared successive differences (SD/rMSSD) to the low frequency-to-high frequency (LF/HF) ratio in a patient population and normal healthy controls. Biomed. Sci. Instrum. 2007, 43, 158–163. [Google Scholar] [PubMed]
- Ma, W.; Zhang, J.; Wu, Y.; Jiao, L.; Zhu, H.; Zhao, W. A Novel Two-Step Registration Method for Remote Sensing Images Based on Deep and Local Features. IEEE Trans. Geosci. Remote Sens. 2019, 57, 4834–4843. [Google Scholar] [CrossRef]
- Quan, D.; Wang, S.; Ning, M.; Xiong, T.; Jiao, L. Using deep neural networks for synthetic aperture radar image registration. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016; pp. 2799–2802. [Google Scholar]
- Ye, F.; Su, Y.; Xiao, H.; Zhao, X.; Min, W. Remote sensing image registration using convolutional neural network features. IEEE Geosci. Remote Sens. Lett. 2018, 15, 232–236. [Google Scholar] [CrossRef]
- Mu, J.; Gou, S.; Mao, S.; Zheng, S. A Stepwise Matching Method for Multi-modal Image based on Cascaded Network. In Proceedings of the 29th ACM International Conference on Multimedia, Nice, France, 21–25 October 2021; pp. 1284–1292. [Google Scholar]
- Zou, B.; Li, H.; Zhang, L. Self-Supervised SAR Image Registration With SAR-Superpoint and Transformation Aggregation. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5201115. [Google Scholar] [CrossRef]
- Mao, S.; Yang, J.; Gou, S.; Lu, K.; Jiao, L.; Xiong, T.; Xiong, L. Adaptive Self-Supervised SAR Image Registration with Modifications of Alignment Transformation. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5203715. [Google Scholar] [CrossRef]
- Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
- Chaudhari, S.; Mithal, V.; Polatkan, G.; Ramanath, R. An attentive survey of attention models. ACM Trans. Intell. Syst. Technol. 2021, 12, 53. [Google Scholar] [CrossRef]
- Kim, Y.; Lee, J.; Lee, E.B.; Lee, J.H. Application of Natural Language Processing (NLP) and Text-Mining of Big-Data to Engineering-Procurement-Construction (EPC) Bid and Contract Documents. In Proceedings of the 6th Conference on Data Science and Machine Learning Applications (CDMA), Riyadh, Saudi Arabia, 4–5 March 2020; pp. 123–128. [Google Scholar]
- Xu, K.; Ba, J.; Kiros, R.; Cho, K.; Courville, A.; Salakhudinov, R.; Zemel, R.; Bengio, Y. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 2048–2057. [Google Scholar]
- Guo, H.; Zheng, K.; Fan, X.; Yu, H.; Wang, S. Visual attention consistency under image transforms for multi-label image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 729–739. [Google Scholar]
- Haut, J.M.; Paoletti, M.E.; Plaza, J.; Plaza, A.; Li, J. Visual attention-driven hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2019, 57, 8065–8080. [Google Scholar] [CrossRef]
- Li, W.; Liu, K.; Zhang, L.; Cheng, F. Object detection based on an adaptive attention mechanism. Sci. Rep. 2020, 10, 11307. [Google Scholar] [CrossRef]
- Zhu, Y.; Zhao, C.; Guo, H.; Wang, J.; Zhao, X.; Lu, H. Attention CoupleNet: Fully convolutional attention coupling network for object detection. IEEE Trans. Image Process. 2018, 28, 113–126. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A Survey on Vision Transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 87–110. [Google Scholar] [CrossRef]
- Dai, Z.; Yang, Z.; Yang, Y.; Carbonell, J.; Le, Q.V.; Salakhutdinov, R. Transformer-xl: Attentive language models beyond a fixed-length context. arXiv 2019, arXiv:1901.02860. [Google Scholar]
- Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. Transunet: Transformers make strong encoders for medical image segmentation. arXiv 2021, arXiv:2102.04306. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 x 16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
- Mansard, E.P.; Funke, E.R. The measurement of incident and reflected spectra using a least squares method. Coast. Eng. Proc. 1980, 17, 8. [Google Scholar] [CrossRef] [Green Version]
- Hugo, T.; Matthieu, C.; Matthijs, D.; Francisco, M.; Alexandre, S.; Hervé, J. Training data-efficient image transformers & distillation through attention. In Proceedings of the 38th International Conference on Machine Learning, PMLR, Virtual Event, 18–24 July 2021; pp. 10347–10357. [Google Scholar]
- Goncalves, H.; Goncalves, J.A.; Corte-Real, L. Measures for an Objective Evaluation of the Geometric Correction Process Quality. IEEE Geosci. Remote Sens. Lett. 2009, 6, 292–296. [Google Scholar] [CrossRef]
- Lowe, D.G. Object recognition from local scale-invariant features. IEEE Int. Conf. Comput. Vis. 1999, 2, 1150–1157. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Methods | ||||||||
---|---|---|---|---|---|---|---|---|
SIFT | 17 | 1.2076 | 1.2139 | — | 0.6471 | 0.1367 | 0.9991 | 0.7048 |
SAR-SIFT | 66 | 1.2455 | 1.2491 | 0.6300 | 0.6212 | 0.1251 | 0.9961 | 0.6784 |
VGG16-LS | 58 | 0.5611 | 0.5694 | 0.6665 | 0.2556 | 0.0389 | 1.0000 | 0.4420 |
ResNet50-LS | 68 | 0.4818 | 0.4966 | 0.7162 | 0.2818 | 0.1943 | 0.9766 | 0.4489 |
ViT-LS | 64 | 0.5218 | 0.5304 | 0.6101 | 0.2330 | 0.1072 | 1.0000 | 0.4296 |
DNN + RANSAC | 8 | 0.6471 | 0.6766 | – | 0.1818 | 0.0943 | 0.9766 | 0.4484 |
MSDF-Net | 39 | 0.4345 | 0.4893 | 0.6101 | 0.3124 | 0.1072 | 1.0000 | 0.4304 |
AdaSSIR | 47 | 0.4217 | 0.4459 | 0.6254 | 0.3377 | 0.1165 | 1.0000 | 0.4287 |
STDT-Net (Ours) | 78 | 0.4490 | 0.4520 | 0.6254 | 0.2277 | 0.1165 | 1.0000 | 0.4122 |
Rank/All | 1/10 | 3/10 | 2/10 | 2/7 | 2/10 | 4/10 | 4/4 | 1/10 |
Methods | ||||||||
---|---|---|---|---|---|---|---|---|
SIFT | 69 | 1.1768 | 1.1806 | 0.9013 | 0.6812 | 0.9922 | 0.7010 | |
SAR-SIFT | 1.2487 | 1.2948 | 0.6016 | 0.6755 | 0.1274 | 0.9980 | 0.6910 | |
VGG16-LS | 112 | 0.5604 | 0.5685 | 0.6150 | 0.3621 | 0.1271 | 1.0000 | 0.4626 |
ResNet50-LS | 120 | 0.4903 | 0.5064 | 0.2515 | 0.1027 | 1.0000 | 0.4215 | |
ViT-LS | 109 | 0.5276 | 0.5371 | 0.7162 | 0.2529 | 0.1105 | 1.0000 | 0.4472 |
DNN+RANSAC | 8 | 0.7293 | 0.7582 | – | 0.5000 | 0.1227 | 0.9766 | 0.5365 |
MSDF-Net | 12 | 0.4645 | 0.4835 | – | 0.4000 | 0.1175 | 0.9999 | 0.4356 |
AdaSSIR | 71 | 0.4637 | 0.4707 | 0.6013 | 0.4545 | 0.1072 | 1.0000 | 0.4504 |
STDT-Net (Ours) | 115 | 0.4732 | 0.6740 | 0.1175 | 1.0000 | |||
Rank/All | 3/9 | 1/9 | 2/9 | 5/7 | 2/9 | 4/9 | 4/4 | 1/9 |
Methods | ||||||||
---|---|---|---|---|---|---|---|---|
SIFT | 11 | 0.9105 | 0.9436 | — | 0.5455 | 0.1055 | 0.9873 | 0.5908 |
SAR-SIFT | 1.1424 | 1.2948 | 0.5910 | 0.7419 | 0.0962 | 1.0000 | 0.6636 | |
VGG16-LS | 19 | 0.6089 | 0.6114 | — | 0.4211 | 0.1061 | 1.0000 | 0.4703 |
ResNet50-LS | 25 | 0.5725 | 0.5889 | 0.5814 | 0.6058 | 0.1387 | 1.0000 | 0.5102 |
ViT-LS | 20 | 0.5986 | 0.5571 | 0.5821 | 0.5875 | 0.1266 | 1.0000 | 0.5118 |
DNN+RANSAC | 10 | 0.8024 | 0.8518 | – | 0.6000 | 0.1381 | 0.9996 | 0.5821 |
MSDF-Net | 11 | 0.5923 | 0.6114 | – | 0.4351 | 0.0834 | 0.9990 | 0.4753 |
AdaSSIR | 20 | 0.5534 | 0.5720 | 0.5395 | 0.4444 | 0.1086 | 1.0000 | 0.4715 |
STDT-Net (Ours) | 24 | 0.5486 | 0.4038 | 0.1088 | 1.0000 | |||
Rank/All | 3/9 | 1/9 | 1/9 | 2/7 | 1/9 | 6/9 | 4/4 | 1/9 |
Methods | ||||||||
---|---|---|---|---|---|---|---|---|
SIFT | 88 | 1.1696 | 1.1711 | 0.6399 | 0.7841 | 0.1138 | 0.6757 | |
SAR-SIFT | 1.1903 | 1.1973 | 0.8961 | 0.8671 | 0.1318 | 1.0000 | 0.7390 | |
VGG16-LS | 54 | 0.5406 | 0.5504 | 0.6804 | 0.3187 | 0.1277 | 1.0000 | 0.4607 |
ResNet50-LS | 70 | 0.5036 | 0.5106 | 0.7162 | 0.2778 | 0.1208 | 0.9999 | 0.4470 |
ViT-LS | 67 | 0.5015 | 0.5095 | 0.2925 | 0.1281 | 1.0000 | 0.4356 | |
DNN+RANSAC | 10 | 0.5784 | 0.5906 | – | 0.0000 | 0.1308 | 0.9999 | 0.3946 |
MSDF-Net | 52 | 0.5051 | 0.5220 | 0.6112 | 0.7692 | 0.1434 | 1.0000 | 0.5215 |
AdaSSIR | 68 | 0.4858 | 0.4994 | 0.6013 | 0.5714 | 0.1149 | 1.0000 | 0.4776 |
STDT-Net (Ours) | 79 | 0.4808 | 0.4954 | 0.6740 | 0.2692 | 0.1134 | 1.0000 | 0.4347 |
Rank/All | 3/9 | 1/9 | 1/9 | 5/7 | 2/9 | 1/9 | 4/4 | 2/9 |
Datasets | Branch | Without Precise-Matching | With Precise-Matching |
---|---|---|---|
Wuhan | R→S | 0.4598 | 0.4579 |
S→R | 0.4620 | 0.4590 | |
YellowR1 | R→S | 0.5798 | 0.5525 |
S→R | 0.5585 | 0.5535 | |
YAMBA | R→S | 0.4788 | 0.4960 |
S→R | 0.4858 | 0.4763 | |
YellowR2 | R→S | 0.5253 | 0.5185 |
S→R | 0.5093 | 0.4960 |
Datasets | Performance | VGG16 | ResNet50 | ViT | Swin-Transformer |
---|---|---|---|---|---|
YellowR1 | (%) | 87.13 | 89.32 | 89.59 | 92.74 |
(m) | 47 | 38 | 42 | 31 | |
Wuhan | (%) | 89.26 | 92.71 | 91.10 | 94.83 |
(m) | 19 | 13 | 28 | 10 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Deng, X.; Mao, S.; Yang, J.; Lu, S.; Gou, S.; Zhou, Y.; Jiao, L. Multi-Class Double-Transformation Network for SAR Image Registration. Remote Sens. 2023, 15, 2927. https://doi.org/10.3390/rs15112927
Deng X, Mao S, Yang J, Lu S, Gou S, Zhou Y, Jiao L. Multi-Class Double-Transformation Network for SAR Image Registration. Remote Sensing. 2023; 15(11):2927. https://doi.org/10.3390/rs15112927
Chicago/Turabian StyleDeng, Xiaozheng, Shasha Mao, Jinyuan Yang, Shiming Lu, Shuiping Gou, Youming Zhou, and Licheng Jiao. 2023. "Multi-Class Double-Transformation Network for SAR Image Registration" Remote Sensing 15, no. 11: 2927. https://doi.org/10.3390/rs15112927
APA StyleDeng, X., Mao, S., Yang, J., Lu, S., Gou, S., Zhou, Y., & Jiao, L. (2023). Multi-Class Double-Transformation Network for SAR Image Registration. Remote Sensing, 15(11), 2927. https://doi.org/10.3390/rs15112927