DSFA-SwinNet: A Multi-Scale Attention Fusion Network for Photovoltaic Areas Detection
Abstract
:1. Introduction
- The geographical distribution of PV installations exhibits significant unevenness, indicating a lack of real-time coordination strategies. This results in an inability to integrate fine-grained data effectively, which constrains the maintenance of the facilities and the assessment of their eco-efficiency;
- The morphological complexity and textural diversity of PV areas pose a significant challenge to accurately identify their detailed features;
- Existing methods focus on single spatial forms, overlooking the diverse scales of PV installations.
- As depicted in Figure 1, there is a significant scale difference between PV arrays and panels in images. While Swin-Transformer [49] mitigates computational complexity with windowing and hierarchies, its fixed window size limits the capture of multi-scale features, leading to an absence of internal multi-scale information within the model [50,51,52];
- In feature learning, existing enhancement strategies do not sufficiently account for the specific characteristics of different PV areas, making it difficult to dynamically adapt to the multi-scale and multi-textural nature of PV features. Furthermore, they lack interpretability regarding hyperparameter tuning;
- Challenges arise as higher image resolutions lead to longer feature sequences, slowing down attention computations and reducing efficiency.
2. Materials and Methods
2.1. Materials
2.1.1. BDAPPV Dataset
2.1.2. Jiangsu PV Dataset
2.2. DSFA-SwinNet
2.2.1. Swin-Transformer Based U-Model Architecture
2.2.2. Dynamic Spatial-Frequency Attention
2.2.3. Pyramid Attention Refinement
2.2.4. Refined Skip Connection Strategy
2.2.5. Loss Function
3. Results
3.1. Experimental Setup
3.2. Evaluation Metrics
3.3. Preprocessing and Parameterization
3.3.1. Data Preprocessing
3.3.2. Hyperparameter Optimization
3.3.3. Experimental Parameter Settings
3.4. Comparison Evaluation
3.5. Complexity Analysis
3.6. Ablation Experiments
3.6.1. Ablation Experiments on Model Components
3.6.2. Ablation Experiments on Loss Functions
3.7. Cross-Subset Testing
- Training on the Google subset and predicting the test set from the PV03 subset;
- Training on the PV03 subset and predicting the test set from the Google subset.
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Construction of Photovoltaic Power Generation in 2023—National Energy Administration. Available online: http://www.nea.gov.cn/2024-02/28/c_1310765696.htm (accessed on 30 May 2024).
- Hernandez, R.R.; Easter, S.B.; Murphy-Mariscal, M.L.; Maestre, F.T.; Tavassoli, M.; Allen, E.B.; Barrows, C.W.; Belnap, J.; Ochoa-Hueso, R.; Ravi, S.; et al. Tawalbeh. Renew. Sustain. Energy Rev. 2014, 29, 766–779. [Google Scholar] [CrossRef]
- Tawalbeh, M.; Al-Othman, A.; Kafiah, F.; Abdelsalam, E.; Almomani, F.; Alkasrawi, M. Environmental Impacts of Solar Photovoltaic Systems: A Critical Review of Recent Progress and Future Outlook. Sci. Total Environ. 2021, 759, 143528. [Google Scholar] [CrossRef] [PubMed]
- Levin, M.O.; Kalies, E.L.; Forester, E.; Jackson, E.L.A.; Levin, A.H.; Markus, C.; McKenzie, P.F.; Meek, J.B.; Hernandez, R.R. Solar Energy-Driven Land-Cover Change Could Alter Landscapes Critical to Animal Movement in the Continental United States. Environ. Sci. Technol. 2023, 57, 11499–11509. [Google Scholar] [CrossRef] [PubMed]
- Cheng, Y.; Wang, W.; Ren, Z.; Zhao, Y.; Liao, Y.; Ge, Y.; Wang, J.; He, J.; Gu, Y.; Wang, Y.; et al. Multi-Scale Feature Fusion and Transformer Network for Urban Green Space Segmentation from High-Resolution Remote Sensing Images. Int. J. Appl. Earth Obs. Geoinf. 2023, 124, 103514. [Google Scholar] [CrossRef]
- Manso-Callejo, M.-Á.; Cira, C.-I.; Arranz-Justel, J.-J.; Sinde-González, I.; Sălăgean, T. Assessment of the Large-Scale Extraction of Photovoltaic (PV) Panels with a Workflow Based on Artificial Neural Networks and Algorithmic Postprocessing of Vectorization Results. Int. J. Appl. Earth Obs. Geoinf. 2023, 125, 103563. [Google Scholar] [CrossRef]
- Yang, R.; He, G.; Yin, R.; Wang, G.; Zhang, Z.; Long, T.; Peng, Y.; Wang, J. A Novel Weakly-Supervised Method Based on the Segment Anything Model for Seamless Transition from Classification to Segmentation: A Case Study in Segmenting Latent Photovoltaic Locations. Int. J. Appl. Earth Obs. Geoinf. 2024, 130, 103929. [Google Scholar] [CrossRef]
- Kruitwagen, L.; Story, K.T.; Friedrich, J.; Byers, L.; Skillman, S.; Hepburn, C. A Global Inventory of Photovoltaic Solar Energy Generating Units. Nature 2021, 598, 604–610. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Xu, M.; Wang, S.; Huang, Y.; Xie, Z. Mapping Photovoltaic Power Plants in China Using Landsat, Random Forest, and Google Earth Engine. Earth Syst. Sci. Data 2022, 14, 3743–3755. [Google Scholar] [CrossRef]
- Jörges, C.; Vidal, H.S.; Hank, T.; Bach, H. Detection of Solar Photovoltaic Power Plants Using Satellite and Airborne Hyperspectral Imaging. Remote Sens. 2023, 15, 3403. [Google Scholar] [CrossRef]
- Yang, H.L.; Yuan, J.; Lunga, D.; Laverdiere, M.; Rose, A.; Bhaduri, B. Building Extraction at Scale Using Convolutional Neural Network: Mapping of the United States. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 2600–2614. [Google Scholar] [CrossRef]
- Li, F.; Dong, W.; Wu, W. A General Model for Comprehensive Electrical Characterization of Photovoltaics under Partial Shaded Conditions. Adv. Appl. Energy 2023, 9, 100118. [Google Scholar] [CrossRef]
- Lin, S.; Yao, X.; Liu, X.; Wang, S.; Chen, H.-M.; Ding, L.; Zhang, J.; Chen, G.; Mei, Q. MS-AGAN: Road Extraction via Multi-Scale Information Fusion and Asymmetric Generative Adversarial Networks from High-Resolution Remote Sensing Images under Complex Backgrounds. Remote Sens. 2023, 15, 3367. [Google Scholar] [CrossRef]
- Zhang, X.; Zeraatpisheh, M.; Rahman, M.M.; Wang, S.; Xu, M. Texture Is Important in Improving the Accuracy of Mapping Photovoltaic Power Plants: A Case Study of Ningxia Autonomous Region, China. Remote Sens. 2021, 13, 3909. [Google Scholar] [CrossRef]
- Xia, Z.; Li, Y.; Guo, X.; Chen, R. High-Resolution Mapping of Water Photovoltaic Development in China through Satellite Imagery. Int. J. Appl. Earth Obs. Geoinf. 2022, 107, 102707. [Google Scholar] [CrossRef]
- Plakman, V.; Rosier, J.; van Vliet, J. Solar Park Detection from Publicly Available Satellite Imagery. GISci. Remote Sens. 2022, 59, 462–481. [Google Scholar] [CrossRef]
- Malof, J.M.; Bradbury, K.; Collins, L.M.; Newell, R.G. Automatic Detection of Solar Photovoltaic Arrays in High Resolution Aerial Imagery. Appl. Energy 2016, 183, 229–240. [Google Scholar] [CrossRef]
- Chen, Z.; Kang, Y.; Sun, Z.; Wu, F.; Zhang, Q. Extraction of Photovoltaic Plants Using Machine Learning Methods: A Case Study of the Pilot Energy City of Golmud, China. Remote Sens. 2022, 14, 2697. [Google Scholar] [CrossRef]
- Li, Q.; Feng, Y.; Leng, Y.; Chen, D. SolarFinder: Automatic Detection of Solar Photovoltaic Arrays. In Proceedings of the 2020 19th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN), Sydney, Australia, 21–24 April 2020; pp. 193–204. [Google Scholar]
- Yuan, W.; Xu, W. MSST-Net: A Multi-Scale Adaptive Network for Building Extraction from Remote Sensing Images Based on Swin Transformer. Remote Sens. 2021, 13, 4743. [Google Scholar] [CrossRef]
- Ying, Z.; Li, M.; Tong, W.; Haiyong, C. Automatic Detection of Photovoltaic Module Cells Using Multi-Channel Convolutional Neural Network. In Proceedings of the 2018 Chinese Automation Congress (CAC), Xi’an, China, 30 November–2 December 2018; pp. 3571–3576. [Google Scholar]
- He, K.; Zhang, L. Automatic Detection and Mapping of Solar Photovoltaic Arrays with Deep Convolutional Neural Networks in High Resolution Satellite Images. In Proceedings of the 2020 IEEE 4th Conference on Energy Internet and Energy System Integration (EI2), Wuhan, China, 30 October–1 November 2020; pp. 3068–3073. [Google Scholar]
- Ishii, T.; Simo-Serra, E.; Iizuka, S.; Mochizuki, Y.; Sugimoto, A.; Ishikawa, H.; Nakamura, R. Detection by Classification of Buildings in Multispectral Satellite Imagery. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 4–8 December 2016; pp. 3344–3349. [Google Scholar]
- Shi, K.; Bai, L.; Wang, Z.; Tong, X.; Mulvenna, M.D.; Bond, R.R. Photovoltaic Installations Change Detection from Remote Sensing Images Using Deep Learning. In Proceedings of the IGARSS 2022—IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; pp. 3231–3234. [Google Scholar]
- Parhar, P.; Sawasaki, R.; Todeschini, A.; Reed, C.; Vahabi, H.; Nusaputra, N.; Vergara, F. HyperionSolarNet: Solar Panel Detection from Aerial Images. arXiv 2022, arXiv:2201.02107. [Google Scholar]
- Jurakuziev, D.; Jumaboev, S.; Lee, M. A Framework to Estimate Generating Capacities of PV Systems Using Satellite Imagery Segmentation. Eng. Appl. Artif. Intell. 2023, 123, 106186. [Google Scholar] [CrossRef]
- Zhao, Z.; Chen, Y.; Li, K.; Ji, W.; Sun, H. Extracting Photovoltaic Panels From Heterogeneous Remote Sensing Images With Spatial and Spectral Differences. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 5553–5564. [Google Scholar] [CrossRef]
- Gasparyan, H.A.; Davtyan, T.A.; Agaian, S.S. A Novel Framework for Solar Panel Segmentation From Remote Sensing Images: Utilizing Chebyshev Transformer and Hyperspectral Decomposition. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–11. [Google Scholar] [CrossRef]
- Yu, J.; Wang, Z.; Majumdar, A.; Rajagopal, R. DeepSolar: A Machine Learning Framework to Efficiently Construct a Solar Deployment Database in the United States. Joule 2018, 2, 2605–2617. [Google Scholar] [CrossRef]
- Castello, R.; Roquette, S.; Esguerra, M.; Guerra, A.; Scartezzini, J.-L. Deep Learning in the Built Environment: Automatic Detection of Rooftop Solar Panels Using Convolutional Neural Networks. J. Phys. Conf. Ser. 2019, 1343, 012034. [Google Scholar] [CrossRef]
- Yuan, J.; Yang, H.-H.L.; Omitaomu, O.A.; Bhaduri, B.L. Large-Scale Solar Panel Mapping from Aerial Images Using Deep Convolutional Networks. In Proceedings of the 2016 IEEE International Conference on Big Data (Big Data), Washington, DC, USA, 5–8 December 2016; pp. 2703–2708. [Google Scholar]
- Moradi Sizkouhi, A.M.; Aghaei, M.; Esmailifar, S.M.; Mohammadi, M.R.; Grimaccia, F. Automatic Boundary Extraction of Large-Scale Photovoltaic Plants Using a Fully Convolutional Network on Aerial Imagery. IEEE J. Photovolt. 2020, 10, 1061–1067. [Google Scholar] [CrossRef]
- Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Dong, Y.; Yang, Z.; Liu, Q.; Zuo, R.; Wang, Z. Fusion of GaoFen-5 and Sentinel-2B Data for Lithological Mapping Using Vision Transformer Dynamic Graph Convolutional Network. Int. J. Appl. Earth Obs. Geoinf. 2024, 129, 103780. [Google Scholar] [CrossRef]
- Liang, M.; Zhang, X.; Yu, X.; Yu, L.; Meng, Z.; Zhang, X.; Jiao, L. An Efficient Transformer with Neighborhood Contrastive Tokenization for Hyperspectral Images Classification. Int. J. Appl. Earth Obs. Geoinf. 2024, 131, 103979. [Google Scholar] [CrossRef]
- Zhang, J.; Lin, S.; Ding, L.; Bruzzone, L. Multi-Scale Context Aggregation for Semantic Segmentation of Remote Sensing Images. Remote Sens. 2020, 12, 701. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16×16 Words: Transformers for Image Recognition at Scale. arXiv 2021, arXiv:2010.11929. [Google Scholar]
- Fu, W.; Xie, K.; Fang, L. Complementarity-Aware Local–Global Feature Fusion Network for Building Extraction in Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–13. [Google Scholar] [CrossRef]
- Roy, S.K.; Deria, A.; Hong, D.; Rasti, B.; Plaza, A.; Chanussot, J. Multimodal Fusion Transformer for Remote Sensing Image Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–20. [Google Scholar] [CrossRef]
- Chen, Y.; Zhou, J.; Ge, Y.; Dong, J. Uncovering the Rapid Expansion of Photovoltaic Power Plants in China from 2010 to 2022 Using Satellite Data and Deep Learning. Remote Sens. Environ. 2024, 305, 114100. [Google Scholar] [CrossRef]
- Guo, Z.; Lu, J.; Chen, Q.; Liu, Z.; Song, C.; Tan, H.; Zhang, H.; Yan, J. TransPV: Refining Photovoltaic Panel Detection Accuracy through a Vision Transformer-Based Deep Learning Model. Appl. Energy 2024, 355, 122282. [Google Scholar] [CrossRef]
- Hou, X.; Wang, B.; Hu, W.; Yin, L.; Wu, H. SolarNet: A Deep Learning Framework to Map Solar Power Plants In China From Satellite Imagery. arXiv 2019, arXiv:1912.03685. [Google Scholar]
- Zhu, R.; Guo, D.; Wong, M.S.; Qian, Z.; Chen, M.; Yang, B.; Chen, B.; Zhang, H.; You, L.; Heo, J.; et al. Deep Solar PV Refiner: A Detail-Oriented Deep Learning Network for Refined Segmentation of Photovoltaic Areas from Satellite Imagery. Int. J. Appl. Earth Obs. Geoinf. 2023, 116, 103134. [Google Scholar] [CrossRef]
- Wang, J.; Chen, X.; Shi, W.; Jiang, W.; Zhang, X.; Hua, L.; Liu, J.; Sui, H. Rooftop PV Segmenter: A Size-Aware Network for Segmenting Rooftop Photovoltaic Systems from High-Resolution Imagery. Remote Sens. 2023, 15, 5232. [Google Scholar] [CrossRef]
- Tan, M.; Luo, W.; Li, J.; Hao, M. TEMCA-Net: A Texture-Enhanced Deep Learning Network for Automatic Solar Panel Extraction in High Groundwater Table Mining Areas. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 2838–2848. [Google Scholar] [CrossRef]
- Kleebauer, M.; Marz, C.; Reudenbach, C.; Braun, M. Multi-Resolution Segmentation of Solar Photovoltaic Systems Using Deep Learning. Remote Sens. 2023, 15, 5687. [Google Scholar] [CrossRef]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 10012–10022. [Google Scholar]
- Chen, Z.; Luo, Y.; Wang, J.; Li, J.; Wang, C.; Li, D. DPENet: Dual-Path Extraction Network Based on CNN and Transformer for Accurate Building and Road Extraction. Int. J. Appl. Earth Obs. Geoinf. 2023, 124, 103510. [Google Scholar] [CrossRef]
- Fan, J.; Shi, Z.; Ren, Z.; Zhou, Y.; Ji, M. DDPM-SegFormer: Highly Refined Feature Land Use and Land Cover Segmentation with a Fused Denoising Diffusion Probabilistic Model and Transformer. Int. J. Appl. Earth Obs. Geoinf. 2024, 133, 104093. [Google Scholar] [CrossRef]
- Liu, Y.; Gao, K.; Wang, H.; Yang, Z.; Wang, P.; Ji, S.; Huang, Y.; Zhu, Z.; Zhao, X. A Transformer-Based Multi-Modal Fusion Network for Semantic Segmentation of High-Resolution Remote Sensing Imagery. Int. J. Appl. Earth Obs. Geoinf. 2024, 133, 104083. [Google Scholar] [CrossRef]
- Kasmi, G.; Saint-Drenan, Y.-M.; Trebosc, D.; Jolivet, R.; Leloux, J.; Sarr, B.; Dubus, L. A Crowdsourced Dataset of Aerial Images with Annotated Solar Photovoltaic Arrays and Installation Metadata. Sci. Data 2023, 10, 59. [Google Scholar] [CrossRef]
- Jiang, H.; Yao, L.; Lu, N.; Qin, J.; Liu, T.; Liu, Y.; Zhou, C. Multi-Resolution Dataset for Photovoltaic Panel Segmentation from Satellite and Aerial Imagery. Earth Syst. Sci. Data 2021, 13, 5389–5401. [Google Scholar] [CrossRef]
- Ren, P.; Li, C.; Wang, G.; Xiao, Y.; Du, Q.; Liang, X.; Chang, X. Beyond Fixation: Dynamic Window Visual Transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2022; pp. 11987–11997. [Google Scholar]
- Qin, Z.; Zhang, P.; Wu, F.; Li, X. FcaNet: Frequency Channel Attention Networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 783–792. [Google Scholar]
- Chi, K.; Yuan, Y.; Wang, Q. Trinity-Net: Gradient-Guided Swin Transformer-Based Remote Sensing Image Dehazing and Beyond. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–14. [Google Scholar] [CrossRef]
- Berman, M.; Triki, A.R.; Blaschko, M.B. The Lovász-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-Over-Union Measure in Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4413–4421. [Google Scholar]
- Liaw, R.; Liang, E.; Nishihara, R.; Moritz, P.; Gonzalez, J.E.; Stoica, I. Tune: A Research Platform for Distributed Model Selection and Training. arXiv 2018, arXiv:1807.05118. [Google Scholar]
- Ishida, T.; Yamane, I.; Sakai, T.; Niu, G.; Sugiyama, M. Do We Need Zero Training Loss After Achieving Zero Training Error? arXiv 2021, arXiv:2002.08709. [Google Scholar]
- Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation. In Proceedings of the Computer Vision—ECCV 2022 Workshops, Tel Aviv, Israel, 23–27 October 2022; Karlinsky, L., Michaeli, T., Nishino, K., Eds.; Springer Nature: Cham, Switzerland, 2023; pp. 205–218. [Google Scholar]
- Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. arXiv 2021, arXiv:2102.04306. [Google Scholar]
- Li, G.; Liu, Z.; Zeng, D.; Lin, W.; Ling, H. Adjacent Context Coordination Network for Salient Object Detection in Optical Remote Sensing Images. IEEE Trans. Cybern. 2023, 53, 526–538. [Google Scholar] [CrossRef]
- Ding, L.; Zheng, K.; Lin, D.; Chen, Y.; Liu, B.; Li, J.; Bruzzone, L. MP-ResNet: Multipath Residual Network for the Semantic Segmentation of High-Resolution PolSAR Images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Li, R.; Zheng, S.; Duan, C.; Su, J.; Zhang, C. Multistage Attention ResU-Net for Semantic Segmentation of Fine-Resolution Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Wu, H.; Huang, P.; Zhang, M.; Tang, W.; Yu, X. CMTFNet: CNN and Multiscale Transformer Fusion Network for Remote-Sensing Image Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–12. [Google Scholar] [CrossRef]
- Xu, G.; Li, J.; Gao, G.; Lu, H.; Yang, J.; Yue, D. Lightweight Real-Time Semantic Segmentation Network with Efficient Transformer and CNN. IEEE Trans. Intell. Transp. Syst. 2023, 24, 15897–15906. [Google Scholar] [CrossRef]
- Xu, J.; Xiong, Z.; Bhattacharyya, S.P. PIDNet: A Real-Time Semantic Segmentation Network Inspired by PID Controllers. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 19529–19539. [Google Scholar]
- Qin, X.; Zhang, Z.; Huang, C.; Gao, C.; Dehghan, M.; Jagersand, M. BASNet: Boundary-Aware Salient Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 7479–7489. [Google Scholar]
Dataset | Ratio (PV:Not PV) |
---|---|
BDAPPV (Google) | 1:23.09 |
BDAPPV (IGN) | 1:75.43 |
PV01 | 1:1.32 |
PV03 | 1:0.89 |
Dataset | Input Size | Original Data | Processed Data | Training Set | Validation Set | Test Set |
---|---|---|---|---|---|---|
BDAPPV (Google) | 256 × 256 | 13,303 | 13,302 | 7981 | 2660 | 2661 |
PV03 | 256 × 256 | 2308 | 11,593 | 6955 | 2318 | 2320 |
Param | Search Space | Algorithm | Initial Values |
---|---|---|---|
batchsize | {2,4,8,16} | GS | 4 |
flooding | [0,1] | BO | 0.4 |
] | [0,1] | BO | {1,1,1,1,1} |
α, β, γ | [0,1] | BO | {1,1,1} |
lr | [0.0001,0.1] | BO | 0.0001 |
momentum | [0.1,0.9] | BO | 0.9 |
Param | Value |
---|---|
batchsize | 16 |
lr | 0.0030 |
flooding | 0.0528 |
momentum | 0.8000 |
weight_decay | 0.0001 |
α | 0.0419 |
β | 0.6464 |
γ | 0.6866 |
0.0000 | |
0.6626 | |
0.4180 | |
1.0000 | |
1.0000 |
Model | Input Size | Number of Classification Head |
---|---|---|
Swin-Unet | 256 × 256 | 1 |
TransUnet | 256 × 256 | 1 |
ACCoNet | 256 × 256 | 5 |
Unet | 256 × 256 | 1 |
MP-ResNet | 256 × 256 | 2 |
DeepLabV3+ | 256 × 256 | 2 |
MAResU-Net | 256 × 256 | 1 |
CMTFNet | 256 × 256 | 1 |
BASNet | 256 × 256 | 3 |
LETNet | 256 × 256 | 1 |
PIDNet-L | 256 × 256 | 3 |
DSFA-SwinNet | 256 × 256 | 5 |
Model | Precision | Recall | F1 | IoU |
---|---|---|---|---|
Swin-Unet | 72.91 | 89.23 | 78.64 | 67.05 |
TransUnet | 86.73 1 | 94.67 | 89.8 | 82.81 |
ACCoNet | 44.34 | 98.47 | 58.48 | 43.95 |
Unet | 82.31 | 95.45 | 87.62 | 79.47 |
MP-ResNet | 81.52 | 94.48 | 86.62 | 78.03 |
DeepLabV3+ | 86.24 | 95.82 | 90.28 | 83.25 |
MAResU-Net | 84.91 | 95.42 | 89.86 | 82.55 |
CMTFNet | 80.80 | 94.56 | 86.04 | 77.33 |
BASNet | 86.99 2 | 95.26 | 90.32 | 83.41 |
LETNet | 82.50 | 95.45 | 87.73 | 79.40 |
PIDNet-L | 78.69 | 94.58 | 84.83 | 75.25 |
DSFA-SwinNet | 85.97 | 96.56 | 90.44 | 83.50 |
Model | Precision | Recall | F1 | IoU |
---|---|---|---|---|
Swin-Unet | 94.10 | 94.67 | 94.00 | 89.43 |
TransUnet | 95.15 | 96.36 | 95.52 | 91.84 |
ACCoNet | 95.74 1 | 95.65 | 95.43 | 91.72 |
Unet | 95.17 | 96.14 | 95.37 | 91.64 |
MP-ResNet | 94.86 | 96.12 | 95.48 | 91.76 |
DeepLabV3+ | 94.69 | 96.81 | 95.52 | 91.81 |
MAResU-Net | 95.27 | 96.25 | 95.55 | 91.89 |
CMTFNet | 94.57 | 96.05 | 94.95 | 91.02 |
BASNet | 95.83 2 | 95.69 | 95.56 | 91.92 |
LETNet | 94.85 | 95.94 | 95.39 | 91.62 |
PIDNet-L | 94.52 | 95.62 | 94.68 | 90.65 |
DSFA-SwinNet | 95.24 | 96.42 | 95.57 | 92.00 |
Model | Training Duration (s) | Inference Efficiency (s) | Parameters (M) | Flops (Gps) |
---|---|---|---|---|
Swin-Unet | 236.38 | 114.24 | 41.39 | 0.031 |
TransUnet | 317.59 | 135.06 | 105.32 | 3.88 |
ACCoNet | 2498.21 | 139.02 | 127.02 | 13.30 |
Unet | 235.00 | 83.50 | 13.40 | 10.68 |
MP-ResNet | 299.2 | 104.29 | 55.03 | 8.12 |
DeepLabV3+ | 195.92 | 112.28 | 46.62 | 3.99 |
MAResU-Net | 300.00 | 121.90 | 26.28 | 2.61 |
CMTFNet | 206.70 | 116.21 | 30.07 | 2.56 |
BASNet | 145.71 1 | 83.66 | 12.57 | 1.18 |
LETNet | 295.8 | 300.04 | 0.95 | 1.08 |
PIDNet-L | 224.46 | 309.87 | 37.30 | 2.16 |
DSFA-SwinNet | 168.78 2 | 108.38 | 25.88 | 0.99 |
Test | Skip Connection | DSFA | PAR | MLUH | DWA | Precision | Recall | F1 | IoU |
---|---|---|---|---|---|---|---|---|---|
1 | - | - | - | - | - | 76.23 | 89.04 | 80.95 | 69.58 |
2 | √ | - | - | - | - | 79.41 | 94.02 | 84.87 | 75.39 |
3 | √ | √ | - | - | - | 83.02 | 94.06 | 88.19 | 79.07 |
4 | √ | √ | √ | - | - | 83.37 | 94.71 | 88.67 | 80.43 |
5 | √ | √ | √ | √ | - | 84.54 | 95.37 | 89.63 | 82.69 |
6 | √ | √ | √ | √ | √ | 85.97 1 | 96.56 | 90.44 | 83.50 |
Test | Loss Function | Number of Classification Head | Training Duration (s) | Precision | Recall | F1 | IoU |
---|---|---|---|---|---|---|---|
1 | αW | 5 | 177.15 | 74.39 | 96.19 | 82.34 | 72.10 |
2 | αW + βD + γL | 1 | 175.15 1 | 84.58 | 95.85 | 89.12 | 81.62 |
3 | αW + βD + γL | 5 | 185.13 | 85.97 | 96.56 | 90.44 | 83.50 |
Test | Training Set | Validation Set | Test Set | Precision | Recall | F1 | IoU |
---|---|---|---|---|---|---|---|
1 | PV03 | 74.32 | 81.73 | 74.87 | 67.04 | ||
2 | PV03 | PV03 | 68.91 | 94.40 | 79.67 | 66.21 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lin, S.; Yang, Y.; Liu, X.; Tian, L. DSFA-SwinNet: A Multi-Scale Attention Fusion Network for Photovoltaic Areas Detection. Remote Sens. 2025, 17, 332. https://doi.org/10.3390/rs17020332
Lin S, Yang Y, Liu X, Tian L. DSFA-SwinNet: A Multi-Scale Attention Fusion Network for Photovoltaic Areas Detection. Remote Sensing. 2025; 17(2):332. https://doi.org/10.3390/rs17020332
Chicago/Turabian StyleLin, Shaofu, Yang Yang, Xiliang Liu, and Li Tian. 2025. "DSFA-SwinNet: A Multi-Scale Attention Fusion Network for Photovoltaic Areas Detection" Remote Sensing 17, no. 2: 332. https://doi.org/10.3390/rs17020332
APA StyleLin, S., Yang, Y., Liu, X., & Tian, L. (2025). DSFA-SwinNet: A Multi-Scale Attention Fusion Network for Photovoltaic Areas Detection. Remote Sensing, 17(2), 332. https://doi.org/10.3390/rs17020332