MASDF-Net: A Multi-Attention Codec Network with Selective and Dynamic Fusion for Skin Lesion Segmentation
Abstract
:1. Introduction
- We propose a novel multi-attention encoder–decoder network with selective and dynamic fusion named MASDF-Net, which effectively addresses the challenges of segmenting the skin lesions with irregular shapes, blurry boundaries, and noise interference.
- We design the MAF module based on multi-attention mechanisms, aiming to enhance the network’s focus on global context information at deeper layers from multiple attention perspectives.
- For the enhancement of skip connections in the U-shaped network, we design the SIG module based on cross-attention. This module interacts in a learnable manner to propagate rich positional information from low-level features and semantic information from high-level features, alleviating the semantic gap between the encoder and decoder.
- We design the MSCF module to dynamically fuse features of different scales in the decoder stage, leading to improve the final segmentation results.
2. Related Work
2.1. U-Net and Its Variants
2.2. Vision Transformer
2.3. Transformer in Medical Image Segmentation
3. Methodology
3.1. Network Architecture
3.2. Multi-Attention Fusion Module
3.3. Selective Information Gathering Module
3.4. Multi-Scale Cascade Fusion Module
4. Experiments
4.1. Datasets
4.2. Loss Function
4.3. Implementation Details
4.4. Evaluation Metrics
4.5. Comparison with Several Existing Methods
4.5.1. Results on ISIC 2016, ISIC 2017 and ISIC 2018
4.5.2. Cross-Dataset Testing
4.6. Ablation Study
5. Limitations and Future Work
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Rogers, H.W.; Weinstock, M.A.; Feldman, S.R.; Coldiron, B.M. Incidence estimate of nonmelanoma skin cancer (keratinocyte carcinomas) in the us population, 2012. JAMA Dermatol. 2015, 151, 1081–1086. [Google Scholar] [CrossRef] [PubMed]
- Karimkhani, C.; Dellavalle, R.P.; Coffeng, L.E.; Flohr, C.; Hay, R.J.; Langan, S.M.; Nsoesie, E.O.; Ferrari, A.J.; Erskine, H.E.; Silverberg, J.I.; et al. Global skin disease morbidity and mortality: An update from the global burden of disease study 2013. JAMA Dermatol. 2017, 153, 406–412. [Google Scholar] [CrossRef]
- Jerant, A.F.; Johnson, J.T.; Sheridan, C.D.; Caffrey, T.J. Early detection and treatment of skin cancer. Am. Fam. Physician 2000, 62, 357–368. [Google Scholar]
- Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef] [PubMed]
- Hasan, M.K.; Ahamad, M.A.; Yap, C.H.; Yang, G. A survey, review, and future trends of skin lesion segmentation and classification. Comput. Biol. Med. 2023, 155, 106624. [Google Scholar] [CrossRef]
- Silveira, M.; Nascimento, J.C.; Marques, J.S.; Marcal, A.R.S.; Mendonca, T.; Yamauchi, S.; Maeda, J.; Rozeira, J. Comparison of segmentation methods for melanoma diagnosis in dermoscopy images. IEEE J. Sel. Top. Signal Process. 2009, 3, 35–45. [Google Scholar] [CrossRef]
- Garnavi, R.; Aldeen, M.; Celebi, M.E.; Varigos, G.; Finch, S. Border detection in dermoscopy images using hybrid thresholding on optimized color channels. Comput. Med. Imaging Graph. 2011, 35, 105–115. [Google Scholar] [CrossRef]
- Thanh, D.N.; Erkan, U.; Prasath, V.S.; Kumar, V.; Hien, N.N. A skin lesion segmentation method for dermoscopic images based on adaptive thresholding with normalization of color models. In Proceedings of the 2019 6th International Conference on Electrical and Electronics Engineering (ICEEE), Istanbul, Turkey, 16–17 April 2019; pp. 116–120. [Google Scholar]
- Wong, A.; Scharcanski, J.; Fieguth, P. Automatic skin lesion segmentation via iterative stochastic region merging. IEEE Trans. Inf. Technol. Biomed. 2011, 15, 929–936. [Google Scholar] [CrossRef]
- Xie, F.; Bovik, A.C. Automatic segmentation of dermoscopy images using self-generating neural networks seeded by genetic algorithm. Pattern Recogn. 2013, 46, 1012–1019. [Google Scholar] [CrossRef]
- Shelhamer, E.; Long, J.; Darrell, T. Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 640–651. [Google Scholar] [CrossRef] [PubMed]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 833–851. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet:A seep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B.; et al. Attention U-Net: Learning Where to Look for the Pancreas. 2018. Available online: http://arxiv.org/abs/1804.03999 (accessed on 10 April 2023).
- Yan, X.; Tang, H.; Sun, S.; Ma, H.; Kong, D.; Xie, X. AFTer-UNet: Axial fusion transformer unet for medical image segmentation. In Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2022; pp. 3270–3280. [Google Scholar]
- Li, H.; Zhai, D.H.; Xia, Y. ERDUnet: An efficient residual double-coding unet for medical image segmentation. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 2083–2096. [Google Scholar] [CrossRef]
- Zhang, X.; Li, Q.; Li, W.; Guo, Y.; Zhang, J.; Guo, C.; Chang, K.; Lovell, N.H. FD-Net: Feature distillation network for oral squamous cell carcinoma lymph node segmentation in hyperspectral imagery. IEEE J. Biomed. Health Inform. 2024, 28, 1552–1563. [Google Scholar] [CrossRef] [PubMed]
- Zhao, X.; Xu, W. NFMPAtt-Unet: Neighborhood fuzzy c-means multi-scale pyramid hybrid attention unet for medical image segmentation. Neural Netw. 2024, 178, 106489. [Google Scholar] [CrossRef]
- Wang, X.; Girshick, R.; Gupta, A.; He, K. Non-local neural networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7794–7803. [Google Scholar]
- Wang, H.; Cao, P.; Wang, J.; Zaiane, O.R. UCTransNet: Rethinking the skip connections in u-net from a channel-wise perspective with transformer. Proc. AAAI Conf. Artif. Intell. 2022, 36, 2441–2449. [Google Scholar] [CrossRef]
- Ni, J.; Mu, W.; Pan, A.; Chen, Z. FSE-Net: Rethinking the up-sampling operation in encoder-decoder structure for retinal vessel segmentation. Biomed. Signal Process. Control. 2024, 90, 105861. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.M.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–23 June 2022; pp. 5718–5729. [Google Scholar]
- Li, M.; Zhao, Y.; Gui, G.; Zhang, F.; Luo, B.; Yang, C.; Gui, W.; Chang, K.; Wang, H. Object detection on low-resolution images with two-stage enhancement. Knowl.-Based Syst. 2024, 299, 111985. [Google Scholar] [CrossRef]
- Liu, N.; Zhang, N.; Wan, K.; Shao, L.; Han, J. Visual saliency transformer. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada, 11–17 October 2021; pp. 4702–4712. [Google Scholar]
- Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. TransUNet: Transformers make strong encoders for medical image segmentation. arXiv 2021, arXiv:2102.04306. [Google Scholar]
- Zhang, Y.; Liu, H.; Hu, Q. TransFuse: Fusing transformers and cnns for medical image segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2021, Strasbourg, France, 27 September–1 October 2021; pp. 14–24. [Google Scholar]
- Lee, S.H.; Lee, S.; Song, B.C. Vision transformer for small-size datasets. arXiv 2021, arXiv:2112.13492. [Google Scholar]
- Wang, W.; Xie, E.; Li, X.; Fan, D.P.; Song, K.; Liang, D.; Lu, T.; Luo, P.; Shao, L. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 568–578. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11531–11539. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the Computer Vision—ECCV 2018, Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Qin, Z.; Zhang, P.; Wu, F.; Li, X. FcaNet: Frequency channel attention networks. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada, 11–17 October 2021; pp. 763–772. [Google Scholar]
- Feng, S.; Zhao, H.; Shi, F.; Cheng, X.; Wang, M.; Ma, Y.; Xiang, D.; Zhu, W.; Chen, X. CPFNet: Context pyramid fusion network for medical image segmentation. IEEE Trans. Med. Imaging 2020, 39, 3008–3018. [Google Scholar] [CrossRef]
- Gu, R.; Wang, G.; Song, T.; Huang, R.; Aertsen, M.; Deprest, J.; Ourselin, S.; Vercauteren, T.; Zhang, S. CA-Net: Comprehensive attention convolutional neural networks for explainable medical image segmentation. IEEE Trans. Med. Imaging 2021, 40, 699–711. [Google Scholar] [CrossRef] [PubMed]
- Wu, Z.; Su, L.; Huang, Q. Cascaded partial decoder for fast and accurate salient object detection. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 3902–3911. [Google Scholar]
- Siddique, N.; Paheding, S.; Elkin, C.P.; Devabhaktuni, V. U-net and its variants for medical image segmentation: A review of theory and applications. IEEE Access 2021, 9, 82031–82057. [Google Scholar] [CrossRef]
- Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. UNet++: A nested u-net architecture for medical image segmentation. In Proceedings of the Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Granada, Spain, 20 September 2018; pp. 3–11. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar]
- Gu, Z.; Cheng, J.; Fu, H.; Zhou, K.; Hao, H.; Zhao, Y.; Zhang, T.; Gao, S.; Liu, J. CE-Net: Context encoder network for 2D medical image segmentation. IEEE Trans. Med. Imaging 2019, 38, 2281–2292. [Google Scholar] [CrossRef] [PubMed]
- Sun, Y.; Dai, D.; Zhang, Q.; Wang, Y.; Xu, S.; Lian, C. MSCA-Net: Multi-scale contextual attention network for skin lesion segmentation. Pattern Recogn. 2023, 139, 109524. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. In Proceedings of the International Conference on Learning Representations, Virtual, 3–7 May 2021. [Google Scholar]
- Touvron, H.; Cord, M.; Douze, M.; Massa, F.; Sablayrolles, A.; Jegou, H. Training data-efficient image transformers & distillation through attention. In Proceedings of the International Conference on Machine Learning, Online, 18–24 July 2021; Volume 139, pp. 10347–10357. [Google Scholar]
- Zheng, S.; Lu, J.; Zhao, H.; Zhu, X.; Luo, Z.; Wang, Y.; Fu, Y.; Feng, J.; Xiang, T.; Torr, P.H.; et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 6877–6886. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada, 11–17 October 2021; pp. 9992–10002. [Google Scholar]
- Wang, W.; Xie, E.; Li, X.; Fan, D.P.; Song, K.; Liang, D.; Lu, T.; Luo, P.; Shao, L. Pvtv2: Improved baselines with pyramid vision transformer. Comput. Vis. Media 2022, 8, 1–10. [Google Scholar]
- Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-Unet: Unet-like pure transformer for medical image segmentation. In Proceedings of the European Conference on Computer Vision Workshops (ECCVW), Tel Aviv, Israel, 23–24 October 2022; pp. 205–218. [Google Scholar]
- Lin, A.; Chen, B.; Xu, J.; Zhang, Z.; Lu, G.; Zhang, D. DS-TransUNet: Dual swin transformer u-net for medical image segmentation. IEEE Trans. Instrum. Meas. 2022, 71, 1–15. [Google Scholar] [CrossRef]
- Li, Y.; Wang, Z.; Yin, L.; Zhu, Z.; Qi, G.; Liu, Y. X-Net: A dual encoding–decoding method in medical image segmentation. Vis. Comput. 2023, 39, 2223–2233. [Google Scholar] [CrossRef]
- Zhu, Z.; He, X.; Qi, G.; Li, Y.; Cong, B.; Liu, Y. Brain tumor segmentation based on the fusion of deep semantics and edge information in multimodal MRI. Inf. Fusion 2023, 91, 376–387. [Google Scholar] [CrossRef]
- Zhang, Z.; Sun, G.; Zheng, K.; Yang, J.K.; Zhu, X.R.; Li, Y. TC-Net: A joint learning framework based on cnn and vision transformer for multi-lesion medical images segmentation. Comput. Biol. Med. 2023, 161, 106967. [Google Scholar] [CrossRef]
- Cao, Y.; Xu, J.; Lin, S.; Wei, F.; Hu, H. GCNet: Non-local networks meet squeeze-excitation networks and beyond. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Republic of Korea, 27–28 October 2019; pp. 1971–1980. [Google Scholar]
- Huang, Z.; Wang, X.; Huang, L.; Huang, C.; Wei, Y.; Liu, W. CCNet: Criss-cross attention for semantic segmentation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 603–612. [Google Scholar]
- Fan, D.P.; Ji, G.P.; Zhou, T.; Chen, G.; Fu, H.; Shen, J.; Shao, L. Pranet: Parallel reverse attention network for polyp segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2020, Virtual, 4–8 October 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 263–273. [Google Scholar]
- Fan, D.P.; Zhou, T.; Ji, G.P.; Zhou, Y.; Chen, G.; Fu, H.; Shen, J.; Shao, L. Inf-Net: Automatic covid-19 lung infection segmentation from ct images. IEEE Trans. Med. Imaging 2020, 39, 2626–2637. [Google Scholar] [CrossRef]
- Dong, B.; Wang, W.; Fan, D.P.; Li, J.; Fu, H.; Shao, L. Polyp-PVT: Polyp segmentation with pyramid vision transformers. CAAI Artif. Intell. Res. 2023, 2, 9150015. [Google Scholar] [CrossRef]
- Gutman, D.; Codella, N.C.F.; Celebi, E.; Helba, B.; Marchetti, M.; Mishra, N.; Halpern, A. Skin Lesion Analysis Toward Melanoma Detection: A Challenge at the International Symposium on Biomedical Imaging (isbi) 2016, Hosted by the International Skin Imaging Collaboration (ISIC). 2016. Available online: http://arxiv.org/abs/1605.01397 (accessed on 15 April 2023).
- Codella, N.C.F.; Gutman, D.; Celebi, M.E.; Helba, B.; Marchetti, M.A.; Dusza, S.W.; Kalloo, A.; Liopyris, K.; Mishra, N.; Kittler, H.; et al. Skin lesion analysis toward melanoma detection: A challenge at the 2017 International symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic). In Proceedings of the 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), Washington, DC, USA, 4–7 April 2018; pp. 168–172. [Google Scholar]
- Codella, N.; Rotemberg, V.; Tschandl, P.; Celebi, M.E.; Dusza, S.; Gutman, D.; Helba, B.; Kalloo, A.; Liopyris, K.; Marchetti, M.; et al. Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC). 2019. Available online: http://arxiv.org/abs/1902.03368 (accessed on 10 April 2023).
- Mendonça, T.; Ferreira, P.M.; Marques, J.S.; Marcal, A.R.S.; Rozeira, J. PH2 - A dermoscopic image database for research and benchmarking. In Proceedings of the 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Osaka, Japan, 3–7 July 2013; pp. 5437–5440. [Google Scholar]
- Dai, D.; Dong, C.; Xu, S.; Yan, Q.; Li, Z.; Zhang, C.; Luo, N. Ms RED: A novel multi-scale residual encoding and decoding network for skin lesion segmentation. Med. Image Anal. 2022, 75, 102293. [Google Scholar] [CrossRef] [PubMed]
- Zhang, W.; Lu, F.; Zhao, W.; Hu, Y.; Su, H.; Yuan, M. ACCPG-Net: A skin lesion segmentation network with adaptive dhannel-context-aware pyramid attention and global feature fusion. Comput. Biol. Med. 2023, 154, 106580. [Google Scholar] [CrossRef]
- Wei, J.; Wang, S.; Huang, Q. F3Net: Fusion, feedback and focus for salient object detection. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), New York, NY, USA, 7–12 February 2020; pp. 12321–12328. [Google Scholar]
- Zhang, X.; Li, W.; Gao, C.; Yang, Y.; Chang, K. Hyperspectral pathology image classification using dimension-driven multi-path attention residual network. Expert Syst. Appl. 2023, 230, 120615. [Google Scholar] [CrossRef]
- Li, L.; Ma, H.; Jia, Z. Change detection from sar images based on convolutional neural networks guided by saliency enhancement. Remote Sens. 2021, 13, 3697. [Google Scholar] [CrossRef]
- Zhao, X.; Liu, K.; Gao, K.; Li, W. Hyperspectral time-series target detection based on spectral perception and spatial-temporal tensor decomposition. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–12. [Google Scholar]
- Li, L.; Ma, H.; Zhang, X.; Zhao, X.; Lv, M.; Jia, Z. Synthetic aperture radar image change detection based on principal component analysis and two-level clustering. Remote Sens. 2024, 16, 1861. [Google Scholar] [CrossRef]
Dataset | Type | Method | JI (%) ↑ | DSC (%) ↑ | ACC (%) ↑ | SE (%) ↑ | SP (%) ↑ | Para (M) ↓ | FLOPs (G) ↓ |
---|---|---|---|---|---|---|---|---|---|
ISIC 2016 | CNN | U-Net | 84.16 | 90.49 | 95.31 | 92.64 | 95.95 | 34.52 | 50.17 |
AttU-Net | 84.55 | 90.85 | 95.57 | 93.61 | 95.63 | 34.87 | 51.01 | ||
Deeplabv3+ | 86.29 | 92.17 | 96.19 | 92.30 | 96.54 | 39.76 | 33.19 | ||
CE-Net | 84.99 | 91.37 | 95.49 | 93.24 | 95.17 | 29.00 | 6.81 | ||
CPFNet | 84.66 | 91.05 | 95.47 | 92.78 | 96.11 | 30.65 | 6.18 | ||
MSCA-Net | 86.41 | 92.09 | 96.02 | 94.08 | 95.58 | 27.09 | 9.02 | ||
Trans | TransFuse | 86.47 | 92.35 | 96.22 | 95.16 | 95.39 | 26.17 | 8.83 | |
Swin-Unet | 84.07 | 90.74 | 95.37 | 93.68 | 95.50 | 27.14 | 5.91 | ||
UCTransNet | 84.91 | 91.12 | 95.63 | 94.85 | 95.33 | 66.24 | 32.98 | ||
Polyp-PVT | 86.56 | 92.28 | 96.29 | 94.91 | 95.73 | 25.10 | 4.05 | ||
MASDF-Net (Ours) | 87.35 | 92.98 | 96.68 | 93.73 | 96.78 | 28.07 | 5.04 | ||
ISIC 2017 | CNN | U-Net | 75.60 | 84.19 | 93.07 | 84.16 | 96.78 | 34.52 | 50.17 |
AttU-Net | 76.04 | 84.40 | 93.06 | 82.74 | 97.50 | 34.87 | 51.01 | ||
Deeplabv3+ | 76.79 | 84.97 | 93.22 | 83.21 | 97.84 | 39.76 | 33.19 | ||
CE-Net | 75.93 | 84.52 | 93.10 | 82.78 | 96.98 | 29.00 | 6.81 | ||
CPFNet | 76.11 | 84.51 | 93.03 | 83.77 | 96.13 | 30.65 | 6.18 | ||
MSCA-Net | 77.68 | 85.81 | 93.68 | 84.48 | 97.37 | 27.09 | 9.02 | ||
Trans | TransFuse | 78.81 | 86.60 | 94.32 | 85.58 | 97.01 | 26.17 | 8.83 | |
Swin-Unet | 74.45 | 83.07 | 92.62 | 81.75 | 96.76 | 27.14 | 5.91 | ||
UCTransNet | 77.73 | 85.76 | 93.58 | 86.02 | 96.40 | 66.24 | 32.98 | ||
Polyp-PVT | 76.72 | 85.17 | 93.80 | 81.92 | 97.70 | 25.10 | 4.05 | ||
MASDF-Net (Ours) | 80.21 | 87.65 | 94.69 | 87.80 | 96.66 | 28.07 | 5.04 | ||
ISIC 2018 | CNN | U-Net | 81.51 | 88.73 | 95.27 | 90.88 | 96.25 | 34.52 | 50.17 |
AttU-Net | 81.88 | 89.11 | 95.26 | 92.50 | 95.28 | 34.87 | 51.01 | ||
Deeplabv3+ | 82.59 | 89.56 | 95.93 | 88.98 | 97.13 | 39.76 | 33.19 | ||
CE-Net | 82.43 | 89.38 | 95.51 | 91.69 | 95.36 | 29.00 | 6.81 | ||
CPFNet | 81.23 | 88.41 | 94.81 | 92.47 | 94.64 | 30.65 | 6.18 | ||
MSCA-Net | 83.46 | 90.17 | 96.10 | 90.97 | 96.66 | 27.09 | 9.02 | ||
Trans | TransFuse | 83.63 | 90.37 | 96.12 | 93.41 | 95.69 | 26.17 | 8.83 | |
Swin-Unet | 79.80 | 87.28 | 94.36 | 88.85 | 95.02 | 27.14 | 5.91 | ||
UCTransNet | 82.06 | 89.07 | 95.42 | 90.95 | 96.22 | 66.24 | 32.98 | ||
Polyp-PVT | 83.58 | 90.42 | 96.27 | 91.71 | 96.42 | 25.10 | 4.05 | ||
MASDF-Net (Ours) | 84.80 | 91.22 | 96.61 | 92.66 | 96.65 | 28.07 | 5.04 |
Type | Method | JI (%) ↑ | DSC (%) ↑ | ACC (%) ↑ | SE (%) ↑ | SP (%) ↑ |
---|---|---|---|---|---|---|
CNN | U-Net | 70.67 (−10.84) | 80.98 (−7.75) | 90.89 (−4.38) | 88.08 (−2.80) | 96.43 (+0.18) |
AttU-Net | 70.74 (−11.14) | 80.47 (−8.64) | 91.32 (−3.94) | 89.83 (−2.67) | 96.45 (+1.17) | |
Deeplabv3+ | 74.28 (−8.31) | 83.10 (−6.46) | 93.06 (−2.87) | 88.63 (−0.35) | 97.07 (−0.06) | |
CE-Net | 73.50 (−8.93) | 82.27 (−7.11) | 92.44 (−3.07) | 88.33 (−3.36) | 97.17 (+1.81) | |
CPFNet | 74.80 (−6.43) | 83.63 (−4.78) | 92.73 (−2.08) | 90.08 (−2.39) | 96.66 (+2.02) | |
MSCA-Net | 75.57 (−7.89) | 83.86 (−6.31) | 93.52 (−2.58) | 89.62 (−1.35) | 97.28 (+0.62) | |
Trans | TransFuse | 75.94 (−7.69) | 84.90 (−5.47) | 94.60 (−1.52) | 87.66 (−5.75) | 97.04 (+1.35) |
Swin-Unet | 74.89 (−4.91) | 83.06 (−4.22) | 92.85 (−1.51) | 90.13 (+1.28) | 97.36 (+2.34) | |
UCTransNet | 74.16 (−7.90) | 83.58 (−5.49) | 92.70 (−2.72) | 89.22 (−1.73) | 96.48 (+0.26) | |
Polyp-PVT | 75.39 (−8.19) | 84.06 (−6.36) | 93.56 (−2.71) | 88.72 (−2.99) | 96.37 (−0.05) | |
MASDF-Net (Ours) | 79.64 (−5.16) | 87.66 (−3.56) | 97.00 (+0.39) | 90.45 (−2.21) | 98.28 (+1.63) |
Type | Method | ISIC 2018→PH2 | ||||
---|---|---|---|---|---|---|
JI (%) ↑ | DSC (%) ↑ | ACC (%) ↑ | SE (%) ↑ | SP (%) ↑ | ||
CNN | U-Net | 83.20 | 90.41 | 93.17 | 96.94 | 92.26 |
AttU-Net | 79.94 | 88.38 | 92.62 | 97.66 | 89.30 | |
Deeplabv3+ | 82.62 | 90.27 | 94.01 | 98.78 | 89.81 | |
CE-Net | 83.81 | 90.73 | 94.38 | 98.05 | 90.53 | |
CPFNet | 81.81 | 89.34 | 93.32 | 97.92 | 89.49 | |
MSCA-Net | 82.68 | 90.21 | 93.81 | 98.20 | 91.21 | |
Trans | TransFuse | 83.78 | 90.70 | 94.38 | 99.00 | 90.52 |
Swin-Unet | 82.44 | 89.87 | 93.55 | 97.07 | 90.83 | |
UCTransNet | 83.10 | 90.33 | 93.83 | 96.94 | 91.88 | |
Polyp-PVT | 83.94 | 90.80 | 94.72 | 98.65 | 91.19 | |
MASDF-Net (Ours) | 84.64 | 91.38 | 95.06 | 98.98 | 91.90 |
Method | JI (%) ↑ | DSC (%) ↑ | ACC (%) ↑ | SE (%) ↑ | SP (%) ↑ | Para (M) ↓ | FLOPs (G) ↓ |
---|---|---|---|---|---|---|---|
Baseline | 82.64 | 89.76 | 96.24 | 91.27 | 96.58 | 26.11 | 4.52 |
Model 1 | 83.66 | 90.45 | 96.35 | 92.61 | 93.67 | 27.16 | 4.52 |
Model 2 | 83.91 | 90.55 | 96.38 | 91.42 | 96.72 | 26.26 | 4.63 |
Model 3 | 83.98 | 90.64 | 96.32 | 91.70 | 96.86 | 26.87 | 4.92 |
Model 4 | 84.51 | 90.93 | 96.43 | 92.03 | 96.69 | 27.31 | 4.63 |
Model 5 | 84.44 | 90.90 | 96.43 | 90.30 | 97.71 | 27.91 | 4.92 |
Model 6 | 84.30 | 90.86 | 96.42 | 91.28 | 97.09 | 27.02 | 5.04 |
Model 7 | 84.80 | 91.22 | 96.61 | 92.66 | 96.65 | 28.07 | 5.04 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fu, J.; Deng, H. MASDF-Net: A Multi-Attention Codec Network with Selective and Dynamic Fusion for Skin Lesion Segmentation. Sensors 2024, 24, 5372. https://doi.org/10.3390/s24165372
Fu J, Deng H. MASDF-Net: A Multi-Attention Codec Network with Selective and Dynamic Fusion for Skin Lesion Segmentation. Sensors. 2024; 24(16):5372. https://doi.org/10.3390/s24165372
Chicago/Turabian StyleFu, Jinghao, and Hongmin Deng. 2024. "MASDF-Net: A Multi-Attention Codec Network with Selective and Dynamic Fusion for Skin Lesion Segmentation" Sensors 24, no. 16: 5372. https://doi.org/10.3390/s24165372
APA StyleFu, J., & Deng, H. (2024). MASDF-Net: A Multi-Attention Codec Network with Selective and Dynamic Fusion for Skin Lesion Segmentation. Sensors, 24(16), 5372. https://doi.org/10.3390/s24165372