Nonlinear Activation-Free Contextual Attention Network for Polyp Segmentation
Abstract
:1. Introduction
- Instead of simply aggregating feature maps from multiple layers to generate coarse segmentation masks, the complementary information between different layers and the contextual information of each layer are integrated. The nonlinear activation-free uncertainty contextual attention network proposed in this paper is able to enhance uncertainty regions on saliency maps that are highly correlated with boundary information.
- In order to realize the calculation of areas with fuzzy saliency scores in the case of various polyp locations and the presence of fuzzy areas that are easily confused with polyp areas, a contextual attention module is implemented by combining foreground and background areas. In this paper, a simple parallel axial channel attention is proposed so as to correctly identify polyps from the background region.
- A nonlinear and non-activating feature detail extraction enhancement technique is introduced, which can fuse feature maps of multiple regions based on an improved U-shaped network with additional encoders and decoders in order to explore the contextual features of each layer more accurately and achieve better segmentation accuracy and computational efficiency.
2. Related Work
2.1. Medical Image Segmentation
2.2. Image Feature Details Extraction Enhancement
2.3. Polyp Segmentation
3. Method
3.1. Overall Architecture
3.2. Simple Parallel Axial Channel Attention
3.3. Nonlinear Activation-Free Network
3.4. Uncertainty Contextual Attention
4. Experimental Analysis
4.1. Experimental Datasets and Evaluation Metrics
4.2. Experimental Details
5. Results
5.1. Experimental Results under Different Methods
5.2. Ablation Experiment of Different Modules
6. Discussion
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015. In Proceedings of the 18th International Conference, Munich, Germany, 5–9 October 2015; Springer International Publishing: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
- Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B.; et al. Attention u-net: Learning where to look for the pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar]
- Haghighi, F.; Hosseinzadeh Taher, M.R.; Zhou, Z.; Gotway, M.B.; Liang, J. Learning semantics-enriched representation via self-discovery, self-classification, and self-restoration. Medical Image Computing and Computer Assisted Intervention–MICCAI 2020. In Proceedings of the 23rd International Conference, Lima, Peru, 4–8 October 2020; Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 137–147. [Google Scholar]
- Bourouis, S.; Alroobaea, R.; Rubaiee, S.; Ahmed, A. Toward effective medical image analysis using hybrid approaches—Review, challenges and applications. Information 2020, 11, 155. [Google Scholar] [CrossRef] [Green Version]
- Tajbakhsh, N.; Gurudu, S.R.; Liang, J. Automated polyp detection in colonoscopy videos using shape and context information. IEEE Trans. Med. Imaging 2015, 35, 630–644. [Google Scholar] [CrossRef]
- Puyal, J.G.B.; Bhatia, K.K.; Brandao, P.; Ahmad, O.F.; Toth, D.; Kader, R.; Lovat, L.; Mountney, P.; Stoyanov, D. Endoscopic polyp segmentation using a hybrid 2D/3D CNN. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, 4–8 October 2020; Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 295–305. [Google Scholar]
- Feng, S.; Zhao, H.; Shi, F.; Cheng, X.; Wang, M.; Ma, Y.; Xiang, D.; Zhu, W.; Chen, X. CPFNet: Context pyramid fusion network for medical image segmentation. IEEE Trans. Med. Imaging 2020, 39, 3008–3018. [Google Scholar] [CrossRef]
- Song, J.; Chen, X.; Zhu, Q.; Shi, F.; Xiang, D.; Chen, Z.; Fan, Y.; Pan, L.; Zhu, W. Global and local feature reconstruction for medical image segmentation. IEEE Trans. Med. Imaging 2022, 41, 2273–2284. [Google Scholar] [CrossRef]
- Wen, Y.; Chen, L.; Deng, Y.; Zhang, Z.; Zhou, C. Pixel-wise triplet learning for enhancing boundary discrimination in medical image segmentation. Knowl. Based Syst. 2022, 243, 108424. [Google Scholar] [CrossRef]
- Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging 2019, 39, 1856–1867. [Google Scholar] [CrossRef] [Green Version]
- Yagang, W.; Yiyuan, X.I.; Xiaoying, P.A.N. Method for intestinal polyp segmentation by improving DeepLabv3+ network. J. Front. Comput. Sci.Technol. 2020, 14, 1243. [Google Scholar]
- Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In Proceedings of the Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Granada, Spain, 20 September 2018; Springer International Publishing: Berlin/Heidelberg, Germany, 2018; pp. 3–11. [Google Scholar]
- Jha, D.; Smedsrud, P.H.; Riegler, M.A.; Johansen, D.; Lange, T.D.; Halvorsen, P. Resunet++: An advanced architecture for medical image segmentation. In Proceedings of the 2019 IEEE International Symposium on Multimedia (ISM), San Diego, CA, USA, 9–11 December 2019; pp. 225–2255. [Google Scholar]
- Zhang, Z.; Liu, Q.; Wang, Y. Road extraction by deep residual U-net. IEEE Geosci. Remote Sens. Lett. 2018, 15, 749–753. [Google Scholar] [CrossRef] [Green Version]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 10–22 June 2018; pp. 7132–7141. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar]
- Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-unet: Unet-like pure transformer for medical image segmentation. arXiv 2021, arXiv:2105.05537. [Google Scholar]
- Fan, D.P.; Ji, G.P.; Zhou, T.; Chen, G.; Fu, H.; Shen, J.; Shao, L. Pranet: Parallel reverse attention network for polyp segmentation. Medical Image Computing and Computer Assisted Intervention–MICCAI 2020. In Proceedings of the 23rd International Conference, Lima, Peru, 4–8 October 2020; Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 263–273. [Google Scholar]
- Bardhi, O.; Sierra-Sosa, D.; Garcia-Zapirain, B.; Bujanda, L. Deep Learning Models for Colorectal Polyps. Information 2021, 12, 245. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef] [Green Version]
- Schlemper, J.; Oktay, O.; Schaap, M.; Heinrich, M.; Kainz, B.; Glocker, B.; Rueckert, D. Attention gated networks: Learning to leverage salient regions in medical images. Med. Image Anal. 2019, 53, 197–207. [Google Scholar] [CrossRef]
- Fu, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA; 2019; pp. 3146–3154. [Google Scholar]
- Yuan, Y.; Chen, X.; Wang, J. Object-contextual representations for semantic segmentation. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 173–190. [Google Scholar]
- Tomassini, S.; Anbar, H.; Sbrollini, A.; Mortada, M.J.; Burattini, L.; Morettini, M. A Double-Stage 3D U-Net for On-Cloud Brain Extraction and Multi-Structure Segmentation from 7T MR Volumes. Information 2023, 14, 282. [Google Scholar] [CrossRef]
- Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H.; Shao, L. Multi-stage progressive image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 14821–14831. [Google Scholar]
- Cho, S.J.; Ji, S.W.; Hong, J.P.; Jung, S.W.; Ko, S.J. Rethinking coarse-to-fine approach in single image deblurring. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021; pp. 4641–4650. [Google Scholar]
- Waqas, Z.S.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H. Restormer: Efficient Transformer for High-Resolution Image Restoration. arXiv 2021, arXiv:2111.09881. [Google Scholar]
- Dauphin, Y.N.; Fan, A.; Auli, M.; Grangier, D. Language modeling with gated convolutional networks. In Proceedings of the International Conference on Machine Learning, PMLR, International Convention Centre, Sydney, Australia, 6–11 August 2017; pp. 933–941. [Google Scholar]
- Chen, L.; Chu, X.; Zhang, X.; Sun, J. Simple baselines for image restoration. In Proceedings of the Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 17–33. [Google Scholar]
- Brandao, P.; Zisimopoulos, O.; Mazomenos, E.; Ciuti, G.; Bernal, J.; Scarzanella, M.V.; Menciassi, A.; Dario, P.; Koulaouzidis, A.; Arezzo, A.; et al. Towards a computedaided diagnosis system in colonoscopy: Automatic polyp segmentation using convolution neural networks. J. Med. Robot. Res. 2018, 3, 1840002. [Google Scholar] [CrossRef] [Green Version]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Chen, S.; Tan, X.; Wang, B.; Hu, X. Reverse attention for salient object detection. In Proceedings of the European Conference on Computer Vision (ECCV), München, Germany, 8–14 September 2018; pp. 234–250. [Google Scholar]
- Wu, Z.; Su, L.; Huang, Q. Cascaded partial decoder for fast and accurate salient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 3907–3916. [Google Scholar]
- Zhang, H.; Goodfellow, I.; Metaxas, D.; Odena, A. Self-attention generative adversarial networks. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 7354–7363. [Google Scholar]
- Huang, Z.; Wang, X.; Huang, L.; Huang, C.; Wei, Y.; Liu, W. Ccnet: Criss-cross attention for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 603–612. [Google Scholar]
- Gao, S.H.; Cheng, M.M.; Zhao, K.; Zhang, X.Y.; Yang, M.H.; Torr, P. Res2net: A new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 652–662. [Google Scholar] [CrossRef] [Green Version]
- Tomar, N.K.; Jha, D.; Bagci, U.; Ali, S. TGANet: Text-guided attention for improved polyp segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, 18–22 September 2022; Springer: Cham, Switzerland, 2022; pp. 151–160. [Google Scholar]
Segmentation Model | Parameter Amount | Context Information Fusion | Attention Mechanism | Perception of Detail and Edges | Computing Resources |
---|---|---|---|---|---|
UNet [1] | large | yes | no | normal | high |
UNet++ [12] | large | yes | no | well | high |
Deeplab [21] | large | yes | no | normal | high |
ResUNet++ [13] | small | yes | no | normal | low |
AttentionUNet [22] | large | yes | yes | well | high |
PraNet [18] | small | yes | yes | well | low |
Method | CVC-ClinicDB | Kvasir | ||||
---|---|---|---|---|---|---|
mIoU | mDic | MAE | mIoU | mDic | MAE | |
Unet [1] | 0.7556 | 0.8232 | 0.0193 | 0.7462 | 0.8184 | 0.0550 |
UNet++ [12] | 0.7298 | 0.7945 | 0.0226 | 0.7430 | 0.8215 | 0.0482 |
ResUNet++ [13] | 0.7964 | 0.8092 | 0.0160 | 0.7933 | 0.8132 | 0.0544 |
AttentionUNet [22] | 0.7745 | 0.8493 | 0.0216 | 0.7675 | 0.8420 | 0.0437 |
PraNet [18] | 0.8490 | 0.8992 | 0.0092 | 0.8403 | 0.8980 | 0.0302 |
NAF_UCANet | 0.8824 | 0.9276 | 0.0063 | 0.8576 | 0.9133 | 0.0250 |
Method | CVC-ColonDB | ETIS | CVC-300 | ||||||
---|---|---|---|---|---|---|---|---|---|
mIoU | mDic | MAE | mIoU | mDic | MAE | mIoU | mDic | MAE | |
Unet [1] | 0.4442 | 0.5120 | 0.0614 | 0.3352 | 0.3980 | 0.0363 | 0.6274 | 0.7102 | 0.0221 |
UNet++ [12] | 0.4104 | 0.4830 | 0.0641 | 0.3446 | 0.4013 | 0.0355 | 0.6240 | 0.7074 | 0.0182 |
ResUNet++ [13] | 0.3879 | 0.4844 | 0.0783 | 0.2274 | 0.2886 | 0.0552 | 0.4946 | 0.5968 | 0.0253 |
AttentionUNet [22] | 0.5346 | 0.6224 | 0.0504 | 0.3720 | 0.4228 | 0.0390 | 0.7392 | 0.8254 | 0.0137 |
PraNet [18] | 0.6403 | 0.7098 | 0.0455 | 0.6273 | 0.6796 | 0.0316 | 0.7971 | 0.8713 | 0.0103 |
NAF_UCANet | 0.7120 | 0.7847 | 0.0334 | 0.6782 | 0.7644 | 0.0124 | 0.8496 | 0.9122 | 0.0049 |
Method | CVC-ClinicDB | Kvasir | ||||
---|---|---|---|---|---|---|
mIoU | mDic | MAE | mIoU | mDic | MAE | |
UNet | 0.7556 | 0.8232 | 0.0193 | 0.7462 | 0.8184 | 0.0550 |
ResNet50 | 0.8382 | 0.8806 | 0.0119 | 0.8255 | 0.8746 | 0.0291 |
ResNet50+SPACA | 0.8494 | 0.8920 | 0.0103 | 0.8328 | 0.8823 | 0.0289 |
ResNet50+NAF | 0.8660 | 0.9101 | 0.0080 | 0.8451 | 0.8930 | 0.0276 |
ResNet50+UCA | 0.8545 | 0.9012 | 0.0097 | 0.8436 | 0.8863 | 0.0284 |
Baseline | 0.8700 | 0.9178 | 0.0079 | 0.8558 | 0.9022 | 0.0248 |
Baseline+SPACA | 0.8632 | 0.9080 | 0.0083 | 0.8470 | 0.8979 | 0.0300 |
Baseline+NAF | 0.8735 | 0.9195 | 0.0074 | 0.8529 | 0.9051 | 0.0285 |
Baseline+UCA | 0.8703 | 0.9163 | 0.0077 | 0.8523 | 0.9050 | 0.0264 |
NAF_UCANet | 0.8824 | 0.9276 | 0.0063 | 0.8576 | 0.9133 | 0.0250 |
Method | CVC-ColonDB | ETIS | CVC-300 | ||||||
---|---|---|---|---|---|---|---|---|---|
mIoU | mDic | MAE | mIoU | mDic | MAE | mIoU | mDic | MAE | |
UNet | 0.4442 | 0.5120 | 0.0614 | 0.3352 | 0.3980 | 0.0363 | 0.6274 | 0.7102 | 0.0221 |
ResNet50 | 0.6545 | 0.7348 | 0.0396 | 0.5525 | 0.6319 | 0.0277 | 0.7659 | 0.8455 | 0.0114 |
ResNet50+SPACA | 0.6637 | 0.7370 | 0.0370 | 0.5770 | 0.6532 | 0.0247 | 0.7714 | 0.8500 | 0.0112 |
ResNet50+NAF | 0.6722 | 0.7413 | 0.0376 | 0.5748 | 0.6520 | 0.0215 | 0.7958 | 0.8711 | 0.0086 |
ResNet50+UCA | 0.6586 | 0.7393 | 0.0388 | 0.5756 | 0.6584 | 0.0236 | 0.7893 | 0.8672 | 0.0099 |
Baseline | 0.6741 | 0.7474 | 0.0407 | 0.5898 | 0.6625 | 0.0261 | 0.8196 | 0.8905 | 0.0065 |
Baseline+SPACA | 0.6862 | 0.7628 | 0.0354 | 0.6111 | 0.6921 | 0.0173 | 0.8214 | 0.8929 | 0.0060 |
Baseline+NAF | 0.6837 | 0.7542 | 0.0331 | 0.5978 | 0.6696 | 0.0203 | 0.8287 | 0.8992 | 0.0056 |
Baseline+UCA | 0.6779 | 0.7507 | 0.0395 | 0.6155 | 0.6938 | 0.0228 | 0.8370 | 0.9025 | 0.0055 |
NAF_UCANet | 0.7120 | 0.7847 | 0.0334 | 0.6782 | 0.7644 | 0.0124 | 0.8496 | 0.9122 | 0.0049 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, W.; Fan, H.; Fan, Y.; Wen, J. Nonlinear Activation-Free Contextual Attention Network for Polyp Segmentation. Information 2023, 14, 362. https://doi.org/10.3390/info14070362
Wu W, Fan H, Fan Y, Wen J. Nonlinear Activation-Free Contextual Attention Network for Polyp Segmentation. Information. 2023; 14(7):362. https://doi.org/10.3390/info14070362
Chicago/Turabian StyleWu, Weidong, Hongbo Fan, Yu Fan, and Jian Wen. 2023. "Nonlinear Activation-Free Contextual Attention Network for Polyp Segmentation" Information 14, no. 7: 362. https://doi.org/10.3390/info14070362
APA StyleWu, W., Fan, H., Fan, Y., & Wen, J. (2023). Nonlinear Activation-Free Contextual Attention Network for Polyp Segmentation. Information, 14(7), 362. https://doi.org/10.3390/info14070362