A Multi-Task Learning and Multi-Branch Network for DR and DME Joint Grading
Abstract
:1. Introduction
- (i)
- Build a multi-task learning and multi-branch network (MaMNet) for simultaneous grading of DME and DR. By tapping the relationship between the two grading tasks of DME and DR, multi-task learning can make models more robust and improve the grading accuracy.
- (ii)
- Design a four-architecture multi-branch network to increase the expression of the underlying features of DME and DR.
- (iii)
- Design a feature fusion module to fuse the self-features, the cross-feature, and the global features involved in DR or DME to enhance the joint grading accuracy.
2. Related Work
2.1. Multi-Task Learning
2.2. Multi-Branch Network
2.3. Attention Mechanism
3. Proposed Method
3.1. Proposed Multi-Branch Network
3.2. Feature Fusion Module
3.2.1. Self-Feature Extraction Network
- (1)
- is processed by the global averaging pooling to obtain the feature map with global sensory field . The calculation is described in Equation (1).
- (2)
- is processed by the use of two FC operation to get the dependencies between individual channels. The process can be given by Equation (2).
- (3)
- The corresponding weight is multiplied with the original input feature F to obtain the channeled disease grading discriminant feature F′. The calculation is described in Equation (3).
- (4)
- The input features F′ are convolved by two convolution branches to obtain two spatial feature maps and . Every convolution branch has two convolutional layers, one with a kernel of 1 × k and the other with k × 1. This convolutional layer design can enlarge the acceptance domain without increasing the training parameters. The corresponding calculation can be given by Equations (4) and (5).
- (5)
- The feature map obtained by fusing and is converted to get the spatial weight SA.
- (6)
- F′ is multiplied with to obtain the disease-specific self-feature maps .
3.2.2. Cross Feature Extraction Network
3.2.3. Atrous Spatial Pyramid Pooling Module
3.3. Disease Classification
4. Experiment and Results
4.1. Experimental Settings
4.1.1. Datasets
4.1.2. Evaluation Metrics
4.1.3. Experimental Environment and Training Methods
4.2. Experimental Results and Analysis
4.2.1. Loss Weight Setting Experiment
4.2.2. K-Parameter Setting Experiment
4.2.3. Module Ablation Experiments
- (1)
- When the CFEN module is added to the branch for DME grading, DME improved by 0.029 and DR improved by 0.010; when the CFEN module is added to the subnet for DR grading, Joint improved by 0.009 and DR improved by 0.019. The above results demonstrated that the CFEN module explored more correlation features for DME grading and DR grading.
- (2)
- Due to the addition of the ASPP module, , DME , and DR are improved by 0.020, 0.019, and 0.047 respectively in the joint grading course. From the experimental data, it can be seen that ASPP can better extract global lesion features and improve the grading performance.
- (3)
- Due to the CFEN and ASPP module, the highest values of , DR , and DME are obtained. These results demonstrate that MaMNet can effectively extract the self-features, correlated features between DME and DR, and the global features of DR to enhance the grading performance.
4.2.4. Comparison Experiments
4.2.5. Visualization of the Best Model Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
DR | Diabetic Retinopathy |
DME | Diabetic Macular Edema |
MaMNet | Multi-task learning and Multi-branch Networks |
MbN | Multi-branch Network |
SFEN | Self-Feature Extraction module |
CFEN | Cross-Feature Extraction module |
ASPP | Atrous Spatial Pyramid Pooling module |
GA | Global Average Pooling Layer |
FC | Fully Connected Layer |
References
- Wilkinson, C.P.; Ferris, F.L.; Klein, R.E.; Lee, P.P.; Agardh, C.D.; Davis, M.; Dills, D.; Kampik, A.; Pararajasegaram, R.; Verdaguer, J.T. Proposed international clinical diabetic retinopathy and diabetic macular edema disease severity scales. Ophthalmology 2003, 110, 1677–1682. [Google Scholar] [CrossRef] [PubMed]
- Ciulla, T.A.; Amador, A.G.; Zinman, B. Diabetic retinopathy and diabetic macular edema: Pathophysiology, screening, and novel therapies. Diabetes Care 2003, 26, 2653–2664. [Google Scholar] [CrossRef] [PubMed]
- Li, T.; Bo, W.; Hu, C.; Kang, H.; Liu, H.; Wang, K.; Fu, H. Applications of deep learning in fundus images: A review. Med. Image Anal. 2021, 69, 101971. [Google Scholar] [CrossRef] [PubMed]
- Wu, L.; Fernandez-Loaiza, P.; Sauma, J.; Hernandez-Bogantes, E.; Masis, M. Classification of diabetic retinopathy and diabetic macular edema. World J. Diabetes 2013, 4, 290. [Google Scholar] [CrossRef] [PubMed]
- Pratt, H.; Coenen, F.; Broadbent, D.M.; Harding, S.P.; Zheng, Y. Convolutional neural networks for diabetic retinopathy. Procedia Comput. Sci. 2016, 90, 200–205. [Google Scholar] [CrossRef]
- Gargeya, R.; Leng, T. Automated identification of diabetic retinopathy using deep learning. Ophthalmology 2017, 124, 962–969. [Google Scholar] [CrossRef] [PubMed]
- Gulshan, V.; Peng, L.; Coram, M.; Stumpc, M.C.; Du, W. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA 2016, 316, 2402–2410. [Google Scholar] [CrossRef] [PubMed]
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Fei-Fei, L. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
- Zhang, W.; Zhong, J.; Yang, S.; Gao, Z.; Hu, J.; Chen, Y.; Yi, Z. Automated identification and grading system of diabetic retinopathy using deep neural networks. Knowl.-Based Syst. 2019, 175, 12–25. [Google Scholar] [CrossRef]
- Li, F.; Wang, Y.; Xu, T.; Dong, L.; Yan, L.; Jiang, M.; Zou, H. Deep learning-based automated detection for diabetic retinopathy and diabetic macular oedema in retinal fundus photographs. Eye 2022, 36, 1433–1441. [Google Scholar] [CrossRef]
- Wang, Z.; Yin, Y.; Shi, J.; Fang, W.; Li, H.; Wang, X. Zoom-in-net: Deep mining lesions for diabetic retinopathy detection. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2017: 20th International Conference, Quebec City, QC, Canada, 11–13 September 2017. [Google Scholar]
- Lin, Z.; Guo, R.; Wang, Y.; Wu, B.; Chen, T.; Wang, W.; Wu, J. A framework for identifying diabetic retinopathy based on anti-noise detection and attention-based fusion. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2018: 21st International Conference, Granada, Spain, 16–20 September 2018. [Google Scholar]
- Zhou, Y.; He, X.; Huang, L.; Liu, L.; Zhu, F.; Cui, S.; Shao, L. Collaborative learning of semi-supervised segmentation and classification for medical images. In Proceedings of the 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Perdomo, O.; Otalora, S.; Rodríguez, F.; Arevalo, J.; González, F.A. A novel machine learning model based on exudate localization to detect diabetic macular edema. Ophthalmic Med. Image Anal. Int. Workshop 2016, 3, 137–144. [Google Scholar]
- Mo, J.; Zhang, L.; Feng, Y. Exudate-based diabetic macular edema recognition in retinal images using cascaded deep residual networks. Neurocomputing 2018, 290, 161–171. [Google Scholar] [CrossRef]
- He, X.; Zhou, Y.; Wang, B.; Cui, S.; Shao, L. Dme-net: Diabetic macular edema grading by auxiliary task learning. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, 13–17 October 2019. [Google Scholar]
- Caruana, R. Multitask learning. Mach. Learn. 1997, 28, 41–75. [Google Scholar] [CrossRef]
- Tan, C.; Zhao, L.; Yan, Z.; Li, K.; Metaxas, D.; Zhan, Y. Deep multi-task and task-specific feature learning network for robust shape preserved organ segmentation. In Proceedings of the 15th IEEE International Symposium on Biomedical Imaging, Washington, DC, USA, 4–7 April 2018. [Google Scholar]
- Liu, L.; Dou, Q.; Chen, H.; Olatunji, I.E.; Qin, J.; Heng, P.A. Mtmr-net: Multi-task deep learning with margin ranking loss for lung nodule analysis. In Proceedings of the 4th International Workshop on Deep Learning in Medical Image Analysis (DLMIA)/8th International Workshop on Multimodal Learning for Clinical Decision Support (ML-CDS), Granada, Spain, 20 September 2018. [Google Scholar]
- Chen, Q.; Peng, Y.; Keenan, T.; Dharssi, S.; Agro, E.; Wong, W.T.; Lu, Z. A multitask deep learning model for the classification of age-related macular degeneration. AMIA Summits Transl. Sci. Proc. 2019, 2019, 505–514. [Google Scholar]
- Xu, X.; Zhou, F.; Liu, B.; Bai, X. Multiple organ localization in CT image using triple-branch fully convolutional networks. IEEE Access 2019, 7, 98083–98093. [Google Scholar] [CrossRef]
- Tabarestani, S.; Aghili, M.; Eslami, M.; Cabrerizo, M.; Barreto, A.; Rishe, N.; Adjouadi, M. A distributed multitask multimodal approach for the prediction of Alzheimer’s disease in a longitudinal study. NeuroImage 2020, 206, 116317. [Google Scholar] [CrossRef]
- He, L.; Li, H.; Wang, J.; Chen, M.; Gozdas, E.; Dillman, J.R.; Parikh, N.A. A multitask, multi-stage deep transfer learning model for early prediction of neurodevelopment in very preterm infants. Sci. Rep. 2020, 10, 15072. [Google Scholar] [CrossRef]
- Estienne, T.; Lerousseau, M.; Vakalopoulou, M.; Alvarez Andres, E.; Battistella, E.; Carré, A.; Deutsch, E. Deep learning-based concurrent brain registration and tumor segmentation. Front. Comput. Neurosci. 2020, 14, 17. [Google Scholar] [CrossRef]
- Jin, C.; Yu, H.; Ke, J.; Ding, P.; Yi, Y.; Jiang, X.; Li, R. Predicting treatment response from longitudinal images using multi-task deep learning. Nat. Commun. 2021, 12, 1851. [Google Scholar] [CrossRef]
- Hao, P.; Gao, X.; Li, Z.; Zhang, J.; Wu, F.; Bai, C. Multi-branch fusion network for Myocardial infarction screening from 12-lead ECG images. Comput. Methods Programs Biomed. 2020, 184, 105286. [Google Scholar] [CrossRef]
- Zhuang, J. LadderNet: Multi-path networks based on U-Net for medical image segmentation. arXiv 2018, arXiv:1810.07810. [Google Scholar]
- Yang, Z.; Ran, L.; Zhang, S.; Xia, Y.; Zhang, Y. EMS-Net: Ensemble of multiscale convolutional neural networks for classification of breast cancer histology images. Neurocomputing 2019, 366, 46–53. [Google Scholar] [CrossRef]
- Chaudhari, S.; Mithal, V.; Polatkan, G.; Ramanath, R. An attentive survey of attention models. ACM Trans. Intell. Syst. Technol. (TIST) 2021, 12, 1–32. [Google Scholar] [CrossRef]
- Sinha, A.; Dolz, J. Multi-scale self-guided attention for medical image segmentation. IEEE J. Biomed. Health Inform. 2020, 25, 121–130. [Google Scholar] [CrossRef] [PubMed]
- Cai, Y.; Wang, Y. Ma-unet: An improved version of unet based on multi-scale and attention mechanism for medical image segmentation. In Proceedings of the Third International Conference on Electronics and Communication; Network and Computer Technology (ECNCT 2021), Xiamen, China, 5 April 2021. [Google Scholar]
- Valanarasu, J.M.J.; Oza, P.; Hacihaliloglu, I.; Patel, V.M. Medical transformer: Gated axial-attention for medical image segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, 27 September–1 October 2021. [Google Scholar]
- Wu, Z.; Su, L.; Huang, Q. Cascaded partial decoder for fast and accurate salient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Xie, H.; Zeng, X.; Lei, H.; Du, J.; Wang, J.; Zhang, G.; Lei, B. Cross-attention multi-branch network for fundus diseases classification using SLO images. Med. Image Anal. 2021, 71, 102031. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Peng, C.; Zhang, X.; Yu, G.; Luo, G.; Sun, J. Large kernel matters--improve semantic segmentation by global convolutional network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Zhao, T.; Wu, X. Pyramid feature attention network for saliency detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef]
- Porwal, P.; Pachade, S.; Kamble, R.; Kokare, M.; Deshmukh, G.; Sahasrabuddhe, V.; Meriaudeau, F. Indian diabetic retinopathy image dataset (IDRiD): A database for diabetic retinopathy screening research. Data 2018, 3, 25. [Google Scholar] [CrossRef]
- Diabetic Retinopathy: Segmentation and Grading Challenge. Available online: https://idrid.grand-challenge.org/Leaderboard/ (accessed on 27 March 2019).
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Yu, F.; Wang, D.; Shelhamer, E.; Darrell, T. Deep layer aggregation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Afreen, S.; Bhurjee, A.K.; Aziz, R.M. Gene selection with Game Shapley Harris hawks optimizer for cancer classification. Chemom. Intell. Lab. Syst. 2023, 242, 104989. [Google Scholar] [CrossRef]
Branch 1 | Branch 2 | Branch 3 | Branch 4 |
---|---|---|---|
Input (224 × 224 × 3) | Input (branch 2–output) | ||
conv3 − 64 × 2 | conv3 − 64 × 2 | conv3 − 512 × 3 | conv3 − 512 × 3 |
Maxpool (pool_size = (2, 2), strides = 2) | |||
conv3 − 128 × 2 | conv3 − 128 × 2 | conv3 − 512 × 3 | conv3 − 512 × 3 |
Maxpool (pool_size = (2, 2), strides = 2) | Maxpool (pool_size = (3, 3), strides = 1) | ||
conv3 − 256 × 3 | conv3 − 256 × 3 | UpSampling | |
Maxpool (pool_size = (2, 2), strides = 2) | |||
conv3 − 512 × 3 | |||
Output | Output | Output | Output |
Parameter | Value |
---|---|
WARMUP_EPOCHS | 10 |
LEARNING_RATE | 1 × 10−4 |
WARMUP_LEARNING_RATE | 1 × 10−3 |
ES_PATIENCE | 5 |
RLROP_PATIENCE | 3 |
DECAY_DROP | 0.5 |
EPOCHS | 50 |
BATCH_SIZE | 8 |
α | β | Joint Ac | DR Ac | DME Ac |
---|---|---|---|---|
0.25 | 0.25 | 0.5049 | 0.5437 | 0.7573 |
0.25 | 0.5 | 0.4466 | 0.5049 | 0.8155 |
0.25 | 0.75 | 0.5146 | 0.5534 | 0.7961 |
0.25 | 1.0 | 0.5146 | 0.5728 | 0.7670 |
0.5 | 0.25 | 0.4660 | 0.5243 | 0.7864 |
0.5 | 0.5 | 0.4951 | 0.5534 | 0.8058 |
0.5 | 0.75 | 0.5922 | 0.6117 | 0.8058 |
0.5 | 1.0 | 0.6117 | 0.6407 | 0.7961 |
0.75 | 0.25 | 0.4757 | 0.5436 | 0.7961 |
0.75 | 0.5 | 0.4466 | 0.5340 | 0.7961 |
0.75 | 0.75 | 0.5243 | 0.5340 | 0.7767 |
0.75 | 1.0 | 0.5728 | 0.5825 | 0.8155 |
1.0 | 0.25 | 0.4466 | 0.4854 | 0.7864 |
1.0 | 0.5 | 0.5340 | 0.5922 | 0.7961 |
1.0 | 0.75 | 0.5243 | 0.5437 | 0.7670 |
1.0 | 1.0 | 0.5437 | 0.5534 | 0.8058 |
K | Joint Ac | DR Ac | DME Ac |
---|---|---|---|
5 | 0.534 | 0.563 | 0.777 |
7 | 0.534 | 0.583 | 0.796 |
9 | 0.612 | 0.641 | 0.796 |
11 | 0.563 | 0.583 | 0.786 |
Methods | Joint Ac | DR Ac | DME Ac | |
---|---|---|---|---|
1 | DR: MbN + SFEN DME: MbN + SFEN | 0.583 | 0.602 | 0.767 |
2 | DR: MbN + SFEN DME: MbN + SFEN + CFEN | 0.583 | 0.612 | 0.796 |
3 | DR: MbN + SFEN + CFEN DME: MbN + SFEN | 0.592 | 0.621 | 0.777 |
4 | DR: MbN + SFEN + CFEN DME: MbN + SFEN + CFEN | 0.592 | 0.621 | 0.777 |
5 | DR: MbN + SFEN + CFEN + ASPP DME: MbN + SFEN + CFEN | 0.6117 | 0.6407 | 0.7961 |
Method | Joint Ac | DR Ac | DME Ac |
---|---|---|---|
LzyUNCC [40] | 0.631 | 0.748 | 0.806 |
VRT [40] | 0.553 | 0.592 | 0.816 |
Mammoth [40] | 0.515 | 0.544 | 0.835 |
HarangiM1 [40] | 0.476 | 0.553 | 0.748 |
AVSASVA [40] | 0.476 | 0.553 | 0.806 |
HarangiM2 [40] | 0.408 | 0.476 | 0.728 |
VGG16 [41] | 0.524 | 0.583 | 0.767 |
ResNet50 [42] | 0.524 | 0.592 | 0.757 |
InceptionV3 [43] | 0.437 | 0.563 | 0.767 |
DenseNet121 [44] | 0.456 | 0.485 | 0.699 |
Xception [45] | 0.467 | 0.515 | 0.738 |
Proposed Approach | 0.612 | 0.641 | 0.796 |
Input Image | True Label | Predicted Label | Various Grading Scores |
---|---|---|---|
DR3 DME2 | DR3 DME2 | DR 0.000 0.000 0.093 0.697 0.209 DME 0.000 0.052 0.947 | |
DR2 DME2 | DR2 DME2 | DR 0.010 0.008 0.889 0.072 0.021 DME 0.002 0.026 0.972 | |
DR0 DME0 | DR0 DME0 | DR 0.912 0.022 0.043 0.016 0.007 DME 0.902 0.065 0.033 | |
DR2 DME2 | DR2 DME2 | DR 0.006 0.005 0.924 0.047 0.017 DME 0.000 0.008 0.991 | |
DR0 DME0 | DR0 DME0 | DR 0.991 0.003 0.004 0.001 0.000 DME 0.970 0.018 0.012 | |
DR4 DME2 | DR4 DME2 | DR 0.000 0.000 0.017 0.297 0.686 DME 0.000 0.000 1.000 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xing, X.; Mao, S.; Yan, M.; Yu, H.; Yuan, D.; Zhu, C.; Zhang, C.; Zhou, J.; Xu, T. A Multi-Task Learning and Multi-Branch Network for DR and DME Joint Grading. Appl. Sci. 2024, 14, 138. https://doi.org/10.3390/app14010138
Xing X, Mao S, Yan M, Yu H, Yuan D, Zhu C, Zhang C, Zhou J, Xu T. A Multi-Task Learning and Multi-Branch Network for DR and DME Joint Grading. Applied Sciences. 2024; 14(1):138. https://doi.org/10.3390/app14010138
Chicago/Turabian StyleXing, Xiaoxue, Shenbo Mao, Minghan Yan, He Yu, Dongfang Yuan, Cancan Zhu, Cong Zhang, Jian Zhou, and Tingfa Xu. 2024. "A Multi-Task Learning and Multi-Branch Network for DR and DME Joint Grading" Applied Sciences 14, no. 1: 138. https://doi.org/10.3390/app14010138
APA StyleXing, X., Mao, S., Yan, M., Yu, H., Yuan, D., Zhu, C., Zhang, C., Zhou, J., & Xu, T. (2024). A Multi-Task Learning and Multi-Branch Network for DR and DME Joint Grading. Applied Sciences, 14(1), 138. https://doi.org/10.3390/app14010138