Depth-Aware Networks for Multi-Organ Lesion Detection in Chest CT Scans
Abstract
:1. Introduction
- A groundbreaking attempt to perform the MOLD task on the DeepLesion dataset, which not only distinguishes lesions from non-lesions but also detects and specifies eight different kinds of lesions so as to greatly enhance the clinical value.
- Improving the network architecture in [3] by adapting DALs through the SHT mechanism to increase the robustness of the backpropagation process and reduce over-fitting.
- Taking domain knowledge into consideration and adding the DA mechanism to extract features in three dimensions benefits the lesion type classification process at a minimal cost.
2. Related Work
2.1. Object Detection
2.2. FCN-Based Models
2.3. Lesion Detection in CADx
2.4. Multi-Level Feature Reuse and Supervision
2.5. Hidden Layer Supervision
2.6. Depth Information Obtainment
2.7. Research Objectives and Research Questions
- Develop an advanced deep learning model that integrates depth-aware (DA) and skipped-layer hierarchical training (SHT) mechanisms to enhance lesion detection accuracy.
- Evaluate the performance of the proposed model on a large, publicly available dataset (DeepLesion) and compare it to existing models.
- The integration of depth-aware mechanisms significantly improves the detection accuracy of multi-organ lesions in chest CT scans.
- The implementation of skipped-layer hierarchical training has a substantial impact on the performance of the Dense 3D context-enhanced (Dense 3DCE) network in detecting lesions of various sizes and appearances.
- The proposed DA-SHT Dense 3DCE model demonstrates superior detection accuracy and computational efficiency compared to existing state-of-the-art models.
3. Methodology
- Stage I: Depth Score Regressor (Step 1)—This initial stage, outlined by the green dashed box, takes the input slices and processes them to output depth scores. These depth scores are crucial for the next stage and are produced after training.
- Stage II: Lesion Detector (Step 2)—The second stage, marked by the orange dashed box, uses the depth scores generated from Stage I as fixed inputs to detect lesions. This stage outputs the final prediction of lesions within the CT slices. The process flows consecutively from Stage I to Stage II, as indicated by the brown arrows.
3.1. Novel Dense 3DCE R-FCN
3.2. Skipped-Layer Hierarchical Training
3.3. Depth-Aware Mechanism
3.3.1. Unsupervised Depth Score Regression
3.3.2. The Depth-Aware Pathway
3.4. Participants and Dataset
3.5. Procedure
- Data pre-processing: The CT scans were preprocessed to normalize the pixel intensities and resize the images to a uniform resolution.
- Model development: We developed the Dense 3D context-enhanced (Dense 3DCE) network, integrating depth-aware (DA) and skipped-layer hierarchical training (SHT) mechanisms.
- Model training: The model was trained on the DeepLesion dataset, using a combination of supervised learning techniques to optimize the detection accuracy.
- Model validation: We validated the model using the original and the selected small dataset to ensure generalizability and robustness.
- Model testing: The model’s performance was tested on a separate subset of the DeepLesion dataset to evaluate its effectiveness in detecting lesions of varying sizes and appearances.
3.6. Instruments and Implementation Details
3.7. Pre-Processing
3.8. Data Analysis
4. Experimental Results
4.1. Universal Lesion Detection
4.2. Multi-Organ Lesion Detection
4.3. Ablation Study
4.4. Sensitivities to Hyper-Parameters
4.5. Analysis of the Predicted Depth Scores
5. Discussion
6. Conclusions
6.1. Limitations
6.2. Future Research Directions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Yan, K.; Bagheri, M.; Summers, R.M. 3d context enhanced region-based convolutional neural network for end-to-end lesion detection. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2018, Proceedings of the 21st International Conference, Granada, Spain, 16–20 September 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 511–519. [Google Scholar]
- Tao, Q.; Ge, Z.; Cai, J.; Yin, J.; See, S. Improving Deep Lesion Detection Using 3D Contextual and Spatial Attention. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2019, Proceedings of the 22nd International Conference, Shenzhen, China, 13–17 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 185–193. [Google Scholar]
- Zhang, H.; Chung, A.C. Lesion Detection with Deep Aggregated 3D Contextual Feature and Auxiliary Information. In Machine Learning in Medical Imaging, Proceedings of the 10th International Workshop, MLMI 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, 13 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 45–53. [Google Scholar]
- Li, Z.; Zhang, S.; Zhang, J.; Huang, K.; Wang, Y.; Yu, Y. MVP-Net: Multi-view FPN with Position-aware Attention for Deep Universal Lesion Detection. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2019, Proceedings of the 22nd International Conference, Shenzhen, China, 13–17 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 13–21. [Google Scholar]
- Zlocha, M.; Dou, Q.; Glocker, B. Improving RetinaNet for CT Lesion Detection with Dense Masks from Weak RECIST Labels. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2019, Proceedings of the 22nd International Conference, Shenzhen, China, 13–17 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 402–410. [Google Scholar]
- Tang, Y.B.; Yan, K.; Tang, Y.X.; Liu, J.; Xiao, J.; Summers, R.M. ULDor: A universal lesion detector for CT scans with pseudo masks and hard negative example mining. In Proceedings of the 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), Venice, Italy, 8–11 April 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 833–836. [Google Scholar]
- Zhang, S.; Xu, J.; Chen, Y.C.; Ma, J.; Li, Z.; Wang, Y.; Yu, Y. Revisiting 3D context modeling with supervised pre-training for universal lesion detection in CT slices. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2020, Proceedings of the 23rd International Conference, Lima, Peru, 4–8 October 2020; Proceedings, Part IV 23; Springer: Berlin/Heidelberg, Germany, 2020; pp. 542–551. [Google Scholar]
- Xu, Z.; Li, T.; Liu, Y.; Zhan, Y.; Chen, J.; Lukasiewicz, T. PAC-Net: Multi-pathway FPN with position attention guided connections and vertex distance IoU for 3D medical image detection. Front. Bioeng. Biotechnol. 2023, 11, 1049555. [Google Scholar] [CrossRef]
- Khan, M.A.; Muhammad, K.; Sharif, M.; Akram, T.; de Albuquerque, V.H.C. Multi-Class Skin Lesion Detection and Classification via Teledermatology. IEEE J. Biomed. Health Inform. 2021, 25, 4267–4275. [Google Scholar] [CrossRef] [PubMed]
- Wang, R.; Fan, J.; Li, Y. Deep multi-scale fusion neural network for multi-class arrhythmia detection. IEEE J. Biomed. Health Inform. 2020, 24, 2461–2472. [Google Scholar] [CrossRef]
- Nayak, D.R.; Dash, R.; Majhi, B. Automated diagnosis of multi-class brain abnormalities using MRI images: A deep convolutional neural network based method. Pattern Recognit. Lett. 2020, 138, 385–391. [Google Scholar] [CrossRef]
- Xu, X.; Zhou, F.; Liu, B.; Fu, D.; Bai, X. Efficient multiple organ localization in CT image using 3D region proposal network. IEEE Trans. Med. Imaging 2019, 38, 1885–1898. [Google Scholar] [CrossRef] [PubMed]
- Yan, K.; Tang, Y.; Peng, Y.; Sandfort, V.; Bagheri, M.; Lu, Z.; Summers, R.M. Mulan: Multitask universal lesion analysis network for joint lesion detection, tagging, and segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2019, Proceedings of the 22nd International Conference, Shenzhen, China, 13–17 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 194–202. [Google Scholar]
- Ginès, P.; Guevara, M.; Arroyo, V.; Rodés, J. Hepatorenal syndrome. Lancet 2003, 362, 1819–1827. [Google Scholar] [CrossRef]
- Parkin, D.M.; Bray, F.; Ferlay, J.; Pisani, P. Global cancer statistics, 2002. CA Cancer J. Clin. 2005, 55, 74–108. [Google Scholar] [CrossRef]
- Yang, C.J.; Hwang, J.J.; Kang, W.Y.; Chong, I.W.; Wang, T.H.; Sheu, C.C.; Tsai, J.R.; Huang, M.S. Gastro-Intestinal metastasis of primary lung carcinoma: Clinical presentations and outcome. Lung Cancer 2006, 54, 319–323. [Google Scholar] [CrossRef]
- Hillers, T.K.; Sauve, M.D.; Guyatt, G.H. Analysis of published studies on the detection of extrathoracic metastases in patients presumed to have operable non-small cell lung cancer. Thorax 1994, 49, 14–19. [Google Scholar] [CrossRef]
- Yan, K.; Wang, X.; Lu, L.; Zhang, L.; Harrison, A.P.; Bagheri, M.; Summers, R.M. Deep lesion graphs in the wild: Relationship learning and organization of significant radiology image findings in a diverse large-scale lesion database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 9261–9270. [Google Scholar]
- Yan, K.; Lu, L.; Summers, R.M. Unsupervised body part regression via spatially self-ordering convolutional neural networks. In Proceedings of the 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), Washington, DC, USA, 4–7 April 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1022–1025. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Dai, J.; Li, Y.; He, K.; Sun, J. R-fcn: Object detection via region-based fully convolutional networks. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 379–387. [Google Scholar]
- Yan, K.; Wang, X.; Lu, L.; Summers, R.M. DeepLesion: Automated mining of large-scale lesion annotations and universal lesion detection with deep learning. J. Med. Imaging 2018, 5, 036501. [Google Scholar] [CrossRef] [PubMed]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; pp. 91–99. [Google Scholar]
- Zou, Z.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. arXiv 2019, arXiv:1905.05055. [Google Scholar] [CrossRef]
- Song, H.; Sun, D.; Chun, S.; Jampani, V.; Han, D.; Heo, B.; Kim, W.; Yang, M.H. An extendable, efficient and effective transformer-based object detector. arXiv 2022, arXiv:2204.07962. [Google Scholar]
- Liang, T.; Bao, H.; Pan, W.; Fan, X.; Li, H. DetectFormer: Category-assisted transformer for traffic scene object detection. Sensors 2022, 22, 4833. [Google Scholar] [CrossRef] [PubMed]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Zhou, Y.; Kang, X.; Ren, F.; Lu, H.; Nakagawa, S.; Shan, X. A multi-attention and depthwise separable convolution network for medical image segmentation. Neurocomputing 2024, 564, 126970. [Google Scholar] [CrossRef]
- Karthik, R.; Radhakrishnan, M.; Rajalakshmi, R.; Raymann, J. Delineation of ischemic lesion from brain MRI using attention gated fully convolutional network. Biomed. Eng. Lett. 2021, 11, 3–13. [Google Scholar] [CrossRef]
- Yu, W.; Huang, Z.; Zhang, J.; Shan, H. SAN-Net: Learning generalization to unseen sites for stroke lesion segmentation with self-adaptive normalization. Comput. Biol. Med. 2023, 156, 106717. [Google Scholar] [CrossRef]
- Deng, J.; Dong, W.; Socher, R.; Li, L.; Li, K.; Feifei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 248–255. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Lee, C.; Xie, S.; Gallagher, P.W.; Zhang, Z.; Tu, Z. Deeply-supervised nets. In Proceedings of the Artificial Intelligence and Statistics, San Diego, CA, USA, 9–12 May 2015; pp. 562–570. [Google Scholar]
- Sun, D.; Yao, A.; Zhou, A.; Zhao, H. Deeply-supervised knowledge synergy. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 6997–7006. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 1, 1097–1105. [Google Scholar] [CrossRef]
- Bell, S.; Zitnick, C.L.; Bala, K.; Girshick, R. Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2874–2883. [Google Scholar]
- Kong, T.; Yao, A.; Chen, Y.; Sun, F. Hypernet: Towards accurate region proposal generation and joint object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 845–853. [Google Scholar]
- Chen, T.; Li, M.; Li, Y.; Lin, M.; Wang, N.; Wang, M.; Xiao, T.; Xu, B.; Zhang, C.; Zhang, Z. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv 2015, arXiv:1512.01274. [Google Scholar]
- Yan, K.; Cai, J.; Harrison, A.P.; Jin, D.; Xiao, J.; Lu, L. Universal lesion detection by learning from multiple heterogeneously labeled datasets. arXiv 2020, arXiv:2005.13753. [Google Scholar]
- Marimuthu, T.; Rajan, V.A.; Londhe, G.V.; Logeshwaran, J. Deep Learning for Automated Lesion Detection in Mammography. In Proceedings of the 2023 IEEE 2nd International Conference on Industrial Electronics: Developments & Applications (ICIDeA), Imphal, India, 29–30 September 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 383–388. [Google Scholar]
- Chu, P.T.M.; Pham, T.H.B.; Vu, N.M., Jr.; Hoang, H.; Doan, T.M. The application of deep learning in lung cancerous lesion detection. medRxiv 2024. medRxiv:2024–04. [Google Scholar]
- Fazilov, S.K.; Abdieva, K.S.; Yusupov, O. Patch-based lesion detection using deep learning method on small mammography dataset. In Artificial Intelligence, Blockchain, Computing and Security; CRC Press: Boca Raton, FL, USA, 2023; Volume 2, pp. 643–647. [Google Scholar]
- Cao, K.; Xia, Y.; Yao, J.; Han, X.; Lambert, L.; Zhang, T.; Tang, W.; Jin, G.; Jiang, H.; Fang, X.; et al. Large-scale pancreatic cancer detection via non-contrast CT and deep learning. Nat. Med. 2023, 29, 3033–3043. [Google Scholar] [CrossRef] [PubMed]
- Faghani, S.; Baffour, F.I.; Ringler, M.D.; Hamilton-Cave, M.; Rouzrokh, P.; Moassefi, M.; Khosravi, B.; Erickson, B.J. A deep learning algorithm for detecting lytic bone lesions of multiple myeloma on CT. Skelet. Radiol. 2023, 52, 91–98. [Google Scholar] [CrossRef] [PubMed]
- Jiang, H.; Diao, Z.; Shi, T.; Zhou, Y.; Wang, F.; Hu, W.; Zhu, X.; Luo, S.; Tong, G.; Yao, Y.D. A review of deep learning-based multiple-lesion recognition from medical images: Classification, detection and segmentation. Comput. Biol. Med. 2023, 157, 106726. [Google Scholar] [CrossRef] [PubMed]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part V 13. Springer: Berlin/Heidelberg, Germany, 2014; pp. 740–755. [Google Scholar]
- Zhou, S.K.; Greenspan, H.; Davatzikos, C.; Duncan, J.S.; Van Ginneken, B.; Madabhushi, A.; Prince, J.L.; Rueckert, D.; Summers, R.M. A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proc. IEEE 2021, 109, 820–838. [Google Scholar] [CrossRef] [PubMed]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
Layer Type | Kernel Attribute | Num of Filters | ||
---|---|---|---|---|
Image Input Layer | ||||
Depth Score Regressor | Conv1 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 64 |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 64 | ||
ReLU Layer | ||||
Max Pooling | 2 × 3 | |||
Conv2 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 128 | |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 128 | ||
ReLU Layer | ||||
Max Pooling | 2 × 3 | |||
Conv3 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 256 | |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 256 | ||
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 256 | ||
ReLU Layer | ||||
Max Pooling | 2 × 3 | |||
Conv4 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | ||
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | ||
ReLU Layer | ||||
Max Pooling | 2 × 3 | |||
Conv5 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | ||
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | ||
ReLU Layer | ||||
Max Pooling | 2 × 3 | |||
Conv6 | Convolutional Layer | 1 × 3, stride = 1, padding = same | 512 | |
ReLU Layer | ||||
Global Average Pooling | ||||
Fully Connected Layer |
Layer Type | Kernel Attribute | Num of Filters | ||
---|---|---|---|---|
Image Input Layer | ||||
Pre-train model—VGG16 [20] | Conv1 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 64 |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 64 | ||
ReLU Layer | ||||
Max Pooling | 2 x 2 | |||
Conv2 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 128 | |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 128 | ||
ReLU Layer | ||||
Max Pooling | 2 × 2 | |||
Conv3 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 256 | |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 256 | ||
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 256 | ||
ReLU Layer | ||||
Max Pooling | 2 × 2 | |||
Conv4 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | ||
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | ||
ReLU Layer | ||||
Max Pooling | 2 × 2 | |||
Conv5 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | ||
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | ||
ReLU Layer | ||||
Max Pooling | 2 × 2 |
Layer Type | Kernel Attribute | Num of Filters | ||
---|---|---|---|---|
Image Input Layer | ||||
Lesion Detector | Conv1 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 64 |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 64 | ||
ReLU Layer | ||||
Max Pooling | 2 × 2 | |||
Conv2 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 128 | |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 128 | ||
ReLU Layer | ||||
Max Pooling | 2 × 2 | |||
Conv3 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 256 | |
ReLU Layer | ||||
Lesion Detector | Conv3 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 256 |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 256 | ||
ReLU Layer | ||||
Max Pooling | 2 × 2 | |||
Conv4 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | |
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | ||
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | ||
ReLU Layer | ||||
Conv5 | Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | |
ReLU Layer | ||||
Convolutional Layer | 3 ×3, stride = 1, padding = same | 512 | ||
ReLU Layer | ||||
Convolutional Layer | 3 × 3, stride = 1, padding = same | 512 | ||
ReLU Layer | ||||
Conv6_1 - Conv6_5 | Convolutional Layer | 1 × 1, stride = 0, no padding | 7 × 7 × 6 | |
PS ROI Pooling | 7 × 7 | |||
Fully Connected Layer |
Dataset | Task | Total Number of Lesions | Number of CT Slices | |||
---|---|---|---|---|---|---|
Train | Validation | Test | Total | |||
Original DeepLesion dataset | ULD | 22,919 | 4889 | 4927 | 32,120 | 32,735 |
Extracted small lesion dataset | ULD | 15,921 | 3537 | 3392 | 22,533 | 22,850 |
Multi-organ lesion dataset | MOLD | 6871 | 1472 | 1473 | 9626 | 9816 |
Evaluation Metric | [email protected] ± std | Sen@1 ± std | Sen@2 ± std | Sen@4 ± std | Sen@8 ± std | Sen@16 ± std | mAP ± std | IT(ms) ± std |
---|---|---|---|---|---|---|---|---|
Faster R-CNN [25] | 58.93 ± 1.87 | 68.46 ± 0.18 | 76.69 ± 0.99 | 82.27 ± 1.75 | 85.98 ± 2.02 | 88.52 ± 1.88 | 52.80 ± 1.19 | 200 ± 3.79 |
Original R-FCN [21] | 57.89 ± 0.64 | 68.69 ± 0.56 | 76.60 ± 1.17 | 82.12 ± 1.38 | 86.28 ± 1.73 | 88.61 ± 1.79 | 50.13 ± 1.02 | 222 ± 31.77 |
3DCE, 9 slices [1] | 62.70 ± 0.71 | 72.20 ± 0.46 | 79.85 ± 1.58 | 84.51 ± 1.76 | 87.43 ± 1.73 | 89.67 ± 1.85 | 57.09 ± 1.93 | 239 ± 24.83 |
Dense DAL 3DCE R-FCN, 9 slices [3] | 63.32 ± 1.10 | 72.79 ± 0.95 | 80.94 ± 2.29 | 85.93 ± 2.29 | 88.88 ± 2.50 | 90.82 ± 2.38 | 58.28 ± 0.77 | 235 ± 3.51 |
DA-SHT Dense 3DCE R-FCN, 9 slices (ours) | 64.86 ± 1.43 | 74.39 ± 1.01 | 81.41 ± 2.08 | 86.04 ± 2.38 | 89.26 ± 1.82 | 91.38 ± 1.77 | 59.22 ± 1.19 | 241 ± 2.09 |
Evaluation Metric | [email protected] ± std | Sen@1 ± std | Sen@2 ± std | Sen@4 ± std | Sen@8 ± std | Sen@16 ± std | mAP ± std | IT(ms) ± std |
---|---|---|---|---|---|---|---|---|
Faster R-CNN [25] | 56.19 ± 2.38 | 67.81 ± 0.80 | 75.98 ± 1.32 | 82.13 ± 0.68 | 86.14 ± 0.60 | 88.76 ± 0.57 | 50.98 ± 2.21 | 207 ± 13.05 |
Original R-FCN [21] | 56.45 ± 0.14 | 67.55 ± 0.98 | 76.02 ± 0.73 | 81.72 ± 0.59 | 86.22 ± 0.67 | 88.58 ± 0.54 | 50.17 ± 1.66 | 214 ± 3.97 |
3DCE R-FCN [1] | 60.25 ± 1.83 | 71.01 ± 1.19 | 78.99 ± 1.12 | 84.39 ± 0.58 | 87.66 ± 0.46 | 89.90 ± 0.70 | 54.62 ± 2.07 | 232 ± 2.65 |
Dense DAL 3DCE R-FCN [3] | 60.61 ± 1.60 | 71.52 ± 0.84 | 79.78 ± 0.95 | 85.10 ± 0.94 | 88.52 ± 0.99 | 90.68 ± 0.82 | 54.41 ± 1.49 | 243 ± 2.30 |
DA-SHT Dense 3DCE (ours) | 60.97 ± 1.32 | 72.50 ± 0.81 | 79.99 ± 0.24 | 84.89 ± 0.22 | 88.52 ± 0.26 | 90.86 ± 0.30 | 55.15 ± 1.29 | 233 ± 6.11 |
Evaluation Metric | [email protected] ± std | AS@1 ± std | AS@2 ± std | AS@4 ± std | AS@8 ± std | AS@16 ± std | mAP ± std |
---|---|---|---|---|---|---|---|
Faster R-CNN [25] | 37.01 ± 0.49 | 47.38 ± 0.44 | 56.73 ± 0.89 | 64.69 ± 0.33 | 70.97 ± 0.52 | 75.92 ± 1.43 | 32.20 ± 0.42 |
Original R-FCN [21] | 31.52 ± 0.24 | 41.13 ± 0.33 | 49.69 ± 0.86 | 56.38 ± 1.16 | 63.21 ± 1.03 | 67.98 ± 1.05 | 25.74 ± 0.48 |
3DCE, 9 slices [1] | 36.77 ± 1.28 | 47.01 ± 2.39 | 56.80 ± 2.46 | 65.08 ± 2.34 | 70.83 ± 2.00 | 75.65 ± 1.82 | 32.43 ± 1.66 |
Dense DAL 3DCE R-FCN, 9 slices [3] | 42.09 ± 0.80 | 54.80 ± 1.44 | 65.17 ± 0.72 | 71.33 ± 1.28 | 76.79 ± 1.13 | 80.39 ± 0.81 | 36.76 ± 1.47 |
Dense DKS 3DCE R-FCN, 9 slices [37] | 34.00 ± 0.58 | 42.84 ± 1.58 | 53.50 ± 0.63 | 62.48 ± 0.09 | 67.90 ± 0.09 | 74.36 ± 1.00 | 29.76 ± 0.28 |
MP3D, 9 slices [7] | 44.32 ± 1.11 | 53.30 ± 0.89 | 61.69 ± 0.60 | 68.16 ± 0.90 | 74.94 ± 0.08 | 80.07 ± 0.26 | 43.58 ± 1.01 |
DA-SHT Dense 3DCE R-FCN, 9 slices (ours) | 44.37 ± 0.12 | 58.05 ± 1.13 | 67.29 ± 0.12 | 73.88 ± 0.17 | 78.97 ± 0.14 | 82.11 ± 0.12 | 40.39 ± 0.41 |
Faster R-CNN | Original R-FCN | 3DCE, 9 Slices | DENSE DAL 3DCE R-FCN | Dense DKS 3DCE R-FCN | DA-SHT Dense 3DCE R-FCN (Ours) | |
---|---|---|---|---|---|---|
BN | 28.79 ± 1.50 | 16.69 ± 7.02 | 15.34 ± 11.99 | 29.52 ± 0.66 | 30.85 ± 1.38 | 40.47 ± 0.66 |
AB | 26.68 ± 0.54 | 21.74 ± 1.06 | 26.41 ± 3.80 | 33.87 ± 2.66 | 24.84 ± 1.70 | 35.16 ± 2.66 |
ME | 43.02 ± 0.12 | 39.22 ± 0.74 | 45.57 ± 7.72 | 50.56 ± 0.54 | 46.83 ± 2.90 | 55.26 ± 2.40 |
LV | 34.58 ± 1.58 | 32.00 ± 1.98 | 38.79 ± 4.90 | 36.19 ± 2.40 | 30.84 ± 0.41 | 41.09 ± 2.28 |
LU | 52.28 ± 0.35 | 52.93 ± 0.63 | 56.02 ± 1.89 | 56.23 ± 2.28 | 52.76 ± 1.49 | 58.85 ± 2.52 |
KD | 29.13 ± 8.80 | 12.38 ± 1.85 | 25.72 ± 4.29 | 31.78 ± 2.52 | 17.01 ± 0.79 | 28.23 ± 6.37 |
ST | 22.28 ± 3.50 | 13.58 ± 1.67 | 24.68 ± 1.11 | 26.08 ± 5.64 | 16.65 ± 1.34 | 26.56 ± 5.64 |
PV | 20.84 ± 1.11 | 17.40 ± 3.54 | 26.94 ± 1.79 | 29.86 ± 0.41 | 18.28 ± 1.46 | 36.68 ± 0.41 |
Faster R-CNN | Original R-FCN | 3DCE, 9 Slices | Dense DKS 3DCE R-FCN | DENSE DAL 3DCE R-FCN | DA-SHT Dense 3DCE R-FCN (Ours) | |
---|---|---|---|---|---|---|
BN | 48.65 ± 4.22 | 37.84 ± 12.48 | 40.54 ± 11.42 | 54.05 ± 4.46 | 54.05 ± 4.65 | 59.46 ± 14.28 |
AB | 63.5 ± 1.48 | 56.13 ± 2.04 | 64.42 ± 3.81 | 63.19 ± 8.84 | 68.40 ± 2.86 | 71.78 ± 2.86 |
ME | 75.95 ± 1.42 | 70.99 ± 1.33 | 77.48 ± 1.02 | 79.77 ± 4.50 | 79.01 ± 1.43 | 82.44 ± 1.43 |
LV | 69.57 ± 2.28 | 69.57 ± 1.60 | 73.91 ± 2.47 | 71.20 ± 1.38 | 79.35 ± 3.84 | 78.26 ± 3.84 |
LU | 80.45 ± 2.28 | 79.89 ± 2.73 | 81.87 ± 1.68 | 79.32 ± 3.46 | 82.72 ± 2.38 | 84.14 ± 2.38 |
KD | 63.08 ± 6.79 | 43.08 ± 5.10 | 58.46 ± 0.54 | 46.15 ± 1.83 | 72.31 ± 2.99 | 61.54 ± 2.99 |
ST | 58.56 ± 0.78 | 46.85 ± 2.23 | 63.96 ± 1.67 | 51.35 ± 1.97 | 66.67 ± 1.33 | 73.87 ± 1.33 |
PV | 57.78 ± 1.53 | 46.67 ± 1.22 | 60.00 ± 0.71 | 54.81 ± 0.88 | 68.15 ± 0.18 | 74.07 ± 0.18 |
Method | Epoch 1 | Epoch 2 | Epoch 3 | Epoch 4 | Epoch 5 | Epoch 6 | Epoch 7 | Epoch 8 |
---|---|---|---|---|---|---|---|---|
Faster R-CNN [25] | 21.40 | 33.90 | 46.98 | 46.34 | 63.39 | 64.13 | 65.34 | 65.08 |
Original R-FCN [21] | 12.24 | 28.06 | 37.96 | 41.83 | 58.14 | 57.63 | 58.67 | 57.87 |
3DCE, 9 slices [1] | 21.81 | 39.18 | 47.81 | 51.06 | 67.10 | 68.91 | 68.78 | 69.41 |
Dense DAL 3DCE R-FCN, 9 slices [3] | 31.54 | 49.97 | 50.30 | 64.59 | 74.46 | 73.96 | 73.61 | 73.53 |
Dense DKS 3DCE R-FCN, 9 slices [37] | 17.79 | 29.53 | 38.29 | 44.87 | 62.31 | 63.06 | 63.77 | 63.74 |
DA-SHT Dense 3DCE R-FCN, 9 slices (ours) | 32.28 | 47.16 | 53.53 | 60.07 | 74.66 | 75.84 | 75.92 | 76.14 |
3DCE? | DENSE? | DA? | LOSS? | Sen@2 ± std | Sen@4 ± std | AS@2 ± std | AS@4 ± std | mAP ± std |
---|---|---|---|---|---|---|---|---|
✔ | 78.99 ± 1.12 | 84.39 ± 0.58 | 56.80 ± 2.46 | 65.08 ± 2.34 | 32.43 ± 1.66 | |||
✔ | ✔ | 77.67 ± 0.36 | 83.33 ± 0.53 | 60.02 ± 1.02 | 67.32 ± 1.13 | 35.74 ± 1.40 | ||
✔ | ✔ | 79.36 ± 0.77 | 84.47 ± 0.24 | 63.58 ± 0.38 | 71.42 ± 0.26 | 37.99 ± 0.17 | ||
✔ | ✔ | DAL | 79.78 ± 0.95 | 85.10 ± 0.94 | 65.17 ± 0.72 | 71.33 ± 1.28 | 36.76 ± 1.47 | |
✔ | ✔ | ✔ | DAL | 79.84 ± 0.85 | 84.79 ± 0.35 | 66.25 ± 0.33 | 73.20 ± 0.13 | 40.29 ± 0.18 |
✔ | ✔ | ✔ | RANDOM | 79.93 ± 0.87 | 84.69 ± 0.36 | 66.30 ± 0.34 | 73.63 ± 0.14 | 40.38 ± 0.40 |
✔ | ✔ | ✔ | SHT | 79.99 ± 0.24 | 84.8 ± 0.22 | 67.29 ± 0.12 | 73.88 ± 0.17 | 40.39 ± 0.41 |
and Value | [email protected] ± std | AS@1 ± std | AS@2 ± std | AS@4 ± std | AS@8 ± std | AS@16 ± std | mAP ± std |
---|---|---|---|---|---|---|---|
= 10, = 0.1 | 35.50 ± 0.22 | 47.13 ± 0.67 | 58.00 ± 0.04 | 66.69 ± 0.06 | 74.06 ± 0.52 | 77.83 ± 0.29 | 31.96 ± 0.15 |
= 10, = 1 | 44.37 ± 0.12 | 58.05 ± 1.13 | 67.29 ± 0.12 | 73.88 ± 0.17 | 78.97 ± 0.14 | 82.11 ± 0.12 | 40.39 ± 0.41 |
= 10, = 10 | 36.20 ± 0.19 | 48.48 ± 0.46 | 60.04 ± 0.73 | 67.04 ± 0.79 | 72.25 ± 0.81 | 77.51 ± 0.81 | 32.24 ± 0.37 |
= 10, = 100 | 37.60 ± 0.17 | 49.43 ± 0.52 | 60.44 ± 0.33 | 68.10 ± 0.48 | 75.83 ± 0.33 | 79.59 ± 0.20 | 34.42 ± 0.24 |
Gaussian Noise | [email protected] ± std | AS@1 ± std | AS@2 ± std | AS@4 ± std | AS@8 ± std | AS@16 ± std | mAP ± std |
---|---|---|---|---|---|---|---|
mean = 0, standard deviation = 0.01 | 46.24 ± 0.67 | 56.84 ± 0.52 | 67.31 ± 0.47 | 74.54 ± 0.16 | 78.78 ± 0.43 | 81.56 ± 0.29 | 40.69 ± 0.50 |
mean = 0, standard deviation = 0.02 | 44.56 ± 0.57 | 56.03 ± 0.27 | 67.37 ± 0.26 | 73.77 ± 0.18 | 78.89 ± 0.23 | 81.09 ± 0.09 | 40.39 ± 0.50 |
mean = 0, standard deviation = 0.1 | 43.81 ± 0.12 | 56.30 ± 0.27 | 66.51 ± 0.56 | 73.86 ± 0.75 | 77.77 ± 0.22 | 81.49 ± 0.36 | 39.90 ± 0.62 |
mean = 0, standard deviation = 0.2 | 43.80 ± 1.68 | 56.43 ± 1.00 | 67.24 ± 0.28 | 73.23 ± 0.44 | 78.40 ± 0.50 | 81.20 ± 0.59 | 39.53 ± 1.09 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, H.; Chung, A.C.S. Depth-Aware Networks for Multi-Organ Lesion Detection in Chest CT Scans. Bioengineering 2024, 11, 998. https://doi.org/10.3390/bioengineering11100998
Zhang H, Chung ACS. Depth-Aware Networks for Multi-Organ Lesion Detection in Chest CT Scans. Bioengineering. 2024; 11(10):998. https://doi.org/10.3390/bioengineering11100998
Chicago/Turabian StyleZhang, Han, and Albert C. S. Chung. 2024. "Depth-Aware Networks for Multi-Organ Lesion Detection in Chest CT Scans" Bioengineering 11, no. 10: 998. https://doi.org/10.3390/bioengineering11100998
APA StyleZhang, H., & Chung, A. C. S. (2024). Depth-Aware Networks for Multi-Organ Lesion Detection in Chest CT Scans. Bioengineering, 11(10), 998. https://doi.org/10.3390/bioengineering11100998