A Semi-Supervised Transformer-Based Deep Learning Framework for Automated Tooth Segmentation and Identification on Panoramic Radiographs
Abstract
:1. Introduction
2. Materials and Methods
2.1. Training and Testing Datasets (Benchmark TSI15k)
2.2. Network Architecture
2.3. Semi-Supervised Learning
- (i)
- Teacher pre-training: The teacher model, parameterized by θt, is exclusively trained on labeled data.
- (ii)
- Enhanced burn-in process: The student model, parameterized by θs, is initialized by the image encoder of MobileSAM [21] and trained on both labeled and unlabeled data using pseudo-labels generated by the teacher model in the first pre-training stage. During this phase, the teacher model remains fixed.
- (iii)
- Distillation stage: In this stage, the student model’s weights are transferred to the teacher model, and continue training the student on both labeled and unlabeled data as before. The teacher model is updated using an exponential moving average (EMA) [28] of the student’s weights. The workflow for this stage is illustrated in Figure 2.
2.4. Loss Function
2.5. Evaluation Metrics
2.6. Performance Comparison and Statistical Analysis
2.7. Experiment Settings
3. Results
3.1. Overall Performance
3.2. Performance Comparison between Fully Dentate and Partially Edentulous Cases
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Hung, K.F.; Yeung, A.W.K.; Bornstein, M.M.; Schwendicke, F. Personalized dental medicine, artificial intelligence, and their relevance for dentomaxillofacial imaging. Dentomaxillofac. Radiol. 2023, 52, 20220335. [Google Scholar] [CrossRef]
- Hung, K.F.; Ai, Q.Y.H.; Wong, L.M.; Yeung, A.W.K.; Li, D.T.S.; Leung, Y.Y. Current applications of deep learning and radiomics on CT and CBCT for maxillofacial diseases. Diagnostics 2022, 13, 110. [Google Scholar] [CrossRef]
- Hung, K.; Montalvao, C.; Tanaka, R.; Kawai, T.; Bornstein, M.M. The use and performance of artificial intelligence applications in dental and maxillofacial radiology: A systematic review. Dentomaxillofac. Radiol. 2020, 49, 20190107. [Google Scholar] [CrossRef]
- Joda, T.; Yeung, A.; Hung, K.; Zitzmann, N.; Bornstein, M. Disruptive innovation in dentistry: What it is and what could be next. J. Dent. Res. 2021, 100, 448–453. [Google Scholar] [CrossRef]
- Chen, X.; Ma, N.; Xu, T.; Xu, C. Deep learning-based tooth segmentation methods in medical imaging: A review. Proc. Inst. Mech. Eng. H 2024, 238, 115–131. [Google Scholar] [CrossRef]
- Zhao, Y.; Li, P.; Gao, C.; Liu, Y.; Chen, Q.; Yang, F.; Meng, D. TSASNet: Tooth segmentation on dental panoramic X-ray images by two-stage attention segmentation network. Knowl. Based Syst. 2020, 206, 106338. [Google Scholar] [CrossRef]
- Hou, S.; Zhou, T.; Liu, Y.; Dang, P.; Lu, H.; Shi, H. Teeth U-Net: A segmentation model of dental panoramic X-ray images for context semantics and contrast enhancement. Comput. Biol. Med. 2023, 152, 106296. [Google Scholar] [CrossRef]
- Wang, S.; Liang, S.; Chang, Q.; Zhang, L.; Gong, B.; Bai, Y.; Zuo, F.; Wang, Y.; Xie, X.; Gu, Y. STSN-Net: Simultaneous tooth segmentation and numbering method in crowded environments with deep learning. Diagnostics 2024, 14, 497. [Google Scholar] [CrossRef]
- Nagaraju, P.; Sudha, S.V. Design of a novel panoptic segmentation using multi-scale pooling model for tooth segmentation. Soft Comput. 2024, 28, 4185–4196. [Google Scholar] [CrossRef]
- Lin, S.; Hao, X.; Liu, Y.; Yan, D.; Liu, J.; Zhong, M. Lightweight deep learning methods for panoramic dental X-ray image segmentation. Neural Comput. Appl. 2023, 35, 8295–8306. [Google Scholar] [CrossRef]
- Chandrashekar, G.; AlQarni, S.; Bumann, E.E.; Lee, Y. Collaborative deep learning model for tooth segmentation and identification using panoramic radiographs. Comput. Biol. Med. 2022, 148, 105829. [Google Scholar] [CrossRef]
- Putra, R.H.; Astuti, E.R.; Putri, D.K.; Widiasri, M.; Laksanti, P.A.M.; Majidah, H.; Yoda, N. Automated permanent tooth detection and numbering on panoramic radiograph using a deep learning approach. Oral Surg. Oral Med. Oral Pathol. Oral Radiol. 2024, 137, 537–544. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
- Berrada, T.; Couprie, C.; Alahari, K.; Verbeek, J. Guided distillation for semi-supervised instance segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 1–10 January 2024; pp. 475–483. [Google Scholar]
- Qayyum, A.; Tahir, A.; Butt, M.A.; Luke, A.; Abbas, H.T.; Qadir, J.; Arshad, K.; Assaleh, K.; Imran, M.A.; Abbasi, Q.H. Dental caries detection using a semi-supervised learning approach. Sci. Rep. 2023, 13, 749. [Google Scholar] [CrossRef]
- Hao, J.; Zhu, Y.; He, L.; Liu, M.; Tsoi, J.K.H.; Hung, K.F. T-mamba: A unified framework with Long-Range Dependency in dual-domain for 2D & 3D Tooth Segmentation. arXiv 2024, arXiv:2404.01065. [Google Scholar] [CrossRef]
- Humans in the Loop. Teeth Segmentation on Dental X-ray Images. 2023. Available online: https://www.kaggle.com/datasets/humansintheloop/teeth-segmentation-on-dental-x-ray-images (accessed on 20 May 2023).
- Panetta, K.; Rajendran, R.; Ramesh, A.; Rao, S.; Agaian, S. Tufts Dental Database: A multimodal panoramic X-ray dataset for benchmarking diagnostic systems. IEEE J. Biomed. Health Inform. 2022, 26, 1650–1659. [Google Scholar] [CrossRef]
- Hao, J.; Liu, M.; Yang, J.; Hung, K.F. GEM: Boost simple network for glass surface segmentation via vision foundation models. arXiv 2023, arXiv:2307.12018. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Zhang, C.; Han, D.; Qiao, Y.; Kim, J.U.; Bae, S.H.; Lee, S.; Hong, C.S. Faster segment anything: Towards lightweight sam for mobile applications. arXiv 2023, arXiv:2306.14289. [Google Scholar]
- Li, Y.; Mao, H.; Girshick, R.; He, K. Exploring plain vision transformer backbones for object detection. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 280–296. [Google Scholar]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning transferable visual models from natural language supervision. In Proceedings of the ICML, Online, 18–24 July 2021; pp. 8748–8763. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Li, D.; Yang, J.; Kreis, K.; Torralba, A.; Fidler, S. Semantic segmentation with generative models: Semi-supervised learning and strong out-of-domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 8300–8311. [Google Scholar]
- Springenberg, J.T. Unsupervised and semi-supervised learning with categorical generative adversarial networks. arXiv 2015, arXiv:1511.06390. [Google Scholar]
- Van Engelen, J.E.; Hoos, H.H. A survey on semi-supervised learning. Mach. Learn. 2020, 109, 373–440. [Google Scholar] [CrossRef]
- Cai, Z.; Ravichandran, A.; Maji, S.; Fowlkes, C.; Tu, Z.; Soatto, S. Exponential moving average normalization for self-supervised and semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 194–203. [Google Scholar]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Zhao, R.; Qian, B.; Zhang, X.; Li, Y.; Wei, R.; Liu, Y.; Pan, Y. Rethinking dice loss for medical image segmentation. In Proceedings of the 2020 IEEE International Conference on Data Mining (ICDM), Sorrento, Italy, 17–20 November 2020; IEEE: New York, NY, USA, 2020; pp. 851–860. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Zhang, H.; Li, F.; Xu, H.; Huang, S.; Liu, S.; Ni, L.M.; Zhang, L. MP-Former: Mask-piloted transformer for image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 18074–18083. [Google Scholar]
- Cheng, B.; Misra, I.; Schwing, A.G.; Kirillov, A.; Girdhar, R. Masked-attention mask transformer for universal image segmentation. arXiv 2022, arXiv:2112.01527. [Google Scholar]
- Li, F.; Zhang, H.; Xu, H.; Liu, S.; Zhang, L.; Ni, L.M.; Shum, H.Y. Mask dino: Towards a unified transformer-based framework for object detection and segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 3041–3050. [Google Scholar]
- Hung, K.F.; Ai, Q.Y.H.; Leung, Y.Y.; Yeung, A.W.K. Potential and impact of artificial intelligence algorithms in dento-maxillofacial radiology. Clin. Oral Investig. 2022, 26, 5535–5555. [Google Scholar] [CrossRef]
- Tuzoff, D.V.; Tuzova, L.N.; Bornstein, M.M.; Krasnov, A.S.; Kharchenko, M.A.; Nikolenko, S.I.; Sveshnikov, M.M.; Bednenko, G.B. Tooth detection and numbering in panoramic radiographs using convolutional neural networks. Dentomaxillofac. Radiol. 2019, 48, 20180051. [Google Scholar] [CrossRef]
TSI15k Dataset | |||
---|---|---|---|
Training Set | Test Set | ||
Cohort 1 | Labeled images | 1398 | 191 |
Cohort 2 | Unlabeled images | 14,728 | 0 |
Networks | Segmentation | Identification | Parameters (M) | |||
---|---|---|---|---|---|---|
IoU (%) | Dice (%) | Precision (%) | Recall (%) | F1 Score (%) | ||
Mask R-CNN | 91.58 * (p < 0.001) | 92.44 * (p < 0.001) | 92.24 * (p < 0.001) | 94.13 * (p < 0.001) | 93.17 * (p < 0.001) | 44.5 |
MPFormer | 93.26 * (p = 0.002) | 94.39 * (p = 0.006) | 90.99 * (p < 0.001) | 93.63 * (p < 0.001) | 92.29 * (p < 0.001) | 43.9 |
Mask2Former | 94.16 | 95.43 | 93.70 * (p = 0.002) | 96.45 | 95.06 * (p = 0.014) | 44.0 |
MaskDINO | 93.75 | 94.64 | 93.74 * (p = 0.010) | 95.81 * (p = 0.050) | 94.76 * (p = 0.022) | 52.0 |
GEM | 93.92 | 94.75 * (p = 0.043) | 93.96 * (p = 0.013) | 96.04 * (p = 0.005) | 94.99 * (p = 0.006) | 21.6 |
SemiTNet (ours) | 94.41 | 95.45 | 94.74 | 97.10 | 95.90 | 21.6 |
Fully Dentate Individuals (n = 40) | Partially Edentulous Individuals (n = 151) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Network | Segmentation | Identification | Segmentation | Identification | ||||||
IoU (%) | Dice (%) | Precision (%) | Recall (%) | F1 Score (%) | IoU (%) | Dice (%) | Precision (%) | Recall (%) | F1 Score (%) | |
MaskR-CNN | 98.84 * (p = 0.033) | 98.98 * (p = 0.040) | 99.04 | 99.36 | 99.20 | 89.65 * (p < 0.001) | 90.70 * (p < 0.001) | 90.44 * (p < 0.001) | 92.74 * (p < 0.001) | 91.57 * (p < 0.001) |
MPFormer | 99.24 | 99.32 | 97.64 * (p = 0.004) | 97.82 * (p = 0.002) | 97.73 * (p = 0.003) | 91.67 * (p = 0.004) | 93.09 * (p = 0.011) | 89.23 * (p < 0.001) | 92.51 * (p < 0.001) | 90.84 * (p < 0.001) |
Mask2Former | 99.47 | 99.59 | 99.26 | 99.57 | 99.41 | 92.24 | 93.33 | 92.28 * (p = 0.003) | 94.82 | 93.53 * (p = 0.018) |
MaskDINO | 99.53 | 99.64 | 99.45 | 99.64 | 99.55 | 92.74 | 94.31 | 92.18 * (p = 0.017) | 95.61 | 93.86 * (p = 0.034) |
GEM | 99.84 | 99.86 | 99.65 | 99.65 | 99.65 | 92.35 | 93.39 * (p = 0.038) | 92.46 * (p = 0.013) | 95.09 * (p = 0.006) | 93.76 * (p = 0.006) |
SemiTNet (ours) | 99.76 | 99.78 | 99.69 | 99.72 | 99.70 | 93.00 | 94.30 | 93.42 | 96.40 | 94.89 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hao, J.; Wong, L.M.; Shan, Z.; Ai, Q.Y.H.; Shi, X.; Tsoi, J.K.H.; Hung, K.F. A Semi-Supervised Transformer-Based Deep Learning Framework for Automated Tooth Segmentation and Identification on Panoramic Radiographs. Diagnostics 2024, 14, 1948. https://doi.org/10.3390/diagnostics14171948
Hao J, Wong LM, Shan Z, Ai QYH, Shi X, Tsoi JKH, Hung KF. A Semi-Supervised Transformer-Based Deep Learning Framework for Automated Tooth Segmentation and Identification on Panoramic Radiographs. Diagnostics. 2024; 14(17):1948. https://doi.org/10.3390/diagnostics14171948
Chicago/Turabian StyleHao, Jing, Lun M. Wong, Zhiyi Shan, Qi Yong H. Ai, Xieqi Shi, James Kit Hon Tsoi, and Kuo Feng Hung. 2024. "A Semi-Supervised Transformer-Based Deep Learning Framework for Automated Tooth Segmentation and Identification on Panoramic Radiographs" Diagnostics 14, no. 17: 1948. https://doi.org/10.3390/diagnostics14171948
APA StyleHao, J., Wong, L. M., Shan, Z., Ai, Q. Y. H., Shi, X., Tsoi, J. K. H., & Hung, K. F. (2024). A Semi-Supervised Transformer-Based Deep Learning Framework for Automated Tooth Segmentation and Identification on Panoramic Radiographs. Diagnostics, 14(17), 1948. https://doi.org/10.3390/diagnostics14171948