Multi-Scale 3D Cephalometric Landmark Detection Based on Direct Regression with 3D CNN Architectures
Abstract
:1. Introduction
- (1)
- Performance comparison of 3D CNN architectures for landmark direct regression;
- (2)
- Development of an automatic cephalometric landmark detection network using multi-scale direct regression;
- (3)
- Validation using 150 sets of maxillofacial clinical data (CT).
2. Methods
2.1. Method Overview
2.2. Data Description
2.3. Data Preprocessing
2.4. Three-Dimensional CNN Architectures
2.4.1. ResNet Architecture
2.4.2. DenseNet Architecture
2.4.3. Inception Architecture
2.4.4. InceptionResNet Architecture
2.5. Proposed Architecture
2.5.1. Coarse Detection
2.5.2. Three-Dimensional Region of Interest (ROI) Processing
2.5.3. Fine Localization
3. Experimental Results
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Yao, Q.; Quan, Q.; Xiao, L.; Kevin Zhou, S. One-shot medical landmark detection. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021, Proceedings of the 24th International Conference, Strasbourg, France, 27 September–1 October 2021; Proceedings, Part II 24; Springer: Berlin/Heidelberg, Germany, 2021; pp. 177–188. [Google Scholar]
- Chen, R.; Ma, Y.; Liu, L.; Chen, N.; Cui, Z.; Wei, G.; Wang, W. Semi-supervised anatomical landmark detection via shape-regulated self-training. Neurocomputing 2022, 471, 335–345. [Google Scholar] [CrossRef]
- Zhu, H.; Yao, Q.; Xiao, L.; Zhou, S.K. You only learn once: Universal anatomical landmark detection. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021, Proceedings of the 24th International Conference, Strasbourg, France, 27 September–1 October 2021; Proceedings, Part V 24; Springer: Berlin/Heidelberg, Germany, 2021; pp. 85–95. [Google Scholar]
- Yao, J.; Zeng, W.; He, T.; Zhou, S.; Zhang, Y.; Guo, J.; Tang, W. Automatic localization of cephalometric landmarks based on convolutional neural network. Am. J. Orthod. Dentofac. Orthop. 2022, 161, e250–e259. [Google Scholar] [CrossRef] [PubMed]
- Ma, Q.; Kobayashi, E.; Fan, B.; Nakagawa, K.; Sakuma, I.; Masamune, K.; Suenaga, H. Automatic 3D landmarking model using patch-based deep neural networks for CT image of oral and maxillofacial surgery. Int. J. Med. Robot. Comput. Assist. Surg. 2020, 16, e2093. [Google Scholar] [CrossRef] [PubMed]
- Dot, G.; Schouman, T.; Chang, S.; Rafflenbeul, F.; Kerbrat, A.; Rouch, P.; Gajny, L. Automatic three-dimensional cephalometric landmarking via deep learning. J. Dent. Res. 2022, 101, 1380–1387. [Google Scholar] [CrossRef] [PubMed]
- Kang, S.H.; Jeon, K.; Kim, H.-J.; Seo, J.K.; Lee, S.-H. Automatic three-dimensional cephalometric annotation system using three-dimensional convolutional neural networks: A developmental trial. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 2020, 8, 210–218. [Google Scholar] [CrossRef]
- Zheng, Y.; Liu, D.; Georgescu, B.; Nguyen, H.; Comaniciu, D. 3D deep learning for efficient and robust landmark detection in volumetric data. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Proceedings of the 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part I 18; Springer: Berlin/Heidelberg, Germany, 2015; pp. 565–572. [Google Scholar]
- Yun, H.S.; Jang, T.J.; Lee, S.M.; Lee, S.-H.; Seo, J.K. Learning-based local-to-global landmark annotation for automatic 3D cephalometry. Phys. Med. Biol. 2020, 65, 085018. [Google Scholar] [CrossRef] [PubMed]
- Liu, Q.; Deng, H.; Lian, C.; Chen, X.; Xiao, D.; Ma, L.; Chen, X.; Kuang, T.; Gateno, J.; Yap, P.-T. SkullEngine: A multi-stage CNN framework for collaborative CBCT image segmentation and landmark detection. In Machine Learning in Medical Imaging, Proceedings of the 12th International Workshop, MLMI 2021, Held in Conjunction with MICCAI 2021, Strasbourg, France, 27 September 2021; Proceedings 12; Springer: Berlin/Heidelberg, Germany, 2021; pp. 606–614. [Google Scholar]
- Nishimoto, S.; Saito, T.; Ishise, H.; Fujiwara, T.; Kawai, K.; Kakibuchi, M. Three-Dimensional Craniofacial Landmark Detection in Series of CT Slices Using Multi-Phased Regression Networks. Diagnostics 2023, 13, 1930. [Google Scholar] [CrossRef] [PubMed]
- Lang, Y.; Wang, L.; Yap, P.-T.; Lian, C.; Deng, H.; Thung, K.-H.; Xiao, D.; Yuan, P.; Shen, S.G.; Gateno, J. Automatic detection of craniomaxillofacial anatomical landmarks on CBCT images using 3D mask R-CNN. In Graph Learning in Medical Imaging, Proceedings of the First International Workshop, GLMI 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, 17 October 2019; Proceedings 1; Springer: Berlin/Heidelberg, Germany, 2019; pp. 130–137. [Google Scholar]
- Chen, X.; Lian, C.; Deng, H.H.; Kuang, T.; Lin, H.-Y.; Xiao, D.; Gateno, J.; Shen, D.; Xia, J.J.; Yap, P.-T. Fast and accurate craniomaxillofacial landmark detection via 3D faster R-CNN. IEEE Trans. Med. Imaging 2021, 40, 3867–3878. [Google Scholar] [CrossRef] [PubMed]
- Sahlsten, J.; Järnstedt, J.; Jaskari, J.; Naukkarinen, H.; Mahasantipiya, P.; Charuakkra, A.; Vasankari, K.; Hietanen, A.; Sundqvist, O.; Lehtinen, A. Deep learning for 3D cephalometric landmarking with heterogeneous multi-center CBCT dataset. PLoS ONE 2024, 19, e0305947. [Google Scholar] [CrossRef]
- Zhou, Q.-Y.; Park, J.; Koltun, V. Open3D: A modern library for 3D data processing. arXiv 2018, arXiv:1801.09847 2018. [Google Scholar]
- Schroeder, W.; Martin, K.; Lorensen, B. The Visualization Toolkit, 4th ed.; Kitware: New York, NY, USA, 2006. [Google Scholar]
- Jiménez, J.; Skalic, M.; Martinez-Rosell, G.; De Fabritiis, G. K deep: Protein–ligand absolute binding affinity prediction via 3d-convolutional neural networks. J. Chem. Inf. Model. 2018, 58, 287–296. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Rezaei, M.A.; Li, C.; Li, X. DeepAtom: A framework for protein-ligand binding affinity prediction. In Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), San Diego, CA, USA, 18–21 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 303–310. [Google Scholar]
- Wang, Y.; Qiu, Z.; Jiao, Q.; Chen, C.; Meng, Z.; Cui, X. Structure-based protein-drug affinity prediction with spatial attention mechanisms. In Proceedings of the 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Houston, TX, USA, 9–12 December 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 92–97. [Google Scholar]
- Sun, S.; Gao, L. Contrastive pre-training and 3D convolution neural network for RNA and small molecule binding affinity prediction. Bioinformatics 2024, 40, btae155. [Google Scholar] [CrossRef] [PubMed]
- Chen, H.; Dou, Q.; Yu, L.; Qin, J.; Heng, P.-A. VoxResNet: Deep voxelwise residual networks for brain segmentation from 3D MR images. NeuroImage 2018, 170, 446–455. [Google Scholar] [CrossRef] [PubMed]
- Milletari, F.; Navab, N.; Ahmadi, S.-A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 565–571. [Google Scholar]
- Kompanek, M.; Tamajka, M.; Benesova, W. Volumetric data augmentation as an effective tool in mri classification using 3d convolutional neural network. In Proceedings of the 2019 International Conference on Systems, Signals and Image Processing (IWSSIP), Osijek, Croatia, 5–7 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 115–119. [Google Scholar]
- Alakwaa, W.; Nassef, M.; Badr, A. Lung cancer detection and classification with 3D convolutional neural network (3D-CNN). Int. J. Adv. Comput. Sci. Appl. 2017, 8, 409–417. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
- Tamhane, A.; Mida, T.E.; Posner, E.; Bouhnik, M. Colonoscopy landmark detection using vision transformers. In MICCAI Workshop on Imaging Systems for GI Endoscopy; Springer: Berlin/Heidelberg, Germany, 2022; pp. 24–34. [Google Scholar]
Parameters | Accuracy (mm) | Time (s) | |
---|---|---|---|
Inception-ResNetV2 | 67,647,802 | 2.978 ± 0.493 | 0.246 |
Inceptionv3 | 34,299,770 | 2.917 ± 0.501 | 0.225 |
ResNet50 | 46,377,501 | 2.787 ± 0.545 | 0.118 |
ResNet101 | 85,475,869 | 2.754 ± 0.473 | 0.142 |
ResNet152 | 117,680,669 | 2.783 ± 0.490 | 0.302 |
DenseNet201 | 25,732,762 | 2.505 ± 0.391 | 0.172 |
DenseNet121 | 11,418,522 | 2.467 ± 0.435 | 0.232 |
DenseNet169 | 18,850,970 | 2.464 ± 0.382 | 0.334 |
DenseNet169 + Coarse-to-fine (Proposed) | 39,056,605 | 2.238 ± 0.364 | 3.835 |
Inception ResNetV2 | Inceptionv3 | ResNet50 | ResNet101 | ResNet152 | DenseNet201 | DenseNet121 | DenseNet169 | Proposed | |
---|---|---|---|---|---|---|---|---|---|
11apex | 2.497 ± 0.922 | 2.620 ± 1.177 | 2.302 ± 1.136 | 2.087 ± 0.956 | 2.379 ± 1.081 | 1.959 ± 0.889 | 2.068 ± 0.873 | 1.925 ± 0.876 | 1.815 ± 0.945 |
13 | 2.412 ± 1.090 | 2.473 ± 1.085 | 2.268 ± 0.936 | 2.330 ± 0.889 | 2.450 ± 1.106 | 2.145 ± 1.098 | 2.197 ± 0.944 | 2.153 ± 0.940 | 2.028 ± 0.913 |
16 | 3.221 ± 1.180 | 3.343 ± 1.270 | 2.865 ± 1.146 | 2.823 ± 1.388 | 3.201 ± 1.463 | 2.882 ± 1.226 | 2.956 ± 1.121 | 2.973 ± 1.247 | 2.875 ± 1.208 |
21apex | 2.477 ± 0.865 | 2.506 ± 1.255 | 2.221 ± 0.931 | 2.130 ± 0.976 | 2.475 ± 0.900 | 1.947 ± 0.776 | 1.951 ± 0.819 | 1.838 ± 0.785 | 1.953 ± 0.731 |
23 | 2.748 ± 1.226 | 3.008 ± 1.524 | 2.574 ± 1.065 | 2.572 ± 1.159 | 2.803 ± 1.265 | 2.278 ± 1.166 | 2.345 ± 1.228 | 2.286 ± 1.129 | 2.053 ± 1.074 |
26 | 3.097 ± 1.147 | 3.395 ± 1.452 | 2.996 ± 1.300 | 2.969 ± 1.271 | 3.243 ± 1.652 | 2.886 ± 1.288 | 2.986 ± 1.192 | 2.978 ± 1.357 | 2.900 ± 1.244 |
A-point | 2.484 ± 1.001 | 2.595 ± 1.062 | 2.265 ± 0.953 | 2.306 ± 0.995 | 2.319 ± 0.843 | 2.083 ± 0.860 | 2.133 ± 0.913 | 2.179 ± 0.954 | 2.106 ± 0.985 |
ANS | 2.778 ± 1.356 | 3.003 ± 1.367 | 2.611 ± 1.205 | 2.869 ± 1.422 | 2.394 ± 1.188 | 2.416 ± 1.325 | 2.414 ± 1.379 | 2.495 ± 1.322 | 2.656 ± 1.676 |
B | 2.933 ± 1.408 | 2.824 ± 1.255 | 2.537 ± 1.407 | 2.746 ± 1.328 | 2.550 ± 1.198 | 2.821 ± 1.148 | 2.354 ± 1.255 | 2.515 ± 1.207 | 2.041 ± 1.118 |
Ba | 3.271 ± 1.822 | 2.427 ± 1.132 | 3.102 ± 1.661 | 3.124 ± 1.631 | 3.051 ± 1.742 | 2.680 ± 1.504 | 2.418 ± 1.226 | 2.683 ± 1.386 | 2.131 ± 1.135 |
CoL | 3.444 ± 1.073 | 3.383 ± 1.522 | 3.661 ± 1.861 | 3.821 ± 1.767 | 3.847 ± 1.285 | 3.274 ± 1.445 | 3.319 ± 1.300 | 2.907 ± 1.373 | 2.832 ± 1.393 |
CoR | 3.618 ± 1.532 | 3.351 ± 1.486 | 3.549 ± 1.435 | 3.543 ± 1.236 | 3.326 ± 1.459 | 3.060 ± 1.306 | 2.946 ± 1.117 | 2.862 ± 1.035 | 2.860 ± 1.126 |
Gn | 3.189 ± 1.438 | 2.650 ± 1.440 | 2.367 ± 1.219 | 2.373 ± 0.961 | 2.319 ± 0.800 | 2.196 ± 1.171 | 1.884 ± 1.075 | 1.938 ± 0.927 | 1.457 ± 0.674 |
GoL | 4.448 ± 2.079 | 4.608 ± 1.949 | 4.104 ± 2.034 | 4.184 ± 2.197 | 4.104 ± 2.099 | 3.567 ± 2.101 | 3.660 ± 1.779 | 3.666 ± 1.866 | 3.038 ± 1.862 |
GoR | 4.213 ± 1.623 | 3.740 ± 1.476 | 3.875 ± 1.459 | 3.741 ± 1.731 | 3.417 ± 1.574 | 3.473 ± 1.579 | 3.386 ± 1.654 | 3.469 ± 1.745 | 2.664 ± 1.444 |
L1 | 2.645 ± 1.051 | 2.803 ± 1.029 | 2.932 ± 0.981 | 2.900 ± 0.979 | 2.924 ± 0.963 | 2.730 ± 0.877 | 2.502 ± 0.937 | 2.734 ± 0.942 | 2.291 ± 0.934 |
Lt. FZS | 2.281 ± 1.198 | 2.654 ± 1.282 | 2.592 ± 1.180 | 2.419 ± 1.020 | 2.367 ± 1.254 | 2.053 ± 0.843 | 2.040 ± 0.853 | 2.120 ± 0.845 | 1.860 ± 0.891 |
Menton | 3.048 ± 1.743 | 2.687 ± 1.249 | 2.358 ± 1.294 | 2.292 ± 0.884 | 2.253 ± 0.951 | 2.091 ± 1.107 | 2.042 ± 1.079 | 1.935 ± 0.981 | 1.533 ± 0.746 |
Na | 2.297 ± 1.193 | 2.394 ± 1.102 | 2.228 ± 1.073 | 2.197 ± 0.912 | 2.218 ± 0.991 | 1.732 ± 0.930 | 1.708 ± 0.918 | 1.767 ± 0.992 | 1.699 ± 0.805 |
Nasal tip | 3.001 ± 1.598 | 3.009 ± 1.184 | 2.574 ± 1.273 | 2.657 ± 1.163 | 2.756 ± 1.211 | 2.662 ± 1.125 | 2.795 ± 1.407 | 2.710 ± 1.400 | 2.689 ± 1.535 |
OrL | 2.967 ± 1.505 | 3.093 ± 1.484 | 2.864 ± 1.418 | 2.559 ± 1.264 | 2.863 ± 1.424 | 2.644 ± 1.090 | 2.639 ± 1.028 | 2.633 ± 1.129 | 2.318 ± 1.211 |
OrR | 3.183 ± 1.184 | 3.092 ± 1.210 | 2.601 ± 0.841 | 2.876 ± 1.166 | 2.785 ± 1.325 | 2.572 ± 0.996 | 2.624 ± 1.062 | 2.664 ± 1.003 | 2.312 ± 0.980 |
PNS | 2.889 ± 1.574 | 2.698 ± 1.851 | 2.826 ± 1.787 | 2.828 ± 1.740 | 2.855 ± 1.741 | 2.687 ± 1.656 | 2.604 ± 1.636 | 2.804 ± 1.438 | 2.597 ± 1.451 |
PoL | 3.570 ± 1.553 | 3.031 ± 1.229 | 3.583 ± 1.591 | 3.295 ± 1.649 | 3.321 ± 1.554 | 3.041 ± 1.300 | 2.830 ± 1.413 | 2.819 ± 1.216 | 2.774 ± 1.251 |
PoR | 3.310 ± 1.419 | 2.704 ± 1.114 | 3.402 ± 1.627 | 2.865 ± 1.199 | 2.886 ± 1.338 | 2.459 ± 1.169 | 2.525 ± 1.024 | 2.658 ± 0.973 | 2.253 ± 0.978 |
Pog | 3.045 ± 1.369 | 2.681 ± 1.247 | 2.467 ± 1.169 | 2.508 ± 0.966 | 2.603 ± 0.721 | 2.324 ± 0.937 | 1.968 ± 0.984 | 2.018 ± 0.847 | 1.687 ± 0.813 |
Rt. FZS | 2.838 ± 1.253 | 2.594 ± 1.061 | 2.140 ± 1.025 | 2.371 ± 1.516 | 2.307 ± 1.172 | 2.140 ± 1.173 | 2.102 ± 1.085 | 1.991 ± 1.091 | 1.747 ± 1.102 |
S | 2.232 ± 1.374 | 2.606 ± 1.300 | 2.544 ± 1.213 | 2.319 ± 1.311 | 2.381 ± 1.305 | 1.971 ± 1.119 | 2.039 ± 1.046 | 1.872 ± 1.109 | 1.845 ± 0.940 |
Sp | 2.780 ± 1.808 | 2.879 ± 1.241 | 2.662 ± 1.551 | 2.537 ± 1.426 | 2.699 ± 1.589 | 2.149 ± 1.288 | 2.297 ± 1.370 | 2.122 ± 1.308 | 2.033 ± 1.374 |
U1 | 2.430 ± 1.016 | 2.659 ± 1.062 | 2.532 ± 0.970 | 2.375 ± 1.053 | 2.382 ± 1.047 | 2.224 ± 1.062 | 2.272 ± 0.969 | 2.201 ± 0.897 | 2.091 ± 0.872 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Song, C.; Jeong, Y.; Huh, H.; Park, J.-W.; Paeng, J.-Y.; Ahn, J.; Son, J.; Jung, E. Multi-Scale 3D Cephalometric Landmark Detection Based on Direct Regression with 3D CNN Architectures. Diagnostics 2024, 14, 2605. https://doi.org/10.3390/diagnostics14222605
Song C, Jeong Y, Huh H, Park J-W, Paeng J-Y, Ahn J, Son J, Jung E. Multi-Scale 3D Cephalometric Landmark Detection Based on Direct Regression with 3D CNN Architectures. Diagnostics. 2024; 14(22):2605. https://doi.org/10.3390/diagnostics14222605
Chicago/Turabian StyleSong, Chanho, Yoosoo Jeong, Hyungkyu Huh, Jee-Woong Park, Jun-Young Paeng, Jaemyung Ahn, Jaebum Son, and Euisung Jung. 2024. "Multi-Scale 3D Cephalometric Landmark Detection Based on Direct Regression with 3D CNN Architectures" Diagnostics 14, no. 22: 2605. https://doi.org/10.3390/diagnostics14222605
APA StyleSong, C., Jeong, Y., Huh, H., Park, J. -W., Paeng, J. -Y., Ahn, J., Son, J., & Jung, E. (2024). Multi-Scale 3D Cephalometric Landmark Detection Based on Direct Regression with 3D CNN Architectures. Diagnostics, 14(22), 2605. https://doi.org/10.3390/diagnostics14222605