BUViTNet: Breast Ultrasound Detection via Vision Transformers
Abstract
:1. Introduction
- Developed the first multistage transfer-learning method using vision transformers for breast cancer detection.
- Utilized microscopic image datasets that have related image features to those of ultrasound images for intermediate-stage transfer learning to improve the performance of breast cancer early detection.
- Carefully studied the characteristics of different vision transformers based on pretrained models for translation to ultrasound image-based breast cancer detection.
- Investigated the effectiveness of the proposed BUViTNet method when applied to datasets from different sources as well as on mixed datasets from different origins.
- Compared the performance of the BUViTNet method against vision transformers trained from scratch, conventional vision transformer-based transfer learning, and convolutional neural networks for breast cancer detection.
2. Materials and Methods
2.1. Dataset
2.2. Preprocessing
2.3. Proposed Method
- Conversion of images into patches: ViTs use image patches as tokens of words, as in the original paper Vaswani et al. [34]. For images, pixels can be considered; however, it increases the computational cost. Moreover, finding hardware to process high-resolution images such as medical images is challenging. Therefore, we propose the conversion of input images into patches, as in [26]. In this study, an image with was converted into patches of size .
- Flattening and patch embedding: The patches were then flattened and sent through a single feed-forward layer to obtain a linear patch projection. This feed-forward layer contained the embedding matrix , as mentioned in Dosovitskiy et al. [26]. Matrix is randomly generated.
- Learnable embedding and positional embedding: Learnable embeddings are concatenated with patch projection, which is used later for classification. Because patches are not naturally formed in sequences, as in time sequence models, the transformers utilized positional embeddings to establish a certain order in the patches. The positional encoding matrix is randomly generated, as in patch embedding.
- Multilayer perceptron (MLP) head: The outputs of the transformer encoder unit are passed to the MLP head for classification. Despite the multiple outputs from the transformer encoder, the MLP considered only one output related to the class embedding, whereas the other outputs are ignored. MLP outputs the probability distribution of the labels to which the corresponding images belong.
2.4. Implementation Details
2.5. Experimental Settings
2.6. Evaluation Metrics
3. Results
4. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Siegel, R.L.; Miller, K.D.; Fuchs, H.E.; Jemal, A. Cancer Statistics, 2022. CA Cancer J. Clin. 2022, 72, 7–33. [Google Scholar] [CrossRef] [PubMed]
- Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
- Siegel, R.L.; Miller, K.D.; Jemal, A. Cancer Statistics, 2020. CA Cancer J. Clin. 2020, 70, 7–30. [Google Scholar] [CrossRef] [PubMed]
- Aggarwal, R.; Sounderajah, V.; Martin, G.; Ting, D.S.W.; Karthikesalingam, A.; King, D.; Ashrafian, H.; Darzi, A. Diagnostic Accuracy of Deep Learning in Medical Imaging: A Systematic Review and Meta-Analysis. NPJ Digit. Med. 2021, 4, 65. [Google Scholar] [CrossRef]
- Lima, Z.S.; Ebadi, M.R.; Amjad, G.; Younesi, L. Application of Imaging Technologies in Breast Cancer Detection: A Review Article. Open Access Maced. J. Med. Sci. 2019, 7, 838–848. [Google Scholar] [CrossRef] [Green Version]
- Hovda, T.; Tsuruda, K.; Hoff, S.R.; Sahlberg, K.K.; Hofvind, S. Radiological Review of Prior Screening Mammograms of Screen-Detected Breast Cancer. Eur. Radiol. 2021, 31, 2568–2579. [Google Scholar] [CrossRef]
- Rothschild, J.; Lourenco, A.P.; Mainiero, M.B. Screening Mammography Recall Rate: Does Practice Site Matter? Radiology 2013, 269, 348–353. [Google Scholar] [CrossRef]
- Geisel, J.; Raghu, M.; Hooley, R. The Role of Ultrasound in Breast Cancer Screening: The Case for and Against Ultrasound. Semin. Ultrasound CT MRI 2018, 39, 25–34. [Google Scholar] [CrossRef]
- Liu, H.; Zhan, H.; Sun, D.; Zhang, Y. Comparison of BSGI, MRI, Mammography, and Ultrasound for the Diagnosis of Breast Lesions and Their Correlations with Specific Molecular Subtypes in Chinese Women. BMC Med. Imaging 2020, 20, 98. [Google Scholar] [CrossRef]
- Mimura, T.; Okawa, S.; Kawaguchi, H.; Tanikawa, Y.; Hoshi, Y. Imaging the Human Thyroid Using Three-Dimensional Diffuse Optical Tomography: A Preliminary Study. Appl. Sci. 2021, 11, 1647. [Google Scholar] [CrossRef]
- Bene, I.B.; Ciurea, A.I.; Ciortea, C.A.; Dudea, S.M. Pros and Cons for Automated Breast Ultrasound (ABUS): A Narrative Review. J. Pers. Med. 2021, 11, 703. [Google Scholar] [CrossRef]
- Ayana, G.; Dese, K.; Raj, H.; Krishnamoorthy, J.; Kwa, T. De-Speckling Breast Cancer Ultrasound Images Using a Rotationally Invariant Block Matching Based Non-Local Means (RIBM-NLM) Method. Diagnostics 2022, 12, 862. [Google Scholar] [CrossRef] [PubMed]
- Ayana, G.; Ryu, J. Ultrasound-Responsive Nanocarriers for Breast Cancer Chemotherapy. Micromachines 2022, 13, 1508. [Google Scholar] [CrossRef] [PubMed]
- Yuan, W.H.; Hsu, H.C.; Chen, Y.Y.; Wu, C.H. Supplemental Breast Cancer-Screening Ultrasonography in Women with Dense Breasts: A Systematic Review and Meta-Analysis. Br. J. Cancer 2020, 123, 673–688. [Google Scholar] [CrossRef]
- Wang, L. Early Diagnosis of Breast Cancer. Sensors 2017, 17, 1572. [Google Scholar] [CrossRef] [Green Version]
- The American Cancer Society Medical and Editorial Content Team Breast Cancer Early Detection and Diagnosis. Available online: https://www.cancer.org%7C1.800.227.2345 (accessed on 8 August 2022).
- Yap, M.H.; Pons, G.; Marti, J.; Ganau, S.; Sentis, M.; Zwiggelaar, R.; Davison, A.K.; Marti, R. Automated Breast Ultrasound Lesions Detection Using Convolutional Neural Networks. IEEE J. Biomed. Heal. Inform. 2018, 22, 1218–1226. [Google Scholar] [CrossRef] [Green Version]
- Seely, J.M.; Alhassan, T. Screening for Breast Cancer in 2018—What Should We Be Doing Today? Curr. Oncol. 2018, 25, S115–S124. [Google Scholar] [CrossRef] [Green Version]
- Chougrad, H.; Zouaki, H.; Alheyane, O. Multi-Label Transfer Learning for the Early Diagnosis of Breast Cancer. Neurocomputing 2020, 392, 168–180. [Google Scholar] [CrossRef]
- Park, G.E.; Kang, B.J.; Kim, S.H.; Lee, J. Retrospective Review of Missed Cancer Detection and Its Mammography Findings with Artificial-Intelligence-Based, Computer-Aided Diagnosis. Diagnostics 2022, 12, 387. [Google Scholar] [CrossRef]
- Mridha, M.F.; Hamid, M.A.; Monowar, M.M.; Keya, A.J.; Ohi, A.Q.; Islam, M.R.; Kim, J.-M. A Comprehensive Survey on Deep-Learning-Based Breast Cancer Diagnosis. Cancers 2021, 13, 6116. [Google Scholar] [CrossRef]
- Oyelade, O.N.; Ezugwu, A.E.S. A State-of-the-Art Survey on Deep Learning Methods for Detection of Architectural Distortion from Digital Mammography. IEEE Access 2020, 8, 148644–148676. [Google Scholar] [CrossRef]
- Salim, M.; Wåhlin, E.; Dembrower, K.; Azavedo, E.; Foukakis, T.; Liu, Y.; Smith, K.; Eklund, M.; Strand, F. External Evaluation of 3 Commercial Artificial Intelligence Algorithms for Independent Assessment of Screening Mammograms. JAMA Oncol. 2020, 6, 1581–1588. [Google Scholar] [CrossRef] [PubMed]
- Murtaza, G.; Shuib, L.; Abdul Wahab, A.W.; Mujtaba, G.; Mujtaba, G.; Nweke, H.F.; Al-garadi, M.A.; Zulfiqar, F.; Raza, G.; Azmi, N.A. Deep Learning-Based Breast Cancer Classification through Medical Imaging Modalities: State of the Art and Research Challenges. Artif. Intell. Rev. 2020, 53, 1655–1720. [Google Scholar] [CrossRef]
- Ayana, G.; Dese, K.; Choe, S. Transfer Learning in Breast Cancer Diagnoses via Ultrasound Imaging. Cancers 2021, 13, 738. [Google Scholar] [CrossRef] [PubMed]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Ayana, G.; Park, J.; Choe, S.W. Patchless Multi-Stage Transfer Learning for Improved Mammographic Breast Mass Classification. Cancers 2022, 14, 1280. [Google Scholar] [CrossRef] [PubMed]
- Ayana, G.; Park, J.; Jeong, J.W.; Choe, S.W. A Novel Multistage Transfer Learning for Ultrasound Breast Cancer Image Classification. Diagnostics 2022, 12, 135. [Google Scholar] [CrossRef]
- Cuenat, S.; Couturier, R. Convolutional Neural Network (CNN) vs Vision Transformer (ViT) for Digital Holography. In Proceedings of the 2022 2nd International Conference on Computer, Control and Robotics (ICCCR), Shanghai, China, 18–20 March 2022; pp. 235–240. [Google Scholar] [CrossRef]
- Khan, A.; Sohail, A.; Zahoora, U.; Qureshi, A.S. A Survey of the Recent Architectures of Deep Convolutional Neural Networks. Artif. Intell. Rev. 2020, 53, 5455–5516. [Google Scholar] [CrossRef] [Green Version]
- Kiranyaz, S.; Avci, O.; Abdeljaber, O.; Ince, T.; Gabbouj, M.; Inman, D.J. 1D Convolutional Neural Networks and Applications: A Survey. Mech. Syst. Signal Process. 2021, 151, 107398. [Google Scholar] [CrossRef]
- Al-Dhabyani, W.; Gomaa, M.; Khaled, H.; Fahmy, A. Dataset of Breast Ultrasound Images. Data Br. 2020, 28, 104863. [Google Scholar] [CrossRef]
- Rodrigues, P.S. Breast Ultrasound Image. Mendeley Data 2018. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
Model | Layers | Hidden Size | MLP Size | Heads | Patch Size |
---|---|---|---|---|---|
vitb_16 | 12 | 768 | 3072 | 12 | 16 × 16 |
vitb_32 | 12 | 768 | 3072 | 12 | 32 × 32 |
vitl_32 | 24 | 1024 | 4096 | 16 | 32 × 32 |
Metrics | Formula |
---|---|
Model | Optimizer | LR | Accuracy (95%) | AUC (95%) | F1 Score (95%) | Time (s) (95%) | Loss (95%) |
---|---|---|---|---|---|---|---|
Mendeley | |||||||
vitb_16 | Adam | 0.0001 | 1 ± 0 | 1 ± 0 | 1 ± 0 | 123.35 ± 2.7751 | 0.33 ± 0.0037 |
vitb_32 | Adam | 0.0001 | 1 ± 0 | 1 ± 0 | 1 ± 0 | 69.85 ± 1.4289 | 0.33 ± 0.0053 |
vitl_32 | Adam | 0.0001 | 1 ± 0 | 1 ± 0 | 1 ± 0 | 157.16 ± 6.8667 | 0.33 ± 0.0031 |
BUSI | |||||||
vitb_16 | Adam | 0.0001 | 0.952 ± 0.0296 | 0.968 ± 0.0232 | 0.966 ± 0.0323 | 371.39 ± 3.1718 | 0.68 ± 0.0213 |
vitb_32 | Adam | 0.0001 | 0.944 ± 0.0242 | 0.958 ± 0.0183 | 0.942 ± 0.0204 | 294.874 ± 2.2868 | 0.72 ± 0.0145 |
vitl_32 | Adam | 0.0001 | 0.928 ± 0.0309 | 0.947 ± 0.0315 | 0.922 ± 0.0333 | 407.074 ± 5.191 | 0.7± 0.0289 |
Mixed | |||||||
vitb_16 | Adam | 0.0001 | 0.919 ± 0.0188 | 0.937 ± 0.0256 | 0.919 ± 0.0225 | 407.024 ± 1.7652 | 0.43 ± 0.0051 |
vitb_32 | Adam | 0.0001 | 0.904 ± 0.0068 | 0.929 ± 0.0092 | 0.904 ± 0.0068 | 265.434 ± 1.8455 | 0.45 ± 0.013 |
vitl_32 | Adam | 0.0001 | 0.891 ± 0.0291 | 0.914 ± 0.343 | 0.894 ± 0.0335 | 448.092 ± 5.0215 | 0.44 ± 0.008 |
Model | Optimizer | LR | Accuracy (95%) | AUC (95%) | F1 Score (95%) | Time (s) (95%) | Loss (95%) |
---|---|---|---|---|---|---|---|
Mendeley | |||||||
vitb_16 | Adam | 0.0001 | 0.704 ± 0.3 | 0.730 ± 0.2 | 0.706 ± 0.3 | 123.35 ± 2.7751 | 0.33 ± 0.0037 |
vitb_32 | Adam | 0.0001 | 0.692 ± 0.1 | 0.715 ± 0.2 | 0.693 ± 0.2 | 69.85 ± 1.4289 | 0.33 ± 0.0053 |
vitl_32 | Adam | 0.0001 | 0.673 ± 0.1 | 0.695 ± 0.3 | 0.671 ± 0.2 | 157.16 ± 6.8667 | 0.33 ± 0.0031 |
BUSI | |||||||
vitb_16 | Adam | 0.0001 | 0.694 ± 0.0296 | 0.710 ± 0.07 | 0.693 ± 0.0323 | 371.39 ± 3.1718 | 0.68 ± 0.0213 |
vitb_32 | Adam | 0.0001 | 0.684 ± 0.0242 | 0.698 ± 0.0183 | 0.682 ± 0.0204 | 294.874 ± 2.2868 | 0.72 ± 0.0145 |
vitl_32 | Adam | 0.0001 | 0.669 ± 0.0309 | 0.6851 ± 0.0315 | 0.667 ± 0.0333 | 407.074 ± 5.191 | 0.7 ± 0.0289 |
Mixed | |||||||
vitb_16 | Adam | 0.0001 | 0.684 ± 0.0188 | 0.70 ± 0.125 | 0.689 ± 0.0225 | 407.024 ± 1.7652 | 0.43 ± 0.0051 |
vitb_32 | Adam | 0.0001 | 0.674 ± 0.0068 | 0.699 ± 0.0092 | 0.68 ± 0.0068 | 267.434 ± 1.8455 | 0.45 ± 0.013 |
vitl_32 | Adam | 0.0001 | 0.66 ± 0.0291 | 0.689 ± 0.343 | 0.67 ± 0.0335 | 448.092 ± 5.0215 | 0.44 ± 0.008 |
Model | Optimizer | LR | Accuracy (95%) | AUC (95%) | F1 score (95%) | Time (s) (95%) | Loss (95%) |
---|---|---|---|---|---|---|---|
Mendeley | |||||||
vitb_16 | Adam | 0.0001 | 1±0 | 1 ± 0 | 1 ± 0 | 118.35 ± 2.7751 | 0.33 ± 0.0037 |
vitb_32 | Adam | 0.0001 | 1±0 | 1±0 | 1 ± 0 | 64.85 ± 1.4289 | 0.33 ± 0.0053 |
vitl_32 | Adam | 0.0001 | 1±0 | 1±0 | 1 ± 0 | 152.16 ± 6.8667 | 0.33 ± 0.0031 |
BUSI | |||||||
vitb_16 | Adam | 0.0001 | 0.942 ± 0.0296 | 0.9541 ± 0.0272 | 0.936 ± 0.0323 | 366.39 ± 3.1718 | 0.68 ± 0.0213 |
vitb_32 | Adam | 0.0001 | 0.934 ± 0.0242 | 0.9548 ± 0.0183 | 0.932 ± 0.0204 | 289.874 ± 2.2868 | 0.72 ± 0.0145 |
vitl_32 | Adam | 0.0001 | 0.918 ± 0.0309 | 0.9351 ± 0.0315 | 0.912 ± 0.0333 | 402.074 ± 5.191 | 0.7 ± 0.0289 |
Mixed | |||||||
vitb_16 | Adam | 0.0001 | 0.914 ± 0.0188 | 0.9116 ± 0.0156 | 0.909 ± 0.0225 | 402.024 ± 1.7652 | 0.43 ± 0.0051 |
vitb_32 | Adam | 0.0001 | 0.894 ± 0.0068 | 0.8799 ± 0.0092 | 0.884 ± 0.0068 | 262.434 ± 1.8455 | 0.45 ± 0.013 |
vitl_32 | Adam | 0.0001 | 0.86 ± 0.0291 | 0.8499 ± 0.343 | 0.844 ± 0.0335 | 443.092 ± 5.0215 | 0.44 ± 0.008 |
Model | Optimizer | LR | Accuracy (95%) | AUC (95%) | F1 score (95%) | Time (s) (95%) | Loss (95%) |
---|---|---|---|---|---|---|---|
Mendeley | |||||||
ResNet50 | Adam | 0.0001 | 0.965 ± 0.02 | 0.972 ± 0.01 | 0.964 ± 0.02 | 113.35 ± 2.7751 | 0.33 ± 0.0037 |
EfficientNetB2 | Adam | 0.0001 | 0.961 ± 0.03 | 0.969 ± 0 | 0.956 ± 0.03 | 115.85 ± 1.4289 | 0.33 ± 0.0053 |
InceptionNetV3 | Adam | 0.0001 | 0.947 ± 0.1 | 0.951 ± 0.01 | 0.941 ± 0.02 | 123.16 ± 6.8667 | 0.33 ± 0.0031 |
BUSI | |||||||
ResNet50 | Adam | 0.0001 | 0.862 ± 0.1 | 0.879 ± 0.2 | 0.864 ± 0.2 | 261.39 ± 3.1718 | 0.68 ± 0.0213 |
EfficientNetB2 | Adam | 0.0001 | 0.851 ± 0.2 | 0.864 ± 0.1 | 0.856 ± 0.3 | 270.874 ± 2.2868 | 0.72 ± 0.0145 |
InceptionNetV3 | Adam | 0.0001 | 0.854 ± 0.1 | 0.87 ± 0.05 | 0.859 ± 0.1 | 272.074 ± 5.191 | 0.7 ± 0.0289 |
Mixed | |||||||
ResNet50 | Adam | 0.0001 | 0.838 ± 0.2 | 0.836 ± 0.08 | 0.841 ± 0.1 | 302.024 ± 1.7652 | 0.43 ± 0.0051 |
EfficientNetB2 | Adam | 0.0001 | 0.824 ± 0.1 | 0.829 ± 0.1 | 0.824 ± 0.2 | 362.434 ± 1.8455 | 0.45 ± 0.013 |
InceptionNetV3 | Adam | 0.0001 | 0.806 ± 0.07 | 0.8199 ± 0.1 | 0.804 ± 0.1 | 343.092 ± 5.0215 | 0.44 ± 0.008 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ayana, G.; Choe, S.-w. BUViTNet: Breast Ultrasound Detection via Vision Transformers. Diagnostics 2022, 12, 2654. https://doi.org/10.3390/diagnostics12112654
Ayana G, Choe S-w. BUViTNet: Breast Ultrasound Detection via Vision Transformers. Diagnostics. 2022; 12(11):2654. https://doi.org/10.3390/diagnostics12112654
Chicago/Turabian StyleAyana, Gelan, and Se-woon Choe. 2022. "BUViTNet: Breast Ultrasound Detection via Vision Transformers" Diagnostics 12, no. 11: 2654. https://doi.org/10.3390/diagnostics12112654
APA StyleAyana, G., & Choe, S. -w. (2022). BUViTNet: Breast Ultrasound Detection via Vision Transformers. Diagnostics, 12(11), 2654. https://doi.org/10.3390/diagnostics12112654