Multi-Class Skin Cancer Classification Using Vision Transformer Networks and Convolutional Neural Network-Based Pre-Trained Models
Abstract
:1. Introduction
- To address the issue of class imbalance, an effective data augmentation technique was implemented to artificially increase the dataset samples;
- The proposed fine-tuned ViT model outperformed the state-of-the-art models for multi-class skin cancer classification;
- In this study, we have also fine-tuned the CNN-based pretrained models, including ResNet50, ResNet101, ResNet152, ResNet50V2, ResNet101V2, ResNet152V2, DenseNet121, DenseNet169, DenseNet201, VGG16, and VGG19, respectively;
- The extensive experiments were performed using the data augmentation technique to propose an effective model;
- The system for classifying multi-class skin lesions has evolved, offering professionals and patients accurate diagnostic information.
2. Materials and Methods
2.1. Dataset
2.2. Data Augmentation
2.3. Effective CNN-Based Pretrained Model (Fine Tuning of Resent50)
Hyperparameter Tuning
2.4. Transfer Learning and Network Architecture Modifications of the CNN-Based Pretrained Models
2.5. Vision Transformer (ViT) Pretrained Model
3. Results and Discussion
3.1. Evaluation Metrics
- Accuracy: measures the overall correctness of the model’s predictions. It calculates the ratio of correctly classified samples to the total number of samples. Accuracy alone is not always sufficient for evaluation, especially when dealing with imbalanced datasets or when different types of errors have varying consequences;
- Precision: quantifies the model’s ability to correctly identify the positive samples among the predicted positives. It calculates the ratio of true positives to the sum of true positives and false positives. Precision focuses on the reliability of positive predictions;
- Recall: also known as sensitivity or the true positive rate, recall measures the model’s ability to correctly identify the positive samples among all actual positives. It calculates the ratio of true positives to the sum of true positives and false negatives. Recall focuses on the completeness of positive predictions;
- F1 Score: the harmonic mean of precision and recall. It provides a single metric that balances both precision and recall, making it useful for when there is an uneven class distribution or an equal emphasis on both types of errors. The F1 score ranges from 0 to 1, with 1 being the best performance.
3.2. Results and Discussion
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AKIEC|AK | Actinic Keratoses |
BCC | Basal cell carcinoma |
BKL | Benign keratosis-like lesions |
DF | Dermatofibroma |
MEL | Melanoma |
NV | Melanocytic nevi |
TP | True positive |
FP | False positive |
TN | True negative |
FN | False negative |
ViT | Vision tranformer |
MAX | Maximum |
MIN | Minumum |
AVG | Average |
HSI | Hyperspectral imaging |
RGB | Red, green, and blue (color images) |
References
- Cancer. Available online: https://www.who.int/news-room/fact-sheets/detail/cancer (accessed on 4 August 2022).
- Cancer—NHS. Available online: https://www.nhs.uk/conditions/cancer/ (accessed on 4 August 2022).
- Melanoma—The Skin Cancer Foundation. Available online: https://www.skincancer.org/skin-cancer-information/melanoma/ (accessed on 8 July 2023).
- Arroyo, J.L.G.; Zapirain, B.G. Automated Detection of Melanoma in Dermoscopic Images. In Computer Vision Techniques for the Diagnosis of Skin Cancer; Springer: Berlin/Heidelberg, Germany, 2014; pp. 139–192. [Google Scholar] [CrossRef]
- Pomponiu, V.; Nejati, H.; Cheung, N.-M. Deepmole: Deep neural networks for skin mole lesion classification. Proc. Int. Conf. Image Process. 2016, 2016, 2623–2627. [Google Scholar] [CrossRef]
- Guo, Y.; Liu, Y.; Oerlemans, A.; Lao, S.; Wu, S.; Lew, M.S. Deep learning for visual understanding: A review. Neurocomputing 2016, 187, 27–48. [Google Scholar] [CrossRef]
- Li, K.M.; Li, E.C. Skin Lesion Analysis Towards Melanoma Detection via End-to-end Deep Learning of Convolutional Neural Networks. arXiv 2018, arXiv:1807.08332. [Google Scholar]
- Li, H.; Pan, Y.; Zhao, J.; Zhang, L. Skin disease diagnosis with deep learning: A review. Neurocomputing 2021, 464, 364–393. [Google Scholar] [CrossRef]
- Goyal, M.; Knackstedt, T.; Yan, S.; Hassanpour, S. Artificial intelligence-based image classification methods for diagnosis of skin cancer: Challenges and opportunities. Comput. Biol. Med. 2020, 127, 104065. [Google Scholar] [CrossRef] [PubMed]
- Kumar, V.B.; Kumar, S.S.; Saboo, V. Dermatological Disease Detection Using Image Processing and Machine Learning. In Proceedings of the 2016 3rd International Conference on Artificial Intelligence and Pattern Recognition (AIPR), Lodz, Poland, 19–21 September 2016; pp. 88–93. [Google Scholar] [CrossRef]
- Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef] [PubMed]
- Kawahara, J.; Daneshvar, S.; Argenziano, G.; Hamarneh, G. Seven-Point Checklist and Skin Lesion Classification Using Multitask Multimodal Neural Nets. IEEE J. Biomed. Health Inform. 2019, 23, 538–546. [Google Scholar] [CrossRef] [PubMed]
- Chandrashekar, G.; Sahin, F. A survey on feature selection methods. Comput. Electr. Eng. 2014, 40, 16–28. [Google Scholar] [CrossRef]
- Haenssle, H.A.; Fink, C.; Schneiderbauer, R.; Toberer, F.; Buhl, T.; Blum, A.; Kalloo, A.; Hassen, A.B.H.; Thomas, L.; Enk, A.; et al. Man against machine: Diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann. Oncol. 2018, 29, 1836–1842. [Google Scholar] [CrossRef] [PubMed]
- Bassel, A.; Abdulkareem, A.B.; Alyasseri, Z.A.A.; Sani, N.S.; Mohammed, H.J. Automatic Malignant and Benign Skin Cancer Classification Using a Hybrid Deep Learning Approach. Diagnostics 2022, 12, 2472. [Google Scholar] [CrossRef] [PubMed]
- Jeny, A.A.; Sakib, A.N.M.; Junayed, M.S.; Lima, K.A.; Ahmed, I.; Islam, M.B. SkNet: A Convolutional Neural Networks Based Classification Approach for Skin Cancer Classes. In Proceedings of the ICCIT 2020 23rd International Conference on Computer and Information Technology (ICCIT), Dhaka, Bangladesh, 19–21 December 2020. [Google Scholar] [CrossRef]
- Tabrizchi, H.; Parvizpour, S.; Razmara, J. An Improved VGG Model for Skin Cancer Detection. Neural Process. Lett. 2022, 1–18. [Google Scholar] [CrossRef]
- Skin Cancer MNIST: HAM10000|Kaggle. Available online: https://www.kaggle.com/datasets/kmader/skin-cancer-mnist-ham10000 (accessed on 11 September 2022).
- Residual Neural Network (ResNet). Available online: https://iq.opengenus.org/residual-neural-networks/ (accessed on 11 August 2022).
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. 2016, pp. 770–778. Available online: http://image-net.org/challenges/LSVRC/2015/ (accessed on 13 July 2023).
- Dropout Regularization in Neural Networks: How It Works and When to Use It—Programmathically. Available online: https://programmathically.com/dropout-regularization-in-neural-networks-how-it-works-and-when-to-use-it/ (accessed on 12 August 2022).
- What Are Hyperparameters? and How to Tune the Hyperparameters in a Deep Neural Network?|by Pranoy Radhakrishnan|Towards Data Science. Available online: https://towardsdatascience.com/what-are-hyperparameters-and-how-to-tune-the-hyperparameters-in-a-deep-neural-network-d0604917584a (accessed on 18 August 2022).
- Activation Functions in Neural Networks—GeeksforGeeks. Available online: https://www.geeksforgeeks.org/activation-functions-neural-networks/ (accessed on 18 August 2022).
- What, Why and Which?? Activation Functions|by Snehal Gharat|Medium. Available online: https://medium.com/@snaily16/what-why-and-which-activation-functions-b2bf748c0441 (accessed on 18 August 2022).
- Gentle Introduction to the Adam Optimization Algorithm for Deep Learning. Available online: https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/ (accessed on 18 August 2022).
- Long, M.; Cao, Y.; Wang, J.; Jordan, M.I.; Edu, J. Learning Transferable Features with Deep Adaptation Networks. PMLR 2015, 37, 97–105. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Sufi, A. Skin Cancer Classification Using Deep Learning. 2022. Available online: http://dspace.uiu.ac.bd/handle/52243/2483 (accessed on 15 September 2022).
- Hosny, K.M.; Kassem, M.A.; Foaud, M.M. Skin Cancer Classification using Deep Learning and Transfer Learning. In Proceedings of the 2018 9th Cairo International Biomedical Engineering Conference (CIBEC), Cairo, Egypt, 20–22 December 2018; pp. 90–93. [Google Scholar] [CrossRef]
- Dorj, U.O.; Lee, K.K.; Choi, J.Y.; Lee, M. The skin cancer classification using deep convolutional neural network. Multimed. Tools Appl. 2018, 77, 9909–9924. [Google Scholar] [CrossRef]
- Budhiman, A.; Suyanto, S.; Arifianto, A. Melanoma Cancer Classification Using ResNet with Data Augmentation. In Proceedings of the 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), Yogyakarta, Indonesia, 5–6 December 2019; pp. 17–20. [Google Scholar] [CrossRef]
- Ali, M.S.; Miah, M.S.; Haque, J.; Rahman, M.M.; Islam, M.K. An enhanced technique of skin cancer classification using deep convolutional neural network with transfer learning models. Mach. Learn. Appl. 2021, 5, 100036. [Google Scholar] [CrossRef]
- Jain, S.; Singhania, U.; Tripathy, B.; Nasr, E.A.; Aboudaif, M.K.; Kamrani, A.K. Deep Learning-Based Transfer Learning for Classification of Skin Cancer. Sensors 2021, 21, 8142. [Google Scholar] [CrossRef] [PubMed]
- Ali, K.; Shaikh, Z.A.; Khan, A.A.; Laghari, A.A. Multiclass skin cancer classification using EfficientNets—A first step towards preventing skin cancer. Neurosci. Inform. 2022, 2, 100034. [Google Scholar] [CrossRef]
- Huang, H.Y.; Hsiao, Y.P.; Mukundan, A.; Tsao, Y.M.; Chang, W.Y.; Wang, H.C. Classification of Skin Cancer Using Novel Hyperspectral Imaging Engineering via YOLOv5. J. Clin. Med. 2023, 12, 1134. [Google Scholar] [CrossRef] [PubMed]
Parameter | Values |
---|---|
Hidden neurons | 1024, 512, 256, 128, 64, 7 |
Epochs | 20 |
Dropout | 0.3 |
Activation function | ReLU, SoftMax |
Loss-Function | Categorical |
Optimizer | Adam |
Learning rate | 0.001 |
Batch size | 32 |
Early stopping | Yes |
Patience | 3 |
Parameter | Values |
---|---|
Encoder and pooling layers dimensionality | 768 |
Transformer encoder hidden layers | 12 |
Feed-forward layer dimensionality | 3072 |
Hidden layer activation | Gelu |
Hidden layer dropout | 0.1 |
Image size | 224 × 224 |
Channels | 3 |
Patches | 16 × 16 |
Balanced | True |
ResNet50 | ResNet152 | ResNet101 |
---|---|---|
Model | Confusion Matrix |
---|---|
ResNet50 confusion matrix (82% accuracy) | |
ViT confusion matrix (92.14% accuracy) |
Class name | Precision | Recall | F1 | Support |
---|---|---|---|---|
AKIEC | 1.0000 | 0.9565 | 0.9778 | 23 |
BCC | 0.9615 | 0.9615 | 0.9615 | 26 |
BKL | 0.8824 | 0.7500 | 0.8108 | 20 |
DF | 1.0000 | 1.0000 | 1.0000 | 19 |
MEL | 0.8824 | 0.8824 | 0.8824 | 17 |
NV | 0.7500 | 0.9375 | 0.8333 | 16 |
VASC | 0.9474 | 0.9474 | 0.9474 | 19 |
Authors and Year | Classes | Method | Evaluation Metric | Results | |
---|---|---|---|---|---|
(Ali et al., 2021) [32] | 2 | Custom CNN-Based Model named DCNN Proposed | Accuracy | Train 93.16% Test 91.93% | |
(Bassel et al., 2022) [14] | 2 | Stacking-CV (Proposed) + Xception Features | Accuracy | Test 90.9% | |
(Jain et al., 2021) [33] | 7 | Xception Net Transfer Learning-Based Model | Accuracy | Test 90.48% | |
(Ali et al., 2021) [34] | 7 | Efficient-Nets B0-B7 Transfer Learning-Based Models (Top Accuracy achieved with Efficient-Net B4) | Accuracy Precision Recall F1 | 87.91% 88% 88% 87% | |
(Huang et al., 2023) [35] | 3 | YoloV5 (RGB Images, HSI Images) | Accuracy Precision Recall F1 Specificity | RGB | HS1 |
79.2% | 78.7% | ||||
88.8% | 80% | ||||
75.8% | 72.6% | ||||
81.8% | 76.1% | ||||
79.8% | 78.6% | ||||
Proposed | 7 | Vision Transformers (RGB Images) | Accuracy Precision Recall F1 | 92.14% 92.61% 92.14% 92.17% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Arshed, M.A.; Mumtaz, S.; Ibrahim, M.; Ahmed, S.; Tahir, M.; Shafi, M. Multi-Class Skin Cancer Classification Using Vision Transformer Networks and Convolutional Neural Network-Based Pre-Trained Models. Information 2023, 14, 415. https://doi.org/10.3390/info14070415
Arshed MA, Mumtaz S, Ibrahim M, Ahmed S, Tahir M, Shafi M. Multi-Class Skin Cancer Classification Using Vision Transformer Networks and Convolutional Neural Network-Based Pre-Trained Models. Information. 2023; 14(7):415. https://doi.org/10.3390/info14070415
Chicago/Turabian StyleArshed, Muhammad Asad, Shahzad Mumtaz, Muhammad Ibrahim, Saeed Ahmed, Muhammad Tahir, and Muhammad Shafi. 2023. "Multi-Class Skin Cancer Classification Using Vision Transformer Networks and Convolutional Neural Network-Based Pre-Trained Models" Information 14, no. 7: 415. https://doi.org/10.3390/info14070415
APA StyleArshed, M. A., Mumtaz, S., Ibrahim, M., Ahmed, S., Tahir, M., & Shafi, M. (2023). Multi-Class Skin Cancer Classification Using Vision Transformer Networks and Convolutional Neural Network-Based Pre-Trained Models. Information, 14(7), 415. https://doi.org/10.3390/info14070415