Number Recognition Through Color Distortion Using Convolutional Neural Networks
Abstract
:1. Introduction
1.1. Prevalence
1.2. Motivation
1.3. Background
1.4. Outline
2. Experimental Design
2.1. Tools Used
2.2. Testing Process
2.3. Model Training and Selection
2.4. Metrics of Success
- Performance: the overall accuracy percentage the model achieved when predicting new images. This is the value that matters the most. The goal is to have the model evaluate with a high accuracy on images it has never seen before. This will be the metric we compare to previous research.
- Precision: the percentage of correctly predicted positives out of all instances by the models.
- Recall (TPR): the percentage of actual positives that are correctly identified by the models.
- Training Time (in seconds): The amount of time it took for the model to train. Likewise with the number of epochs, the goal is to run the model as quickly as possible.
- Evaluation Time (in seconds): the amount of time it took for the model to evaluate on a new image or batch of images. In the real world, this is the value that matters the most when incorporating the model in a OCR sensor.
2.5. Test Cases
3. Results
4. Future Works
4.1. Improvements
4.2. Extensions
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Memon, J.; Sami, M.; Khan, R.A.; Uddin, M. Handwritten Optical Character Recognition (OCR): A Comprehensive Systematic Literature Review (SLR). IEEE Access 2020, 8, 142642–142668. [Google Scholar] [CrossRef]
- Tseng, Y.C.; Pan, H.K. Secure and invisible data hiding in 2-color images. In Proceedings of the Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213), Anchorage, AK, USA, 22–26 April 2001; Volume 2, pp. 887–896. [Google Scholar] [CrossRef]
- LeCun, Y.; Cortes, C.; Burges, C. MNIST Handwritten Digit Database. ATT Labs [Online]. Available online: http://yann.lecun.com/exdb/mnist (accessed on 12 December 2024).
- Baldominos, A.; Saez, Y.; Isasi, P. A Survey of Handwritten Character Recognition with MNIST and EMNIST. Appl. Sci. 2019, 9, 3169. [Google Scholar] [CrossRef]
- Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: Extending MNIST to handwritten letters. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 2921–2926. [Google Scholar] [CrossRef]
- Nocentini, O.; Kim, J.; Bashir, M.Z.; Cavallo, F. Image Classification Using Multiple Convolutional Neural Networks on the Fashion-MNIST Dataset. Sensors 2022, 22, 9544. [Google Scholar] [CrossRef] [PubMed]
- Gerónimo, D.; Serrat, J.; López, A.M.; Baldrich, R. Traffic Sign Recognition for Computer Vision Project-Based Learning. IEEE Trans. Educ. 2013, 56, 364–371. [Google Scholar] [CrossRef]
- Shaker, A.; Saralajew, S.; Gashteovski, K.; Faust, I.; Xu, Z.; Kotnis, B.; Ben-Rim, W.; Lawrence, C. Ishihara Like MNIST. 2022. Available online: https://www.kaggle.com/datasets/ammarshaker/ishihara-mnist (accessed on 12 December 2024).
- Ishihara, S. Tests for colour-blindness, 1951.
- Picryl. Available online: https://picryl.com/media/eight-ishihara-charts-for-testing-colour-blindness-europe-wellcome-l0059155-cf3385 (accessed on 12 December 2024).
- We Are Colorblind. 2019. Available online: https://wearecolorblind.com/articles/a-quick-introduction-to-color-blindness/ (accessed on 12 December 2024).
- American Academy of Ophthalmology. 2018. Available online: https://www.aao.org/eye-health/anatomy/cones#:~:text= There%20are%20three%20types%20of,%2Dsensing%20cones%20(10%20percent) (accessed on 12 December 2024).
- National Eye Institute. Color Blindness. Available online: https://www.nei.nih.gov/learn-about-eye-health/eye-conditions-and-diseases/color-blindness (accessed on 12 December 2024).
- Mayo Clinic. Color Blindness. Available online: https://www.mayoclinic.org/diseases-conditions/poor-color-vision/symptoms-causes/syc-20354988 (accessed on 12 December 2024).
- National Eye Institute. Types of Color Vision Deficiency. Available online: https://www.nei.nih.gov/learn-about-eye-health/eye-conditions-and-diseases/color-blindness/types-color-vision-deficiency (accessed on 12 December 2024).
- GavinAdmin. Available online: https://doctorofeye.com/colour-blindness/ (accessed on 12 December 2024).
- MedlinePlus. Available online: https://medlineplus.gov/genetics/condition/achromatopsia/#frequency (accessed on 12 December 2024).
- PickPik. Available online: https://www.pickpik.com/fruit-mixed-color-food-assorted-variety-62464 (accessed on 12 December 2024).
- Pilestone Inc. Color Blind Vision Simulator. Available online: https://pilestone.com/pages/color-blindness-simulator-1 (accessed on 12 December 2024).
- Petrovic, G.; Fujita, H. Deep Correct: Deep Learning Color Correction for Color Blindness; IOS Press: Amsterdam, The Netherlands, 2017. [Google Scholar] [CrossRef]
- Lin, H.Y.; Chen, L.Q.; Wang, M.L. Improving Discrimination in Color Vision Deficiency by Image Re-Coloring. Sensors 2019, 19, 2250. [Google Scholar] [CrossRef] [PubMed]
- Jefferson, L.; Harvey, R. Accommodating color blind computer users. In Proceedings of the 8th International ACM SIGACCESS Conference on Computers and Accessibility, Portland, OR, USA, 23–25 October 2006; pp. 40–47. [Google Scholar] [CrossRef]
- Jefferson, L.; Harvey, R. An interface to support color blind computer users. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, San Jose, CA, USA, 28 April–3 May 2007; pp. 1535–1538. [Google Scholar] [CrossRef]
- Tsekouras, G.E.; Rigos, A.; Chatzistamatis, S.; Tsimikas, J.; Kotis, K.; Caridakis, G.; Anagnostopoulos, C.N. A Novel Approach to Image Recoloring for Color Vision Deficiency. Sensors 2021, 21, 2740. [Google Scholar] [CrossRef] [PubMed]
- de la Escalera, A.; Moreno, L.; Salichs, M.; Armingol, J. Road traffic sign detection and classification. IEEE Trans. Ind. Electron. 1997, 44, 848–859. [Google Scholar] [CrossRef]
- Bahlmann, C.; Zhu, Y.; Ramesh, V.; Pellkofer, M.; Koehler, T. A system for traffic sign detection, tracking, and recognition using color, shape, and motion information. In Proceedings of the IEEE Proceedings. Intelligent Vehicles Symposium, Las Vegas, NV, USA, 6–8 June 2005; pp. 255–260. [Google Scholar] [CrossRef]
- Creusen, I.; Hazelhoff, L.; de With, P. Color transformation for improved traffic sign detection. In Proceedings of the 2012 19th IEEE International Conference on Image Processing, Orlando, FL, USA, 30 September–3 October 2012; pp. 461–464. [Google Scholar] [CrossRef]
- Xie, Z.; Lyu, R. Whether pattern memory can be truly realized in deep neural network? Research Square 2024. [Google Scholar] [CrossRef]
- Solonko, M. Reading Color Blindness Charts: Deep Learning and Computer Vision. Available online: https://towardsdatascience.com/reading-color-blindness-charts-deep-learning-and-computer-vision-a8c824dd71cd (accessed on 12 December 2024).
- Bottou, L.; Cortes, C.; Denker, J.; Drucker, H.; Guyon, I.; Jackel, L.; LeCun, Y.; Muller, U.; Sackinger, E.; Simard, P.; et al. Comparison of classifier methods: A case study in handwritten digit recognition. In Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3—Conference C: Signal Processing (Cat. No.94CH3440-5), Jerusalem, Israel, 9–13 October 1994; Volume 2, pp. 77–82. [Google Scholar] [CrossRef]
- GeeksforGeeks. MNIST Dataset: Practical Applications Using Keras and PyTorch. 2024. Available online: https://www.geeksforgeeks.org/mnist-dataset/ (accessed on 12 December 2024).
- Clanuwat, T.; Bober-Irizar, M.; Kitamoto, A.; Lamb, A.; Yamamoto, K.; Ha, D. Deep Learning for Classical Japanese Literature. arXiv 2018, arXiv:1812.01718. [Google Scholar] [CrossRef]
- Al-Noori, A.H.; Talib, M.; Harbi S., J. The Classification of Ancient Sumerian Characters using Convolutional Neural Network. In Proceedings of the 1st International Conference on Computing and Emerging Sciences, Lahore, Pakistan, 26–27 May 2023; SciTePress: Setúbal, Portugal, 2020. [Google Scholar] [CrossRef]
- Zalando Research; Crawford Company. Fashion Mnist. 2017. Available online: https://www.kaggle.com/datasets/zalando-research/fashionmnist (accessed on 12 December 2024).
- Xhaferra, E.; Cina, E.; Toti, L. Classification of Standard FASHION MNIST Dataset Using Deep Learning Based CNN Algorithms. In Proceedings of the 2022 International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Ankara, Turkey, 20–22 October 2022; pp. 494–498. [Google Scholar] [CrossRef]
- Rim, W.B.; Shaker, A.; Xu, Z.; Gashteovski, K.; Kotnis, B.; Lawrence, C.; Quittek, J.; Saralajew, S. A Human-Centric Assessment of the Usefulness of Attribution Methods in Computer Vision. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Vilnius, Lithuania, 9–13 September 2024. [Google Scholar] [CrossRef]
- Potjewyd, G. The Color Code. 2022. Available online: https://theophthalmologist.com/business-profession/the-color-code (accessed on 12 December 2024).
- Welcome Collection. Available online: https://wellcomecollection.org/search/works (accessed on 12 December 2024).
- Ishihara, S. Ishihara Instructions. Available online: https://web.stanford.edu/group/vista/wikiupload/0/0a/Ishihara.14.Plate.Instructions.pdf (accessed on 12 December 2024).
- Dhawale, K.; Vohra, A.S.; Jain, P.; Kumar, T. A Framework to Identify Color Blindness Charts Using Image Processing and CNN. In Communication, Networks and Computing: Second International Conference (CNC 2020), Gwalior, India, 29–31 December 2020; Revised Selected Papers 2; Springer: Singapore, 2021; pp. 100–109. [Google Scholar] [CrossRef]
- Ye, Q.; Doermann, D. Text Detection and Recognition in Imagery: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1480–1500. [Google Scholar] [CrossRef]
- Imran, F.; Hossain, D.M.A.; Mamun, M.A. Identification and Recognition of Printed Distorted Characters Using Proposed DCR Method. In Proceedings of the 2020 IEEE Region 10 Symposium (TENSYMP), Dhaka, Bangladesh, 5–7 June 2020; pp. 1478–1481. [Google Scholar] [CrossRef]
- Paravisionlab.co.in. LeNet-5: A Simple Yet Powerful CNN for Image Classification. Available online: https://paravisionlab.co.in/lenet-5-architecture/ (accessed on 12 December 2024).
- Boesch, G. Very Deep Convolutional Networks (VGG) Essential Guide. Available online: https://viso.ai/deep-learning/vgg-very-deep-convolutional-networks/ (accessed on 12 December 2024).
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Wang, Y.; Li, F.; Sun, H.; Li, W.; Zhong, C.; Wu, X.; Wang, H.; Wang, P. Improvement of MNIST Image Recognition Based on CNN. IOP Conf. Ser. Earth Environ. Sci. 2020, 428, 012097. [Google Scholar] [CrossRef]
- Cheng, S.; Shang, G.; Zhang, L. Handwritten digit recognition based on improved VGG16 network. In Proceedings of the Tenth International Conference on Graphics and Image Processing (ICGIP 2018), Chengdu, China, 12–14 December 2018; Volume 11069, pp. 954–962. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
- Kaggle. Available online: https://www.kaggle.com/ (accessed on 12 December 2024).
- GeeksforGeeks. How to Choose Batch Size and Number of Epochs When Fitting a Model? 2024. Available online: https://www.geeksforgeeks.org/how-to-choose-batch-size-and-number-of-epochs-when-fitting-a-model/ (accessed on 12 December 2024).
- Kaggle, A.J. Available online: https://www.kaggle.com/code/amyjang/tensorflow-mnist-cnn-tutorial/ (accessed on 12 December 2024).
- Thakur, A. ReLU vs. Sigmoid Function in Deep Neural Networks. 2020. Available online: https://wandb.ai/ayush-thakur/dl-question-bank/reports/ReLU-vs-Sigmoid-Function-in-Deep-Neural-Networks–VmlldzoyMDk0MzI#:~:text=The%20model%20trained%20with%20ReLU,better%20when%20trained%20with%20ReLU (accessed on 12 December 2024).
- Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Kumar, S. Comparison of Sigmoid, Tanh and Relu Activation Functions. 2023. Available online: https://www.aitude.com/comparison-of-sigmoid-tanh-and-relu-activation-functions/ (accessed on 12 December 2024).
- Melanie. Unveiling the Secrets of the VGG Model: A Deep Dive with Daniel. Available online: https://datascientest.com/en/unveiling-the-secrets-of-the-vgg-model-a-deep-dive-with-daniel#:~:text=A%20little%20history,Recognition%20Challenge)%20competition%20in%202014 (accessed on 12 December 2024).
- Wei, J. AlexNet: The Architecture That Challenged CNNs. Available online: https://towardsdatascience.com/alexnet-the-architecture-that-challenged-cnns-e406d5297951 (accessed on 12 December 2024).
- Taheri, R.; Arabikhan, F.; Gegov, A.; Akbari, N. Robust Aggregation Function in Federated Learning. In International Conference on Information and Knowledge Systems; Springer: Cham, Switzerland, 2023; pp. 168–175. [Google Scholar]
Model Name | Number of Trainable Parameters | Number of Layers | Conv2d Layers | Dense Layers |
---|---|---|---|---|
MNIST | 2,416,330 | 8 | 3 | 2 |
Lenet | 1,214,006 | 8 | 2 | 3 |
VGG16 | 50,415,434 | 22 | 13 | 3 |
AlexNet | 23,357,514 | 19 | 5 | 3 |
Custom 1 | 1,469,466 | 19 | 5 | 3 |
Custom 2 | 371,154 | 19 | 5 | 3 |
Test 1—Original MNIST (Accuracy, Precision, Recall Training Time, Evaluation Time)—60 k Training Images, 10 k Testing Images | |
---|---|
Model | Results |
MNIST | (98.37%, 98.38%, 98.36%, 835 s, 6.33 s) |
LeNet5 | (98.21%, 98.21%, 98.20%, 820 s, 7.45 s) |
VGG16 | (99.14%, 99.13%, 99.15%, 493 s, 8.28 s) |
AlexNet | (99.22%, 99.23%, 99.22%, 221 s, 6.39 s) |
Custom 1 | (99.12%, 99.11%, 99.12%, 199 s, 6.69 s) |
Custom 2 | (99.21%, 99.21%, 99.20%, 333 s, 6.12 s) |
Test 2—Color MNIST (Accuracy, Precision, Recall, Training Time, Evaluation Time)—60 k Training Images, 10 k Testing Images | |
---|---|
Model | Results |
MNIST | (98.68%, 98.68%, 98.66%, 957 s, 6.12 s) |
LeNet5 | (97.90%, 97.91%, 97.87%, 938 s, 8.23 s) |
VGG16 | (99.10%, 99.10%, 99.09%, 566 s, 8.30 s) |
AlexNet | (99.24%, 99.26%, 99.23%, 276 s, 7.51 s) |
Custom 1 | (98.98%, 98.99%, 98.96%, 189 s, 10.00 s) |
Custom 2 | (99.16%, 99.17%, 99.16%, 247 s, 8.66 s) |
Test 3: Grayscale Ishihara (Accuracy, Precision, Recall, Training Time, Evaluation Time)—10 k Training Images, 2 k Testing Images Per Plate | |||
---|---|---|---|
Plate | MNIST | LeNet5 | VGG16 |
2 | (43.00%, 78.18%, 43.00%, 113 s, 0.74 s) | (33.85%, 53.03%, 33.85%, 152 s, 0.84 s) | (84.85%, 88.25%, 84.85%, 391 s, 1.44 s) |
3 | (65.35%, 78.47%, 65.35%, 149 s, 0.77 s) | (42.00%, 64.12%, 42.00%, 153 s, 1.07 s) | (93.65%, 94.15%, 93.65%, 388 s, 1.07 s) |
4 | (50.90%, 72.67%, 50.90%, 152 s, 0.92 s) | (40.75%, 49.52%, 40.75%, 147 s, 0.90 s) | (75.90%, 83.55%, 75.90%, 383 s, 0.85 s) |
5 | (41.50%, 73.07%, 41.50%, 148 s, 0.88 s) | (25.95%, 57.72%, 25.95%, 151 s, 0.91 s) | (78.35%, 85.23%, 78.35%, 383 s, 1.03 s) |
6 | (45.20%, 73.90%, 45.20%, 152 s, 0.90 s) | (31.40%, 61.03%, 31.40%, 129 s, 0.87 s) | (43.95%, 66.19%, 43.95%, 383 s, 1.11 s) |
7 | (42.50%, 62.24%, 42.50%, 151 s, 0.72 s) | (32.45%, 61.74%, 32.45%, 151 s, 0.76 s) | (75.50%, 82.82%, 75.50%, 362 s, 0.95 s) |
8 | (82.65%, 82.88%, 82.65%, 152 s, 0.80 s) | (50.55%, 64.38%, 50.55%, 151 s, 0.59 s) | (68.30%, 81.07%, 68.30%, 359 s, 0.90 s) |
9 | (82.40%, 82.92%, 82.40%, 151 s, 0.79 s) | (41.20%, 62.24%, 41.20%, 149 s, 0.73 s) | (91.60%, 91.93%, 91.60%, 384 s, 0.89 s) |
rand | (80.30%, 80.54%, 80.30%, 152 s, 0.72 s) | (60.45%, 60.83%, 60.45%, 150 s, 0.71 s) | (10.00%, 1.00%, 10.00%, 382 s, 0.95 s) |
all | 91.16%, 91.35%, 91.16%, 1238 s, 6.08 s) | (78.43%, 78.55%, 78.43%, 1214 s, 5.84 s) | (98.53%, 98.53%, 98.53%, 732 s, 7.49 s) |
Plate | AlexNet | Custom 1 | Custom 2 |
2 | (57.70%, 77.69%, 57.70%, 177 s, 0.91 s) | (81.00%, 85.56%, 81.00%, 166 s, 0.87 s) | (55.35%, 76.27%, 55.35%, 173 s, 0.91 s) |
3 | (85.75%, 87.47%, 85.75%, 177 s, 0.80 s) | (76.85%, 81.76%, 76.85%, 167 s, 0.93 s) | (69.60%, 75.39%, 69.60%, 165 s, 0.82 s) |
4 | (87.05%, 89.04%, 87.05%, 174 s, 0.94 s) | (69.70%, 80.16%, 69.70%, 172 s, 0.85 s) | (61.10%, 76.24%, 61.10%, 167 s, 0.91 s) |
5 | (82.45%, 88.81%, 82.45%, 179 s, 0.89 s) | (72.50%, 81.78%, 72.50%, 172 s, 0.89 s) | (69.20%, 76.53%, 69.20%, 170 s, 0.88 s) |
6 | (83.55%, 87.31%, 83.55%, 177 s, 0.86 s) | (67.10%, 79.72%, 67.10%, 175 s, 0.90 s) | (74.60%, 77.64%, 74.60%, 172 s, 0.95 s) |
7 | (78.75%, 83.65%, 78.75%, 175 s, 0.70 s) | (68.50%, 81.56%, 68.50%, 173 s, 0.74 s) | (67.80%, 74.95%, 67.80%, 172 s, 0.75 s) |
8 | (85.00%, 86.97%, 85.00%, 174 s, 0.75 s) | (68.85%, 77.39%, 68.85%, 172 s, 0.77 s) | (66.65%, 74.87%, 66.65%, 170 s, 0.75 s) |
9 | (67.85%, 81.03%, 67.85%, 177 s, 0.70 s) | (76.65%, 81.07%, 76.65%, 173 s, 0.69 s) | (75.35%, 76.99%, 75.35%, 169 s, 0.68 s) |
rand | (78.25%, 81.50%, 78.25%, 175 s, 0.73 s) | (72.50%, 75.84%, 72.50%, 169 s, 0.73 s) | (68.30%, 74.57%, 68.30%, 169 s, 0.79 s) |
all | (89.84%, 90.9%, 89.84%, 1413 s, 5.81 s) | (90.29%, 90.83%, 90.29%, 1391 s, 6.10 s) | (88.06%, 88.67%, 88.06%, 1395 s, 5.94 s) |
Test 4: Color Ishihara (Accuracy, Precision, Recall, Training Time, Evaluation Time)—10 k Training Images, 2 k Testing Images Per Plate | |||
---|---|---|---|
Plate | MNIST | LeNet5 | VGG16 |
2 | (94.25%, 94.28%, 94.25%, 162 s, 1.44 s) | (92.45%, 92.44%, 92.45%, 172 s, 1.42 s) | (98.25%, 98.27%, 98.25%, 413 s, 1.60 s) |
3 | (94.30%, 94.31%, 94.30%, 172 s, 1.45 s) | (93.35%, 93.38%, 93.35%, 166 s, 1.44 s) | (97.75%, 97.83%, 97.75%, 406 s, 1.35 s) |
4 | (95.90%, 95.91%, 95.90%, 171 s, 1.42 s) | (93.45%, 93.44%, 93.45%, 166 s, 1.38 s) | (97.50%, 97.58%, 97.50%, 401 s, 1.60 s) |
5 | (94.60%, 94.63%, 94.60%, 174 s, 1.43 s) | (92.25%, 92.32%, 92.25%, 171 s, 1.44 s) | (97.95%, 97.99%, 97.95%, 403 s, 1.77 s) |
6 | (95.40%, 95.42%, 95.40%, 171 s, 1.41 s) | (92.80%, 92.86%, 92.80%, 171 s, 1.47 s) | (97.45%, 97.50%, 97.45%, 392 s, 2.79 s) |
7 | (96.45%, 96.45%, 96.45%, 171 s, 1.55 s) | (93.00%, 93.01%, 93.00%, 169 s, 0.98 s) | (99.00%, 99.00%, 99.00%, 402 s, 0.92 s) |
8 | (94.75%, 94.76%, 94.75%, 171 s, 1.00 s) | (92.15%, 92.19%, 92.15%, 171 s, 1.03 s) | (99.10%, 99.11%, 99.10%, 398 s, 1.17 s) |
9 | (95.15%, 95.16%, 95.15%, 170 s, 0.94 s) | (92.80%, 92.83%, 92.80%, 168 s, 0.89 s) | (98.55%, 98.56%, 98.55%, 401 s, 1.06 s) |
rand | (78.95%, 79.08%, 78.95%, 171 s, 1.19 s) | (44.20%, 43.83%, 44.20%, 168 s, 1.21 s) | (10.00%, 1.00%, 10.00%, 404 s, 0.92 s) |
all | (92.30%, 92.40%, 92.30%, 1328 s, 10.74 s) | (82.26%, 83.68%, 82.26%, 1319 s, 8.85 s) | (98.31%, 98.32%, 98.31%, 572 s, 11.73 s) |
Plate | AlexNet | Custom 1 | Custom 2 |
2 | (94.10%, 95.10%, 94.10%, 194 s, 1.41 s) | (95.75%, 95.95%, 95.75%, 191 s, 1.44 s) | (95.95%, 96.01%, 95.95%, 190 s, 1.36 s) |
3 | (96.30%, 96.43%, 96.30%, 196 s, 1.50 s) | (96.55%, 96.62%, 96.55%, 182 s, 1.51 s) | (94.55%, 94.84%, 94.55%, 190 s, 1.43 s) |
4 | (89.95%, 91.19%, 89.95%, 190 s, 1.42 s) | (96.85%, 96.92%, 96.85%, 190 s, 1.46 s) | (96.40%, 96.45%, 96.40%, 188 s, 1.41 s) |
5 | (95.25%, 95.43%, 95.25%, 195 s, 1.44 s) | (97.25%, 97.27%, 97.25%, 186 s, 1.46 s) | (96.20%, 96.26%, 96.20%, 190 s, 1.45 s) |
6 | (96.85%, 96.89%, 96.85%, 196 s, 1.45 s) | (96.65%, 96.84%, 96.65%, 190 s, 1.48 s) | (97.10%, 97.13%, 97.10%, 182 s, 1.47 s) |
7 | (94.40%, 95.04%, 94.40%, 193 s, 0.86 s) | (97.40%, 97.43%, 97.40%, 191 s, 0.94 s) | (96.55%, 96.69%, 96.55%, 188 s, 0.89 s) |
8 | (93.70%, 94.49%, 93.70%, 195 s, 0.97 s) | (95.75%, 96.05%, 95.75%, 192 s, 1.07 s) | (96.20%, 96.33%, 96.20%, 192 s, 0.91 s) |
9 | (95.70%, 95.92%, 95.70%, 199 s, 0.85 s) | (92.50%, 93.81%, 92.50%, 191 s, 0.94 s) | (96.95%, 97.00%, 96.95%, 192 s, 1.46 s) |
rand | (83.70%, 85.24%, 83.70%, 196 s, 1.02 s) | (76.50%, 78.74%, 76.50%, 194 s, 0.91 s) | (76.20%, 77.45%, 76.20%, 194 s, 0.94 s) |
all | (96.77%, 96.78%, 96.77%, 1512 s, 9.07 s) | (94.38%, 94.54%, 94.38%, 1484 s, 10.36 s) | (90.81%, 90.92%, 90.81%, 1472 s, 9.48 s) |
Ishihara Color—Test Case Without the Random Plate (Accuracy, Precision, Recall, Training Time, Evaluation Time) | |
---|---|
Model | Results |
MNIST | (97.19%, 97.20%, 97.19%, 1395 s, 9.87 s) |
LeNet5 | (96.11%, 96.11%, 96.11%, 1382 s, 10.48 s) |
VGG16 | (98.88%, 98.89%, 98.88%, 640 s, 11.54 s) |
AlexNet | (98.55%, 98.56%, 98.55%, 493 s, 10.00 s) |
Custom 1 | (98.45%, 98.46%, 98.45%, 852 s, 11.49 s) |
Custom 2 | (98.09%, 98.11%, 98.09%, 1537 s, 9.28 s) |
Test 5—Original MNIST with Gray Ishihara (Accuracy, Precision, Recall, Training Time, Evaluation Time)–60 k Training Images, 10 k Testing Images | |
---|---|
Model | Results |
MNIST | (10.00%, 1.00%, 10.00%, 805 s, 4.77 s) |
LeNet | (10.01%, 2.67%, 10.01%, 821 s, 5.55 s) |
VGG16 | (12.48%, 12.51%, 12.48%, 536 s, 7.83 s) |
AlexNet | (10.02%, 4.06%, 10.02%, 182 s, 5.35 s) |
Custom 1 | (9.95%, 4.56%, 9.95%, 158 s, 5.95 s) |
Custom 2 | (10.23%, 4.13%, 10.23%, 267 s, 6.26 s) |
Test 7—Color MNIST with Color Ishihara (Accuracy, Precision, Recall, Training Time, Evaluation Time)–60 k Training Images, 10 k Testing Images | |
---|---|
Model | Results |
MNIST | (9.93%, 5.64%, 9.93%, 950 s, 8.65 s) |
LeNet | (10.49%, 2.22%, 10.49%, 934 s, 9.62 s) |
VGG16 | (10.00%, 19.12%, 10.00%, 523 s, 9.49 s) |
AlexNet | (10.06%, 5.05%, 10.06%, 209 s, 8.27 s) |
Custom 1 | (10.02%, 2.62%, 10.02%, 249 s, 10.36 s) |
Custom 2 | (10.06%, 7.17%, 10.06%, 246 s, 10.57 s) |
Test 6—Grayscale Ishihara with Original MNIST (Accuracy, Precision, Recall, Training Time, Evaluation Time)–60 k Training Images, 10 k Testing Images | |
---|---|
Model | Results |
MNIST | (32.13%, 31.14%, 32.94%, 785 s, 5.84 s) |
LeNet | (35.81%, 51.21%, 35.46%, 819 s, 5.70 s) |
VGG16 | (9.84%, 11.98%, 10.02%, 1104 s, 6.56 s) |
AlexNet | (10.28%, 1.03%, 10.00%, 944 s, 5.34 s) |
Custom 1 | (9.68%, 1.59%, 9.94%, 959 s, 6.08 s) |
Custom 2 | (10.32%, 1.03%, 10.00%, 912 s, 6.63 s) |
Test 8—Color Ishihara with Color MNIST (Accuracy, Precision, Recall, Training Time, Evaluation Time)—60 k Training Images, 10 k Testing Images | |
---|---|
Model | Results |
MNI sT | (11.20%, 7.21%, 10.95%, 938 s, 7.04 s) |
LeNet | (25.55%, 55.60%, 24.57%, 901 s, 7.15 s) |
VGG16 | (11.48%, 13.30%, 10.13%, 1126 s, 7.80 s) |
AlexNet | (9.74%, 0.97%, 10.00%, 1058 s, 8.19 s) |
Custom 1 | (9.05%, 1.48%, 9.22%, 1029 s, 7.77 s) |
Custom 2 | (12.29%, 6.49%, 12.05%, 1026 s, 8.60 s) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Henshaw, C.; Dennis, J.; Nadzam, J.; Michaels, A.J. Number Recognition Through Color Distortion Using Convolutional Neural Networks. Computers 2025, 14, 34. https://doi.org/10.3390/computers14020034
Henshaw C, Dennis J, Nadzam J, Michaels AJ. Number Recognition Through Color Distortion Using Convolutional Neural Networks. Computers. 2025; 14(2):34. https://doi.org/10.3390/computers14020034
Chicago/Turabian StyleHenshaw, Christopher, Jacob Dennis, Jonathan Nadzam, and Alan J. Michaels. 2025. "Number Recognition Through Color Distortion Using Convolutional Neural Networks" Computers 14, no. 2: 34. https://doi.org/10.3390/computers14020034
APA StyleHenshaw, C., Dennis, J., Nadzam, J., & Michaels, A. J. (2025). Number Recognition Through Color Distortion Using Convolutional Neural Networks. Computers, 14(2), 34. https://doi.org/10.3390/computers14020034