Visualization of Customized Convolutional Neural Network for Natural Language Recognition
Abstract
:1. Introduction
- The dataset for the proposed research work was prepared for 24 different classes of Gurumukhi months from 500 different writers of different age groups and professions, where each writer wrote each word twice, resulting in 24,000 words in the Gurumukhi month name dataset.
- A Convolutional Neural Network is proposed for the prepared Gurumukhi month name dataset.
- The focus of this work is to examine the overall results of using the Convolutional Neural Network on the prepared dataset in terms of accuracy, precision, recall and F1 score.
- The proposed model’s performance is evaluated using different numbers of epochs and batch sizes in a comparative performance analysis.
- A performance comparison is conducted for the proposed Convolutional Neural Network model with various transfer learning models.
2. Literature Review
3. Dataset Preparation
3.1. Dataset Collection
3.2. Digitization
3.3. Dataset Distribution in Respective Folders
3.4. Data Normalization
3.5. Data Augmentation
4. Methodology
4.1. Proposed CNN Model
4.2. Description of Bilinear Model of CNN
4.3. Proposed Model’s Training Parameters
5. Experiments and Result Analysis
5.1. Simulation of Proposed Model at 100 Epochs with Different Batch Sizes
5.1.1. Analysis with Batch Size 20
5.1.2. Analysis with Batch Size 30
5.1.3. Analysis with Batch Size 40
5.2. Simulation of Proposed Model at 40 Epochs with Different Batch Sizes
5.2.1. Analysis with Batch Size 20
5.2.2. Analysis with Batch Size 30
5.2.3. Analysis with Batch Size 40
5.3. Analysis of Proposed Model with Different Numbers of Epochs and Different Batch Sizes
6. Comparison of Proposed CNN Model with Transfer Learning Models at 100 Epochs and a Batch Size of 20
7. Comparison of the Proposed Model against Existing Text Recognition Systems
8. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Tappert, C.C.; Suen, C.Y.; Wakahara, T. The state of the art in online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 1990, 12, 787–808. [Google Scholar] [CrossRef]
- Tay, Y.H.; Lallican, P.-M.; Khalid, M.; Knerr, S.; Viard-Gaudin, C. An analytical handwritten word recognition system with word-level discriminant training. In Proceedings of the Sixth International Conference on Document Analysis and Recognition, Seattle, WA, USA, 10–13 September 2001; pp. 726–730. [Google Scholar]
- Madhvanath, S.; Govindaraju, V. The role of holistic paradigms in handwritten word recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23, 149–164. [Google Scholar] [CrossRef]
- Połap, D.; Srivastava, G. Neural image reconstruction using a heuristic validation mechanism. Neural Comput. Appl. 2021, 33, 10787–10797. [Google Scholar] [CrossRef]
- Gao, G.; You, P.; Pan, R.; Han, S.; Zhang, Y.; Dai, Y.; Lee, H. Neural Image Compression via Attentional Multi-scale Back Projection and Frequency Decomposition. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 11–17 October 2021; pp. 14657–14666. [Google Scholar]
- Sharma, A.; Kumar, R.; Sharma, R.K. Online handwritten Gurmukhi character recognition using elastic matching. In Proceedings of the 2008 Congress on Image and Signal Processing, Sanya, China, 27–30 May 2008; pp. 391–396. [Google Scholar]
- Sharma, A.; Kumar, R.; Sharma, R.K. Rearrangement of recognized strokes in online handwritten Gurmukhi words recognition. In Proceedings of the 2009 10th International Conference on Document Analysis and Recognition, Barcelona, Spain, 26–29 July 2009; pp. 1241–1245. [Google Scholar]
- Kumar, R.; Singh, A. Detection and segmentation of lines and words in Gurmukhi handwritten text. In Proceedings of the 2010 IEEE 2nd International Advance Computing Conference (IACC), Patiala, India, 19–20 February 2010. [Google Scholar]
- Dhir, R. Moment based invariant feature extraction techniques for bilingual character recognition. In Proceedings of the 2010 2nd International Conference on Education Technology and Computer, Shanghai, China, 22–24 June 2010; pp. V4-80–V4-84. [Google Scholar]
- Kumar, M.; Jindal, M.K.; Sharma, R.K. Review on OCR for handwritten Indian scripts character recognition. In Advances in Digital Image Processing and Information Technology, Proceedings of the First International Conference on Digital Image Processing and Pattern Recognition, DPPR 2011, Tirunelveli, India, September 23–25 2011; Springer: Heidelberg, Germany, 2011; Volume 205, pp. 268–276. [Google Scholar]
- Kumar, M.; Jindal, M.K.; Sharma, R.K. Classification of characters and grading writers in offline handwritten Gurmukhi script. In Proceedings of the 2011 International Conference on Image Information Processing, Shimla, India, 3–5 November 2011; pp. 1–4. [Google Scholar]
- Kumar, M.; Sharma, R.K.; Jindal, M.K. Efficient feature extraction techniques for offline handwritten Gurmukhi character recognition. Natl. Acad. Sci. Lett. 2014, 37, 381–391. [Google Scholar] [CrossRef]
- Kumar, R.; Sharma, R.K.; Sharma, A. Recognition of multi-stroke based online handwritten Gurmukhi aksharas. Proc. Natl. Acad. Sci. India Sect. A Phys. Sci. 2015, 85, 159–168. [Google Scholar] [CrossRef]
- Verma, K.; Sharma, R.K. Recognition of online handwritten Gurmukhi characters based on zone and stroke identification. Sādhanā 2017, 42, 701–712. [Google Scholar] [CrossRef]
- Kumar, M.; Jindal, M.K.; Sharma, R.K. Offline handwritten Gurmukhi character recognition: Analytical study of different transformations. Proc. Natl. Acad. Sci. India Sect. A Phys. Sci. 2017, 87, 137–143. [Google Scholar] [CrossRef]
- Singh, H.; Sharma, R.K.; Singh, V.P. Efficient zone identification approach for the recognition of online handwritten Gurmukhi script. Neural Comput. Appl. 2019, 31, 3957–3968. [Google Scholar] [CrossRef]
- Kumar, N.; Gupta, S. A novel handwritten Gurmukhi character recognition system based on deep neural networks. Int. J. Pure Appl. Math. 2017, 117, 663–678. [Google Scholar]
- Singh, H.; Sharma, R.K.; Singh, V.P. Recognition of online unconstrained handwritten Gurmukhi characters based on Finite State Automata. Sādhanā 2018, 43, 192. [Google Scholar] [CrossRef]
- Mahto, M.K.; Bhatia, K.; Sharma, R.K. Robust Offline Gurmukhi Handwritten Character Recognition using Multilayer Histogram Oriented Gradient Features. Int. J. Comput. Sci. Eng. 2018, 6, 915–925. [Google Scholar] [CrossRef]
- Kumar, M.; Jindal, M.K.; Sharma, R.K.; Jindal, S.R. A novel framework for writer identification based on pre-segmented Gurmukhi characters. Sādhanā 2018, 43, 197. [Google Scholar] [CrossRef]
- Sakshi; Garg, N.K.; Kumar, M. Writer Identification System for handwritten Gurmukhi characters: Study of different feature-classifier combinations. In Proceedings of International Conference on Computational Intelligence and Data Engineering; Springer: Singapore, 2018; Volume 9, pp. 125–131. [Google Scholar]
- Kumar, M.; Jindal, S.R.; Jindal, M.K.; Lehal, G.S. Improved recognition results of medieval handwritten Gurmukhi manuscripts using boosting and bagging methodologies. Neural Processing Lett. 2019, 50, 43–56. [Google Scholar] [CrossRef]
- Garg, A.; Jindal, M.K.; Singh, A. Degraded offline handwritten Gurmukhi character recognition: Study of various features and classifiers. Int. J. Inf. Technol. 2019, 14, 145–153. [Google Scholar] [CrossRef]
- Garg, A.; Jindal, M.K.; Singh, A. Offline handwritten Gurmukhi character recognition: K-NN vs. SVM classifier. Int. J. Inf. Technol. 2019, 13, 2389–2396. [Google Scholar] [CrossRef]
- Kumar, M.; Jindal, M.K.; Sharma, R.K.; Jindal, S.R. Performance evaluation of classifiers for the recognition of offline handwritten Gurmukhi characters and numerals: A study. Artif. Intell. Rev. 2020, 53, 2075–2097. [Google Scholar] [CrossRef]
- Jindal, U.; Gupta, S.; Jain, V.; Paprzycki, M. Offline handwritten Gurumukhi character recognition system using deep learning. In Advances in Bioinformatics, Multimedia, and Electronics Circuits and Signals, Proceedings of the 2019 International Conference on Computing, Power and Communication Technologies (GUCON), New Delhi, India, 27–28 September, 2019; Springer: Singapore, 2020; Volume 1064, pp. 121–133. [Google Scholar]
- Malakar, S.; Sharma, P.; Singh, P.K.; Das, M.; Sarkar, R.; Nasipuri, M. A Holistic approach for handwritten Hindi word recognition. Int. J. Comput. Vis. Image Processing 2017, 7, 59–78. [Google Scholar] [CrossRef]
- Bhowmik, S.; Malakar, S.; Sarkar, R.; Basu, S.; Kundu, M. Off-line Bangla handwritten word recognition: A holistic approach. Neural Comput. Appl. 2019, 31, 5783–5798. [Google Scholar] [CrossRef]
- Kaur, H.; Kumar, M. Offline handwritten Gurumukhi word recognition using extreme gradient boosting methodology. Soft Comput. 2021, 25, 4451–4464. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Roy Chowdhury, A.; Maji, S. Bilinear CNN Models for Fine-Grained Visual Recognition, In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV). Santiago, Chile, 7–13 December 2015; pp. 1449–1457. [Google Scholar]
- Zhu, Y.; Sun, W.; Cao, X.; Wang, C.; Wu, D.; Yang, Y.; Ye, N. TA-CNN: Two-way attention models in deep convolutional neural network for plant recognition. Neurocomputing 2019, 365, 191–200. [Google Scholar] [CrossRef]
Sr No. | Class Name in English | Class Name in Gurumukhi | Time Duration | Type of Month |
---|---|---|---|---|
1. | Vaisakh | ਵਿਸਾਖ | 14 April to 14 May | Desi Months |
2. | Jeth | ਜੇਠ | 15 May to 14 June | |
3. | Harh | ਹਾੜ੍ਹ | 15 June to 15 July | |
4. | Sawan | ਸਾਉਣ | 16 July to 15 August | |
5. | Bhado | ਭਾਦੋਂ | 16 August to 14 September | |
6. | Assu | ਅੱਸੂ | 15 September to 14 October | |
7. | Katak | ਕੱਤਕ | 15 October to 13 November | |
8. | Magar | ਮੱਘਰ | 14 November to 13 December | |
9. | Poh | ਪੋਹ | 14 December to 12 January | |
10. | Magh | ਮਾਘ | 13 January to 11 February | |
11. | Phagun | ਫੱਗਣ | 12 February to 13 March | |
12. | Chet | ਚੇਤ | 14 March to 13 April | |
13. | January | ਜਨਵਰੀ | 1 January to 31 January | English Months |
14. | February | ਫਰਵਰੀ | 1 February to 28/29 February | |
15. | March | ਮਾਰਚ | 1 March to 31 March | |
16. | April | ਅਪ੍ਰੈਲ | 1 April to 30 April | |
17. | May | ਮਈ | 1 May to 31 May | |
18. | June | ਜੂਨ | 1 June to 30 June | |
19. | July | ਜੁਲਾਈ | 1 July to 31 July | |
20. | August | ਅਗਸਤ | 1 August to 31 August | |
21. | September | ਸਤੰਬਰ | 1 September to 30 September | |
22. | October | ਅਕਤੂਬਰ | 1 October to 31 October | |
23. | November | ਨਵੰਬਰ | 1 November to 30 November | |
24. | December | ਦਸੰਬਰ | 1 December to 31 December |
S.No. | Layers | Input Image Size | Filter Size | No. of Filter | Activation Function | Output | Parameters |
---|---|---|---|---|---|---|---|
1 | Input Image | 50 × 50 × 1 | ----- | ----- | ----- | ----- | ----- |
2 | Convolutional | 50 × 50 × 1 | 3 × 3 | 32 | ReLU | 50 × 50 × 32 | 320 |
3 | Maxpooling | 50 × 50 × 32 | Poolsize (3 × 3) | ------ | ------ | 16 × 16 × 32 | 0 |
4 | Convolutional | 16 × 16 × 32 | 3 × 3 | 64 | ReLU | 16 × 16 × 64 | 18,496 |
5 | Convolutional | 16 × 16 × 64 | 3 × 3 | 64 | ReLU | 16 × 16 × 64 | 36,928 |
6 | Maxpooling | 16 × 16 × 64 | Pool size 2 × 2 | ------ | ------ | 8 × 8 × 64 | 0 |
7 | Convolutional | 8 × 8 × 64 | 3 × 3 | 128 | ReLU | 8 × 8 × 128 | 73,856 |
8 | Convolutional | 8 × 8 × 128 | 3 × 3 | 128 | ReLU | 8 × 8 × 128 | 147,584 |
9 | Maxpooling | 8 × 8 × 128 | Pool size 2 × 2 | ------ | ------ | 4 × 4 × 128 | 0 |
10 | Flatten | 4 × 4 × 128 | ---- | ----- | ----- | 2048 | 0 |
11 | Dense | 2048 | ---- | ----- | ReLU | 1024 | 2,098,176 |
12 | Dense | 1024 | ---- | ----- | Softmax | 24 | 24,600 |
Adam Optimizer’s Specification | Learning Rate (LR) | Loss Function | Matrix | Number of Epochs | Batch Size (BS) |
---|---|---|---|---|---|
learning rate = 1.0 × 10−3, beta1 = 0.9, beta2 = 0.999, epsilon = 1.0 × 10−7, decay= learning rate/epochs | 0.0001 | Categorical cross entropy | Accuracy | 100 | 20 |
Training | Confusion Matrix | |||||||
---|---|---|---|---|---|---|---|---|
Parameters | Training | Validation | Training | Validation | Overall | Overall | Overall | |
Model | Accuracy | Accuracy | Loss | Loss | Precision | Recall | F1 Score | |
ResNet 50 | 0.3299 | 0.3929 | 2.1693 | 1.9268 | 0.4482 | 0.3937 | 0.3892 | |
VGG 19 | 0.7530 | 0.7771 | 0.7560 | 0.6647 | 0.7929 | 0.7767 | 0.7756 | |
VGG 16 | 0.7925 | 0.8138 | 0.6274 | 0.5484 | 0.8223 | 0.8135 | 0.8115 | |
Proposed Model | 0.9703 | 0.9950 | 0.0885 | 0.0230 | 0.9950 | 0.9951 | 0.9950 |
The Authors (Year) | Technique Used | Dataset Used | Accuracy | |
---|---|---|---|---|
Feature Extraction Method | Classifier | |||
[12] | Parabola curve fitting and power curve fitting | SVM and k-NN | 3500 offline handwritten Gurumukhi characters | 98.10% |
[15] | Discrete wavelet transforms, discrete cosine transforms, fast Fourier transforms and fan beam transforms | SVM | 10,500 samples of isolated offline handwritten Gurumukhi characters. | 95.8% |
[17] | Local binary pattern (LBP) features, directional features, and regional features | Deep neural network | 2700 images of Gurumukhi text | 99.3% |
[19] | Histogram oriented gradient (HOG) and pyramid histogram oriented gradient (PHOG) features | SVM | 3500 handwritten Gurumukhi characters | 99.1% |
[20] | Zoning, diagonal, transition, intersection and open end points, centroid, the horizontal peak extent, the vertical peak extent, parabola curve fitting, and power curve fitting-based features | Naive Bayes (NB), decision Tree (DT), random forest (RF) and AdaBoostM1 | 49,000 samples of Gurumukhi handwritten text | 89.85% |
[22] | Zoning, discrete cosine transforms and gradient features | k-NN, SVM, decision tree (DT), random forest (RF) | Medieval HandwrittenGurumukhi Manuscripts | 95.91% |
[23] | Zoning, diagonal, peak extent-based features (horizontally and vertically) and shadow features | k-NN, decision tree (DT) and random forest | 8960 samples of Gurumukhi handwritten text | 96.03% |
[25] | Vertically peak extent, diagonal, centroid features | k-NN, linear- (SVM), RBF-SVM, naive Bayes, decision tree, CNN, and random forest | 13,000 samples that includes 7000 characters and 6000 numerals. | 87.9% |
[26] | Automatic feature extraction | Convolutional neural network | 3500 Gurumukhi characters | 98.32% |
Proposed Model | Automatic feature extraction | Convolutional neural network | 24,000 Gurumukhi Month Name Images | 99.50% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Singh, T.P.; Gupta, S.; Garg, M.; Gupta, D.; Alharbi, A.; Alyami, H.; Anand, D.; Ortega-Mansilla, A.; Goyal, N. Visualization of Customized Convolutional Neural Network for Natural Language Recognition. Sensors 2022, 22, 2881. https://doi.org/10.3390/s22082881
Singh TP, Gupta S, Garg M, Gupta D, Alharbi A, Alyami H, Anand D, Ortega-Mansilla A, Goyal N. Visualization of Customized Convolutional Neural Network for Natural Language Recognition. Sensors. 2022; 22(8):2881. https://doi.org/10.3390/s22082881
Chicago/Turabian StyleSingh, Tajinder Pal, Sheifali Gupta, Meenu Garg, Deepali Gupta, Abdullah Alharbi, Hashem Alyami, Divya Anand, Arturo Ortega-Mansilla, and Nitin Goyal. 2022. "Visualization of Customized Convolutional Neural Network for Natural Language Recognition" Sensors 22, no. 8: 2881. https://doi.org/10.3390/s22082881
APA StyleSingh, T. P., Gupta, S., Garg, M., Gupta, D., Alharbi, A., Alyami, H., Anand, D., Ortega-Mansilla, A., & Goyal, N. (2022). Visualization of Customized Convolutional Neural Network for Natural Language Recognition. Sensors, 22(8), 2881. https://doi.org/10.3390/s22082881