Combining Multi-Scale Fusion and Attentional Mechanisms for Assessing Writing Accuracy
Abstract
:Featured Application
Abstract
1. Introduction
2. Multi-Scale-SE Recognition Methods
2.1. Multi-Scale-SE Recognition Models
2.2. The Bottleneck Layer Structure of the Fusion Attention Mechanism
2.3. Multi-Scale Fusion Feature Extraction
3. Results
3.1. Dataset
3.2. Experimental Parameters
3.3. Experimental Results and Ablation Experiments
3.4. Generalization Experiments and Comparisons
4. Conclusions and Further Research
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Jin, L.-W.; Zhong, Z.-Y.; Yang, Z.; Yang, W.-X.; Xie, Z.-C.; Sun, J. Applications of deep learning for handwritten Chinese character recognition: A review. Acta Autom. Sin. 2016, 42, 1125–1141. [Google Scholar]
- Zhong, Z.; Jin, L.; Xie, Z. High performance offline handwritten chinese character recognition using googlenet and directional feature maps. In Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia, 23–26 August 2015; pp. 846–850. [Google Scholar]
- Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Wu, Y.T.; Fujiwara, E.; Suzuki, C.K. Image-Based Radical Identification in Chinese Characters. Appl. Sci. 2023, 13, 2163. [Google Scholar] [CrossRef]
- Huang, Z.; Zhang, Q. Skew correction of handwritten Chinese character based on ResNet. In Proceedings of the 2019 International Conference on High Performance Big Data and Intelligent Systems (HPBD&IS), Shenzhen, China, 9–11 May 2019; pp. 223–227. [Google Scholar]
- Zhou, Y.C.; Tan, Q.H.; Xi, C.L. Offline Handwritten Chinese Character Recognition of SqueezeNet and Dynamic Network Surgery. J. Chin. Comput. Syst. 2021, 42, 556–560. [Google Scholar]
- Luo, G.-F.; Wang, D.-H.; Du, X.; Yin, H.-Y.; Zhang, X.-Y.; Zhu, S. Self-information of radicals: A new clue for zero-shot Chinese character recognition. Pattern Recognit. 2023, 140, 109598. [Google Scholar] [CrossRef]
- Wong, A.; So, J.; Ng, Z.T.B. Developing a web application for Chinese calligraphy learners using convolutional neural network and scale invariant feature transform. Comput. Educ. Artif. Intell. 2024, 6, 100200. [Google Scholar] [CrossRef]
- Si, H. Analysis of calligraphy Chinese character recognition technology based on deep learning and computer-aided technology. Soft Comput. 2024, 28, 721–736. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Chen, Z.; Wang, C.; Wu, G. Pupil Refinement Recognition Method Based on Deep Residual Network and Attention Mechanism. Appl. Sci. 2024, 14, 10971. [Google Scholar] [CrossRef]
- Vaswani, A. Attention is all you need. In Advances in Neural Information Processing Systems, Proceedings of the Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; NeurIPS: San Diego, CA, USA, 2017. [Google Scholar]
- Kim, H.-J.; Eesaar, H.; Chong, K.T. Transformer-Enhanced Retinal Vessel Segmentation for Diabetic Retinopathy Detection Using Attention Mechanisms and Multi-Scale Fusion. Appl. Sci. 2024, 14, 10658. [Google Scholar] [CrossRef]
- Szymkowski, P.; Saeed, K.; Szymkowski, Ł.; Nishiuchi, N. Classification of Japanese Handwritten Characters Using Biometrics Approach. Appl. Sci. 2024, 14, 225. [Google Scholar] [CrossRef]
- Snowberger, A.D.; Lee, C.H. Handwritten Hangul Graphemes Classification Using Three Artificial Neural Networks. J. Inf. Commun. Converg. Eng. 2023, 21, 167–173. [Google Scholar] [CrossRef]
- Aneesh, N.; Somasundaram, A.; Ameen, A.; Garimella, G.S.; Jayashree, R. Exploring Hieroglyph Recognition: A Deep Learning Approach. In Proceedings of the 2024 2nd International Conference on Computer, Communication and Control (IC4), Indore, India, 8–10 February 2024; pp. 1–5. [Google Scholar]
- Mushtaq, F.; Misgar, M.M.; Kumar, M.; Khurana, S.S. UrduDeepNet: Offline handwritten Urdu character recognition using deep neural network. Neural Comput. Appl. 2021, 33, 15229–15252. [Google Scholar] [CrossRef]
- Yin, F.; Wang, Q.-F.; Zhang, X.-Y.; Liu, C.-L. ICDAR 2013 Chinese handwriting recognition competition. In Proceedings of the 2013 12th International Conference on Document Analysis and Recognition, Washington, DC, USA, 25–28 August 2013; pp. 1464–1470. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 448–456. [Google Scholar]
- Glorot, X.; Bordes, A.; Bengio, Y. Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Ft. Lauderdale, FL, USA, 11–13 April 2011; pp. 315–323. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Li, Z.-H.; Cui, J.-X.; Lu, H.-P.; Zhou, F.; Diao, Y.-l.; Li, Z.-X. Prediction model of measurement errors in current transformers based on deep learning. Rev. Sci. Instrum. 2024, 95, 044704. [Google Scholar] [CrossRef] [PubMed]
- Li, Z.; Cui, J.; Chen, H.; Lu, H.; Zhou, F.; Rocha, P.R.; Yang, C. Research Progress of All-Fiber Optic Current Transformers in Novel Power Systems: A Review. Microw. Opt. Technol. Lett. 2025, 67, e70061. [Google Scholar] [CrossRef]
- Pereira, L.M.; Salazar, A.; Vergara, L. A comparative analysis of early and late fusion for the multimodal two-class problem. IEEE Access 2023, 11, 84283–84300. [Google Scholar] [CrossRef]
- GB2312-80; National Standard of the People’s Republic of China: Chinese Coded Character Set for Information Interchange—Primary Set. China State Bureau of Technical Supervision: Beijing, China, 1981.
- Liu, C.-L.; Yin, F.; Wang, D.-H.; Wang, Q.-F. CASIA online and offline Chinese handwriting databases. In Proceedings of the 2011 International Conference on Document Analysis and Recognition, Beijing, China, 18–21 September 2011; pp. 37–41. [Google Scholar]
- Long, T.; Jin, L. Building compact MQDF classifier for large character set recognition by subspace distribution sharing. Pattern Recognit. 2008, 41, 2916–2925. [Google Scholar] [CrossRef]
- Cireşan, D.; Meier, U. Multi-column deep neural networks for offline handwritten Chinese character classification. In Proceedings of the 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 12–17 July 2015; pp. 1–6. [Google Scholar]
- Zhang, X.-Y.; Yin, F.; Zhang, Y.-M.; Liu, C.-L.; Bengio, Y. Drawing and recognizing chinese characters with recurrent neural network. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 849–862. [Google Scholar] [CrossRef] [PubMed]
- Orlando, J.I.; Prokofyeva, E.; Blaschko, M.B. A discriminatively trained fully connected conditional random field model for blood vessel segmentation in fundus images. IEEE Trans. Biomed. Eng. 2016, 64, 16–27. [Google Scholar] [CrossRef] [PubMed]
- Zhou, L.; Yu, Q.; Xu, X.; Gu, Y.; Yang, J. Improving dense conditional random field for retinal vessel segmentation by discriminative feature learning and thin-vessel enhancement. Comput. Methods Programs Biomed. 2017, 148, 13–25. [Google Scholar] [CrossRef]
- Chen, L.; Wang, S.; Fan, W.; Sun, J.; Naoi, S. Beyond human recognition: A CNN-based framework for handwritten character recognition. In Proceedings of the 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Kuala Lumpur, Malaysia, 3–6 November 2015; pp. 695–699. [Google Scholar]
Layer Name | MSSE Model | Output Shape |
---|---|---|
Input | 96 × 96 image | 96 × 96 × 1 |
Conv1 | 3 × 3 conv | 96 × 96 × 64 |
Conv2 | 3 × 3 conv | 96 × 96 × 64 |
AvgPool | 3 × 3 avgpool | 48 × 48 × 64 |
MSSEBottleneck1 | 1 × 1 conv 3 × 3 conv/5 × 5 conv 1 × 1 conv | 48 × 48 × 96 |
AvgPool | 3 × 3 avgpool | 24 × 24 × 96 |
MSSEBottleneck2 | 1 × 1 conv 3 × 3 conv/5 × 5 conv 1 × 1 conv | 24 × 24 × 128 |
AvgPool | 3 × 3 avgpool | 12 × 12 × 128 |
MSSEBottleneck3 | 1 × 1 conv 3 × 3 conv/5 × 5 conv 1 × 1 conv | 12 × 12 × 256 |
AvgPool | 3 × 3 avgpool | 6 × 6 × 256 |
MSSEBottleneck4 | 1 × 1 conv 3 × 3 conv/5 × 5 conv 1 × 1 conv | 6 × 6 × 448 |
GAP | gap, dropout | 1 × 1 × 448 |
Output | Softmax | 3755 |
Dataset | Writer | Sample |
---|---|---|
Writing Grade Examination | 240 | 268,000 |
CASIA-HWDB1.0 | 420 | 1,680,258 |
CASIA-HWDB1.1 | 300 | 1,172,907 |
ICDAR-2013 | 60 | 224,419 |
Model | Accuracy | F1 | Recall | Kappa |
---|---|---|---|---|
MSSE | 98.6% | 96.9% | 95.7% | 94.3% |
Experiment | Mean ± SD | 95% CI | p-Value |
---|---|---|---|
Monte Carlo | 98.19 ± 0.316 | [97.964, 98.416] | 0.002 |
Method | Accuracy |
---|---|
Basic | 96.54% |
Muti-Basic | 97.38% |
SE-Basic | 97.65% |
MSSE | 98.60% |
Method | Sample |
---|---|
Human Recognition | 96.13% |
DFE+DLQDF | 92.72% |
MCDNN | 95.79% |
ATR-CNN | 95.04% |
HCCR-Gabor-GoogLeNet | 96.35% |
DirectMap-ConvNet-Adaptation | 96.55% |
HCCR-CNN12layer-GSLRE 4X | 96.73% |
Ensemble-CNN-voting | 96.79% |
ResNet–Centerloss | 97.03% |
Skew Correction Based on ResNet | 95.50% |
SqueezeNet | 96.32% |
CUE | 96.96% |
Ours (MSSE) | 97.05% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, R.; Shi, Y.; Tang, X.; Liu, X. Combining Multi-Scale Fusion and Attentional Mechanisms for Assessing Writing Accuracy. Appl. Sci. 2025, 15, 1204. https://doi.org/10.3390/app15031204
Liu R, Shi Y, Tang X, Liu X. Combining Multi-Scale Fusion and Attentional Mechanisms for Assessing Writing Accuracy. Applied Sciences. 2025; 15(3):1204. https://doi.org/10.3390/app15031204
Chicago/Turabian StyleLiu, Renyuan, Yunyu Shi, Xian Tang, and Xiang Liu. 2025. "Combining Multi-Scale Fusion and Attentional Mechanisms for Assessing Writing Accuracy" Applied Sciences 15, no. 3: 1204. https://doi.org/10.3390/app15031204
APA StyleLiu, R., Shi, Y., Tang, X., & Liu, X. (2025). Combining Multi-Scale Fusion and Attentional Mechanisms for Assessing Writing Accuracy. Applied Sciences, 15(3), 1204. https://doi.org/10.3390/app15031204