Next Article in Journal
Four-Dimensional Trajectory Planning Algorithm for Fixed-Wing Aircraft Formation Based on Improved Hunter—Prey Optimization
Next Article in Special Issue
VHR-BirdPose: Vision Transformer-Based HRNet for Bird Pose Estimation with Attention Mechanism
Previous Article in Journal
A Novel 3-D Jerk System, Its Bifurcation Analysis, Electronic Circuit Design and a Cryptographic Application
Previous Article in Special Issue
CNN-Based Fluid Motion Estimation Using Correlation Coefficient and Multiscale Cost Volume
 
 
Article
Peer-Review Record

SGooTY: A Scheme Combining the GoogLeNet-Tiny and YOLOv5-CBAM Models for Nüshu Recognition

Electronics 2023, 12(13), 2819; https://doi.org/10.3390/electronics12132819
by Yan Zhang and Liumei Zhang *
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Electronics 2023, 12(13), 2819; https://doi.org/10.3390/electronics12132819
Submission received: 12 May 2023 / Revised: 22 June 2023 / Accepted: 23 June 2023 / Published: 26 June 2023
(This article belongs to the Special Issue Deep Learning for Computer Vision Application)

Round 1

Reviewer 1 Report

This manuscript proposed a publicly available Chinese Nushu character dataset and leveraged a few deep neural networks with architecture modifications to achieve the image classification task. Overall, this paper is well-written. The authors did a thorough literature review and presented the networks in detail. There are only minor revisions needed:

1. There are minor formatting issues. Examples are: in line 51 "datasets,In this paper," there should be a space before "In" and "In" should be lowercased; In line 325 "cross-entropy[41](BCE)", there should be a space before "(BCE)".

2. I suggest to add a short paragraph at the end of the Introduction Section to briefly introduce how this article is organized so that readers are easier to follow.

The quality of English language is good, only minor formatting issues need to be fixed.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

The authors present a method to recognize handwritten letters. The authors developed a two stage recognition system: On the first stage, the GoogLeNet-tiny, and on the second stage, YOLO5-CBAM is applied.

* Comments for unclear points in the manuscript:

1. I have a question: the authors proposed a two stage recognition method. However, it is unclear that, how the first stage's model of the recognition system determines the falsely recognized characters(letters) and tosses them to the second stage's model? Describe the method in more detailed manner in that sense.

 

2. Generally, it is a common idea to apply ensemble method to identify data with several weak method. Why the authors develop a two stage model?

 

 

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

1. In section 4.4.2. Parameter Setting, it emphasizes the importance of the initial configuration. However, it does not provide a description of why the Discrete Staircase Method was adopted as the optimization algorithm for training the deep learning model, despite the existence of multiple optimization algorithms for deep learning model training in the initial configuration.

2. The paper claims that the addition of a BN (Batch Normalization) layer to the GoogLeNet-tiny model significantly improves its training speed. However, the paper does not provide direct comparison data between the GoogLeNet-tiny model and the GoogLeNet model. This lack of comparison or evaluation of performance differences and training speed improvement between the two models is a limitation.

3. 1. In the experimental results of the paper, various graphs were presented to clearly demonstrate the training outcomes of YOLOv5. However, for the basic models such as AlexNet-BN, AlexNet-SC, AlexNet-LR, and GoogLeNet-tiny, apart from the validation accuracy and train loss graphs, no additional visual materials were provided. Therefore, to enhance the explanatory power of the paper, it would be beneficial to include graphs similar to the YOLOv5 training, specifically for the basic models, to provide a clear visualization of the results.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

1. In Line 237-242, what is the meaning of the threshold? Please, explain the role of the threshold in training process and in test process, separately, with mathematical expression.

2. For example, if the predicted label is scored under the threshold value, then the input image is put into the YOLO5, in test process?

3. In Line 237-242, the first stage's threshold value is not presented. The authors must reveal the applied threshold's value exactly.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

I think the revised manuscript inlcudes all of the comments rised by the reviewer. I recomment this one for publication.

Author Response

 We gratefully appreciate for your valuable suggestion.

Back to TopTop