ROENet: A ResNet-Based Output Ensemble for Malaria Parasite Classification
Round 1
Reviewer 1 Report
· Initially, they argued that there are some limitations in CNN-based models, and their proposed method is going to solve the above problems, but what the authors proposed is again a CNN-based model with the same problems they were supposed to solve!
· The tables are not referenced correctly!
· There are no references for the methods they have used including ELM, RVFL, and SNN. For example, for RVFL they have to cite the paper they get inspired from, and clearly state what is their contribution on top of the mentioned algorithm.
· An explanation is needed for the Max-epoch value in section 4.1. Why is this going to be converged that fast? Is there any explanation or hypothesis for that?
· In section 4.6, are the other methods benchmarked with the same dataset? The author should mention that.
· In section 4.6, the author claims that using RNN is faster than other SOTA methods. how can they claim this? Have they done any experiments and comparisons speed-wise? Mentioned reasons can correlate to the superiority of their method’s performance, but they could not be the reasons to achieve roughly 0.4 percent higher results.
Author Response
- Initially, they argued that there are some limitations in CNN-based models, and their proposed method is going to solve the above problems, but what the authors proposed is again a CNN-based model with the same problems they were supposed to solve!
Response: Thank you for your advice. We have realized this problem and tried our best to fix it. ‘However, we believe that the classification performance of malaria parasites can be improved. We propose a new model (ROENet) to automatically classify malaria parasites on the blood smears.’
- The tables are not referenced correctly!
Response: Thank you for careful review. We have checked all the tables references.
- There are no references for the methods they have used including ELM, RVFL, and SNN. For example, for RVFL they have to cite the paper they get inspired from, and clearly state what is their contribution on top of the mentioned algorithm.
Response: Thank you for your advice. We added references in section 3.1. ‘We choose randomized neural networks (RNNs) as the classifier in our proposed model. Three RNNs are used in ROENet, which are random vector functional link (RVFL) [17], Schmidt neural network (SNN) [18], and extreme learning machine (ELM) [19]. ’
- An explanation is needed for the Max-epoch value in section 4.1. Why is this going to be converged that fast? Is there any explanation or hypothesis for that?
Response: Thank you for your suggestion. We have explained it in Section 4.1. ‘We set the max-epoch to 4 to prevent the overfitting problem. The learning rate is set as . The minibatch size is 128. The data set used in this paper is small and the batch size is large, so the convergence is fast.’
- In section 4.6, are the other methods benchmarked with the same dataset? The author should mention that.
Response: Thank you for your comment. We mentioned it in section 4.6. ‘The DCGAN, Computer-Automated-CNN, and 3-layer CNN used the same data set as this paper. Other SOTA methods used different data sets.’
- In section 4.6, the author claims that using RNN is faster than other SOTA methods. how can they claim this? Have they done any experiments and comparisons speed-wise? Mentioned reasons can correlate to the superiority of their method’s performance, but they could not be the reasons to achieve roughly 0.4 percent higher results.
Response: Thank you for your careful review. In section 4.6, we have reinterpreted why our model achieved better classification performance. ‘There are three reasons why our model can achieve better results than other SOTA methods. (i). The ResNet-18 is the backbone of our model, which can accurately extract features. (ii). We use RNN as the classifier, which can avoid overfitting problems. (iii) The results of ROENet are the ensemble outputs from three RNNs, which can improve the classification performance.’
Reviewer 2 Report
The topic of this paper is ROENet: a ResNet-based output ensemble for malaria parasite classification. This paper mainly proposes (1) using fine-tuned RestNet18 as a backbone to extract features, and (2) ensemble three RNN (Randomized Neural Network) algorithms: random vector functional link (RVFL), Schmidt neural network (SNN), and extreme learning machine (ELM) as a classifier. This classifier is used for the malaria classification task. This approach is different from the general DNN approach and is innovative. In addition, the whole process of classifying model is clearly explained in this paper, and all aspects of experimental analysis are included.
However, some more modifications are needed before it can be accepted. Here are some of my suggestions:
1 In section 2: To enhance the information and explanation of the modeled data, please describe the differences between the healthy and infected images, besides the example images. Are the pink areas marked by experts? Is the healthy image completely clean and free of mottling? Is a healthy image completely clean and free of noise? Is it infected as long as it has noise?
2 Please add more introduction of Randomized Neural Network, especially their differences from CNN. Except for less layers than CNN, what kind of model tasks are the Randomized Neural Network used to solve in most typical situation?
3. The introduction and reference of the three methods random vector functional link (RVFL), Schmidt neural network (SNN), and extreme learning machine (ELM) should be added to explain why these three methods are chosen.
4 Ensemble is one of the main points of this paper, the author should explain more about how to integrate the decision from the three RNN to get the final output.
5 In section 4.6, it is mentioned that "There are three reasons why our model can achieve better results than other SOTA methods. ...... (ii). We use RNN as the classifier, which is faster and can avoid overfitting problems.". Can you supplement the experimental data for this part?
6 Others:
6.1 In Table 2: RNN should be changed to RNN (Randomized Neural Network) to avoid mixing with Recurrent Neural Network when reading for the first time.
6.2 In rows 114, 135, 206,210,220,234,235,247,248 "Error! Reference source not found" appears.
6.3 There is a duplication of semantics in lines 246 to 248.
Author Response
- In section 2: To enhance the information and explanation of the modeled data, please describe the differences between the healthy and infected images, besides the example images. Are the pink areas marked by experts? Is the healthy image completely clean and free of mottling? Is a healthy image completely clean and free of noise? Is it infected as long as it has noise?
Response: Thank you for your suggestion. We explained it in section 2. ‘The image processing method used in this open dataset is to find parasites in the digital image of blood film. The typical shape, data, and visual appearance of parasites are marked manually by experts. If there is no expert mark, the image is uninfected.’
- Please add more introduction of Randomized Neural Network, especially their differences from CNN. Except for less layers than CNN, what kind of model tasks are the Randomized Neural Network used to solve in most typical situation?
Response: Thank you for your advice. We have added some introductions to Randomized Neural Network in section 3.3. ‘There are many layers in the CNN model, and each layer has many parameters. The randomized neural networks (RNNs) have only three simple layers: input layer, hidden layer, and output layer. Only the shallow structure of the three-layer RNN model can effectively alleviate the overfitting problem. The parameters (the randomized weights and biases) in the RNN model are also trained quickly. Because RNN has good classification performance, it has been applied to many machine learning tasks, such as geography, big data analysis, chemistry, and so on.’
- The introduction and reference of the three methods random vector functional link (RVFL), Schmidt neural network (SNN), and extreme learning machine (ELM) should be added to explain why these three methods are chosen.
Response: As suggested by the reviewer, we have added the introduction and reference of the three methods and explained the reason why we selected these three methods. ‘Three RNNs are used in this paper, which are ELM. RVFL, and SNN. ELM projects the input features into the hidden space randomly and does not need gradient-based backpropagation to adjust the weights. The most obvious structural difference between RVFL and ELM is that there is a quick connection between input and output in RVFL. This quick connection can effectively improve the classification performance of RVFL and the robustness of the model. SNN was an RNN model proposed by Schmidt, Kraijveld, and Duin thirty years ago. The structure of SNN is consistent with that of ELM. But in the SNN model, the output layer has a learnable output bias. These three RNN models are very classical and achieve great classification performance since they are proposed.’
- Ensemble is one of the main points of this paper, the author should explain more about how to integrate the decision from the three RNN to get the final output.
Response: As the reviewer advised, we explained it in section 3.3. ‘Although the RNN model is simple, bad weights and biases will seriously affect the classification performance. So in this paper, we combine the results of three RNN models to get the final classification model based on majority voting. Because the three RNN models used in this paper have some differences, it is more helpful to obtain diversified information, to further improve the performance and robustness of the system.’
- In section 4.6, it is mentioned that "There are three reasons why our model can achieve better results than other SOTA methods. ...... (ii). We use RNN as the classifier, which is faster and can avoid overfitting problems.". Can you supplement the experimental data for this part?
Response: Thank you for your careful review. In section 4.6, we have reinterpreted why our model achieved better classification performance. ‘There are three reasons why our model can achieve better results than other SOTA methods. (i). The ResNet-18 is the backbone of our model, which can accurately extract features. (ii). We use RNN as the classifier, which can avoid overfitting problems. (iii) The results of ROENet are the ensemble outputs from three RNNs, which can improve the classification performance.’
- Others: 6.1 In Table 2: RNN should be changed to RNN (Randomized Neural Network) to avoid mixing with Recurrent Neural Network when reading for the first time. 6.2 In rows 114, 135, 206,210,220,234,235,247,248 "Error! Reference source not found" appears. 6.3 There is a duplication of semantics in lines 246 to 248.
Response: Thank you for your careful review. We have changed the RNN to RNN (Randomized Neural Network) in Table 2. We have fixed the "Error! Reference source not found" problem. We deleted the duplication of semantics.
Reviewer 3 Report
The research problem addressed in this paper is interesting and I appreciate the authors for their efforts. However, I suggest the authors to address the following comments.
1. To most of the readers, RNN is recurrent neural network.
2. Look into the proper citations of the reference
3. Why ResNet architecture is preferred for solving this problem needs to be highlighted
4. Comparative analysis with some basic deep learning architectures might well reflect the novelty and need of the proposed model
Author Response
- To most of the readers, RNN is recurrent neural network.
Response: Thank you for your advice. In this paper, it is explained when RNN is used for the first time in the text, abstract, and table to prevent misleading readers.
- Look into the proper citations of the reference
Response: Thank you for your suggestion. We have checked all the citations of the reference to make sure all of them were correct.
- Why ResNet architecture is preferred for solving this problem needs to be highlighted
Response: Thank you for your suggestion. We explain it in section 4.3. ‘Because ResNet-18 can achieve the best classification results when it is used as the backbone model based on the experimental results, ResNet-18 is selected as the preferred architecture in this paper.’
- Comparative analysis with some basic deep learning architectures might well reflect the novelty and need of the proposed model
Response: As the reviewer suggested, we test the effects of the output ensemble in section 4.4 and compare our model with basic ResNet in section 4.5. In these two sections, our model achieves better classification performance to prove the novelty and need of the proposed model.
Round 2
Reviewer 1 Report
The authors improved the manuscript based on my comments.
Reviewer 2 Report
It is generally revised and adjusted according to previous suggestions, so we suggest accepting this paper.