NeuralNCD: A Neural Network Cognitive Diagnosis Model Based on Multi-Dimensional Features
Round 1
Reviewer 1 Report
The authors have investigated A neural network cognitive diagnosis model based on multi-dimensional feature
of the exercises for modeling the student learning process. The methodoloy used is not novel but the application is well-suitable to journal.
However , the article presents certain inaccuracies ..please correct them!!
57-74 the section is very ripetitive !!
109-113 rephrase the sentence
125 write "Information use"
Fig.4 describe in caption IRT and neu-add
Give the information about the database : number of data, students...etc..
The introduction section lacks of the description of Neural Network and applications. I suggest to cite the articles:
Sciuto, G. L., Susi, G., Cammarata, G., & Capizzi, G. (2016, June). A spiking neural network-based model for anaerobic digestion process. In 2016 International Symposium on Power Electronics, Electrical Drives, Automation and Motion (SPEEDAM) (pp. 996-1003). IEEE.
The article can be accepted with major revision
Author Response
Response to Reviewer 1 Comments:
Point 1: 57-74 the section is very ripetitive !!
Response 1:Thanks for your suggestions.We apologize for some structural errors that led to poor reading due to repeat definitions. The content of sections 57-74 has been revised, some repeat definitionshave been removed, and some statements have been modified (Section 1, P.2, rows 57,75).
Point 2: 109-113 rephrase the sentence
Response 2:Thanks for your suggestions.The description of DINA has been revised(Section 2, P2, rows 108,112)
Point 3: 125 write "Information use"
Response 3:Thanks for your suggestions.The HO-DINA information use has been added(Section 2, P.3,rows 123,128)
Point 4: Fig.4 describe in caption IRT and neu-add
Response 4:Thanks for your suggestions.We apologize for the blunder in Figure 4. neu-add is a writing error and has been corrected to NeuralNCD,which is the proposed model method in this paper; IRT has been introduced in (Section 1, P.2 ,rows49), citing the 8th reference.
Point 5: Give the information about the database : number of data, students...etc..
Response 5:Thanks for your suggestions. In this paper, we use two databases, ASSIST2009 and FrcSub. ASSIST 2009 had 17,746 exercises that received 4,163 student responses and contained a total of 123 knowledge concepts.FrcSub has 536 student interactions on 20 questions, and the exercises contain a total of 8 different knowledge points.
Point 6: The introduction section lacks of the description of Neural Network and applications. I suggest to cite the articles.
Response 6:We have benefited from a careful reading of the recommended papers. We learned the the application of Neural Network, which can be seen in Reference 12. the references we added include the following:
- Sciuto, G. L., Susi, G., Cammarata, G., & Capizzi, G. (2016, June). A spiking neural network-based model for anaerobic digestion process. In 2016 International Symposium on Power Electronics, Electrical Drives, Automation and Motion (SPEEDAM) (pp. 996-1003). IEEE.
Thank you for your kind comments! And thank you so much for professional and careful opinion!
Author Response File: Author Response.docx
Reviewer 2 Report
This article needs a complete review in several parts highlighted as follows (there are some oversights in the preparation of this article):
1. English review.
2. This article has too many self-citations. In this case, therefore, the adoption of COPE's (Committee on Publication Ethics).recommendations
3. Citations are missing in essential parts of the text, for example, some equations, RMSE, AUC, fuzzy set, deep learning, neural networks, Q-Matrix, etc.;
4. I suggest not repeating the definition of acronyms.
5. I recommend using benchmark publications in the bibliography whenever possible.
6. Some acronyms' definitions are missing, e.g., DNNa, BN, DIAN, REMS, etc.
7. Discriminate the use of the defined x equations in equations (7) and (8).
8. Equations (9)-(11): the Greek letter Phi (uppercase and lowercase) is used for the same parameter.
9. The Theta parameter definition is missing.
10. Unify the style used in the representing parameters/variables (italics and sometimes the usual style is used).
11. Considering the results presented in Table 2, I observe that there is no method that can be highlighted. Can the methodologies used for the resolution (comparative study) of the experiment performed to be considered benchmarks in the literature?
12. Considering the results presented in Table 2, I observe that there is no method that can be highlighted.
13. What are the advantages and disadvantages that should highlight?
14. I suggest to the Authors highlight objectively the innovation of this proposal about literature.
Author Response
Response to Reviewer 2 Comments:
Point 1: English review.
Response 1:Thanks for your suggestions, Confirmed.
Point 2: This article has too many self-citations. In this case, therefore, the adoption of COPE's (Committee on Publication Ethics).recommendations.
Response 2:Thanks for your suggestions, We apologize for this problem and are actively worked to improve it. We have restructured the references by removing three self-cited references, adding one related paper on neural network applications and two new papers on cognitive diagnostic methods as well as two references for additional definitions.
Point 3: Citations are missing in essential parts of the text, for example, some equations, RMSE, AUC, fuzzy set, deep learning, neural networks, Q-Matrix, etc.
Response 3:Thanks for your suggestions, We apologize for some grammatical errors that are not well read due to lack of citations.Equations, deep learning, neural networks, Q-Matrix are mentioned in the introduction and related work section of the paper (Section 1,2), so no citations are added afterwards. Rmse and AUC are generic evaluation metrics and citations are added in the paper.[32]Wu S; Flach P. A scored AUC metric for classifier evaluation and selection. In Second workshop on ROC analysis in ML, bonn, Germany. 2005: 247-262.
Point 4: 4. I suggest not repeating the definition of acronyms.
Response 4:Thanks for your suggestions.We apologize for some structural errors that led to poor reading due to repeat definitions.The definition of acronyms in the text has been streamlined, and some repetitive definitions have been removed.
Point 5: I recommend using benchmark publications in the bibliography whenever possible.
Response 5:Thanks for your suggestions.We have revised the references to reduce some of the less relevant references and added five articles that could be used for benchmark publications.
Point 6: Some acronyms' definitions are missing, e.g., DNNa, BN, DIAN, REMS, etc.
Response 6:Thanks for your suggestions. DNNa,DNNb have been added as definitions for single-layer neural networks with the same structure but different weights.BN is the data preprocessing layer and has been added as a definition and adapted to the article structure (Section 3, P.6, row235,236).DINA is presented as a definition in the introductory section of the article (Section 1), so no changes have been made.RMSE is a generic evaluation metric and has been defined in the article The RMSE is a generic evaluation metric and the instructions for its use are explained in the text (Section 4, P.8, rows 283,284).
Point 7: Discriminate the use of the defined x equations in equations (7) and (8).
Response 7:The x defined in equations (7) and (8) are the same, and the two equations are a back-and-forth process, first in equations (7) and then in equations (8).
Point 8: Equations (9)-(11): the Greek letter Phi (uppercase and lowercase) is used for the same parameter.
Response 8:Phi in Equations (9)-(11) all represent one layer of neural network computation, but the parameter are not the same.
Point 9: The Theta parameter definition is missing.
Response 9:Thanks for your suggestions.Theta is a trainable parameter. Optimal parameters are obtained by training to help the neural network to better normalize the data. We have added the definition in (Section 3, P.6, rows 241,243)
Point 10: Unify the style used in the representing parameters/variables (italics and sometimes the usual style is used).
Response 10:Thanks for your suggestions.The style of the representative parameters/variables has been standardized to italic
Point 11: Considering the results presented in Table 2, I observe that there is no method that can be highlighted. Can the methodologies used for the resolution (comparative study) of the experiment performed to be considered benchmarks in the literature?
Response 11:Thanks for your suggestions. The previous description of the results was too small and lacked critical analysis, which made the results section seem no method that can be highlighted. This time we have added the analysis of the results and the reasons for the superiority of the model based on your comments.
In the ASSIST 2009 dataset, the NeuralNCD model proposed in this paper showed a large improvement (5%-10%) in individual evaluation metrics compared to the CDMS without neural networks (IRT, MIRT, PMF, DINA), which proves that the neural network used in this paper can better capture the relationship between students compared to traditional cognitive diagnostic models.
There is also a 2.5% improvement in the evaluation index AUC compared to the latest NeuralCDM model using neural networks, which can also indicate that integrating multiple motor features into a cognitive diagnostic model can lead to more accurate and reasonable diagnostic results.
The comparison of all evaluation metrics lies in the results of other references [30], and the NeuralCDM model proposed in this paper shows a greater improvement in the diagnosis of student status.
Point 12: Considering the results presented in Table 2, I observe that there is no method that can be highlighted.
Response 12:Thanks for your suggestions.We admit that the conclusion of FrcSub dataset was weak and did not get to the point.
Through Table 2, we can see that in the FrcSub dataset, the NeuralNCD model proposed in this paper has a 1%-5% improvement in the evaluation index of AUC, which is not very satisfactory compared to the improvement of the model in the ASSIST2009 dataset. The reason is that the FrcSub dataset has fewer knowledge points of the exercises, and the NeuralNCD model is more difficult to extract the features of the exercises caused by the detailed reasons given in the paper in (Section 4, P.10, rows 359,361).We will address these issues in the future to improve the performance of the model.
Point 13: What are the advantages and disadvantages that should highlight?
Response 13:Thanks for your suggestions.
Advantages:
(1) the neural network used in this paper can better capture the relationship between students and exercises compared to traditional cognitive diagnostic models that only use interaction functions (e.g., logistic functions, inner products) .
(2)Integrating multiple exercise features into the cognitive diagnostic model can obtain more accurate and reasonable diagnostic results.
(3)The data preprocessing and monotonicity assumptions added in this paper enhance the performance of the neural network and improve the accuracy and interpretability of the model diagnosis results.
Disadvantages:
(1) Since the model uses deep learning methods, it requires a large amount of data to train the model, and the practical application may not guarantee a large enough data size for training the model.
(2) The model is based on a 3-parameter Logistic model, and although the influence of several exercise features such as difficulty, discrimination, guess and slip coefficients on diagnosing students' knowledge level is considered, there may still be features that are not considered for real application scenarios.
Point 14: I suggest to the Authors highlight objectively the innovation of this proposal about literature.
Response 14:Thanks for your suggestions.
(1) In this paper, we propose a A neural network cognitive diagnosis model based on multi-dimensional features. Based on a 3-parameter Logistic model, the model can effectively utilize multiple exercise features (e.g., difficulty, discrimination, guess and slip factor) so that it can better diagnose students' knowledge states.
(2) The model introduces data preprocessing mechanisms and monotonicity assumptions to enhance the accuracy and interpretability of the model. A data pre-processing layer is added to the neural network to enable it to better capture the interaction information between students and exercises. The monotonicity assumption is introduced to make the diagnosed student's states more interpretable.
Thank you for your kind comments! And thank you so much for professional and careful opinion!
Author Response File: Author Response.docx
Reviewer 3 Report
This study proposes a cognitive neural network model for diagnostics of the student's state during the learning process. The model is based on an extracting exercise characteristics (e.g., difficulty, discrimination, guess and slip factor) and determining the nonlinear interaction between the student and the exercises. To improve the accuracy and interpretability of the diagnostic results, as well as to increase the convergence speed of the model, data preprocessing and monotonicity assumption mechanisms have been implemented. The study is interesting, well-structured and described, but there are some amendments that should be made before publishing.
I have the following comments and questions to the authors:
1. There are several reference lumps in the manuscript: [1,2], [4,5], [10,11], [12,13], [20-24], [25-27]. Please eliminate these lumps. This should be done by characterising each reference individually. This can be done by mentioning 1 or 2 phrases per reference to show how it is different from the others and why it deserves mentioning.
2. The sentence: “Multidimensional Item Response Theory Models (MIRT) was proposed by Reckase M D et al. [19] is…………” (Section 2, p. 3, rows 99, 100) needs a revision.
3. Some of the notations used in the equations such as: θ, η, E, s, b, etc. and some indices such as: i, h, j, etc., are not defined in the text.
4. At the beginning of the description of step 3 in the algorithm on page 7 missing “I” should be added to “nitialize“.
5. Table 1 should be presented in an appropriate (smaller) format.
6. The notations used in Eq. (19) related to the summation are unclear and they are not explained in the text.
7. How the value of the empirical parameter - "1.7" on page 5 in Eq. (7) is determined?
8. What is the number of neurons in hidden and output layers?
9. What type of data preprocessing mechanism is used in the model?
Author Response
Response to Reviewer 3 Comments:
Point 1: There are several reference lumps in the manuscript: [1,2], [4,5], [10,11], [12,13], [20-24], [25-27]. Please eliminate these lumps. This should be done by characterising each reference individually. This can be done by mentioning 1 or 2 phrases per reference to show how it is different from the others and why it deserves mentioning.
Response 1:Thanks for your suggestions.We apologize for some structural errors that led to poor reading due to reference lumps. We restructured the paper. [1,2] mentions intelligent education system and scale education platform, which are modified respectively; [4-5] is the current research topics in the field of intelligent education, one of which is reviewed and one of which is supplemented and not suitable for subdivision; [10,11] is the use of interaction functions, respectively, logic function and inner product, which are expanded and described; [12,13] mentions two applications of neural networks, which are were expanded. [20-24] is a summary of the optimized DINA method in this paper, which is expanded and described in the text (Section 2, P.3, rows 115,127). [25-27] are three different big data application scenarios that are not suitable to be expanded and described, but one literature is deleted considering that there are too many examples.
Point 2: The sentence: “Multidimensional Item Response Theory Models (MIRT) was proposed by Reckase M D et al. [19] is…………” (Section 2, p. 3, rows 99, 100) needs a revision.
Response 2:Thanks for your suggestions.We have revised it in the text (Section 2, p.3, rows 102, 104).
Point 3: Some of the notations used in the equations such as: θ, η, E, s, b, etc. and some indices such as: i, h, j, etc., are not defined in the text.
Response 3:Thanks for your suggestions, We apologize for some grammatical errors that are not well read due to lack of definitions.Notations:θ is the definition of trainable parameters has been added to the text (Section 3, p. 6, rows 241, 243), η is the learning rate in the text (Section 4, p. 9, rows 328), E is the practice vector in the text (Section 3, p. 4, rows 177), s is the error rate of answers in the text (Section 3, p. 5, rows 212), and b is the difficulty factor (Section 3, p. 5, rows 202).
Point 4: At the beginning of the description of step 3 in the algorithm on page 7 missing “I” should be added to “nitialize“.
Response 4:Thanks for your suggestions.We have revised the "initialization"
Point 5: Table 1 should be presented in an appropriate (smaller) format.
Response 5:Thanks for your suggestions.We have modified the format of Table 1
Point 6: The notations used in Eq. (19) related to the summation are unclear and they are not explained in the text.
Response 6:Thanks for your suggestions.Equation (19) is a description of the evaluation index AUC, where the sum = and ri is the ranking position of the ith positive class sample, and its computational complexity is O(n log(n)). References have been added for supplementation.
[33]Hand, D. J.; Till, R. J.. A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine learning, 2001,45(2), 171-186.
Point 7: How the value of the empirical parameter - "1.7" on page 5 in Eq. (7) is determined?
Response 7:1.7 is determined by previous experience and training models, and it cannot be determined that this is necessarily the best empirical parameter, but it is the best empirical parameter in the training results of this paper.
Point 8: What is the number of neurons in hidden and output layers?
Response 8:It has been proposed in the text (Sec4, rows 326,327) that the neurons in the hidden and output layers are 256,128,1.
Point 9: What type of data preprocessing mechanism is used in the model?
Response 9:Thanks for your suggestions.The type of data preprocessing mechanism is data normalization, specifically the normalization of the sample data first, before the translation and scaling process that allows the network to learn to recover the distribution of features that the original network was designed to learn.
Thank you for your kind comments! And thank you so much for professional and careful opinion!
Author Response File: Author Response.docx
Reviewer 4 Report
Comments for author File: Comments.docx
Author Response
Response to Reviewer 4 Comments:
Point 1: 1.p. 3, line 113, it should be “DINA” not “DIAN”
Response 1:Thanks for your suggestions.We have revised the "DIAN"
Point 2: Please write out the CDM, which you are estimating.
Response :Thanks for your suggestions.This paper proposes a Neural Network Cognitive Diagnosis model (NeuralNCD) that incorporates multiple features. The model obtains more accurate diagnostic results by using neural networks to handle the nonlinear interaction between students and exercises. First, the student vector and the exercise vector is obtained through the Q-matrix; second, the multi-dimensional features of the exercises (e.g., difficulty, discrimination, guess and slip) are obtained using the neural network; Finally, item response theory and a neural network are employed to characterize the interaction between the student and the exercise in order to determine the student's cognitive state. At the same time, monotonicity assumptions and data preprocessing mechanisms are introduced into the neural network to improve the accuracy of the diagnostic results. Extensive experimental results on real world datasets present the effectiveness of NeuralNCD with the regard of both accuracy and interpretability for diagnosing students' cognitive states.
Point 3: Where are the predicted mastery probabilities based on the NeuralNCD?
Response 3:Thanks for your suggestions.
In this paper, the predicted mastery probability of NeuralNCD is based on the following points.
- the aspects considered in this paper are comprehensive in diagnosing the student's state, extracting multiple features of the exercises (e.g., difficulty, discrimination, guessing, and slippage).
- In this paper, a neural network combined with a 3-parameter logistic model is used to process the nonlinear interaction data of students and exercises, and the ability of neural networks to process data has been widely recognized.
- In this paper, we propose monotonicity assumption and data preprocessing mechanism to improve the performance of neural networks and compensate the defects of long response time and poor interpretability.
Point 4: p.6, line 233, would you please further explain the out value y_{i}, gamma, and beta in details?
Response 4:Thanks for your suggestions.The gamma and beta perform the translation and scaling of the normalized data. The gamma, and beta parameters are trained by introducing the gamma and beta parameters in order to get the optimal results. The ultimate goal is to allow our neural network to learn to recover the feature distribution that the original network was designed to learn. y_{i} is the output value of the normalized data after the reconstruction transformation, i.e., it is the output of the data processing layer after the operation
Point 5: How many layers of NN?
Response 5: 6layers, with 2 hidden layers and one output layer, and a data preprocessing layer in front of each layer.
Point 6: P. 8, equation 18. Please define r_{i}.
Response 6: r_{i} is the predicted value of the ith exercise, which is defined in the text(Section4, P.8,rows 291,292).
Point 7: Why are we comparing to IRT, MIRT models? I am not sure the motivation of doing so.
Response 7:Thanks for your suggestions.Modified.
- IRT, MIRT models are classical cognitive diagnostic models, and although these methods were proposed earlier, they are still acceptable as a comparison method.
- The model proposed in this paper diagnoses student status by combining IRT and neural network, so it is necessary to compare with IRT and MIRT models to highlight the changes after adding neural network.
Point 8: Table 2, what are the cut-off values used for classifications?
Response 8:Thanks for your suggestions.the cut-off values used for classifications is 0.5
Point 9: 9.How about the attribute profiles?
Response 9:The configuration file is configured by the number of students, the number of exercises, and the number of types of knowledge contained in the exercises. The student is represented by a table with the rows of the table being the results of the student's answers and the columns being the knowledge concepts contained in the exercises.
Point 10: Would you please compare the computation efficiencies across models and different estimation methods?
Response 10:Thank you for your valuable comments, it has inspired me a lot. However, the NeuralNCD method proposed in this paper uses a neural network structure to improve the accuracy of the diagnosis results, and the experimental results also prove that the neural network is effective in improving the diagnosis results. Therefore, it is necessary to improve the accuracy at the expense of computational efficiency. The computational efficiency of the NeuralNCD method cannot be compared with other estimation methods. However, we are studying this problem and will continue to do so in our future work.
Translated with www.DeepL.com/Translator (free version)
Point 11: Please proofread your manuscript. Many grammar errors were identified.
Response 11:We apologize for some grammatical errors that led to poor reading due to sloppy writing and careless proofreading. Hereby, an English major professor gives us help and correct every grammatical error and sentence carefully.
Point 12: Please cite
Response 12:We have benefited from a careful reading of the three recommended papers. Especially after reading the first papers, we learned the excellent way , which can be seen in Reference 19, and 26. the references we added include the following:
[26]Zhan*, P., Man*, K., Wind, S. A., & Malone, J. (2022). Cognitive Diagnosis Modeling Incorporating Response Times and Fixation Counts: Providing Comprehensive Feedback and Accurate Diagnosis. Journal of Educational and Behavioral Statistics. https://doi.org/10.3102/10769986221111085
[19]Zhan, P., Jiao, H., Man, K., & Wang, L. (2019). Using JAGS for Bayesian Cognitive Diagnosis Modeling: A Tutorial. Journal of Educational and Behavioral Statistics, 44(4), 473–503. https://doi.org/10.3102/1076998619826040
Thank you for your kind comments! And thank you so much for professional and careful opinion!
Author Response File: Author Response.docx
Round 2
Reviewer 1 Report
The article can be accepted in the present form.
Reviewer 2 Report
---
Reviewer 3 Report
The authors have taken into account all the comments made and have revised the paper accordingly. I recommend it to be published in the journal.