Next Article in Journal
Land Swap Option for Sustainable Production of Oil Palm Plantations in Kalimantan, Indonesia
Next Article in Special Issue
Managing Social Presence in Collaborative Learning with Agent Facilitation
Previous Article in Journal
Development of a Method for Selecting Bus Rapid Transit Corridors Based on the Economically Viable Passenger Flow Criterion
Previous Article in Special Issue
Toward a More Personalized MOOC: Data Analysis to Identify Drinking Water Production Operators’ Learning Characteristics—An Ecuador Case
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Mining and Utilizing Knowledge Correlation and Learners’ Similarity Can Greatly Improve Learning Efficiency and Effect: A Case Study on Chinese Writing Stroke Correction

1
Huzhou University Library, Huzhou University, Huzhou 313000, China
2
School of Information Engineering, Huzhou University, Huzhou 313000, China
3
Department of Training, China Language & Culture Press, Beijing 100010, China
*
Author to whom correspondence should be addressed.
Sustainability 2023, 15(3), 2393; https://doi.org/10.3390/su15032393
Submission received: 20 December 2022 / Revised: 24 January 2023 / Accepted: 26 January 2023 / Published: 28 January 2023
(This article belongs to the Special Issue E-learning Personalization Systems and Sustainable Education)

Abstract

:
Using AI technology to improve teaching and learning is an important goal of educational sustainability. By mining the correlation between knowledge points, the discrete knowledge points can be integrated to improve the knowledge density and reduce the learning task. In addition, the successful experiences of similar learners can be shared, thus shortening the learning path of new learners. To change the common situation of irregular writing stroke order, to teach and correct stroke order effectively, this study uses association rules to explore the potential correlation between error-prone Chinese characters based on a large number of learners’ writing records, and then summarizes and sorts out a set of error-prone Chinese characters based on this. Every Chinese character contained in an error-prone category has a common error-prone feature. By correcting this error, it can be extended to every Chinese character of this category, and the learning efficiency of Chinese character strokes can be improved tens of times. In the training and testing system with a Chinese character error-prone character set, combined with the improved collaborative filtering algorithm, a learner-based personalized error-prone Chinese character recommendation model was proposed. Experimental results showed that the Apriori algorithm with lift measure can excavate effective strong association rules and provide an important reference for the character set table. The improved collaborative filtering algorithm can make use of the similarity between learners, share successful learning experiences, provide a personalized recommendation service for error-prone Chinese characters, and the recommendation performance is higher than that of the traditional collaborative filtering model. In the test of different types of learning groups, there are obvious differences between the independent pre-test and the post-training test, which effectively corrects the irregular writing habits, and further indicates that the excavation of knowledge correlation and the combination of learners’ similarity can effectively improve the efficiency and effect of teaching and learning.

1. Introduction

Chinese characters are the carriers of Chinese history and culture, and have a rich traditional cultural heritage. The standardization of stroke order is the most basic part of the writing process, but it is also the part that people most easily overlook. Mastering the standard stroke order not only helps to correct the wrong writing habits, but also adapts to the physiological function of the wrist, making the written characters more proportional, balanced and beautiful, and improving the speed of writing to a certain extent [1]. According to statistics, among the more than 260,000 contestants in the competition Brush and Ink in China sponsored by the Ministry of Education of China in 2020 and 2021, 79.24% of them had problems with stroke order. However, as typical morpheme characters [2], the total number of Chinese characters is very large, and it is very difficult for learners to learn and correct the wrong stroke order character by character. Therefore, how to effectively learn the standard stroke and correct the wrong stroke is an important research topic of sustainable education.
Research has shown that computer technology used in the teaching process can provide students with new learning experiences and more effective learning, which can lead to sustainable education. For example, Dillenbourg P et al. [3] investigated how MOOC study groups watch videos together under different configurations. The results show that watching MOOCs in groups provides highly satisfying learning experience as learners feel connected and interactions among them are enabled, which reveals that collaborative learning with the help of computer technology can increase students’ sense of participation and improve learning efficiency. Dillenbourg P et al. [4] also captured students’ behavioral patterns through analysis of sequential interaction logs, which enabled more effective and personalized support during the learning processes. Troussas C et al. [5] presented a fully operating and evaluated adaptive and intelligent e-learning system for second language acquisition, which provided each student with a unique educational experience. Furthermore, the inference system utilized the knowledge inference relationship between the learning objects and created a personalized learning environment for each student.
It can be inferred from these studies that the successful experiences of similar learners can be shared to shorten the learning path of new learners. In addition, by mining the correlation between knowledge points, the discrete knowledge points can be integrated into integrated knowledge points, so as to improve the knowledge density and reduce the learning task. These two points will help to establish an effective and personalized learning system. According to this idea, we take the learning of Chinese character stroke order knowledge as a case study of e-learning personalization system and try to use artificial intelligence technology to improve the efficiency of learning standard stroke order and correcting wrong stroke order.
Chinese character forms consist of limited basic strokes and components, so the writing of different characters is necessarily related. Such relationships can be obtained through the mining association rule. The experience of Chinese character writers can be shared and learned from, and the group writing experience can be promoted through collaborative filtering technology. Association rule mining is a typical data mining technique that has been widely used in the field of computer-assisted education. Ding Jihong et al. [6] achieved accurate recommendation of learning resources based on association rules in a big data environment, enhancing the experience of online learning. Zhang L et al. [7] applied association rule mining techniques to teaching information management in universities, which shows that data mining is helpful for teaching management. However, there is a paucity of research on the incorporation of association rules into Chinese character writing teaching techniques. Collaborative filtering is one of the core techniques used in recommendation systems, mainly by calculating preference information between similar users and then predicting what other users might be interested in [8].
As a typical study case, this paper addressed the learning need for efficient correction of Chinese writing stroke order, forming an error-prone character set through mining the correlation between error-prone Chinese characters and recommending exercises through an optimized collaborative filtering algorithm. A systematic stroke order correction method and system were achieved. To verify the effectiveness of the personalized stroke correction algorithm, we conducted special tests and effect evaluations on two different groups of learners. The experiments show that our method can effectively achieve personalized correction of Chinese characters’ stroke order and has effectiveness in teaching stroke order standardization to different groups, which can advance sustainable education.

2. Personalized Chinese Character Stroke Order Correction Algorithm

Based on the massive writing records of learners, combined with data mining technology, this paper constructs a personalized Chinese character stroke order correction algorithm. As shown in Figure 1, the algorithm is divided into two stages. The first stage constructed a unique error-prone Chinese character set library based on the Apriori algorithm that introduced lift, which provided important support for realizing the second stage of error-prone Chinese character recommendation. The second stage introduced learner-based collaborative filtering-inverse item frequence based on the error-prone Chinese character set library, which recommended effective experiences for learners by calculating their similarity. In general, association rules were used to explore the potential relationships between Chinese characters. On this basis, the error-prone Chinese character set library is summarized. Then, based on the improved collaborative filtering algorithm, a personalized error-prone Chinese character recommendation model based on user was proposed, which took learners and error-prone Chinese character set as the core, and a complete personalized Chinese character stroke correction algorithm was constructed.

2.1. Apriori Algorithm with Lift Measure

As a data mining algorithm based on association rules [9], the Apriori algorithm can analyze valuable information from the writing records of different learners, and reflect the writing situation of most learners. It is an important means to summarize the types of error-prone Chinese characters and then generate the error-prone Chinese character set table.
Assume that the error-prone Chinese character data set D contains all the incorrect characters in the database. The non-empty item set Q represents a learner written record, an item set composed of several Chinese characters [10]. Let X and Y be two error-prone Chinese character sets in learner written records Q, X Q and Y Q . If there is X     Ø , Y     Ø , and X Y = Ø , then X Y constitutes an error-prone Chinese character association rule in learner written record D. The effectiveness of the association rules of the Apriori algorithm is usually measured by the support and confidence [11]. Support refers to the percentage of the number of characters X and Y appearing simultaneously in the total characters in pre-processed writing record dataset C, and is denoted as support ( X Y ), as shown in Equation (1). Confidence is the percentage of the number of characters X and Y to the number of characters X in pre-processed dataset C, denoted as confidence( X Y ), as shown in Equation (2). Where count   ( X Y ) is the number of characters X and Y that can occur simultaneously, and support   ( X ) is the percentage of the number of characters X in dataset C.
support ( X Y ) = P ( X Y ) = count ( X Y ) count ( C )
confidence ( X Y ) = P ( Y | X ) = support ( X Y ) support ( X )
The traditional Apriori algorithm uses two evaluation indexes: support and confidence, for rule filtering, and many of the association rules mined are invalid. To address the shortcomings of the support–confidence framework, we introduce lift [12,13] to further filter the mined association rules. The lift refers to the ratio of the probability of the occurrence of character Y in the condition of the existence of character X to the probability of the occurrence of character Y without the existence of character X, reflecting the correlation between X and Y, as shown in Equation (3).
lift ( X Y ) = P ( Y | X ) P ( Y ) = support ( X Y ) support ( X ) · support ( Y )
Support   ( Y ) is the percentage of the number of characters Y in the index data to the whole data set D. The value range of lift is [0, +∞]. When the lift is greater than 1, it indicates that the appearance of character X promotes the appearance of character Y, which is called the positive correlation rule. When the lift is equal to 1, it indicates that the simultaneous occurrence of characters X and Y is an independent random event, and this rule is called irrelevant rule. When the lift is less than 1, it indicates that the occurrence of character X reduces the probability of occurrence of character Y, which is called negative correlation rule.
Therefore, the Apriori algorithm which introduces list measure can extract meaningful association rules from the massive Chinese writing records and summarize them into the error-prone Chinese character set table, providing support for the subsequent effective recommendation of error-prone Chinese characters.

2.2. Learner-Based Collaborative Filtering-Inverse Item Frequence

By calculating the similarity of learners, learners can be recommended incorrectly written Chinese characters of similar learners; thus, the experience of other learners can be effectively utilized to enhance learning efficiency.
Firstly, it can make statistics according to the number of errors of different Chinese characters made by learners. In general, the more frequently learners make mistakes in a Chinese character, the more likely the Chinese character is to be written wrong. Thus, the scoring matrix of learners for Chinese characters is established. By calculating the similarity of different learners, the nearest neighbor set of the current learner is established, and the error-prone degree of different Chinese characters is ranked according to the learners in the nearest neighbor set, to obtain the recommendation of the current learners’ error-prone characters. Jaccard similarity [14] and cosine similarity [15] can be used to calculate the similarity between different learners. The calculation formulas are shown in Equations (4) and (5):
w u , v , J   = | N ( u ) N ( v ) | | N ( u ) N ( v ) |
w u , v , C   = | N ( u ) N ( v ) | | N ( u ) | | N ( v ) |
N ( u ) refers to the collection of wrong Chinese characters written by learner with number u. N ( v ) refers to the collection of wrong Chinese characters written by learner with number v.
However, neither of the two similarity calculation methods mentioned above can avoid the influence of high-frequency Chinese characters’ handwriting errors. It means that many error-prone Chinese characters have similar problems with a common error-prone Chinese character. Therefore, it is necessary to improve the similarity degree. When two learners have the same writing errors for certain low-frequency characters, it is more indicative of similarity between the two learners. Therefore, we introduce user-based Collaborative filtering-inverse Item Frequence (UserCF-IIF) into the cosine similarity calculation formula [14] and penalize the effect of common error-prone characters in the learner and error-prone character sets on similarity. In this case, the improved Jaccard similarity and cosine similarity formulas are shown in Equations (6) and (7):
w u , v = i N ( u ) N ( v ) 1 log ( 1 + N ( i ) ) | N ( u ) N ( v ) |
w u , v = i N ( u ) N ( v ) 1 log ( 1 + N ( i ) ) | N ( u ) | | N ( v ) |
The note i indicates the number of error-prone Chinese characters. Finally, the recommendation analysis of relevant error-prone Chinese characters is realized by sorting the similarity degree of learners.

3. Experimental Results and Analysis of Association Rules and Collaborative Filtering

This section took the writing records of the competition Brush and Ink in China as the data source. For example, a participant’s error writing record is “连, 迈, 莲, 房, and 剪”. In order to extract more effective data, data source and pretreatment methods are introduced in detail in Section 3.1. In Section 3.2, the improved association rule algorithm was used to mine and analyze typical Chinese characters, and the correlation strength between different Chinese characters and error types was calculated, and the error-prone Chinese characters set library was summarized. In Section 3.3, by comparing various traditional collaborative filtering algorithms, it proved the effectiveness of the improved collaborative filtering algorithm in recommending Chinese characters that learners were interested in.

3.1. Data Pre-Processing

The data was based on the standardized writing questions from the 2020 and 2021 Competition—Brush and Ink in China, which was targeted at teachers, students in schools and colleges, and members of the community across the country. One hundred and thirty-seven error-prone Chinese characters were selected as the question bank of standard writing. We randomly sampled 20,000 participants from 2020 and 2021 to write record data as the research object. The specific data pre-processing operation steps are shown below.
  • Data Cleaning: Delete missing data to complete data cleaning. Since some participants left several questions empty without answering them directly, resulting in vacant answer data and wrong judging data, these records need to be deleted. In addition, the purpose of the research is to mine information about Chinese characters, so redundant data such as participants’ cell phone numbers and titles were deleted.
  • Data Integration: Multiple sub-data were integrated into one data file and duplicate records were removed to resolve data redundancy. Since there may be multiple submissions by a participant resulting in the data records being saved multiple times, these duplicate data need to be removed to avoid data redundancy.
  • Data Conversion: The form of data is subject to the requirements of the algorithm, and the data used for mining needs to be processed by data conversion. Since character data is generally not directly used as input to the algorithm, it is necessary to encode the character data into digital data to make it meet the requirements of the algorithm.

3.2. The Error-Prone Character Set Table of Stroke Order Based on Association Rules

By using the Apriori algorithm that introduced lift to mine the pre-processed contest data, the relationship between error-prone Chinese characters (incorrect stroke sequence/incorrect number of strokes) and error-prone Chinese characters was mined. Some of the mined results are shown in Table 1, respectively.
It can be seen from Table 1 that error-prone Chinese characters are often significantly associated with specific error types. Taking “之” as an example, the confidence of “之” and the error type of “Wrong number of strokes” is 0.99. This indicates that the reliability of this rule is very high, and learners are most likely to have this error type when practicing this character; it is also in line with the fact that it is easy to write the two strokes of “horizontal-break” and “right-falling” as one folding stroke. Therefore, in the process of writing correction, attention should be paid to the correction of the character strokes. Error-prone Chinese characters like this can be grouped into the set of characters with incorrect hyphenated strokes. In addition, the character “怀” corresponding to no. 6 in Table 1 has a high correlation with the character “情”. When learners make writing errors on the character “怀”, they may also make errors on the character “情”. It is easy to observe that both characters have the “忄” side, which stroke order is easy to write incorrectly, indicating that there is a certain correlation between Chinese characters with the same components, which is relatively intuitive. These kind of error-prone Chinese characters can be grouped into the set of characters with the same error-prone components. However, there are also some Chinese characters with no intuitive correlation. Through the Apriori algorithm, we found that characters “龙” and “为” have a certain correlation. From the similarity of structure, it can be explained that their commonality is independent dot strokes, suggesting that we can sum up the stroke order rules of Chinese characters with independent dot strokes, grouping them into sets of characters with the same error-prone features. Other characters do not have the above features but are also easy to write incorrectly due to their complex structure, which can be grouped into the set of characters with complex structures that are not easy to write correctly.
By mining the correlation between error-prone Chinese characters, some error-prone Chinese character categories were obtained. Additionally, we constructed the basic error-prone Chinese character set table by expanding the set of characters within different categories (Table 2). Each category contains dozens of Chinese characters with common error-prone feature. By correcting this error, it can be extended to every Chinese character of this category, and the learning efficiency of Chinese character strokes can be improved tens of times.
Thus far, we described a generation method of error-prone Chinese character set library based on the improved Aprori algorithm. By calling the Chinese characters in the library summarized above, we can make personalized recommendations according to the user information and character library. To verify the effectiveness of the error-prone Chinese character set library, we imported it into an applet developed by ourselves for internet users to practice. The writing records of each user were extracted and further analyzed for the types of errors in the strokes and stroke order of Chinese characters in the writing records, and some of the exercise data are shown in Table 3.
We can see that there is a mutual relationship between the wrong characters written by learners, and there is an explicit same-part correlation. For example, characters “龙” and “拢” written by learner no. 1 have the same component “龙”. Additionally, there is an implicit same-part correlation, such characters “丑” and “再” written by learner no. 5, which both have a “土” structure. We can conclude that the “土” is integrated into the character, and the stroke order rule is “vertical first and then two horizontal” [16]. Inspired by this idea, 38 different error types and their character sets were summarized through data mining and analysis to form the error-prone Chinese character set table, in which correlations between characters were confirmed in the learners’ writing records.

3.3. Recommendation of Error-Prone Chinese Characters based on Collaborative Filtering

The learned-based collaborative filtering algorithm for the error-prone Chinese character recommendation is mainly aimed at learners and the Chinese character writing records of the test system based on the error-prone Chinese character set table as experimental samples. The experimental analysis is carried out through the intelligent recommendation of error-prone Chinese characters based on UserCF-IIF. The experiment focuses on analyzing the quality of recommendation for selecting error-prone Chinese characters based on improved cosine similarity.
The purpose of the algorithm applied in the recommendation is to recommend the most error-prone Chinese characters to learners, so the top-N recommendation strategy was used [17]. To evaluate the recommendation results objectively, we adopted commonly used evaluation indexes in the recommendation system, namely precision, recall and coverage. Among them, accuracy rate refers to the ratio of error-prone Chinese characters recommended to learners to the true error Chinese characters. Recall rate represents the ratio of learners’ true error Chinese characters appearing in the most likely error-prone Chinese characters set recommended in the test set. Coverage rate represents the ratio of all the recommended error-prone Chinese characters to the whole error-prone Chinese character set table. The formulas are shown in Equations (8)–(10):
Precision = u | R ( u ) T ( u ) | u | R ( u ) |
Recall = u | R ( u ) T ( u ) | u | T ( u ) |
Coverage = u | R ( u ) | | I |
R ( u ) represents the error-prone Chinese character set recommended for learner u , T ( u ) represents the true error Chinese characters of learner u in the test set, I represents the sum of the whole error-prone Chinese character set table, P ( i ) represents the prevalence of Chinese character i , and N represents the list length of the recommended error-prone Chinese character R ( u ) .
The comparison algorithms adopted in this paper were three collaborative filtering algorithms: UserCF (Learner-based collaborative filtering algorithm), MostPopularCF (Heat-based collaborative filtering algorithm) and RandomCF (random filtering algorithm) [18]. All algorithms were tested separately using Jaccard similarity and cosine similarity for comparison. The length of the recommendation list selected in the experiment was 10, and the length of similar learners was 5–30.
As can be seen from Table 4, the collaborative filtering algorithm based on improved similarity can improve the accuracy rate of the recommendation of error-prone Chinese characters. Comparing several models, Jaccard-UserCF-IIF has the best accuracy rate; RandomCF is the best in coverage, while Jaccard-UsercF-iIF is the second best. This is because RandomCF is randomly recommended, but its performance in terms of accuracy and recall is flawed. Overall, Jaccard-UserCF-IIF shows the best comprehensive performance among all models.
We further analyze the model with the best comprehensive performance, namely Jaccard-UserCF-IIF. Figure 2 reports the influence of different neighbor numbers on the recommendation performance of Jaccard-UserCF-IIF. With the increase in neighbor numbers, the recommendation performance increases gradually. When the number of neighbors is 25, the accuracy and recall rate reach the maximum value. The coverage rate decreases with the increase of neighbor number, while the prevalence rate increases steadily.

4. Experimental Test and Effect Evaluation of Stroke Order Correction Algorithm

To verify the effectiveness of the personalized stroke correction algorithm, we developed an error-prone Chinese character writing stroke correction training WeChat applet and conducted special tests and effect evaluations on two different groups of learners. Among them, Test Experiment 1 reported the results of a pre-and post-training test of 65 junior normal students in a college of education in Zhejiang Province, and Test Experiment 2 reported the results of 593 pupils in a district of Hangzhou.

4.1. Test Experiment 1

The first experiment invited junior normal students from a university in Zhejiang Province as experimental subjects, and there were a total of 65 valid experimenters, including 13 male and 56 female students with an average age of about 21 years old, and the pre-test and post-test scores are shown in Figure 3.
The abscissa of Figure 3 represents the number of experimental subjects, the ordinate represents the scores obtained from the test, and the horizontal bar chart represents the test scores of different experimental subjects before training, which objectively reflects the overall basic stroke standardization level of the subjects. The black line shows the test scores of the different subjects after the training. The following basic conclusions can be drawn: 37 of the 65 learners (57%) improved their scores to varying degrees, 19 learners’ test scores remained the same, 17 of whom scored perfect on both the pre- and post-tests. Additionally, 9 learners’ test scores did not improve effectively.
The paired-samples t-test analysis yielded (as shown in Table 5) a correlation coefficient of 0.655, significance level p < 0.001, between the learners’ scores on the pre-and post-training. The students’ scores were 85.46 ± 14.60 on the pre-test and 91.92 ± 9.66 on the post-test, and the mean test scores improved by 6.46 points, an increase of 7.56%. This indicates that the personalized stroke correction algorithm is effective in improving the stroke regulation training of normal students (t = −4.718, p = 0.00 < 0.001, the difference is statistically significant).

4.2. Test Experiment 2

Experiment 2 organized students from several elementary schools in a district of Hangzhou to participate in this test, and obtained data from 593 valid experimenters, including 120 in grade 1, 98 in grade 2, 94 in grade 3, 122 in grade 4, 138 in grade 5, and 21 in grade 6, as shown in Figure 4.
The grid portion of Figure 4 represents the pre-test scores before training, reflecting the initial level of the learners. The black slash portion represents the post-test scores after training. It can be seen that the average scores of all grades improved by more than 10 points after the training. Table 6 counts the number of people whose scores changed by grade. In total 460 of the 593 learners who participated in the test had their scores improved, accounting for 77.6%; 40 had their test scores unchanged, of which 6 had perfect test scores on both the pre-test and post-test; Additionally, 93 had their test scores not effectively improved.
Further analysis by paired samples t-test yielded (as shown in Table 7) a correlation coefficient of 0.524, significance level p < 0.001, for student scores before and after training. The students’ scores were 66.47 ± 17.42 on the pre-test and 79.84 ± 15.46 on the post-test, and the mean test scores before and after training improved by 13.37 points, an increase of 20.11%. This indicates that the personalized stroke correction algorithm is effective in improving the stroke regulation training of normal students (t = −20.172, p < 0.001, the difference is statistically significant).
In summary, both Test Experiment 1 and Test Experiment 2 show that our method can effectively achieve personalized correction of Chinese characters’ stroke order and has effectiveness in teaching stroke order standardization to different groups.

5. Conclusions

As an educational sustainability case study, a personalized Chinese stroke order correction algorithm was successfully developed to correct irregular writing habits. In this algorithm, the Apriori algorithm improved by lift measure was first used to construct the error-prone Chinese character set table, and the improved collaborative filtering algorithm was then used to develop a learner-based personalized error-prone Chinese character recommendation model. The empirical testing of the personalized stroke correction algorithm of two experiments showed that the experimental testers’ performance was significantly improved after training. The overall results illustrated the effectiveness of the proposed algorithms. However, the strength of the association rules was not sufficient due to massive competition data with sparsity, which deserves further in-depth investigation. In future studies, we will further optimize the error-prone Chinese character set table and introduce more perspectives of learner information to improve the performances of the proposed algorithms. The methods in this study can also be extended to relevance mining of other subjects and the design of teaching strategies, due to the fact that knowledge is relevant and learners have similar groups in each domain.

Author Contributions

Conceptualization, Q.L. and H.Q.; methodology, Q.L., C.Z. (Caifeng Zhang) and H.Q.; software, C.Z. (Caifeng Zhang) and Y.D.; validation, Q.L. and Y.D; formal analysis, C.Z. (Caifeng Zhang) and Q.L.; investigation, C.Z. (Caifeng Zhang) and Q.L.; resources, Q.L., C.Z. (Caifeng Zhang) and H.Q.; data curation, Q.L., C.Z. (Caifeng Zhang) and Y.D.; writing—original draft preparation, C.Z. (Caifeng Zhang), Q.L. and X.Z.; writing—review and editing, C.Z. (Chu Zhang), Q.L. and X.Z.; supervision, H.Q. and C.Z. (Chu Zhang); project administration, H.Q., C.Z. (Chu Zhang) and M.L.; funding acquisition, Q.L. and H.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Chaomi S&T Company Cooperation Project (Grant number: HK16003) and project supported by Scientific Research Fund of Zhejiang Provincial Education Department (Grant number: Y202248424).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wu, D.L.; Chen, Y. The Current Situation and Reflections on the Study of Strokes and Errors of Chinese Characters in the Last Decade. Pop. Lit. Arts 2020, 2020, 155–157. [Google Scholar]
  2. Xia, Y.; Xie, R.B.; Wang, Z.L.; Ruan, S.F.; Wu, X.C. Developmental relationships among morpheme awareness, Chinese character recognition, and vocabulary knowledge in lower elementary Chinese children - A cross-lagged study. J. Psychol. 2022, 54, 905–916. [Google Scholar]
  3. Li, N.; Verma, H.; Skevi, A.; Zufferey, G.; Blom, J.; Dillenbourg, P. Watching MOOCs together: Investigating co-located MOOC study groups. Distance. Educ. 2014, 35, 217–233. [Google Scholar] [CrossRef]
  4. Boroujeni, M.S.; Dillenbourg, P. Discovery and temporal analysis of latent study patterns in MOOC interaction sequences. In Proceedings of the 8th International Conference on Learning Analytics and Knowledge, Virtual Event, 5–9 March 2018; pp. 206–215. [Google Scholar]
  5. Troussas, C.; Chrysafiadi, K.; Virvou, M. An intelligent adaptive fuzzy-based inference system for computer-assisted language learning. Expert. Syst. Appl. 2019, 127, 85–96. [Google Scholar] [CrossRef]
  6. Ding, J.H.; Liu, H.Z. Accurate recommendation of learning resources based on multi-dimensional correlation analysis in age of big data. E-educ. Res. 2018, 39, 53–59+66. [Google Scholar]
  7. Zhang, L.; Lu, Z. Applications of association rule mining in Teaching Evaluation. In Proceedings of the 2018 3rd International Conference on Humanities Science, Management and Education Technology (HSMET 2018), Nanjing, China, 8–10 June 2018; pp. 331–334. [Google Scholar]
  8. Zhang, Y.J.; Dong, Z.; Meng, X.W. Research on Personalized Advertising Recommendation Systems and Their Applications. Chin. J. Comput. 2021, 44, 531–563. [Google Scholar]
  9. Chen, H.J. Design of Information Recommendation Book Management System based on Apriori Data Mining Algorithm. Mod. Electron. Tech. 2019, 42, 115–119+124. [Google Scholar]
  10. Guo, P.; Cai, C. Data Mining and Analysis of Students’ Score Based on Clustering and Association Algorithm. Comput. Eng. Appl. 2019, 55, 169–179. [Google Scholar]
  11. Wang, Y.Z.; Shen, Y.J.; Wang, L.J. The Causes Analysis of Traffic Accident Black Spots based on Improved Interest Measurement and Apriori Algorithm. J. Zhejiang Univ. (Sci. Ed.) 2021, 48, 349–355. [Google Scholar]
  12. Harahap, M.; Husein, A.M.; Aisyah, S.; Lubis, F.R.; Wijaya, B.A. Mining association rule based on the diseases population for recommendation of medicine need. J. Phys. Conf. Ser. 2018, 1007, 012017. [Google Scholar] [CrossRef]
  13. Das, S.; Dutta, A.; Jalayer, M.; Bibeka, A.; Wu, L. Factors influencing the patterns of wrong-way driving crashes on freeway exit ramps and median crossovers: Exploration using ‘Eclat’ association rules to promote safety. Int. J. Transp. Sci. Technol. 2018, 7, 114–123. [Google Scholar] [CrossRef]
  14. Kosub, S. A note on the triangle inequality for the Jaccard distance. Pattern. Recogn. Lett. 2018, 120, 36–38. [Google Scholar] [CrossRef] [Green Version]
  15. Li, Y.Y.; Deng, H.J. Collaborative filtering recommendation algorithm based on improved cosine similarity. Comput. Mod. 2020, 2020, 69–74. [Google Scholar]
  16. Lang, Q.; Ma, J.; Qi, H.N. Studies on Teaching of Stroke Order and Re-summarizing of Stroke Order Norms. J. Huzhou Univ. 2020, 42, 102–107. [Google Scholar]
  17. Xue, F.; He, X.; Wang, X.; Xu, J.; Liu, K.; Hong, R. Deep Item-based Collaborative Filtering for Top-N Recommendation. ACM T. Inform. Syst. 2019, 37, 1–25. [Google Scholar] [CrossRef]
  18. Bedi, P.; Gautam, A.; Sharma, C. Using Novelty Score of Unseen Items to Handle Popularity Bias in Recommender Systems. In Proceedings of the International Conference on Contemporary Computing and Informatics, Noida, India, 8–9 May 2015; pp. 934–939. [Google Scholar]
Figure 1. Personalized Chinese stroke order correction algorithm.
Figure 1. Personalized Chinese stroke order correction algorithm.
Sustainability 15 02393 g001
Figure 2. The effect of the number of nearest neighbors on the performance of Jaccard-UserCF-IIF recommendation. Subfigures (a) reports the influence of different neighbor numbers on Precision. Subfigures (b) reports the influence of different neighbor numbers on Recall. Subfigures (c) reports the influence of different neighbor numbers on Coverage.
Figure 2. The effect of the number of nearest neighbors on the performance of Jaccard-UserCF-IIF recommendation. Subfigures (a) reports the influence of different neighbor numbers on Precision. Subfigures (b) reports the influence of different neighbor numbers on Recall. Subfigures (c) reports the influence of different neighbor numbers on Coverage.
Sustainability 15 02393 g002aSustainability 15 02393 g002b
Figure 3. Comparison of experimental test scores of junior normal students from a university in Zhejiang Province.
Figure 3. Comparison of experimental test scores of junior normal students from a university in Zhejiang Province.
Sustainability 15 02393 g003
Figure 4. Comparison of mean test scores of participating elementary school grades in a city district.
Figure 4. Comparison of mean test scores of participating elementary school grades in a city district.
Sustainability 15 02393 g004
Table 1. Partial association rules.
Table 1. Partial association rules.
OrderAssociation Rule SupportConfidenceLift
1之→ Wrong number of strokes0.060.992.21
2山→ Wrong order of stroke 0.020.831.50
3家→ Wrong number of strokes0.020.942.08
4义→ Wrong order of stroke0.030.971.75
5为→ Wrong order of stroke0.070.941.70
6怀→情0.010.432.10
7存→好0.010.423.28
8龙→为0.360.471.19
9鹿→花0.020.504.81
10乘→老0.010.414.31
Note: Some contents on the table are Chinese characters.
Table 2. Stroke order error-prone type character set table.
Table 2. Stroke order error-prone type character set table.
Associated CharsetRuleExplainLabel
军, 挥, 辉, 连, 轰…The structure of “车” ends with verticalThe structure of “车” is not the side, the last stroke vertical
怕, 忙, 快, 怜, 怪…The structure of “忄” write two points left and rightThe structure of “忄” writes two points first and then writes vertical, in line with the writing method
刀, 刃, 分, 初, 剪…The structure of “刀” ends with the primeThe last stroke of “万, 刀, 力, and 乃” is prime
灯, 灾, 灼, 灵, 烂…The structure of “火” write two points right and left“人” structure write together, dot and prime in the “人” above, first write dot and prime, then write “人”
义, 仪, 斗, 门, 闪…Anything on the top or top left should be written firstWrite according to the most basic structure leading rule from top to bottom
Note: Some contents on the table are Chinese characters.
Table 3. Part of learners’ practice records.
Table 3. Part of learners’ practice records.
Learner OrderCharacters with Wrong Stroke OrderCharacters with Wrong Strokes Number
1搜, 军, 龙, 丹, 拢连, 初, 莲, 扔, 字
2浑, 挥, 辆, 防, 房迈, 转, 轮, 初, 莲
3奶, 圾, 船, 母, 丑区, 医, 能, 北, 笼
4连, 迈, 莲, 房, 剪轰, 软, 防, 浑, 连
5丑, 再, 垂, 每, 悔极, 扔, 里, 重, 秀
Note: Some contents on the table are Chinese characters.
Table 4. The average influence of different models on the recommendation results of error-prone Chinese characters.
Table 4. The average influence of different models on the recommendation results of error-prone Chinese characters.
Model Precision Recall Coverage
cos-UserCF0.46110.53380.7902
Jaccard-UserCF0.45690.5210.7894
cos-MostPopularCF0.27590.310.2304
Jaccard-MostPopularCF0.27360.30650.2206
cos-RandomCF0.06920.07931
Jaccard-RandomCF0.07470.08351
cos-UserCF-IIF0.45960.51660.8259
Jaccard-UserCF-IIF0.47090.52960.8061
Table 5. Paired sample statistics and test (Dataset 1).
Table 5. Paired sample statistics and test (Dataset 1).
Difference 95% Confidence Interval
AverageNumber of CasesStandard DeviationStandard Error MeanUpper LimitLower LimitDegree of FreedomSig. (2-Tailed)
Pair 1First test85.461538466514.595672441.810370537
Effect test91.91794872659.6555307211.197621190
First test-Effect test−6.45641026 11.033026371.368478498−9.19026033−3.72256018640.000
Note: Correlation coefficient = 0.655, p < 0.001, t = −4.718 (t-Test, Dataset 1).
Table 6. The number of learners whose scores changed by grade.
Table 6. The number of learners whose scores changed by grade.
GradeRisingNo changeDeclineSum
1931017120
270111798
37451594
4100418122
5113718138
6103821
Sum4604093593
Table 7. Paired sample statistics and test (Dataset 2).
Table 7. Paired sample statistics and test (Dataset 2).
Difference 95% Confidence Interval
AverageNumber of CasesStandard DeviationStandard Error MeanUpper LimitLower LimitDegree of FreedomSig. (2-Tailed)
Pair 1Pre-test66.4759317.4190.715
Post-test79.8459315.4610.635
Pre-test–Post-test–13.363 16.1310.662–14.664–12.0625920.000
Note: Correlation coefficient = 0.524, p < 0.001, t = −20.172 (t-Test, Dataset 2).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lang, Q.; Zhang, C.; Qi, H.; Du, Y.; Zhu, X.; Zhang, C.; Li, M. Mining and Utilizing Knowledge Correlation and Learners’ Similarity Can Greatly Improve Learning Efficiency and Effect: A Case Study on Chinese Writing Stroke Correction. Sustainability 2023, 15, 2393. https://doi.org/10.3390/su15032393

AMA Style

Lang Q, Zhang C, Qi H, Du Y, Zhu X, Zhang C, Li M. Mining and Utilizing Knowledge Correlation and Learners’ Similarity Can Greatly Improve Learning Efficiency and Effect: A Case Study on Chinese Writing Stroke Correction. Sustainability. 2023; 15(3):2393. https://doi.org/10.3390/su15032393

Chicago/Turabian Style

Lang, Qing, Caifeng Zhang, Hengnian Qi, Yaqin Du, Xiaorong Zhu, Chu Zhang, and Mizhen Li. 2023. "Mining and Utilizing Knowledge Correlation and Learners’ Similarity Can Greatly Improve Learning Efficiency and Effect: A Case Study on Chinese Writing Stroke Correction" Sustainability 15, no. 3: 2393. https://doi.org/10.3390/su15032393

APA Style

Lang, Q., Zhang, C., Qi, H., Du, Y., Zhu, X., Zhang, C., & Li, M. (2023). Mining and Utilizing Knowledge Correlation and Learners’ Similarity Can Greatly Improve Learning Efficiency and Effect: A Case Study on Chinese Writing Stroke Correction. Sustainability, 15(3), 2393. https://doi.org/10.3390/su15032393

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop