4.2. The pyBKT Library
Badrinath et al. [
22] developed pyPKT as an accessible, easy-to-implement, and computationally efficient Python library to facilitate the estimation of various BKT models. The pyBKT library is built on HMM where the model is composed of an observable node of whether a learner has mastered a skill (i.e., represented with the learner’s dichotomous response sequences) and four hidden nodes; probability of initial knowledge (i.e.,
), transition probability (i.e., probability of learning;
), probability of slip (i.e.,
), and probability of guessing (i.e.,
). pyBKT uses the expectation-maximization algorithm to learn hidden nodes from the dichotomous action sequences and estimates the probability of a learner mastering a skill (i.e., knowledge component; KC) at time
t.
pyBKT requires a Python version of 3.5 or higher and can be run on all operating systems. The pyBKT library’s internal data helper processes the response-per-row data file (.csv or .tab) to create an internal pyBKT data format Badrinath et al. [
22]. The estimation error asymptote at 50 students for learn and slip parameters whereas the estimation error asymptote at 100 students when prior and guess parameters are added to the model. The sequence length is also an important factor for model estimation accuracy. A sequence length of at least 15 is needed per skill for model accuracy. That is at least 50 students for learn and slip parameters or 100 students for prior and guess parameters with at least 15 sequences are needed for accurate model estimation (for more information, see Badrinath et al. [
22]). The pyBKT library requirements include numpy, sklearn, pandas, scikit-learn, scipy, and joblib libraries.
The pyBKT library involves functions for a wide range of tasks, including data generation, model estimation, prediction, cross-validation, parameter initialization, and evaluation [
22]. Furthermore, pyBKT includes many BKT variants such as
Standard Knowledge Tracing,
Knowledge Tracing and Forgets Models (i.e., forgets),
Item Difficulty Effect Model (i.e., multigs),
Prior per Student Model (i.e., multiprior),
Item Order Effect Model (i.e., multipair), and
Item Learning Effect Model (i.e., multilearn). The pyBKT’s Github page (
https://github.com/CAHLR/pyBKT; accessed on 19 July 2023) includes a quick start tutorial and several code examples for estimating various BKT models. The pyBKT library is compatible with Windows, Mac OS X, and Linux operating systems.
4.3. Case Study 1: Estimating the Standard BKT Model
In the following case study, we demonstrate how pyBKT can be used for estimating the standard BKT model with the Cognitive Tutor dataset. In a standard BKT model, at least four variables should exist in the dataset:
First, we begin with the installation of pyBKT and then import it with other libraries necessary for the implementation of our case studies (see Listing 1).
Listing 1. Installing necessary Python libraries. |
|
Next, we read the file for the Cognitive Tutor dataset in Python. The accepted input formats for pyBKT are Pandas data frames and data files of type .csv (comma-separated values) or .tsv (tab-separated values). We can import the dataset as a Pandas data frame and then use it in the pyBKT functions, or we can directly refer to the file location (Listing 2).
Listing 2. Importing the Cognitive Tutor dataset. |
|
The Model function fits the BKT model to all skills available in the Cognitive Tutor dataset. However, we can also train the model for a single skill or a set of skills. In the Cognitive Tutor dataset, there are 12 math skills, indicated as KC(default (i.e., knowledge components). Assume that a teacher is interested in understanding students’ mastery of generating plots for various purposes, which encompasses knowledge components such as plot terminating improper fraction, plot non-terminating improper fraction, plot whole number, plot decimal - thousandths, and plot pi. We can use regex to simply match the knowledge components that include the word plot for selecting the skills that we want to use for training BKT. We use skills = “.*Plot.*” to select plot-related knowledge components and fit a standard BKT model, without considering other model specifications such as forgets, multigs, multiprior, multipair, multilearn. In Listing 3, we initialize a BKT model and fit the model for skills including Plot.
Listing 3. Training BKT for a specific set of skills. |
|
Alternatively, we can train a standard BKT model on a unique math skill available in the Cognitive Tutor dataset using
skills. For example, specifying skills = “Plot imperfect radical” would train the BKT model on a particular skill (i.e., plotting imperfect radical). In addition, we can train the model on each unique skill in the dataset using a loop, see Listing 4:
Listing 4. Training BKT for each unique skill. |
|
To model learner data using the pyBKT library, the dataset must consist of column names familiar to pyBKT (e.g., order_id, skill_name, and correct). In the above code snippet, we did not need to change the column names because pyBKT can automatically infer column name mappings in the Cognitive Tutor dataset. However, if the dataset includes unfamiliar column names, then the user may need to specify a mapping from the dataset’s column names to pyBKT’s expected column names. This process is referred to as the model defaults. Assume that the Cognitive Tutor dataset included three columns: person, skill_math, and answer representing the order ID, skill names, and correctness, respectively. For these columns, we could specify the default column names (see Listing 5) to lookup in the dataset and use them in the model training process as follows:
Listing 5. Mapping the existing column names to expected pyBKT columns. |
|
pyBKT offers a variety of features for prediction and evaluation. We can use the
params function to obtain fitted parameters, including the probability of prior knowledge, the probability of guessing, the probability of slipping, and the probability of learning the skills. We can extract fitted parameters by using Listing 6.
Figure 2 shows the pyBKT output with the estimated parameters for each skill. In the output, “prior” refers to the prior probability of knowing, “learns” is the probability of transitioning to the knowing state given not known, “guesses” is the probability of guessing correctly given the not-knowing state, and “slips” is the probability of picking incorrect answer given the knowing state. Note that all “forgets” values in the output (i.e., the probability of transitioning to the not knowing state given known) are zero because the standard BKT model assumes no forgetting.
Listing 6. Parameters for unknown states. |
|
Furthermore, we can evaluate the prediction accuracy of the fitted BKT model using root-mean-square error (RMSE), which is the default evaluation metric. Alternatively, we can use metric=[’auc’] to obtain the area under a curve (AUC) metric for the trained model (see Listing 7).
Listing 7. Model evaluation. |
|
On the pyBKT’s GitHub page (
https://github.com/CAHLR/pyBKT; accessed on 19 July 2023), the authors of the pyBKT library also demonstrate how to use a custom accuracy function for model evaluation as given in Listing 8:
Listing 8. Custom evaluation metrics. |
|
We found an overall accuracy of 67%, with RMSE, AUC, and MAE values of 0.45, 0.74, and 0.41, respectively for the skills related to generating plots. This, however, represents model evaluation for the whole dataset as we did not split the data into training and test sets. Therefore, we can employ k-fold cross-validation as given in Listing 9 to obtain a more reliable evaluation of the performance of the BKT model.
Listing 9. Three-fold cross-validation. |
|
After running a three-fold model cross-validation, we obtain the RMSE values of 0.48, 0.44, 0.50, 0.47, 0.29, and 0.48 for
plot non-terminating improper fraction,
plot imperfect radical,
plot terminating improper fraction,
plot pi,
plot whole number, and
plot decimal- thousandths, respectively (see
Figure 3).
We can also use different variants of the standard BKT model (see Listing 10), such as forgets, which relaxes the assumption that learners do not forget the content or skill they mastered, multigs, which fits different guess and slip parameters for each class, or multilearn which fits a different learn rate if forgets is used.
In the above code snippet, we assume that students may forget the knowledge component mastered, the BKT model is allowed to have different guess and slip parameters, and the model is estimated with a different learn rate. For this model, we found an RMSE value of 0.39, an accuracy of 78%, and an AUC value of 0.85. These findings suggest that model fit improved when we allowed students to forget, used different guess and slip parameters, and enabled different learn rates.
Listing 10. Training BKT variants. |
|
4.5. Case Study 3: Associations between IRT and BKT Parameters
In our final case study, we compare the parameters of IRT (1PL) and BKT. In IRT, the parameters (difficulty, or b, and ability, or ) are specific to individual items and learners, respectively, and are estimated separately. In BKT, the parameters (initial knowledge, learning rate, guess probability, and slip probability) are estimated globally for each skill or concept. Furthermore, IRT focuses on modeling individual learner abilities and item difficulties, providing a continuous measure of learner proficiency. BKT, on the other hand, focuses on modeling learning as a dynamic process and can capture changes in knowledge over time. Overall, the IRT and BKT models have distinct parameterizations and capture different aspects of learning. IRT primarily focuses on estimating ability and item difficulty, while BKT models the dynamics of learning and incorporates parameters related to initial knowledge, learning rate, guess probability, and slip probability. The choice between these models depends on the research or educational context and the specific learning aspects of interest.
In the IRT model, item difficulty refers to the level of item challenge or the probability of a correct response from individuals with average ability. In the BKT model, the parameters of guess and slip represent the probability of guessing the correct answer and the probability of slipping, respectively. The difficulty parameter in IRT can be defined based on these BKT parameters, by re-parameterizing the guess parameter [
29]. The following code snippet shows how to compare the 1PL IRT and BKT model parameters by converting BKT parameters to IRT parameters and then calculating the correlation between the IRT parameters and the transformed BKT parameters (Listing 12).
Listing 12. Python code example linking BKT and IRT parameters. |
|
The results shown in
Table 2 indicate the Pearson correlation calculated between the IRT difficulty parameters and the reconstructed difficulty parameters from BKT
, which represents the forget parameter
for the specific problem
p in the knowledge component
k. The results suggest that most of the difficulty parameters from IRT (
) and the converted difficulty parameters from BKT (
) generally were highly correlated, indicating the close alignment between the IRT and BKT models.