stanbkt.utils.sim_simple_BKT#

stanbkt.utils.sim_simple_BKT(n_students=10, n_problems=20, n_kcs=1, prior=0.1, learn=0.01, forget=0.05, guess=0.2, slip=0.1, rng_seed=None, kc_sequence=None, frac=1.0)#

Simulate student problem responses under simple BKT model.

Generates synthetic dataset by sampling problem responses from a Bayesian Knowledge Tracing model with fixed parameters.

Parameters:

nStudents (int, default 10) – Number of students to simulate.
nProblems (int, default 20) – Number of problems to simulate.
nKcs (int, default 1) – Number of knowledge components (KCs).
prior (Union[float, Sequence[float]]) – Initial knowledge probability. Scalar broadcasted to all KCs or array of length nKcs.
learn (Union[float, Sequence[float]]) – Learning (mastery) probability. Scalar or array of length nKcs.
forget (Union[float, Sequence[float]]) – Forgetting probability. Scalar or array of length nKcs.
guess (Union[float, Sequence[float]]) – Guessing probability (correct response without knowledge). Scalar or array of length nKcs.
slip (Union[float, Sequence[float]]) – Slipping probability (incorrect response despite knowledge). Scalar or array of length nKcs.
rng_seed (int or None, optional) – Random seed for reproducibility.
kc_sequence (array-like of int or None, optional) – KC assignment for each problem. If None, randomly sampled.
frac (float, default 1.0) – Fraction of rows to include in the output dataset. This simulates missing data, or students not completing all problems, by randomly dropping rows after simulation.
n_students (int)
n_problems (int)
n_kcs (int)

Returns:

Simulated dataset with columns: student_id, problem_id, correct, kc_id.

Return type:

DataFrame

Raises:

ValueError – If parameter lengths do not match nKcs or if kc_sequence is invalid.