stanbkt.utils.sim_simple_BKT#

stanbkt.utils.sim_simple_BKT(n_students=10, n_problems=20, n_kcs=1, prior=0.1, learn=0.01, forget=0.05, guess=0.2, slip=0.1, rng_seed=None, kc_sequence=None, frac=1.0)#

Simulate student problem responses under simple BKT model.

Generates synthetic dataset by sampling problem responses from a Bayesian Knowledge Tracing model with fixed parameters.

Parameters:
  • nStudents (int, default 10) – Number of students to simulate.

  • nProblems (int, default 20) – Number of problems to simulate.

  • nKcs (int, default 1) – Number of knowledge components (KCs).

  • prior (Union[float, Sequence[float]]) – Initial knowledge probability. Scalar broadcasted to all KCs or array of length nKcs.

  • learn (Union[float, Sequence[float]]) – Learning (mastery) probability. Scalar or array of length nKcs.

  • forget (Union[float, Sequence[float]]) – Forgetting probability. Scalar or array of length nKcs.

  • guess (Union[float, Sequence[float]]) – Guessing probability (correct response without knowledge). Scalar or array of length nKcs.

  • slip (Union[float, Sequence[float]]) – Slipping probability (incorrect response despite knowledge). Scalar or array of length nKcs.

  • rng_seed (int or None, optional) – Random seed for reproducibility.

  • kc_sequence (array-like of int or None, optional) – KC assignment for each problem. If None, randomly sampled.

  • frac (float, default 1.0) – Fraction of rows to include in the output dataset. This simulates missing data, or students not completing all problems, by randomly dropping rows after simulation.

  • n_students (int)

  • n_problems (int)

  • n_kcs (int)

Returns:

Simulated dataset with columns: student_id, problem_id, correct, kc_id.

Return type:

DataFrame

Raises:

ValueError – If parameter lengths do not match nKcs or if kc_sequence is invalid.