stanbkt.fits#
Classes
|
Base dataclass for typed Stan fit options. |
|
Base class for StanBKT fits. |
Factory for creating fit classes and options based on fit method. |
|
|
Root metadata for persisted fits. |
|
Enumeration of supported fitting methods. |
|
Fit class using Markov Chain Monte Carlo (MCMC) sampling. |
|
Common options for |
|
Fit class using Maximum Likelihood Estimation (MLE) / Optimization. |
|
Common options for |
|
Common options for |
|
Fit class using Pathfinder variational approximation. |
|
Fit class using Variational Bayes (VB) approximation. |
|
Common options for |
API Details
BKT Fit results and options for different inference methods.
This module defines the core fit result classes for various inference methods (MCMC, MLE, VB, Pathfinder) and their associated options. It also includes a factory for creating fit instances based on user-specified options.
- class stanbkt.fits.BaseFitOptions(seed=None, extra_kwargs=<factory>)#
Bases:
objectBase dataclass for typed Stan fit options.
- Parameters:
Notes
Subclasses can add strongly typed fields and use
extra_kwargsfor less common or future CmdStanPy options.- classmethod from_dict(d)#
Create fit options from a dictionary.
Known dataclass fields are extracted and used for instantiation. Remaining keys are stored in
extra_kwargsfor CmdStanPy.
- class stanbkt.fits.FitBase(verbose=VerbosityLevel.INFO, fits=None, fit_metadata=None, cache_summary=True, summary_percentiles=(2.5, 97.5), _summary_cache=None)#
Bases:
VerboseMixin,ABCBase class for StanBKT fits.
This class provides shared fit state management and delegates all persistence to
stanbkt.fits.persistence.- Variables:
fits (dict[str, CmdStanFit]) – Mapping of knowledge component IDs to CmdStan fit objects.
num_fitted_kcs (int) – Number of knowledge components that have been fitted.
_fit_metadata (FitMetadata) – Metadata used to resolve persisted fit folders.
_summary_cache (dict[str, pd.DataFrame]) – Cached summary DataFrames for each knowledge component.
_summary_percentiles (tuple[float, float], default (2.5, 97.5)) – Percentiles used for generating summary statistics. Values should be in range [1, 99].
- Parameters:
verbose (VerbosityLevel)
fit_metadata (FitMetadata | None)
cache_summary (bool)
- add_fit(kc, fit, overwrite_kcs=False, group2index=None, groups=None)#
Add a fit for a knowledge component to the model’s fit state.
- Parameters:
kc (
str) – Knowledge component identifier.fit (
Union[CmdStanMCMC,CmdStanMLE,CmdStanVB,CmdStanPathfinder]) – CmdStan fit object to add for the KC.overwrite_kcs (
bool) – Whether to overwrite existing fits for KCs that are being added again.group2index (
dict[str,int] |None) – Optional mapping from group ID to 1-based index used for this KC’s fit.groups (
set[str] |None) – Optional set of group IDs used for this KC’s fit.
- Raises:
ValueError – If the fit’s method is incompatible with the model’s fit method, or if a fit for the KC already exists and
overwrite_kcs=False.- Return type:
- get_fit(kc)#
Get the fit for a knowledge component.
- Parameters:
kc (
str) – Knowledge component identifier.- Returns:
CmdStan fit object for the specified KC.
- Return type:
Union[CmdStanMCMC,CmdStanMLE,CmdStanVB,CmdStanPathfinder]- Raises:
KeyError – If no fit exists for the specified KC.
- get_fit_save_entry(kc)#
Return persisted fit metadata entry for a KC, if available.
- has_kc(kc)#
Check if a fit exists for a knowledge component.
- log(msg, level=VerbosityLevel.INFO)#
Log a message if verbosity level permits.
- Parameters:
msg (
str) – Message to log.level (
VerbosityLevel) – Verbosity level of this message. Message is printed if self.verbose >= level. Lower enum values = higher verbosity.
- set_verbosity(level)#
Set the verbosity level for logging.
- Parameters:
level (
VerbosityLevel) – New verbosity level.- Raises:
ValueError – If level is not a valid VerbosityLevel.
- class stanbkt.fits.FitFactory#
Bases:
objectFactory for creating fit classes and options based on fit method.
- FIT_METHOD_TO_OPTION_MAPPING: dict[FitMethod, type[MCMCFitOptions | VBFitOptions | MLEFitOptions | PFFitOptions]] = {FitMethod.MCMC: <class 'stanbkt.fits.fit_options.MCMCFitOptions'>, FitMethod.MLE: <class 'stanbkt.fits.fit_options.MLEFitOptions'>, FitMethod.PATHFINDER: <class 'stanbkt.fits.fit_options.PFFitOptions'>, FitMethod.VB: <class 'stanbkt.fits.fit_options.VBFitOptions'>}#
- static create_default_fit_options(fit_method)#
Get default fit options for a given fit method.
- Parameters:
fit_method (
FitMethod) – Fit method for which to get default options.- Returns:
Default fit options for the specified method.
- Return type:
Union[MCMCFitOptions,VBFitOptions,MLEFitOptions,PFFitOptions]- Raises:
ValueError – If fit method is unsupported.
- static create_fit_options_from_dict(fit_option_dict, fit_method)#
Create fit options from a dictionary for a given fit method.
- Parameters:
- Returns:
Fit options instance created from the dictionary.
- Return type:
Union[MCMCFitOptions,VBFitOptions,MLEFitOptions,PFFitOptions]- Raises:
ValueError – If fit method is unsupported.
- static get_fit_class_from_method(fit_method)#
Get the expected CmdStan fit class for this fit method.
- Returns:
Expected CmdStan fit class corresponding to this fit method.
- Return type:
- Raises:
ValueError – If fit method is unsupported.
- Parameters:
fit_method (FitMethod)
- static verify_fit_options_compatibility(fit_options, fit_method, cpp_compile_kwargs={})#
Verify that provided fit options are compatible with the specified fit method.
- Parameters:
fit_options (
Union[MCMCFitOptions,VBFitOptions,MLEFitOptions,PFFitOptions]) – Fit options to verify.fit_method (
FitMethod) – Fit method for which to verify compatibility.cpp_compile_kwargs (
dict) – C++ compile kwargs to check for compatibility with fit options (e.g. for MCMC multi-threading), by default {}
- Raises:
TypeError – If fit options are not compatible with the specified fit method.
ValueError – If fit method is unsupported.
- Return type:
- class stanbkt.fits.FitMetadata(fit_method, fit_saves=<factory>, summary_percentiles=(2.5, 97.5))#
Bases:
objectRoot metadata for persisted fits.
- Variables:
fit_method (FitMethod) – Method used to fit all attached KCs.
fit_saves (FitSaves) – Saved fit folder entries, keyed by knowledge component identifier.
summary_percentiles (tuple[float, float], default (2.5, 97.5)) – Lower and upper percentiles used when computing summary statistics. Values should be in range [1, 99]. Persisted so that cached summaries remain valid after a save/load round-trip.
- Parameters:
- class stanbkt.fits.FitMethod(*values)#
Bases:
StrEnumEnumeration of supported fitting methods.
- Variables:
- MCMC = 'mcmc'#
- MLE = 'mle'#
- PATHFINDER = 'pathfinder'#
- VB = 'vb'#
- static infer_fit_method_from_stan_fit(fit)#
Infer the fit method from a CmdStan fit object.
- Parameters:
fit (
Union[CmdStanMCMC,CmdStanMLE,CmdStanVB,CmdStanPathfinder]) – Fit object created by CmdStanPy.- Returns:
Inferred fit method enum value.
- Return type:
- Raises:
ValueError – If
fittype is unsupported.
- class stanbkt.fits.MCMCFit(verbose=VerbosityLevel.INFO, fits=None, fit_metadata=None, cache_summary=True, summary_percentiles=(2.5, 97.5), _summary_cache=None)#
Bases:
FitBaseFit class using Markov Chain Monte Carlo (MCMC) sampling.
This class wraps CmdStanPy’s MCMC sampler to fit BKT models using full Bayesian inference via Hamiltonian Monte Carlo sampling.
Inherits all state management from
BaseFit.- Parameters:
verbose (VerbosityLevel)
fit_metadata (FitMetadata | None)
cache_summary (bool)
- add_fit(kc, fit, overwrite_kcs=False, group2index=None, groups=None)#
Add a fit for a knowledge component to the model’s fit state.
- Parameters:
kc (
str) – Knowledge component identifier.fit (
Union[CmdStanMCMC,CmdStanMLE,CmdStanVB,CmdStanPathfinder]) – CmdStan fit object to add for the KC.overwrite_kcs (
bool) – Whether to overwrite existing fits for KCs that are being added again.group2index (
dict[str,int] |None) – Optional mapping from group ID to 1-based index used for this KC’s fit.groups (
set[str] |None) – Optional set of group IDs used for this KC’s fit.
- Raises:
ValueError – If the fit’s method is incompatible with the model’s fit method, or if a fit for the KC already exists and
overwrite_kcs=False.- Return type:
- diagnose()#
Generate MCMC diagnostic information.
Provides convergence diagnostics (Rhat, effective sample size, etc.) for the fitted MCMC chains. This method is not yet implemented.
- get_fit(kc)#
Get the fit for a knowledge component.
- Parameters:
kc (
str) – Knowledge component identifier.- Returns:
CmdStan fit object for the specified KC.
- Return type:
Union[CmdStanMCMC,CmdStanMLE,CmdStanVB,CmdStanPathfinder]- Raises:
KeyError – If no fit exists for the specified KC.
- get_fit_save_entry(kc)#
Return persisted fit metadata entry for a KC, if available.
- has_kc(kc)#
Check if a fit exists for a knowledge component.
- log(msg, level=VerbosityLevel.INFO)#
Log a message if verbosity level permits.
- Parameters:
msg (
str) – Message to log.level (
VerbosityLevel) – Verbosity level of this message. Message is printed if self.verbose >= level. Lower enum values = higher verbosity.
- set_verbosity(level)#
Set the verbosity level for logging.
- Parameters:
level (
VerbosityLevel) – New verbosity level.- Raises:
ValueError – If level is not a valid VerbosityLevel.
- class stanbkt.fits.MCMCFitOptions(seed=None, extra_kwargs=<factory>, chains=4, parallel_chains=4, threads_per_chain=1, iter_warmup=1000, iter_sampling=1000, save_warmup=None, thin=None, adapt_delta=None, max_treedepth=None, show_progress=True, show_console=False)#
Bases:
BaseFitOptionsCommon options for
cmdstanpy.CmdStanModel.sample().- Parameters:
chains (
int) – Number of Markov chains.parallel_chains (
int) – Number of chains to run in parallel.threads_per_chain (
int) – Number of threads used per chain.iter_warmup (
int) – Warmup iterations per chain.iter_sampling (
int) – Sampling iterations per chain.seed (
int|list[int] |None) – RNG seed (single seed or one seed per chain).adapt_delta (
float|None) – Target acceptance statistic for NUTS adaptation.show_progress (
bool) – Whether to show sampling progress.show_console (
bool) – Whether to stream CmdStan console output.
- classmethod from_dict(d)#
Create fit options from a dictionary.
Known dataclass fields are extracted and used for instantiation. Remaining keys are stored in
extra_kwargsfor CmdStanPy.
- class stanbkt.fits.MLEFit(verbose=VerbosityLevel.INFO, fits=None, fit_metadata=None, cache_summary=True, summary_percentiles=(2.5, 97.5), _summary_cache=None)#
Bases:
FitBaseFit class using Maximum Likelihood Estimation (MLE) / Optimization.
This class wraps CmdStanPy’s optimization algorithm to fit BKT models by finding point estimates that maximize the likelihood function.
Inherits all state management from
BaseFit.- Parameters:
verbose (VerbosityLevel)
fit_metadata (FitMetadata | None)
cache_summary (bool)
- add_fit(kc, fit, overwrite_kcs=False, group2index=None, groups=None)#
Add a fit for a knowledge component to the model’s fit state.
- Parameters:
kc (
str) – Knowledge component identifier.fit (
Union[CmdStanMCMC,CmdStanMLE,CmdStanVB,CmdStanPathfinder]) – CmdStan fit object to add for the KC.overwrite_kcs (
bool) – Whether to overwrite existing fits for KCs that are being added again.group2index (
dict[str,int] |None) – Optional mapping from group ID to 1-based index used for this KC’s fit.groups (
set[str] |None) – Optional set of group IDs used for this KC’s fit.
- Raises:
ValueError – If the fit’s method is incompatible with the model’s fit method, or if a fit for the KC already exists and
overwrite_kcs=False.- Return type:
- get_fit(kc)#
Get the fit for a knowledge component.
- Parameters:
kc (
str) – Knowledge component identifier.- Returns:
CmdStan fit object for the specified KC.
- Return type:
Union[CmdStanMCMC,CmdStanMLE,CmdStanVB,CmdStanPathfinder]- Raises:
KeyError – If no fit exists for the specified KC.
- get_fit_save_entry(kc)#
Return persisted fit metadata entry for a KC, if available.
- has_kc(kc)#
Check if a fit exists for a knowledge component.
- log(msg, level=VerbosityLevel.INFO)#
Log a message if verbosity level permits.
- Parameters:
msg (
str) – Message to log.level (
VerbosityLevel) – Verbosity level of this message. Message is printed if self.verbose >= level. Lower enum values = higher verbosity.
- set_verbosity(level)#
Set the verbosity level for logging.
- Parameters:
level (
VerbosityLevel) – New verbosity level.- Raises:
ValueError – If level is not a valid VerbosityLevel.
- class stanbkt.fits.MLEFitOptions(seed=None, extra_kwargs=<factory>, algorithm=None, iter=2000, jacobian=False, tol_obj=None, tol_rel_obj=None, tol_grad=None, tol_rel_grad=None, tol_param=None, history_size=None)#
Bases:
BaseFitOptionsCommon options for
cmdstanpy.CmdStanModel.optimize().- Parameters:
algorithm (
str|None) – Optimization algorithm (for example,"lbfgs","bfgs", or"newton").iter (
int) – Maximum number of optimization iterations.jacobian (
bool) – Whether to include Jacobian adjustment for constrained parameters.tol_obj (
float|None) – Convergence tolerance on changes in objective function value.tol_rel_obj (
float|None) – Convergence tolerance on relative changes in objective function value.tol_grad (
float|None) – Convergence tolerance on the norm of the gradient.tol_rel_grad (
float|None) – Convergence tolerance on the relative norm of the gradient.tol_param (
float|None) – Convergence tolerance on changes in parameter value.history_size (
int|None) – History size for the L-BFGS Hessian approximation. Values of 5–10 are usually sufficient; must be less than the parameter-space dimensionality.
- classmethod from_dict(d)#
Create fit options from a dictionary.
Known dataclass fields are extracted and used for instantiation. Remaining keys are stored in
extra_kwargsfor CmdStanPy.
- to_dict()#
Convert options to a CmdStanPy kwargs dictionary.
Nonevalues are removed so CmdStanPy can apply its own defaults.
- class stanbkt.fits.PFFitOptions(seed=None, extra_kwargs=<factory>, init_alpha=None, tol_obj=None, tol_rel_obj=None, tol_grad=None, tol_rel_grad=None, tol_param=None, history_size=None, num_paths=None, max_lbfgs_iters=None, draws=None, num_elbo_draws=None, psis_resample=True, calculate_lp=True, inits=None, show_console=False, refresh=None, num_threads=None)#
Bases:
BaseFitOptionsCommon options for
cmdstanpy.CmdStanModel.pathfinder().- Parameters:
init_alpha (
float|None) – Initial step size parameter for Pathfinder.tol_obj (
float|None) – Absolute tolerance on the objective value.tol_rel_obj (
float|None) – Relative tolerance on the objective value.tol_grad (
float|None) – Absolute tolerance on the gradient norm.tol_rel_grad (
float|None) – Relative tolerance on the gradient norm.history_size (
int|None) – History size used by the underlying L-BFGS optimizer.num_paths (
int|None) – Number of Pathfinder optimization paths.max_lbfgs_iters (
int|None) – Maximum number of L-BFGS iterations per path.draws (
int|None) – Number of draws returned from the Pathfinder approximation.num_elbo_draws (
int|None) – Number of draws used for ELBO estimation.psis_resample (
bool) – Whether to use PSIS resampling for returned draws.calculate_lp (
bool) – Whether to calculate log probability values for draws.inits (
dict[str,float] |float|PathLike|str|None) – Initial parameter values or path to an initialization file.show_console (
bool) – Whether to stream CmdStan console output.refresh (
int|None) – Frequency of progress messages written by CmdStan.num_threads (
int|None) – Number of threads available to the CmdStan process.
Notes
This dataclass intentionally exposes only a small set of commonly used Pathfinder arguments. Less common options can be supplied through
extra_kwargs.- classmethod from_dict(d)#
Create fit options from a dictionary.
Known dataclass fields are extracted and used for instantiation. Remaining keys are stored in
extra_kwargsfor CmdStanPy.
- to_dict()#
Convert options to a CmdStanPy kwargs dictionary.
Nonevalues are removed so CmdStanPy can apply its own defaults.
- class stanbkt.fits.PathfinderFit(verbose=VerbosityLevel.INFO, fits=None, fit_metadata=None, cache_summary=True, summary_percentiles=(2.5, 97.5), _summary_cache=None)#
Bases:
FitBaseFit class using Pathfinder variational approximation.
This class wraps CmdStanPy’s Pathfinder algorithm to fit BKT models using a fast variational approximation that explores the posterior geometry.
Inherits all state management from
BaseFit.- Parameters:
verbose (VerbosityLevel)
fit_metadata (FitMetadata | None)
cache_summary (bool)
- add_fit(kc, fit, overwrite_kcs=False, group2index=None, groups=None)#
Add a fit for a knowledge component to the model’s fit state.
- Parameters:
kc (
str) – Knowledge component identifier.fit (
Union[CmdStanMCMC,CmdStanMLE,CmdStanVB,CmdStanPathfinder]) – CmdStan fit object to add for the KC.overwrite_kcs (
bool) – Whether to overwrite existing fits for KCs that are being added again.group2index (
dict[str,int] |None) – Optional mapping from group ID to 1-based index used for this KC’s fit.groups (
set[str] |None) – Optional set of group IDs used for this KC’s fit.
- Raises:
ValueError – If the fit’s method is incompatible with the model’s fit method, or if a fit for the KC already exists and
overwrite_kcs=False.- Return type:
- get_fit(kc)#
Get the fit for a knowledge component.
- Parameters:
kc (
str) – Knowledge component identifier.- Returns:
CmdStan fit object for the specified KC.
- Return type:
Union[CmdStanMCMC,CmdStanMLE,CmdStanVB,CmdStanPathfinder]- Raises:
KeyError – If no fit exists for the specified KC.
- get_fit_save_entry(kc)#
Return persisted fit metadata entry for a KC, if available.
- has_kc(kc)#
Check if a fit exists for a knowledge component.
- log(msg, level=VerbosityLevel.INFO)#
Log a message if verbosity level permits.
- Parameters:
msg (
str) – Message to log.level (
VerbosityLevel) – Verbosity level of this message. Message is printed if self.verbose >= level. Lower enum values = higher verbosity.
- set_verbosity(level)#
Set the verbosity level for logging.
- Parameters:
level (
VerbosityLevel) – New verbosity level.- Raises:
ValueError – If level is not a valid VerbosityLevel.
- class stanbkt.fits.VBFit(verbose=VerbosityLevel.INFO, fits=None, fit_metadata=None, cache_summary=True, summary_percentiles=(2.5, 97.5), _summary_cache=None)#
Bases:
FitBaseFit class using Variational Bayes (VB) approximation.
This class wraps CmdStanPy’s variational Bayes algorithm to fit BKT models using fast approximate posterior inference.
Inherits all state management from
BaseFit.- Parameters:
verbose (VerbosityLevel)
fit_metadata (FitMetadata | None)
cache_summary (bool)
- add_fit(kc, fit, overwrite_kcs=False, group2index=None, groups=None)#
Add a fit for a knowledge component to the model’s fit state.
- Parameters:
kc (
str) – Knowledge component identifier.fit (
Union[CmdStanMCMC,CmdStanMLE,CmdStanVB,CmdStanPathfinder]) – CmdStan fit object to add for the KC.overwrite_kcs (
bool) – Whether to overwrite existing fits for KCs that are being added again.group2index (
dict[str,int] |None) – Optional mapping from group ID to 1-based index used for this KC’s fit.groups (
set[str] |None) – Optional set of group IDs used for this KC’s fit.
- Raises:
ValueError – If the fit’s method is incompatible with the model’s fit method, or if a fit for the KC already exists and
overwrite_kcs=False.- Return type:
- get_fit(kc)#
Get the fit for a knowledge component.
- Parameters:
kc (
str) – Knowledge component identifier.- Returns:
CmdStan fit object for the specified KC.
- Return type:
Union[CmdStanMCMC,CmdStanMLE,CmdStanVB,CmdStanPathfinder]- Raises:
KeyError – If no fit exists for the specified KC.
- get_fit_save_entry(kc)#
Return persisted fit metadata entry for a KC, if available.
- has_kc(kc)#
Check if a fit exists for a knowledge component.
- log(msg, level=VerbosityLevel.INFO)#
Log a message if verbosity level permits.
- Parameters:
msg (
str) – Message to log.level (
VerbosityLevel) – Verbosity level of this message. Message is printed if self.verbose >= level. Lower enum values = higher verbosity.
- set_verbosity(level)#
Set the verbosity level for logging.
- Parameters:
level (
VerbosityLevel) – New verbosity level.- Raises:
ValueError – If level is not a valid VerbosityLevel.
- class stanbkt.fits.VBFitOptions(seed=None, extra_kwargs=<factory>, algorithm='meanfield', iter=None, grad_samples=1, elbo_samples=None, eta=None, draws=None, require_converged=True)#
Bases:
BaseFitOptionsCommon options for
cmdstanpy.CmdStanModel.variational().- Parameters:
algorithm (
str) – Variational algorithm (for example,"meanfield"or"fullrank").grad_samples (
int|None) – Number of Monte Carlo gradient samples.elbo_samples (
int|None) – Number of Monte Carlo ELBO samples.draws (
int|None) – Number of approximate posterior draws to save.require_converged (bool)
- classmethod from_dict(d)#
Create fit options from a dictionary.
Known dataclass fields are extracted and used for instantiation. Remaining keys are stored in
extra_kwargsfor CmdStanPy.