calibration
calibration
Define the calibration class
Classes
| Name | Description |
|---|---|
| Calibration | Customized STIsim calibration class. |
Calibration
calibration.Calibration(
sim,
calib_pars,
data=None,
weights=None,
extra_results=None,
save_results=False,
check_fn=None,
**kwargs,
)Customized STIsim calibration class.
Inherits all the functionality of the Starsim calibration class, but adds:
- A default build function that routes parameters using dot notation (e.g.
'hiv.beta_m2f') - A default evaluation function that uses the data provided in the constructor
- :meth:
get_parsfor extracting calibrated parameter sets
If no build_fn is provided, uses :func:default_build_fn which looks up modules via sim.get_module() and sets their parameters.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| sim | Sim | The simulation to calibrate | required |
| calib_pars | dict | Parameters to calibrate using dot notation, e.g. {'hiv.beta_m2f': dict(low=0.01, high=0.1, guess=0.05)} |
required |
| data | DataFrame |
Calibration targets with ‘time’ column + result columns | None |
| weights | dict | Optional weight multipliers per result | None |
| extra_results | list | Additional results to track beyond data columns | None |
| save_results | bool | Save sim results for each trial | False |
Examples::
sim = make_sim()
data = pd.read_csv('calibration_data.csv')
calib_pars = {
'hiv.beta_m2f': dict(low=0.01, high=0.1, guess=0.05),
'structuredsexual.prop_f0': dict(low=0.5, high=0.9, guess=0.8),
}
calib = sti.Calibration(sim=sim, calib_pars=calib_pars, data=data, total_trials=100)
calib.calibrate()
# Extract best parameters and run multi-sim
par_sets = calib.get_pars(n=200)
msim = sti.make_calib_sims(calib=calib, n_parsets=200)
Methods
| Name | Description |
|---|---|
| calibrate | Perform calibration with crash-recovery support. |
| get_pars | Extract top-N calibrated parameter sets as a list of flat dicts. |
| load_results | Load the results from the tmp files, tracking which loaded successfully |
| parse_study | Parse the study into a data frame – called automatically |
| run_trial | Define the objective for Optuna |
| save | Save calibration results. |
| shrink | Shrink the results to only the best fit |
| worker | Run a single worker, catching exceptions so one crash doesn’t kill all workers |
calibrate
calibration.Calibration.calibrate(calib_pars=None, **kwargs)Perform calibration with crash-recovery support.
If continue_db=True and the database already has completed trials, only the remaining trials will be run. This allows recovery from crashes by simply re-running the same command.
get_pars
calibration.Calibration.get_pars(n=None)Extract top-N calibrated parameter sets as a list of flat dicts.
Each dict maps 'module.par' keys to scalar values (metadata columns like index, mismatch, and rand_seed are stripped). The returned dicts can be passed directly to :func:set_sim_pars or :func:make_calib_sims.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| n | int | Number of top parameter sets. None returns all. |
None |
Returns
| Name | Type | Description |
|---|---|---|
| list[dict]: Parameter sets sorted by mismatch (best first). |
load_results
calibration.Calibration.load_results(study)Load the results from the tmp files, tracking which loaded successfully
parse_study
calibration.Calibration.parse_study(study)Parse the study into a data frame – called automatically
run_trial
calibration.Calibration.run_trial(trial)Define the objective for Optuna
save
calibration.Calibration.save(
filename,
shrink=True,
n_results=None,
save_pars=True,
pars_filename=None,
)Save calibration results.
Optionally shrinks to the top n_results before saving (default: keep top 10% of completed trials). Also saves the parameter DataFrame as a separate file.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| filename | str / Path |
Path for the saved calibration object. | required |
| shrink | bool | If True (default), shrink before saving. The full (unshrunk) object is saved with a _full suffix first. |
True |
| n_results | int | Number of top results to keep when shrinking. Default: len(self.df) // 10 (minimum 1). |
None |
| save_pars | bool | If True (default), also save calib.df. |
True |
| pars_filename | str / Path |
Path for the parameter DataFrame. If None, uses {stem}_pars.df next to filename. |
None |
Returns
| Name | Type | Description |
|---|---|---|
| The (possibly shrunk) calibration object. |
shrink
calibration.Calibration.shrink(n_results=100, make_df=True)Shrink the results to only the best fit
worker
calibration.Calibration.worker()Run a single worker, catching exceptions so one crash doesn’t kill all workers
Functions
| Name | Description |
|---|---|
| compute_gof | Calculate the goodness of fit. By default use normalized absolute error, but |
| default_build_fn | Default build function for STIsim calibration. |
| flatten_calib_pars | Normalize calibration parameters to flat dot-notation format. |
| make_calib_sims | Create and run simulations using calibrated parameters. |
| set_sim_pars | Set calibrated parameters on a sim. |
compute_gof
calibration.compute_gof(
actual,
predicted,
normalize=True,
use_frac=False,
use_squared=False,
as_scalar='none',
eps=1e-09,
skestimator=None,
estimator=None,
**kwargs,
)Calculate the goodness of fit. By default use normalized absolute error, but highly customizable. For example, mean squared error is equivalent to setting normalize=False, use_squared=True, as_scalar=‘mean’.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| actual | arr |
array of actual (data) points | required |
| predicted | arr |
corresponding array of predicted (model) points | required |
| normalize | bool | whether to divide the values by the largest value in either series | True |
| use_frac | bool | convert to fractional mismatches rather than absolute | False |
| use_squared | bool | square the mismatches | False |
| as_scalar | str | return as a scalar instead of a time series: choices are sum, mean, median | 'none' |
| eps | float | to avoid divide-by-zero | 1e-09 |
| skestimator | str | if provided, use this scikit-learn estimator instead | None |
| estimator | func |
if provided, use this custom estimator instead | None |
| kwargs | dict | passed to the scikit-learn or custom estimator | {} |
Returns
| Name | Type | Description |
|---|---|---|
| gofs | arr |
array of goodness-of-fit values, or a single value if as_scalar is True |
Examples::
x1 = np.cumsum(np.random.random(100))
x2 = np.cumsum(np.random.random(100))
e1 = compute_gof(x1, x2) # Default, normalized absolute error
e2 = compute_gof(x1, x2, normalize=False, use_frac=False) # Fractional error
e3 = compute_gof(x1, x2, normalize=False, use_squared=True, as_scalar='mean') # Mean squared error
e4 = compute_gof(x1, x2, skestimator='mean_squared_error') # Scikit-learn's MSE method
e5 = compute_gof(x1, x2, as_scalar='median') # Normalized median absolute error -- highly robust
default_build_fn
calibration.default_build_fn(sim, calib_pars, **kwargs)Default build function for STIsim calibration.
Routes calibration parameters to the correct sim module using dot notation:
- ``'hiv.beta_m2f'`` → ``sim.get_module('hiv').pars['beta_m2f']``
- ``'structuredsexual.prop_f0'`` → ``sim.get_module('structuredsexual').pars['prop_f0']``
- ``'hiv_syph.rel_sus_syph_hiv'`` → ``sim.get_module('hiv_syph').pars['rel_sus_syph_hiv']``
- ``'symp_algo.rel_test'`` → ``sim.get_module('symp_algo').pars['rel_test']``
All parameters are set on the uninitialized sim before sim.init() is called. This works because every module stores its pars dict immediately on construction.
rand_seed is handled specially: set_sim_pars skips it (it is in _META_KEYS), so it is applied directly to sim.pars here. This ensures that reseed=True in :class:Calibration actually uses a different seed per trial, and that the stored rand_seed column in calib.df can be used to reproduce each trial exactly.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| sim | Sim | An uninitialized simulation (modules must be instances, not strings) | required |
| calib_pars | dict | Calibration parameters with values set by the sampler | required |
Returns
| Name | Type | Description |
|---|---|---|
| Sim | The initialized and modified simulation |
Example::
calib_pars = {
'hiv.beta_m2f': dict(low=0.01, high=0.10, guess=0.035, value=0.05),
'structuredsexual.f1_conc': dict(low=0.005, high=0.3, guess=0.16, value=0.10),
'hiv_syph.rel_sus_syph_hiv': dict(low=1.0, high=4.0, guess=2.5, value=3.0),
'symp_algo.rel_test': dict(low=0.5, high=1.5, guess=1.0, value=1.2),
}
sim = default_build_fn(sim, calib_pars)
flatten_calib_pars
calibration.flatten_calib_pars(calib_pars)Normalize calibration parameters to flat dot-notation format.
Accepts two formats:
Nested (grouped by module)::
dict(hiv=dict(beta_m2f=dict(low=0.01, high=0.10)))
Flat (dot notation — returned unchanged)::
{'hiv.beta_m2f': dict(low=0.01, high=0.10)}
Nested format is detected when a value is a dict whose keys don’t overlap with spec keys (low, high, guess, etc.).
Returns
| Name | Type | Description |
|---|---|---|
| dict | Flat dict with 'module.par' keys. |
make_calib_sims
calibration.make_calib_sims(
calib=None,
calib_pars=None,
sim=None,
n_parsets=None,
check_fn=None,
seeds_per_par=1,
**kwargs,
)Create and run simulations using calibrated parameters.
Provide either a :class:Calibration object or a set of calibration parameters directly. The function creates one sim per parameter set (with optional seed replication), runs them in parallel, and optionally filters the results.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| calib | Calibration | A completed calibration. Extracts pars via calib.get_pars() and uses calib.sim as the base if sim is not provided. |
None |
| calib_pars | Parameter source — one of: - dict: single parameter set → 1 sim (× seeds_per_par) - list[dict]: N parameter sets → N sims - DataFrame: rows are parameter sets (like calib.df) |
None |
|
| sim | Sim | Base (uninitialized) simulation to copy. If None, uses calib.sim. |
None |
| n_parsets | int | Number of top parameter sets to use. None = all. |
None |
| check_fn | callable | Post-run filter — check_fn(sim) → bool. Sims returning False are dropped. If None and calib is provided, uses calib.check_fn. |
None |
| seeds_per_par | int | Random seeds per parameter set. When > 1, each par set is run with multiple seeds and only the first surviving seed (per check_fn) is kept. |
1 |
| **kwargs | Passed to ss.parallel(). |
{} |
Returns
| Name | Type | Description |
|---|---|---|
| ss.MultiSim: A MultiSim containing the completed simulations. |
Examples::
# From a Calibration object
msim = sti.make_calib_sims(calib=calib, n_parsets=200)
# From a saved parameters DataFrame
pars_df = sc.loadobj('results/pars.df')
msim = sti.make_calib_sims(calib_pars=pars_df, sim=make_sim(), n_parsets=50)
# Single parameter set with multiple seeds
msim = sti.make_calib_sims(
calib_pars={'hiv.beta_m2f': 0.05, 'syph.beta_m2f': 0.2},
sim=make_sim(), seeds_per_par=5,
)
# Different scenario with calibrated parameters
msim = sti.make_calib_sims(
calib_pars=pars_df, n_parsets=10, seeds_per_par=5,
sim=make_sim(scenario='intervention'),
check_fn=lambda s: float(np.sum(s.results.syph.new_infections[-60:])) > 0,
)
set_sim_pars
calibration.set_sim_pars(sim, pars)Set calibrated parameters on a sim.
All parameters are set directly on the module objects via sim.get_module(). This works on uninitialized sims because every module stores its pars dict immediately on construction.
Supports both dot notation ('hiv.beta_m2f') and legacy underscore format ('hiv_beta_m2f'). Legacy keys are matched greedily against the sim’s module names (longest match first).
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| sim | Sim | A simulation (modules must be instances, not strings) | required |
| pars | dict | Flat parameter dict, e.g. {'hiv.beta_m2f': 0.05, ...} |
required |
Returns
| Name | Type | Description |
|---|---|---|
| Sim | The same sim, modified in place |