calibration

calibration

Define the calibration class

Classes

Name Description
Calibration Customized STIsim calibration class.

Calibration

calibration.Calibration(
    sim,
    calib_pars,
    data=None,
    weights=None,
    extra_results=None,
    save_results=False,
    check_fn=None,
    **kwargs,
)

Customized STIsim calibration class.

Inherits all the functionality of the Starsim calibration class, but adds:

  • A default build function that routes parameters using dot notation (e.g. 'hiv.beta_m2f')
  • A default evaluation function that uses the data provided in the constructor
  • :meth:get_pars for extracting calibrated parameter sets

If no build_fn is provided, uses :func:default_build_fn which looks up modules via sim.get_module() and sets their parameters.

Parameters

Name Type Description Default
sim Sim The simulation to calibrate required
calib_pars dict Parameters to calibrate using dot notation, e.g. {'hiv.beta_m2f': dict(low=0.01, high=0.1, guess=0.05)} required
data DataFrame Calibration targets with ‘time’ column + result columns None
weights dict Optional weight multipliers per result None
extra_results list Additional results to track beyond data columns None
save_results bool Save sim results for each trial False

Examples::

sim = make_sim()
data = pd.read_csv('calibration_data.csv')
calib_pars = {
    'hiv.beta_m2f': dict(low=0.01, high=0.1, guess=0.05),
    'structuredsexual.prop_f0': dict(low=0.5, high=0.9, guess=0.8),
}
calib = sti.Calibration(sim=sim, calib_pars=calib_pars, data=data, total_trials=100)
calib.calibrate()

# Extract best parameters and run multi-sim
par_sets = calib.get_pars(n=200)
msim = sti.make_calib_sims(calib=calib, n_parsets=200)

Methods

Name Description
calibrate Perform calibration with crash-recovery support.
get_pars Extract top-N calibrated parameter sets as a list of flat dicts.
load_results Load the results from the tmp files, tracking which loaded successfully
parse_study Parse the study into a data frame – called automatically
run_trial Define the objective for Optuna
save Save calibration results.
shrink Shrink the results to only the best fit
worker Run a single worker, catching exceptions so one crash doesn’t kill all workers
calibrate
calibration.Calibration.calibrate(calib_pars=None, **kwargs)

Perform calibration with crash-recovery support.

If continue_db=True and the database already has completed trials, only the remaining trials will be run. This allows recovery from crashes by simply re-running the same command.

get_pars
calibration.Calibration.get_pars(n=None)

Extract top-N calibrated parameter sets as a list of flat dicts.

Each dict maps 'module.par' keys to scalar values (metadata columns like index, mismatch, and rand_seed are stripped). The returned dicts can be passed directly to :func:set_sim_pars or :func:make_calib_sims.

Parameters
Name Type Description Default
n int Number of top parameter sets. None returns all. None
Returns
Name Type Description
list[dict]: Parameter sets sorted by mismatch (best first).
load_results
calibration.Calibration.load_results(study)

Load the results from the tmp files, tracking which loaded successfully

parse_study
calibration.Calibration.parse_study(study)

Parse the study into a data frame – called automatically

run_trial
calibration.Calibration.run_trial(trial)

Define the objective for Optuna

save
calibration.Calibration.save(
    filename,
    shrink=True,
    n_results=None,
    save_pars=True,
    pars_filename=None,
)

Save calibration results.

Optionally shrinks to the top n_results before saving (default: keep top 10% of completed trials). Also saves the parameter DataFrame as a separate file.

Parameters
Name Type Description Default
filename str / Path Path for the saved calibration object. required
shrink bool If True (default), shrink before saving. The full (unshrunk) object is saved with a _full suffix first. True
n_results int Number of top results to keep when shrinking. Default: len(self.df) // 10 (minimum 1). None
save_pars bool If True (default), also save calib.df. True
pars_filename str / Path Path for the parameter DataFrame. If None, uses {stem}_pars.df next to filename. None
Returns
Name Type Description
The (possibly shrunk) calibration object.
shrink
calibration.Calibration.shrink(n_results=100, make_df=True)

Shrink the results to only the best fit

worker
calibration.Calibration.worker()

Run a single worker, catching exceptions so one crash doesn’t kill all workers

Functions

Name Description
compute_gof Calculate the goodness of fit. By default use normalized absolute error, but
default_build_fn Default build function for STIsim calibration.
flatten_calib_pars Normalize calibration parameters to flat dot-notation format.
make_calib_sims Create and run simulations using calibrated parameters.
set_sim_pars Set calibrated parameters on a sim.

compute_gof

calibration.compute_gof(
    actual,
    predicted,
    normalize=True,
    use_frac=False,
    use_squared=False,
    as_scalar='none',
    eps=1e-09,
    skestimator=None,
    estimator=None,
    **kwargs,
)

Calculate the goodness of fit. By default use normalized absolute error, but highly customizable. For example, mean squared error is equivalent to setting normalize=False, use_squared=True, as_scalar=‘mean’.

Parameters

Name Type Description Default
actual arr array of actual (data) points required
predicted arr corresponding array of predicted (model) points required
normalize bool whether to divide the values by the largest value in either series True
use_frac bool convert to fractional mismatches rather than absolute False
use_squared bool square the mismatches False
as_scalar str return as a scalar instead of a time series: choices are sum, mean, median 'none'
eps float to avoid divide-by-zero 1e-09
skestimator str if provided, use this scikit-learn estimator instead None
estimator func if provided, use this custom estimator instead None
kwargs dict passed to the scikit-learn or custom estimator {}

Returns

Name Type Description
gofs arr array of goodness-of-fit values, or a single value if as_scalar is True

Examples::

x1 = np.cumsum(np.random.random(100))
x2 = np.cumsum(np.random.random(100))

e1 = compute_gof(x1, x2) # Default, normalized absolute error
e2 = compute_gof(x1, x2, normalize=False, use_frac=False) # Fractional error
e3 = compute_gof(x1, x2, normalize=False, use_squared=True, as_scalar='mean') # Mean squared error
e4 = compute_gof(x1, x2, skestimator='mean_squared_error') # Scikit-learn's MSE method
e5 = compute_gof(x1, x2, as_scalar='median') # Normalized median absolute error -- highly robust

default_build_fn

calibration.default_build_fn(sim, calib_pars, **kwargs)

Default build function for STIsim calibration.

Routes calibration parameters to the correct sim module using dot notation:

- ``'hiv.beta_m2f'``              → ``sim.get_module('hiv').pars['beta_m2f']``
- ``'structuredsexual.prop_f0'``   → ``sim.get_module('structuredsexual').pars['prop_f0']``
- ``'hiv_syph.rel_sus_syph_hiv'``  → ``sim.get_module('hiv_syph').pars['rel_sus_syph_hiv']``
- ``'symp_algo.rel_test'``         → ``sim.get_module('symp_algo').pars['rel_test']``

All parameters are set on the uninitialized sim before sim.init() is called. This works because every module stores its pars dict immediately on construction.

rand_seed is handled specially: set_sim_pars skips it (it is in _META_KEYS), so it is applied directly to sim.pars here. This ensures that reseed=True in :class:Calibration actually uses a different seed per trial, and that the stored rand_seed column in calib.df can be used to reproduce each trial exactly.

Parameters

Name Type Description Default
sim Sim An uninitialized simulation (modules must be instances, not strings) required
calib_pars dict Calibration parameters with values set by the sampler required

Returns

Name Type Description
Sim The initialized and modified simulation

Example::

calib_pars = {
    'hiv.beta_m2f':               dict(low=0.01, high=0.10, guess=0.035, value=0.05),
    'structuredsexual.f1_conc':    dict(low=0.005, high=0.3, guess=0.16, value=0.10),
    'hiv_syph.rel_sus_syph_hiv':   dict(low=1.0, high=4.0, guess=2.5, value=3.0),
    'symp_algo.rel_test':          dict(low=0.5, high=1.5, guess=1.0, value=1.2),
}
sim = default_build_fn(sim, calib_pars)

flatten_calib_pars

calibration.flatten_calib_pars(calib_pars)

Normalize calibration parameters to flat dot-notation format.

Accepts two formats:

Nested (grouped by module)::

dict(hiv=dict(beta_m2f=dict(low=0.01, high=0.10)))

Flat (dot notation — returned unchanged)::

{'hiv.beta_m2f': dict(low=0.01, high=0.10)}

Nested format is detected when a value is a dict whose keys don’t overlap with spec keys (low, high, guess, etc.).

Returns

Name Type Description
dict Flat dict with 'module.par' keys.

make_calib_sims

calibration.make_calib_sims(
    calib=None,
    calib_pars=None,
    sim=None,
    n_parsets=None,
    check_fn=None,
    seeds_per_par=1,
    **kwargs,
)

Create and run simulations using calibrated parameters.

Provide either a :class:Calibration object or a set of calibration parameters directly. The function creates one sim per parameter set (with optional seed replication), runs them in parallel, and optionally filters the results.

Parameters

Name Type Description Default
calib Calibration A completed calibration. Extracts pars via calib.get_pars() and uses calib.sim as the base if sim is not provided. None
calib_pars Parameter source — one of: - dict: single parameter set → 1 sim (× seeds_per_par) - list[dict]: N parameter sets → N sims - DataFrame: rows are parameter sets (like calib.df) None
sim Sim Base (uninitialized) simulation to copy. If None, uses calib.sim. None
n_parsets int Number of top parameter sets to use. None = all. None
check_fn callable Post-run filter — check_fn(sim) → bool. Sims returning False are dropped. If None and calib is provided, uses calib.check_fn. None
seeds_per_par int Random seeds per parameter set. When > 1, each par set is run with multiple seeds and only the first surviving seed (per check_fn) is kept. 1
**kwargs Passed to ss.parallel(). {}

Returns

Name Type Description
ss.MultiSim: A MultiSim containing the completed simulations.

Examples::

# From a Calibration object
msim = sti.make_calib_sims(calib=calib, n_parsets=200)

# From a saved parameters DataFrame
pars_df = sc.loadobj('results/pars.df')
msim = sti.make_calib_sims(calib_pars=pars_df, sim=make_sim(), n_parsets=50)

# Single parameter set with multiple seeds
msim = sti.make_calib_sims(
    calib_pars={'hiv.beta_m2f': 0.05, 'syph.beta_m2f': 0.2},
    sim=make_sim(), seeds_per_par=5,
)

# Different scenario with calibrated parameters
msim = sti.make_calib_sims(
    calib_pars=pars_df, n_parsets=10, seeds_per_par=5,
    sim=make_sim(scenario='intervention'),
    check_fn=lambda s: float(np.sum(s.results.syph.new_infections[-60:])) > 0,
)

set_sim_pars

calibration.set_sim_pars(sim, pars)

Set calibrated parameters on a sim.

All parameters are set directly on the module objects via sim.get_module(). This works on uninitialized sims because every module stores its pars dict immediately on construction.

Supports both dot notation ('hiv.beta_m2f') and legacy underscore format ('hiv_beta_m2f'). Legacy keys are matched greedily against the sim’s module names (longest match first).

Parameters

Name Type Description Default
sim Sim A simulation (modules must be instances, not strings) required
pars dict Flat parameter dict, e.g. {'hiv.beta_m2f': 0.05, ...} required

Returns

Name Type Description
Sim The same sim, modified in place