calibration

calibration

Define the calibration class

Classes

Name	Description
Calibration	Customized STIsim calibration class.

Calibration

calibration.Calibration(
    sim,
    calib_pars,
    data=None,
    weights=None,
    extra_results=None,
    save_results=False,
    check_fn=None,
    **kwargs,
)

Customized STIsim calibration class.

Inherits all the functionality of the Starsim calibration class, but adds:

A default build function that routes parameters using dot notation (e.g. 'hiv.beta_m2f')
A default evaluation function that uses the data provided in the constructor
:meth:get_pars for extracting calibrated parameter sets

If no build_fn is provided, uses :func:default_build_fn which looks up modules via sim.get_module() and sets their parameters.

Parameters

Name	Type	Description	Default
sim	Sim	The simulation to calibrate	required
calib_pars	dict	Parameters to calibrate using dot notation, e.g. `{'hiv.beta_m2f': dict(low=0.01, high=0.1, guess=0.05)}`	required
data	`DataFrame`	Calibration targets with ‘time’ column + result columns	`None`
weights	dict	Optional weight multipliers per result	`None`
extra_results	list	Additional results to track beyond data columns	`None`
save_results	bool	Save sim results for each trial	`False`

Examples::

sim = make_sim()
data = pd.read_csv('calibration_data.csv')
calib_pars = {
    'hiv.beta_m2f': dict(low=0.01, high=0.1, guess=0.05),
    'structuredsexual.prop_f0': dict(low=0.5, high=0.9, guess=0.8),
}
calib = sti.Calibration(sim=sim, calib_pars=calib_pars, data=data, total_trials=100)
calib.calibrate()

# Extract best parameters and run multi-sim
par_sets = calib.get_pars(n=200)
msim = sti.make_calib_sims(calib=calib, n_parsets=200)

Methods

Name	Description
calibrate	Perform calibration with crash-recovery support.
get_pars	Extract top-N calibrated parameter sets as a list of flat dicts.
load_results	Load the results from the tmp files, tracking which loaded successfully
parse_study	Parse the study into a data frame – called automatically
run_trial	Define the objective for Optuna
save	Save calibration results.
shrink	Shrink the results to only the best fit
worker	Run a single worker, catching exceptions so one crash doesn’t kill all workers

calibrate

calibration.Calibration.calibrate(calib_pars=None, **kwargs)

Perform calibration with crash-recovery support.

If continue_db=True and the database already has completed trials, only the remaining trials will be run. This allows recovery from crashes by simply re-running the same command.

get_pars

calibration.Calibration.get_pars(n=None)

Extract top-N calibrated parameter sets as a list of flat dicts.

Each dict maps 'module.par' keys to scalar values (metadata columns like index, mismatch, and rand_seed are stripped). The returned dicts can be passed directly to :func:set_sim_pars or :func:make_calib_sims.

Parameters

Name	Type	Description	Default
n	int	Number of top parameter sets. `None` returns all.	`None`

Returns

Name	Type	Description
		list[dict]: Parameter sets sorted by mismatch (best first).

load_results

calibration.Calibration.load_results(study)

Load the results from the tmp files, tracking which loaded successfully

parse_study

calibration.Calibration.parse_study(study)

Parse the study into a data frame – called automatically

run_trial

calibration.Calibration.run_trial(trial)

Define the objective for Optuna

save

calibration.Calibration.save(
    filename,
    shrink=True,
    n_results=None,
    save_pars=True,
    pars_filename=None,
)

Save calibration results.

Optionally shrinks to the top n_results before saving (default: keep top 10% of completed trials). Also saves the parameter DataFrame as a separate file.

Parameters

Name	Type	Description	Default
filename	str / `Path`	Path for the saved calibration object.	required
shrink	bool	If True (default), shrink before saving. The full (unshrunk) object is saved with a `_full` suffix first.	`True`
n_results	int	Number of top results to keep when shrinking. Default: `len(self.df) // 10` (minimum 1).	`None`
save_pars	bool	If True (default), also save `calib.df`.	`True`
pars_filename	str / `Path`	Path for the parameter DataFrame. If None, uses `{stem}_pars.df` next to filename.	`None`

Returns

Name	Type	Description
		The (possibly shrunk) calibration object.

shrink

calibration.Calibration.shrink(n_results=100, make_df=True)

Shrink the results to only the best fit

worker

calibration.Calibration.worker()

Run a single worker, catching exceptions so one crash doesn’t kill all workers

Functions

Name	Description
compute_gof	Calculate the goodness of fit. By default use normalized absolute error, but
default_build_fn	Default build function for STIsim calibration.
flatten_calib_pars	Normalize calibration parameters to flat dot-notation format.
make_calib_sims	Create and run simulations using calibrated parameters.
set_sim_pars	Set calibrated parameters on a sim.

compute_gof

calibration.compute_gof(
    actual,
    predicted,
    normalize=True,
    use_frac=False,
    use_squared=False,
    as_scalar='none',
    eps=1e-09,
    skestimator=None,
    estimator=None,
    **kwargs,
)

Calculate the goodness of fit. By default use normalized absolute error, but highly customizable. For example, mean squared error is equivalent to setting normalize=False, use_squared=True, as_scalar=‘mean’.

Parameters

Name	Type	Description	Default
actual	`arr`	array of actual (data) points	required
predicted	`arr`	corresponding array of predicted (model) points	required
normalize	bool	whether to divide the values by the largest value in either series	`True`
use_frac	bool	convert to fractional mismatches rather than absolute	`False`
use_squared	bool	square the mismatches	`False`
as_scalar	str	return as a scalar instead of a time series: choices are sum, mean, median	`'none'`
eps	float	to avoid divide-by-zero	`1e-09`
skestimator	str	if provided, use this scikit-learn estimator instead	`None`
estimator	`func`	if provided, use this custom estimator instead	`None`
kwargs	dict	passed to the scikit-learn or custom estimator	`{}`

Returns

Name	Type	Description
gofs	`arr`	array of goodness-of-fit values, or a single value if as_scalar is True

Examples::

x1 = np.cumsum(np.random.random(100))
x2 = np.cumsum(np.random.random(100))

e1 = compute_gof(x1, x2) # Default, normalized absolute error
e2 = compute_gof(x1, x2, normalize=False, use_frac=False) # Fractional error
e3 = compute_gof(x1, x2, normalize=False, use_squared=True, as_scalar='mean') # Mean squared error
e4 = compute_gof(x1, x2, skestimator='mean_squared_error') # Scikit-learn's MSE method
e5 = compute_gof(x1, x2, as_scalar='median') # Normalized median absolute error -- highly robust

default_build_fn

calibration.default_build_fn(sim, calib_pars, **kwargs)

Default build function for STIsim calibration.

Routes calibration parameters to the correct sim module using dot notation:

- ``'hiv.beta_m2f'``              → ``sim.get_module('hiv').pars['beta_m2f']``
- ``'structuredsexual.prop_f0'``   → ``sim.get_module('structuredsexual').pars['prop_f0']``
- ``'hiv_syph.rel_sus_syph_hiv'``  → ``sim.get_module('hiv_syph').pars['rel_sus_syph_hiv']``
- ``'symp_algo.rel_test'``         → ``sim.get_module('symp_algo').pars['rel_test']``

All parameters are set on the uninitialized sim before sim.init() is called. This works because every module stores its pars dict immediately on construction.

rand_seed is handled specially: set_sim_pars skips it (it is in _META_KEYS), so it is applied directly to sim.pars here. This ensures that reseed=True in :class:Calibration actually uses a different seed per trial, and that the stored rand_seed column in calib.df can be used to reproduce each trial exactly.

Parameters

Name	Type	Description	Default
sim	Sim	An uninitialized simulation (modules must be instances, not strings)	required
calib_pars	dict	Calibration parameters with values set by the sampler	required

Returns

Name	Type	Description
Sim		The initialized and modified simulation

Example::

calib_pars = {
    'hiv.beta_m2f':               dict(low=0.01, high=0.10, guess=0.035, value=0.05),
    'structuredsexual.f1_conc':    dict(low=0.005, high=0.3, guess=0.16, value=0.10),
    'hiv_syph.rel_sus_syph_hiv':   dict(low=1.0, high=4.0, guess=2.5, value=3.0),
    'symp_algo.rel_test':          dict(low=0.5, high=1.5, guess=1.0, value=1.2),
}
sim = default_build_fn(sim, calib_pars)

flatten_calib_pars

calibration.flatten_calib_pars(calib_pars)

Normalize calibration parameters to flat dot-notation format.

Accepts two formats:

Nested (grouped by module)::

dict(hiv=dict(beta_m2f=dict(low=0.01, high=0.10)))

Flat (dot notation — returned unchanged)::

{'hiv.beta_m2f': dict(low=0.01, high=0.10)}

Nested format is detected when a value is a dict whose keys don’t overlap with spec keys (low, high, guess, etc.).

Returns

Name	Type	Description
dict		Flat dict with `'module.par'` keys.

make_calib_sims

calibration.make_calib_sims(
    calib=None,
    calib_pars=None,
    sim=None,
    n_parsets=None,
    check_fn=None,
    seeds_per_par=1,
    **kwargs,
)

Create and run simulations using calibrated parameters.

Provide either a :class:Calibration object or a set of calibration parameters directly. The function creates one sim per parameter set (with optional seed replication), runs them in parallel, and optionally filters the results.

Parameters

Name	Type	Description	Default
calib	Calibration	A completed calibration. Extracts pars via `calib.get_pars()` and uses `calib.sim` as the base if sim is not provided.	`None`
calib_pars		Parameter source — one of: - dict: single parameter set → 1 sim (× seeds_per_par) - list[dict]: N parameter sets → N sims - DataFrame: rows are parameter sets (like `calib.df`)	`None`
sim	Sim	Base (uninitialized) simulation to copy. If `None`, uses `calib.sim`.	`None`
n_parsets	int	Number of top parameter sets to use. `None` = all.	`None`
check_fn	callable	Post-run filter — `check_fn(sim) → bool`. Sims returning `False` are dropped. If `None` and calib is provided, uses `calib.check_fn`.	`None`
seeds_per_par	int	Random seeds per parameter set. When > 1, each par set is run with multiple seeds and only the first surviving seed (per `check_fn`) is kept.	`1`
**kwargs		Passed to `ss.parallel()`.	`{}`

Returns

Name	Type	Description
		ss.MultiSim: A MultiSim containing the completed simulations.

Examples::

# From a Calibration object
msim = sti.make_calib_sims(calib=calib, n_parsets=200)

# From a saved parameters DataFrame
pars_df = sc.loadobj('results/pars.df')
msim = sti.make_calib_sims(calib_pars=pars_df, sim=make_sim(), n_parsets=50)

# Single parameter set with multiple seeds
msim = sti.make_calib_sims(
    calib_pars={'hiv.beta_m2f': 0.05, 'syph.beta_m2f': 0.2},
    sim=make_sim(), seeds_per_par=5,
)

# Different scenario with calibrated parameters
msim = sti.make_calib_sims(
    calib_pars=pars_df, n_parsets=10, seeds_per_par=5,
    sim=make_sim(scenario='intervention'),
    check_fn=lambda s: float(np.sum(s.results.syph.new_infections[-60:])) > 0,
)

set_sim_pars

calibration.set_sim_pars(sim, pars)

Set calibrated parameters on a sim.

All parameters are set directly on the module objects via sim.get_module(). This works on uninitialized sims because every module stores its pars dict immediately on construction.

Supports both dot notation ('hiv.beta_m2f') and legacy underscore format ('hiv_beta_m2f'). Legacy keys are matched greedily against the sim’s module names (longest match first).

Parameters

Name	Type	Description	Default
sim	Sim	A simulation (modules must be instances, not strings)	required
pars	dict	Flat parameter dict, e.g. `{'hiv.beta_m2f': 0.05, ...}`	required

Returns

Name	Type	Description
Sim		The same sim, modified in place