xp_launch⚓︎

Tools (notably xpList) for setup and running of experiments (known as xps).

See da_methods.da_method for the strict definition of xps.

Modules:

Name	Description
`dapper`	Root package of DAPPER
`pb`	Make `progbar` (wrapper around `tqdm`) and `read1`.

`xpList` ⚓︎

Bases: list

Subclass of list specialized for experiment ("xp") objects.

Main use: administrate experiment launches.

Modifications to list:

xpList.append supports unique to enable lazy xp declaration.
__iadd__ (+=) supports adding single xps. this is hackey, but convenience is king.
__getitem__ supports lists, similar to np.ndarray
__repr__: prints the list as rows of a table, where the columns represent attributes whose value is not shared among all xps. Refer to xpList.prep_table for more information.

Add-ons:

xpList.launch: run the experiments in current list.
xpList.prep_table: find all attributes of the xps in the list; classify as distinct, redundant, or common.
xpList.gen_names: use xpList.prep_table to generate a short & unique name for each xp in the list.
xpList.tabulate_avrgs: tabulate time-averaged results.
xpList.inds to search by kw-attrs.

Parameters:

Name	Type	Description	Default
`args`	`entries`	Nothing, or a list of `xp`s.	`()`
`unique`	`bool`	Duplicates won't get appended. Makes `append` (and `__iadd__`) relatively slow. Use `extend` or `__add__` or `combinator` to bypass this validation.	`False`

Also see

Examples: docs/examples/basic_2, docs/examples/basic_3
xp_process.xpSpace, which is used for experient result presentation, as opposed to this class (xpList), which handles launching experiments.

`da_methods` ⚓︎

List da_method attributes in this list.

`getitem(keys)` ⚓︎

Indexing, also by a list

`append(xp)` ⚓︎

Append if not self.unique & present.

`gen_names(abbrev=6, tab=False)` ⚓︎

Similiar to self.__repr__(), but:

returns list of names
tabulation is optional
attaches (abbreviated) labels to each attribute

`inds(strict=True, missingval='NONSENSE', **kws)` ⚓︎

Find (all) indices of xps whose attributes match kws.

If strict, then xps lacking a requested attr. will not match, unless the missingval matches the required value.

`launch(HMM, save_as='noname', mp=False, fail_gently=None, **kwargs)` ⚓︎

Essentially: for xp in self: run_experiment(xp, ..., **kwargs).

See run_experiment for documentation on the kwargs and fail_gently. See tools.datafiles.create_run_dir for documentation save_as.

Depending on mp, run_experiment is delegated as follows:

False: caller process (no parallelisation)
True or "MP" or an int: multiprocessing on this host
"GCP" or "Google" or dict(server="GCP"): the DAPPER server (Google Cloud Computing with HTCondor).
- Specify a list of files as mp["files"] to include them in working directory of the server workers.
- In order to use absolute paths, the list should cosist of tuples, where the first item is relative to the second (which is an absolute path). The root is then not included in the working directory of the server.
- If this dict field is empty, then all python files in sys.path[0] are uploaded.

See docs/examples/basic_2.py and docs/examples/basic_3.py for example use.

`prep_table(nomerge=())` ⚓︎

Classify all attrs. of all xps as distinct, redundant, or common.

An attribute of the xps is inserted in one of the 3 dicts as follows: The attribute names become dict keys. If the values of an attribute (collected from all of the xps) are all equal, then the attribute is inserted in common, but only with a single value. If they are all the same or missing, then it is inserted in redundant with a single value. Otherwise, it is inserted in distinct, with its full list of values (filling with None where the attribute was missing in the corresponding xp).

The attrs in distinct are sufficient to (but not generally necessary, since there might exist a subset of attributes that) uniquely identify each xp in the list (the redundant and common can be "squeezed" out). Thus, a table of the xps does not need to list all of the attributes. This function also does the heavy lifting for xp_process.xpSpace.squeeze.

Parameters:

Name	Type	Description	Default
`nomerge`	`list`	Attributes that should always be seen as distinct.	`()`

`combinator(param_dict, **glob_dict)` ⚓︎

Mass creation of xp's by combining the value lists in the param_dict.

Returns a function (for_params) that creates all possible combinations of parameters (from their value list) for a given da_methods.da_method. This is a good deal more efficient than relying on xpList's unique. Parameters

not found among the args of the given DA method are ignored by for_params.
specified as keywords to the for_params fix the value preventing using the corresponding (if any) value list in the param_dict.

Warning

Beware! If, eg., infl or rot are in param_dict, aimed at the EnKF, but you forget that they are also attributes some method where you don't actually want to use them (eg. SVGDF), then you'll create many more than you intend.

`run_experiment(xp, label, savedir, HMM, setup=seed_and_simulate, free=True, statkeys=False, fail_gently=False, **stat_kwargs)` ⚓︎

Used by xp_launch.xpList.launch to run each single (DA) experiment ("xp").

This involves steps similar to docs/examples/basic_1.py, i.e.:

setup : Initialize experiment.
xp.assimilate : run DA, pass on exception if fail_gently
xp.stats.average_in_time : result averaging
xp.avrgs.tabulate : result printing
dill.dump : result storage

Parameters:

Name	Type	Description	Default
`xp`	`object`	Type: a `da_methods.da_method`-decorated class.	required
`label`	`str`	Name attached to progressbar during assimilation.	required
`savedir`	`str`	Path of folder wherein to store the experiment data.	required
`HMM`	`HiddenMarkovModel`	Container defining the system.	required
`free`	`bool`	Whether (or not) to `del xp.stats` after the experiment is done, so as to free up memory and/or not save this data (just keeping `xp.avrgs`).	`True`
`statkeys`	`list`	A list of names (possibly in the form of abbreviations) of the statistical averages that should be printed immediately afther this xp.	`False`
`fail_gently`	`bool`	Whether (or not) to propagate exceptions.	`False`
`setup`	`function`	This function must take two arguments: `HMM` and `xp`, and return the `HMM` to be used by the DA methods (typically the same as the input `HMM`, but could be modified), and the (typically synthetic) truth and obs time series. This gives you the ability to customize almost any aspect of the individual experiments within a batch launch of experiments (i.e. not just the parameters of the DA. method). Typically you will grab one or more parameter values stored in the `xp` (see `da_methods.da_method`) and act on them, usually by assigning them to some object that impacts the experiment. Thus, by generating a new `xp` for each such parameter value you can investigate the impact/sensitivity of the results to this parameter. Examples include: Setting the seed. See the default `setup`, namely `seed_and_simulate`, for how this is, or should be, done. Setting some aspect of the `HMM` such as the observation noise, or the interval between observations. This could be achieved for example by: `def setup(hmm, xp): hmm.Obs.noise = GaussRV(M=hmm.Nx, C=xp.obs_noise) hmm.tseq.dkObs = xp.time_between_obs import dapper as dpr return dpr.seed_and_simulate(hmm, xp)` This process could involve more steps, for example loading a full covariance matrix from a data file, as specified by the `obs_noise` parameter, before assigning it to `C`. Also note that the import statement is not strictly necessary (assuming `dapper` was already imported in the outer scope, typically the main script), except when running the experiments on a remote server. Sometimes, the parameter you want to set is not accessible as one of the conventional attributes of the `HMM`. For example, the `Force` in the Lorenz-96 model. In that case you can add these lines to the setup function: `import dapper.mods.Lorenz96 as core core.Force = xp.the_force_parameter` However, if your model is an OOP instance, the import approach will not work because it will serve you the original model instance, while `setup()` deals with a copy of it. Instead, you could re-initialize the entire model in `setup()` and overwrite `HMM.Dyn`. However, it is probably easier to just assign the instance to some custom attribute before launching the experiments, e.g. `HMM.Dyn.object = the_model_instance`, enabling you to set parameters on `HMM.Dyn.object` in `setup()`. Note that this approach won't work for modules (for ex., combining the above examples, `HMM.Dyn.object = core`) because modules are not serializable. Using a different `HMM` entirely for the truth/obs (`xx`/`yy`) generation, than the one that will be used by the DA. Or loading the truth/obs time series from file. In both cases, you might also have to do some cropping or slicing of `xx` and `yy` before returning them.	`seed_and_simulate`

`seed_and_simulate(HMM, xp)` ⚓︎

Default experiment setup (sets seed and simulates truth and obs).

Used by xp_launch.xpList.launch via xp_launch.run_experiment.

Parameters:

Name	Type	Description	Default
`HMM`	`HiddenMarkovModel`	Container defining the system.	required
`xp`	`object`	Type: a `da_methods.da_method`-decorated class. `xp.seed` should be set (and `int`). Without `xp.seed` the seed does not get set, and different `xp`s will use different seeds (unless you do some funky hacking). Reproducibility for a script as a whole can still be achieved by setting the seed at the outset of the script. To avoid even that, set `xp.seed` to `None` or `"clock"`.	required

Returns:

Type	Description
`tuple(xx, yy)`	The simulated truth and observations.

xp_launch⚓︎

xpList ⚓︎

da_methods ⚓︎

__getitem__(keys) ⚓︎

append(xp) ⚓︎

gen_names(abbrev=6, tab=False) ⚓︎

inds(strict=True, missingval='NONSENSE', **kws) ⚓︎

launch(HMM, save_as='noname', mp=False, fail_gently=None, **kwargs) ⚓︎

prep_table(nomerge=()) ⚓︎

combinator(param_dict, **glob_dict) ⚓︎

run_experiment(xp, label, savedir, HMM, setup=seed_and_simulate, free=True, statkeys=False, fail_gently=False, **stat_kwargs) ⚓︎

seed_and_simulate(HMM, xp) ⚓︎

`xpList` ⚓︎

`da_methods` ⚓︎

`getitem(keys)` ⚓︎

`append(xp)` ⚓︎

`gen_names(abbrev=6, tab=False)` ⚓︎

`inds(strict=True, missingval='NONSENSE', **kws)` ⚓︎

`launch(HMM, save_as='noname', mp=False, fail_gently=None, **kwargs)` ⚓︎

`prep_table(nomerge=())` ⚓︎

`combinator(param_dict, **glob_dict)` ⚓︎

`run_experiment(xp, label, savedir, HMM, setup=seed_and_simulate, free=True, statkeys=False, fail_gently=False, **stat_kwargs)` ⚓︎

`seed_and_simulate(HMM, xp)` ⚓︎