Basic 1
A single, interactive, synthetic ("twin") experiment¶
If run as a script in a terminal (i.e. not as a jupyter notebook) then the liveplotting is interactive (can be paused, skipped, etc).
Imports¶
NB: On Gooble Colab,
replace %matplotlib notebook
(right below) by
!python -m pip install git+https://github.com/nansencenter/DAPPER.git
%matplotlib notebook
import dapper as dpr
import dapper.da_methods as da
Generate the same random numbers each time this script is run:
dpr.set_seed(3000);
Load experiment setup: the hidden Markov model (HMM)¶
from dapper.mods.Lorenz63.sakov2012 import HMM
HMM # ⇒ printout (in notebooks)
HiddenMarkovModel({ 'Dyn': Operator({ 'M': 3, 'model': rk4 integration of <function dapper.mods.Lorenz63.dxdt(x)> <function with_rk4.<locals>.step at 0x119028720>, 'noise': GaussRV({ 'mu': array([0., 0., 0.]), 'M': 3, 'C': 0 }), 'linear': <function dstep_dx at 0x119028540> }), 'Obs': <TimeDependentOperator> CONSTANT operator sepcified by .Op1: Operator({ 'M': 3, 'model': Direct obs. at [0 1 2] <function partial_Id_Obs.<locals>.model at 0x119028c20>, 'noise': GaussRV({ 'C': <CovMat> M: 3 kind: 'diag' trunc: 1.0 rk: 3 full: [[2. 0. 0.] [0. 2. 0.] [0. 0. 2.]] diag: [2. 2. 2.], 'mu': array([0., 0., 0.]), 'M': 3 }), 'linear': Constant matrix [[1. 0. 0.] [0. 1. 0.] [0. 0. 1.]] <function partial_Id_Obs.<locals>.linear at 0x119028e00> }), 'tseq': <Chronology> - K: 25025 - Ko: 1000 - T: 250.25 - BurnIn: 16.0 - dto: 0.25 - dt: 0.01, 'X0': GaussRV({ 'C': <CovMat> M: 3 kind: 'diag' trunc: 1.0 rk: 3 full: [[2. 0. 0.] [0. 2. 0.] [0. 0. 2.]] diag: [2. 2. 2.], 'mu': array([ 1.509, -1.531, 25.46 ]), 'M': 3 }), 'liveplotters': [(1, <class 'dapper.tools.liveplotting.correlations'>), (1, <function sliding_marginals.<locals>.init at 0x11903c720>), (1, <function phase_particles.<locals>.init at 0x11903c5e0>)], 'sectors': {}, 'name': 'Lorenz63/sakov2012.py' })
Simulate synthetic truth (xx) and noisy obs (yy)¶
A variable named <char><char>
conventionally refers to a time series of <char>
.
HMM.tseq.T = 30 # shorten experiment
xx, yy = HMM.simulate()
Truth & Obs: 100%|█████████████| 3000/3000 [00:00<00:00, 42197.63it/s]
Specify a DA method¶
Here "xp" is short for "experiment" configuration.
# xp = da.OptInterp()
# xp = da.Var3D()
# xp = da.ExtKF(infl=90)
xp = da.EnKF("Sqrt", N=10, infl=1.02, rot=True)
# xp = da.PartFilt(N=100, reg=2.4, NER=0.3)
xp # ⇒ printout (in notebooks)
EnKF(upd_a='Sqrt', N=10, infl=1.02, rot=True, fnoise_treatm='Stoch')
Assimilate yy¶
Note that the assimilation "knows" the HMM,
but xx
is only used for performance assessment.
xp.assimilate(HMM, xx, yy)
EnKF: 100%|████████████████████| 3000/3000 [00:00<00:00, 14291.97it/s]
Average the time series¶
Computes a set of different statistics.
# print(xp.stats) # ⇒ long printout
xp.stats.average_in_time()
Print some of these time-averages
# print(xp.avrgs) # ⇒ long printout
print(xp.avrgs.tabulate(["rmse.a", "rmv.a"]))
rmse.a 1σ rmv.a 1σ ------------ ----------- 0.78 ±0.06 0.57 ±0.07
Replay liveplotters¶
Further diagnostic plots¶
import dapper.tools.viz as viz
Excercise (methods)¶
- Try out each of the above DA methods (currently commented out).
- Next, remove the call to
replay
, and setliveplots=False
above. - Now, use the iterative EnKS (
iEnKS
), and try to find a parameter combination for it so that you achieve a lowerrmse.a
than with thePartFilt
.
Hint: In general, there is no free lunch. Similarly, not all methods work
for all problems; additionally, methods often have parameters that require
tuning. Luckily, in DAPPER, you should be able to find suitably tuned
configuration settings for various DA methods in the files that define the
HMM. If you do not find a suggested configuration for a given method, you
will have to tune it yourself. The example script basic_2
shows how DAPPER
facilitates the tuning process, and basic_3
takes this further.
Excercise (diagnostics)¶
- Create a new code cell, and copy-paste the above
print(...tabulate)
command into it. Then, replacermse
byerr.rms
. This should yield the same printout, as is merely an abbreviation of the latter. - Next, figure out how to print the time average forecast (i.e. prior) error
(and
rmv
) instead. Explain (in broad terms) why the values are larger than for the analysis values. - Finally, instead of the
rms
spatial/field averages, print the regular mean (.m
) averages. Explain whyerr.m
is nearly zero, in contrast toerr.rms
.
Excercise (memory)¶
Why are the replay plots not as smooth as the liveplot?
Hint: provide the keyword store_u=True
to assimilate()
to avoid this.