dev_guide
Developer guide
Conventions
- Ensemble (data) matrices are
np.ndarrays
with shapeN-by-Nx
. This shape (orientation) is contrary to the EnKF literature, but has the following advantages:- Improves speed in row-by-row accessing,
since that's
np
's default orientation. - Facilitates broadcasting for, e.g. centring the matrix.
- Fewer indices:
[n,:]
yields same as[n]
- Beneficial operator precedence without
()
. E.g.dy @ Rinv @ Y.T @ Pw
(wheredy
is a vector) - Less transposing for for ensemble space formulae.
- It's the standard for data matrices in the broader statistical literature.
- Improves speed in row-by-row accessing,
since that's
- Naming:
E
: ensemble matrixw
: ensemble weights or coefficientsX
: centred ensemble ("anomalies")N
: ensemble sizeNx
: state sizeNy
: observation size- Double letters means a sequence of something.
For example:
xx
: Time series of truth; shape(K+1, Nx)
yy
: Time series of observations; shape(Ko+1, Nx)
EE
: Time series of ensemble matricesii
,jj
: Sequences of indices (integers)
xps
: anxpList
orxpDict
, wherexp
abbreviates "experiment".
Install for development
Include the dev tools as part of the installation (detailed in the README):
pip install -e '.[dev]'
PS: If you want to be able to use static analysis tools (pyright
) with dapper
all the while working from outside its directory,
you should also append --config-settings editable_mode=compat
to the above command.
Ref pyright doc
and pyright issue.
Alternatively, there is the extraPaths
setting.
Run tests
By default, only doctests
are run when executing pytest
.
To run the main tests, do this:
pytest tests
You can also append test_plotting.py
for example,
which is otherwise ignored for being slow.
If the test with the QG
model in test_HMM.py
fails
(simply because you have not compiled it) that is fine
(that test does not run in CI either).
Pre-commit and linting
Pull requests (PR) to DAPPER are checked with continuous integration (CI),
which runs the tests, and also linting, plus some pre-commit
hooks.
To avoid having to wait for the CI server to run all of this,
you'll want to run them on your own computer:
pre-commit install
pre-commit run --all-files
Now every time you commit, these tests will run on the staged files. For detailed linting messages, run
ruff check --output-format=concise
Obsolete
Kept for reference.
You may also want to display linting issues in your editor as you code. Below is a suggested configuration of VS-Code with the pylance plug-in or Vim (with the coc.nvim plug-in with the pyright extension)
// Put this in settings.json (VS-Code) or ~/.vim/coc-settings.json (For Vim)
{
"python.analysis.typeCheckingMode": "off",
"python.analysis.useLibraryCodeForTypes": true,
"python.analysis.extraPaths": ["scripts"],
"python.formatting.provider": "autopep8",
"python.formatting.autopep8Path": "~/.local/bin/autopep8",
"python.linting.enabled": true,
"python.linting.flake8Enabled": true,
"python.linting.flake8Args": ["lint"],
"python.linting.flake8Path": "${env:CONDA_PREFIX}/bin/flakeheaven",
// Without VENV (requires `pip3 install --user flakeheaven flake8-docstrings flake8-bugbear ...`)
// "python.linting.flake8Path": "[USE PATH PRINTED BY PIP ABOVE]/Python/3.8/bin/flakeheaven",
}
Adding to the examples
Example scripts are very useful, and contributions are very desirable. As well
as showcasing some feature, new examples should make sure to reproduce some
published literature results. After making the example, consider converting
the script to the Jupyter notebook format (or vice versa) so that the example
can be run on Colab without users needing to install anything (see
examples/README.md
). This should be done using the jupytext
plug-in (with
the lightscript
format), so that the paired files can be kept in synch.
Documentation
The documentation is built with pdoc
, e.g.
pdoc -t docs/templates --math --docformat=numpy docs/bib/bib.py docs/dev_guide.py ./dapper
Hosting
The above command is run by a GitHub Actions workflow whenever
the master
branch gets updated.
The gh-pages
branch is no longer being used.
Instead actions/deploy-pages
creates an artefact that is deployed to Github Pages.
Bibliography
In order to add new references,
insert their bibtex into docs/bib/refs.bib
,
then run docs/bib/make_bib.py
,
which will format and add entries to docs/bib/bib.py
.
Profiling
- Launch your python script using
kernprof -l -v my_script.py
- Functions decorated with
profile
will be timed, line-by-line. - If your script is launched regularly, then
profile
will not be present in thebuiltins.
Instead of deleting your decorations, you could also define a pass-through fallback.
Publishing a release on PyPI
cd DAPPER
Bump version number in __init__.py
Merge dev1
into master
git checkout master
git merge --no-commit --no-ff dev1
# Fix conflicts, e.g
# git rm <unwanted-file>
git commit
Tag
git tag -a v$(python setup.py --version) -m 'My description'
git push origin --tags
Clean
rm -rf build/ dist *.egg-info .eggs
Add new files to package_data
and packages
in setup.py
Build
./setup.py sdist bdist_wheel
Upload to PyPI
twine upload --repository pypi dist/*
Upload to Test.PyPI
twine upload --repository testpypi dist/*
where ~/.pypirc contains
[distutils]
index-servers=
pypi
testpypi
[pypi]
username: myuser
password: mypass
[testpypi]
repository: https://test.pypi.org/legacy/
username: myuser
password: mypass
Upload to Test.PyPI
git checkout dev1
Test installation
Install from Test.PyPI
pip install --extra-index-url https://test.pypi.org/simple/ dapper
Install from PyPI
pip install dapper
Install into specific dir (includes all of the dependencies)
pip install dapper -t MyDir
Install with options
pip install dapper[dev,Qt]
Install from local (makes installation accessible from everywhere)
pip install -e .