Is your feature request related to a problem? Please describe.
Maintainability and code quality are important parts to keep anemoi alive and well and avoid technical debt.
Pre-commit hooks are an easy way to automatically flag and possibly fix these issues, but above this we can run pre-commit in Github Actions to automatically check code compliance across many definitions.
We already implement many pre-commit hooks, but from experience I would propose the following hooks in addition:
Describe the solution you'd like
- Use
pygrep-hooks
to enforce type annotations, check for blanket noqa's (they should be specific to what they exempt) and check for log.warns
- repo: https://github.com/pre-commit/pygrep-hooks
rev: v1.10.0 # Use the ref you want to point at
hooks:
- id: python-use-type-annotations # Check for missing type annotations
- id: python-check-blanket-noqa # Check for # noqa: all
- id: python-no-log-warn # Check for log.warn
- Use
docsig
to check docstrings against function signature
- repo: https://github.com/jshwi/docsig # Check docstrings against function sig
rev: v0.44.2
hooks:
- id: docsig
args:
- --ignore-no-params # Allow docstrings without parameters
- --check-dunders # Check dunder methods
- --check-overridden # Check overridden methods
- --check-protected # Check protected methods
- --check-class # Check class docstrings
- --disable=E113 # Disable empty docstrings
- --summary # Print a summary
There is no formal check that docstrings actually represent the function they describe. The additional settings make the docstring checking only necessary, when parameters are set and doesn't fail when no docstring is set, as a fairly lenient implementation. (This can be a point of discussion, whether we want to force docstrings and parameter descriptions.)
- I would suggest expanding the
ruff
ruleset to ALL
, which expands the checking to a wide variety of possible problems.
By default ruff
includes ruleset E
and F
which are pydocstyle errors and flake8 errors.
There are many more rulesets that improve the overall code quality and should be considered. We could also enable those explicitly, or simply rely on ALL
to include the full expansion of community accepted best practices.
From experimentation I had also disabled some specific rulesets:
"E203", whitespace before punctuation
"D100", missing docstring in public module
"D101", missing docstring in public class
"D102", missing docstring in public method
"D103", missing docstring in public function
"D104", missing docstring in public package
"D105", missing docstring in magic method
"D401", First line of docstring written in imperative mood
"S101", Use of assert (asserts are usually good in our case)
"PT018", Composite pytest asserts
These could be discussed, as empty docstrings may actually not be wanted in the framework.
Describe alternatives you've considered
Alternatives would be to enable each package we want individually:
Currently in AIFS we are working with these:
select = [
"A", # flake8 builtins
"B", # flake8-bugbear
"D", # pydocstyle
"E", # pycodestyle error
"W", # pycodestyle warning
"F", # flake8 error
"UP", # Pyupgrade
"SIM", # flake8-simplify
"N", # pep8 naming
"YTT", # flake8-2020
"S", # bandit
"COM", # Commas
"C4", # Comprehensions
"DTZ", # Datetimes
"ISC", # Implicit string concatenation
"ICN", # Import conventions
"LOG", # Logging
"PIE", # Misc lints
"T20", # Print statements
"PT", # Pytest
"Q", # Quotes
"RSE", # Raises
"TID", # Tidy imports
"PTH", # Use Pathlib
"PGH", # Pygrep hooks
"R", # Refactor
"FLY", # Fstrings
"PERF", # Perfomance linting
"FURB", # Modernising code
"RUF", # Ruff specific
"NPY", # Numpys
# "PL", # Pylint
# "TD", # Todos
# "FBT", # Boolean traps
# "CPY", # Copyright
]
Only disabling four, which would still be useful for a framework. The boolean traps ruleset might make coding harder however, and might necessitate some refactoring of parts of the code.
Alternatively we can introduce pre-commit hooks these rulesets implement:
- Use
bandit
for common code vulnerabilities (implemented in ruleset S
):
- repo: https://github.com/pycqa/bandit # Check code for common security issues
rev: 1.7.7
hooks:
- id: bandit
- Use
docformatter
to enforce consistent formatting of docstrings (ruleset from pydocstyle D
, E
, W
):
- repo: https://github.com/PyCQA/docformatter # Format docstrings
rev: v1.7.5
hooks:
- id: docformatter
args:
- -s numpy
- --black
- --in-place
- Use
pyupgrade
to automatically upgrade Python syntax from older patterns (e.g. upgrading from percent style formatting '%s %s' % (a, b)
to '{} {}'.format(a, b)
)
- repo: https://github.com/asottile/pyupgrade # Upgrade Python syntax
rev: v3.15.1
hooks:
- id: pyupgrade
args:
- --py38-plus
Additional context
Bandit or ruleset S
especially may expose vulnerabilities in code, which makes our life as maintainers a little easier.
Organisation
ECMWF