Contributing to climakitae
Audience
This guide is for contributors who want to fix bugs, add features, or improve documentation. If you only want to use climakitae, start with Getting Started.
On this page
- Environment Setup
- Running Tests
- Code Style
- Docstrings
- Where New Features Go
- Adding a Processor
- Adding a Validator
- PR Checklist
Environment Setup
climakitae uses uv exclusively. Do not use bare python/pip or conda
outside of an activated .venv.
# One-time setup from the repo root
uv venv
source .venv/bin/activate
uv sync
pip install -e . --no-deps # editable install, skips dep conflicts
After any new dependency is added to pyproject.toml, re-run uv sync.
Running Tests
# Basic test suite (2–3 min, runs on every PR)
uv run python -m pytest -n auto -m "not advanced" --no-header -q
# Advanced tests (require network / external data — CI-only by default)
uv run python -m pytest -m "advanced"
# With coverage report
pytest --cov=climakitae --cov-report=xml --cov-branch
Test files mirror the source structure under tests/. A new module at
climakitae/new_core/processors/my_processor.py should have a matching test
file at tests/new_core/processors/test_my_processor.py.
The @pytest.mark.advanced marker is skipped on feature branches and only
runs on main or PRs labeled "Advanced Testing".
Code Style
Two formatters are required by CI — run both before committing:
isort reports many files as unsorted on a fresh clone — this is a known
quirk. Always run isort . before opening a PR to avoid failing the check.
Docstrings
NumPy style is required on every module, class, function, and method —
no exceptions. The API reference docs are auto-generated by mkdocstrings
from these docstrings, so missing or malformed docstrings cause broken pages.
def my_function(data: xr.Dataset, threshold: float = 0.5) -> xr.Dataset:
"""Short one-line summary.
Longer description if needed. Explain *why*, not just *what*.
Parameters
----------
data : xr.Dataset
Input climate dataset with a ``time`` dimension.
threshold : float, optional
Cutoff value. Default is 0.5.
Returns
-------
xr.Dataset
Dataset with values below *threshold* masked to NaN.
Examples
--------
>>> result = my_function(ds, threshold=0.3)
"""
Where New Features Go
climakitae has two interfaces — do not add features to the legacy core:
| Directory | Purpose | Policy |
|---|---|---|
climakitae/core/ |
Legacy get_data() / DataParameters API |
Maintenance only — bug fixes, no new features |
climakitae/new_core/ |
Modern ClimateData builder API |
All new features go here |
Adding a Processor
Processors live in climakitae/new_core/processors/. Use
processors/template.py as a starting point.
- Create
climakitae/new_core/processors/my_processor.py. - Subclass
DataProcessorand decorate with@register_processor. - Implement
execute(),update_context(), andset_data_accessor(). - Add tests in
tests/new_core/processors/test_my_processor.py.
from climakitae.new_core.processors.abc_data_processor import (
DataProcessor,
register_processor,
)
@register_processor(key="my_processor", priority=50)
class MyProcessor(DataProcessor):
"""One-line summary of what this processor does.
Parameters
----------
value : dict
Processor configuration dict passed by the user.
"""
def __init__(self, value: dict) -> None:
super().__init__(value)
# parse and validate your config here
def execute(self, data: xr.Dataset, context: dict) -> xr.Dataset:
"""Apply the processing step."""
...
return data
def update_context(self, context: dict) -> None:
"""Record metadata about this step in the context dict."""
...
def set_data_accessor(self, catalog) -> None:
"""Receive the DataCatalog instance (if needed for data lookups)."""
...
The priority controls execution order — lower numbers run first. See
existing processors for reference values.
Adding a Validator
Validators live in climakitae/new_core/param_validation/. Use the
@register_catalog_validator decorator to bind a validator to a catalog key.
from climakitae.new_core.param_validation.abc_param_validation import (
ParameterValidator,
register_catalog_validator,
)
@register_catalog_validator("my_catalog")
class MyCatalogValidator(ParameterValidator):
"""Validates query parameters for 'my_catalog'."""
def validate(self, parameters: dict) -> dict:
# raise ValueError or emit warnings for invalid combos
return parameters
PR Checklist
Before opening a pull request:
- [ ]
black climakitae/ tests/passes (no diff) - [ ]
isort climakitae/ tests/passes (no diff) - [ ]
uv run python -m pytest -n auto -m "not advanced"passes - [ ] NumPy docstrings on every new public symbol
- [ ] New features added to
new_core/, notcore/ - [ ] Tests added that mirror the source structure
For significant changes, open a draft PR early and describe the motivation so reviewers can give feedback before full implementation.