Skip to content

Contributing to climakitae

Audience

This guide is for contributors who want to fix bugs, add features, or improve documentation. If you only want to use climakitae, start with Getting Started.

On this page


Environment Setup

climakitae uses uv exclusively. Do not use bare python/pip or conda outside of an activated .venv.

# One-time setup from the repo root
uv venv
source .venv/bin/activate
uv sync
pip install -e . --no-deps   # editable install, skips dep conflicts

After any new dependency is added to pyproject.toml, re-run uv sync.


Running Tests

# Basic test suite (2–3 min, runs on every PR)
uv run python -m pytest -n auto -m "not advanced" --no-header -q

# Advanced tests (require network / external data — CI-only by default)
uv run python -m pytest -m "advanced"

# With coverage report
pytest --cov=climakitae --cov-report=xml --cov-branch

Test files mirror the source structure under tests/. A new module at climakitae/new_core/processors/my_processor.py should have a matching test file at tests/new_core/processors/test_my_processor.py.

The @pytest.mark.advanced marker is skipped on feature branches and only runs on main or PRs labeled "Advanced Testing".


Code Style

Two formatters are required by CI — run both before committing:

black climakitae/ tests/
isort climakitae/ tests/

isort reports many files as unsorted on a fresh clone — this is a known quirk. Always run isort . before opening a PR to avoid failing the check.


Docstrings

NumPy style is required on every module, class, function, and method — no exceptions. The API reference docs are auto-generated by mkdocstrings from these docstrings, so missing or malformed docstrings cause broken pages.

def my_function(data: xr.Dataset, threshold: float = 0.5) -> xr.Dataset:
    """Short one-line summary.

    Longer description if needed. Explain *why*, not just *what*.

    Parameters
    ----------
    data : xr.Dataset
        Input climate dataset with a ``time`` dimension.
    threshold : float, optional
        Cutoff value. Default is 0.5.

    Returns
    -------
    xr.Dataset
        Dataset with values below *threshold* masked to NaN.

    Examples
    --------
    >>> result = my_function(ds, threshold=0.3)
    """

Where New Features Go

climakitae has two interfaces — do not add features to the legacy core:

Directory Purpose Policy
climakitae/core/ Legacy get_data() / DataParameters API Maintenance only — bug fixes, no new features
climakitae/new_core/ Modern ClimateData builder API All new features go here

Adding a Processor

Processors live in climakitae/new_core/processors/. Use processors/template.py as a starting point.

  1. Create climakitae/new_core/processors/my_processor.py.
  2. Subclass DataProcessor and decorate with @register_processor.
  3. Implement execute(), update_context(), and set_data_accessor().
  4. Add tests in tests/new_core/processors/test_my_processor.py.
from climakitae.new_core.processors.abc_data_processor import (
    DataProcessor,
    register_processor,
)


@register_processor(key="my_processor", priority=50)
class MyProcessor(DataProcessor):
    """One-line summary of what this processor does.

    Parameters
    ----------
    value : dict
        Processor configuration dict passed by the user.
    """

    def __init__(self, value: dict) -> None:
        super().__init__(value)
        # parse and validate your config here

    def execute(self, data: xr.Dataset, context: dict) -> xr.Dataset:
        """Apply the processing step."""
        ...
        return data

    def update_context(self, context: dict) -> None:
        """Record metadata about this step in the context dict."""
        ...

    def set_data_accessor(self, catalog) -> None:
        """Receive the DataCatalog instance (if needed for data lookups)."""
        ...

The priority controls execution order — lower numbers run first. See existing processors for reference values.


Adding a Validator

Validators live in climakitae/new_core/param_validation/. Use the @register_catalog_validator decorator to bind a validator to a catalog key.

from climakitae.new_core.param_validation.abc_param_validation import (
    ParameterValidator,
    register_catalog_validator,
)


@register_catalog_validator("my_catalog")
class MyCatalogValidator(ParameterValidator):
    """Validates query parameters for 'my_catalog'."""

    def validate(self, parameters: dict) -> dict:
        # raise ValueError or emit warnings for invalid combos
        return parameters

PR Checklist

Before opening a pull request:

  • [ ] black climakitae/ tests/ passes (no diff)
  • [ ] isort climakitae/ tests/ passes (no diff)
  • [ ] uv run python -m pytest -n auto -m "not advanced" passes
  • [ ] NumPy docstrings on every new public symbol
  • [ ] New features added to new_core/, not core/
  • [ ] Tests added that mirror the source structure

For significant changes, open a draft PR early and describe the motivation so reviewers can give feedback before full implementation.