Legacy Core Concepts
The mental model behind the legacy climakitae.core interface — how queries are
configured, why the field names differ from the modern interface, and how a
query flows from configuration to data.
Audience
This page is for readers maintaining or reading existing legacy code. If you are writing new code, start with the ClimateData interface instead.
Warning
climakitae.core is maintained for backward compatibility only. New work
should use climakitae.new_core.user_interface.ClimateData.
On this page
The configuration object
Internally, the legacy interface is built around a configuration object,
DataParameters. It was originally designed to back a GUI, so it carries
observers that keep dependent fields (resolution, timescale, scenario, cached
area) in sync as values change.
You rarely need to build one by hand. The recommended entry point is
get_data(), which accepts GUI-style keyword arguments, constructs and
validates a DataParameters object for you, and returns the data:
from climakitae.core.data_interface import get_data
data = get_data(
variable="Air Temperature at 2m",
resolution="9 km",
timescale="hourly",
)
Driving a DataParameters instance directly and calling .retrieve() is still
supported for GUI-style workflows, but get_data() is the preferred path for
new and maintained code.
GUI-style field names
The most important thing to understand about the legacy interface is that its
field names are human-readable GUI labels, not the catalog-native names used
by ClimateData. For example, the legacy interface uses "9 km" where the
modern interface uses the grid label "d02", and "Statistical" where the
modern interface uses the activity id "LOCA2".
This means legacy code reads more like the old web tool and less like the underlying intake-esm catalog.
Legacy → modern mapping
When porting legacy code, this table maps the common GUI-style fields to their
modern ClimateData equivalents:
| Legacy field | Meaning | Modern equivalent |
|---|---|---|
downscaling_method |
Dynamical, Statistical, or both | activity_id ("WRF" / "LOCA2") |
resolution |
3 km, 9 km, or 45 km | grid_label ("d03" / "d02" / "d01") |
timescale |
hourly, daily, monthly | table_id ("1hr" / "day" / "mon") |
scenario_ssp / scenario_historical |
Scenario selection buckets | experiment_id |
area_subset / cached_area |
Named boundary selection | clip processor |
time_slice |
Year-range tuple | time_slice processor |
variable |
GUI display name | variable (catalog id, e.g. t2max / tasmax) |
See the migration guide for a complete walkthrough.
Query flow
A legacy query moves through four stages:
get_data()builds aDataParametersobject from your keyword arguments and loads the singletonDataInterfacewith the available options.- Option observers validate and keep fields like
resolution,timescale,scenario_ssp, andcached_areain sync. get_data()calls the catalog loader (the same loader used byDataParameters.retrieve()).- The loader returns an
xarray.DataArray,xarray.Dataset, or a list ofDataArrayobjects depending on the request.
Always prefer calling get_data() with keyword arguments — it handles the
DataParameters construction and validation for you.
Like the modern interface, the result is lazily loaded — data streams from S3 only when you compute, plot, or export it.
Named boundaries
Spatial subsetting in the legacy interface is driven by the
Boundaries loader. The GUI exposes a small set of boundary
categories (area_subset) and, within each, a list of named regions
(cached_area):
from climakitae.core.data_interface import DataInterface
boundaries = DataInterface().geographies
county_lookup = boundaries.boundary_dict()["CA counties"]
In the modern interface this is replaced by the much more flexible
clip processor.