ClimateData class
The ClimateData class is the main entry point for the ClimateData interface.
It exposes the fluent / builder API used to assemble climate-data queries.
API reference for climakitae.new_core.user_interface.
ClimateData
A fluent interface for accessing climate data.
This class provides a chainable interface for setting parameters and retrieving climate data. It uses a factory pattern to create datasets and validators based on the specified parameters. The class is designed to be chainable, allowing users to set multiple parameters in a single expression.
The interface supports various climate data sources and allows for flexible querying with different combinations of parameters. All methods return the instance itself to enable method chaining.
Other Parameters:
| Name | Type | Description |
|---|---|---|
catalog |
str
|
The data catalog to use (e.g., |
installation |
str
|
The installation type (e.g., |
activity_id |
str
|
The activity identifier (e.g., |
institution_id |
str
|
The institution identifier (e.g., |
source_id |
str
|
The source identifier (e.g., |
experiment_id |
str or list of str
|
The experiment identifier (e.g., |
table_id |
str
|
The temporal resolution (e.g., |
grid_label |
str
|
The spatial resolution (e.g., |
variable_id |
str
|
The climate variable (e.g., |
processes |
dict
|
Dictionary of data processing operations to apply. |
Methods:
| Name | Description |
|---|---|
verbosity |
Set the logging verbosity level. |
catalog |
Set the data catalog to use. |
installation |
Set the installation type. |
activity_id |
Set the activity identifier. |
institution_id |
Set the institution identifier. |
source_id |
Set the source identifier. |
experiment_id |
Set the experiment identifier(s). |
table_id |
Set the temporal resolution. |
grid_label |
Set the spatial resolution. |
variable |
Set the climate variable to retrieve. |
station_id |
Set the station identifier |
network_id |
Set the network identifier |
processes |
Set processing operations to apply to the data. |
get |
Execute the query and retrieve the climate data. |
show_query |
Display the current query configuration. |
show_catalog_options |
Display available catalog options. |
show_installation_options |
Display available installation options. |
show_activity_id_options |
Display available activity ID options. |
show_institution_id_options |
Display available institution ID options. |
show_source_id_options |
Display available source ID options. |
show_experiment_id_options |
Display available experiment ID options. |
show_table_id_options |
Display available table ID (temporal resolution) options. |
show_grid_label_options |
Display available grid label (spatial resolution) options. |
show_variable_options |
Display available climate variable options. |
show_station_id_options |
Display available station ID options. |
show_network_id_options |
Display available network ID options. |
show_derived_variables |
Display registered derived variables. |
show_processors |
Display registered data processors. |
show_station_options |
Display available weather station options. |
show_boundary_options |
Display available boundary options; pass a boundary type to list sub-options. |
show_all_options |
Display all available options for exploration. |
Returns:
| Type | Description |
|---|---|
DataArray or None
|
The retrieved climate data as a lazy-loaded xarray DataArray, or None if the query fails or required parameters are missing. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If required parameters are missing or invalid during validation. |
Exception
|
If there is an error during data retrieval or processing. |
Examples:
Basic usage with method chaining:
>>> cd = ClimateData()
>>> data = (cd
... .catalog("cadcat")
... .activity_id("WRF")
... .experiment_id("historical")
... .table_id("1hr")
... .grid_label("d02")
... .variable("prec")
... .get()
... )
Exploring available options:
>>> cd = ClimateData()
>>> cd.show_catalog_options()
>>> cd.catalog("cadcat").show_variable_options()
Using with processing:
>>> processes = {"spatial_avg": "region", "temporal_avg": "monthly"}
>>> data = (ClimateData()
... .catalog("climate")
... .variable("pr")
... .processes(processes)
... .get())
Initialize the ClimateData interface.
Sets up the factory for dataset creation and initializes query parameters to their default (UNSET) state. Optionally configures logging to file or stdout.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
log_file
|
str
|
Path to log file. If None, logs to stdout. Default is None. |
None
|
verbosity
|
int
|
Logging verbosity level: - <= -2: Effectively silent (no logs) - -1: WARNING level - 0: INFO level (default) - > 0: DEBUG level Default is 0. |
0
|
Source code in climakitae/new_core/user_interface.py
verbosity(level)
Set the logging verbosity level.
This method allows dynamic adjustment of logging verbosity and supports method chaining.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
level
|
int
|
Logging verbosity mapping: - <= -2: effectively silent (no logs) - -1: WARNING level - 0: INFO level (default) - >0: DEBUG level (user must specify >0 to get debug) |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Examples:
>>> cd = ClimateData()
>>> cd.verbosity(-1) # warnings only
>>> cd.verbosity(0) # info (default)
>>> cd.verbosity(1) # debug
Source code in climakitae/new_core/user_interface.py
catalog(catalog)
Set the data catalog to use for the query.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
catalog
|
str
|
The name of the catalog (e.g., "renewables", "climate"). |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Source code in climakitae/new_core/user_interface.py
installation(installation)
Set the installation type for the query.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
installation
|
str
|
The installation type (e.g., "pv_utility", "wind_offshore"). |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Source code in climakitae/new_core/user_interface.py
activity_id(activity_id)
Set the activity identifier for the query.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
activity_id
|
str
|
The activity ID (e.g., "CMIP6", "CORDEX"). |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Source code in climakitae/new_core/user_interface.py
institution_id(institution_id)
Set the institution identifier for the query.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
institution_id
|
str
|
The institution ID (e.g., "CNRM", "DWD"). |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Source code in climakitae/new_core/user_interface.py
source_id(source_id)
Set the source identifier for the query.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_id
|
str
|
The source ID (e.g., "GCM", "RCM", "Station"). |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Source code in climakitae/new_core/user_interface.py
experiment_id(experiment_id)
Set the experiment identifier for the query.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
experiment_id
|
str or list of str
|
The experiment ID (e.g., "historical", "ssp245") or a list of experiment IDs to query multiple scenarios at once. |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Examples:
>>> cd.experiment_id("ssp245") # Single experiment
>>> cd.experiment_id(["historical", "ssp245", "ssp370"]) # Multiple
Source code in climakitae/new_core/user_interface.py
table_id(table_id)
Set the temporal resolution identifier for the query.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table_id
|
str
|
The temporal resolution (e.g., "1hr", "day", "mon"). |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Source code in climakitae/new_core/user_interface.py
grid_label(grid_label)
Set the spatial resolution identifier for the query.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
grid_label
|
str
|
The spatial resolution (e.g., "d01", "d02", "d03"). |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Source code in climakitae/new_core/user_interface.py
variable(variable)
Set the climate variable to retrieve.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
variable
|
str
|
The variable identifier (e.g., "tasmax", "pr", "cf"). Can also be a registered derived variable name. |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Source code in climakitae/new_core/user_interface.py
derived_variable(name, depends_on, func, description='', units='', **query_extras)
Register and query a user-defined derived variable.
This method registers a custom function that computes a new variable from existing source variables, then sets that variable as the query target. The computation happens during data loading (not as a post-processor).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name for the new derived variable. This becomes queryable like any other variable in the catalog. |
required |
depends_on
|
list of str
|
List of source variable IDs required for the computation (e.g., ['tasmax', 'tasmin'] or ['t2', 'rh']). |
required |
func
|
callable
|
Function that takes an xarray.Dataset and returns a modified Dataset
with the new variable added. The function signature should be:
|
required |
description
|
str
|
Human-readable description of what this variable represents. |
''
|
units
|
str
|
Expected units of the derived variable. |
''
|
**query_extras
|
Additional query constraints (e.g., table_id='day'). |
{}
|
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Examples:
Define and query a custom temperature range variable:
>>> def calc_temp_range(ds):
... ds['temp_range'] = ds.tasmax - ds.tasmin
... ds['temp_range'].attrs = {'units': 'K', 'long_name': 'Daily Range'}
... return ds
...
>>> data = (cd
... .catalog("cadcat")
... .activity_id("LOCA2")
... .table_id("day")
... .grid_label("d03")
... .derived_variable(
... name='temp_range',
... depends_on=['tasmax', 'tasmin'],
... func=calc_temp_range,
... description='Daily temperature range',
... units='K'
... )
... .get())
Notes
- Registration is permanent for the Python session
- The function must add the variable to the dataset and return it
- Set appropriate attributes (units, long_name) on the new variable
- For complex post-load transformations, use processors instead
See Also
show_derived_variables : View all registered derived variables climakitae.new_core.derived_variables : Module documentation
Source code in climakitae/new_core/user_interface.py
648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 | |
station_id(station_id)
Set the station identifier for the query.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
station_id
|
str
|
The station ID (e.g., "ASOSAWOS_72019300117", "ASOSAWOS_72020200118"). |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Source code in climakitae/new_core/user_interface.py
network_id(network_id)
Set the network identifier for the query.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
network_id
|
str | list[str]
|
The network ID (e.g., "ASOSAWOS", "CWOP"). |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Source code in climakitae/new_core/user_interface.py
processes(processes)
Set processing operations to apply to the retrieved data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
processes
|
Dict[str, Union[str, Iterable]]
|
A dictionary of processing operations and their parameters. |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance for method chaining. |
Source code in climakitae/new_core/user_interface.py
get()
Execute the configured query and retrieve climate data.
Validates required parameters, creates the appropriate dataset using the factory pattern, executes the query, and resets the query state for the next use.
Thread Safety
This method takes a snapshot of the query at the start of execution, making it safe to call from multiple threads on the same ClimateData instance. However, for maximum clarity and safety, it is recommended to use separate ClimateData instances in multi-threaded scenarios.
Returns:
| Type | Description |
|---|---|
Optional[DataArray]
|
The retrieved climate data as a lazy-loaded xarray DataArray, or None if the query fails or validation errors occur. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If required parameters are missing during validation. |
Exception
|
If there are errors during dataset creation or execution. |
Source code in climakitae/new_core/user_interface.py
854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 | |
show_query()
Display the current query configuration.
Source code in climakitae/new_core/user_interface.py
show_catalog_options(show_n=None)
Display available catalog options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_n
|
int
|
Maximum number of options to display. If None (default), shows all options. |
None
|
Source code in climakitae/new_core/user_interface.py
show_installation_options(show_n=None)
Display available installation options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_n
|
int
|
Maximum number of options to display. If None (default), shows all options. |
None
|
Source code in climakitae/new_core/user_interface.py
show_activity_id_options(show_n=None)
Display available activity ID options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_n
|
int
|
Maximum number of options to display. If None (default), shows all options. |
None
|
Source code in climakitae/new_core/user_interface.py
show_institution_id_options(show_n=None)
Display available institution ID options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_n
|
int
|
Maximum number of options to display. If None (default), shows all options. |
None
|
Source code in climakitae/new_core/user_interface.py
show_source_id_options(show_n=None)
Display available source ID options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_n
|
int
|
Maximum number of options to display. If None (default), shows all options. |
None
|
Source code in climakitae/new_core/user_interface.py
show_experiment_id_options(show_n=None)
Display available experiment ID options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_n
|
int
|
Maximum number of options to display. If None (default), shows all options. |
None
|
Source code in climakitae/new_core/user_interface.py
show_station_id_options(show_n=None)
Display available station ID options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_n
|
int
|
Maximum number of stations to display. If None (default), shows all stations. |
None
|
Source code in climakitae/new_core/user_interface.py
show_network_id_options(show_n=None)
Display available network ID options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_n
|
int
|
Maximum number of options to display. If None (default), shows all options. |
None
|
Source code in climakitae/new_core/user_interface.py
show_table_id_options(show_n=None)
Display available table ID options (Temporal resolutions).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_n
|
int
|
Maximum number of options to display. If None (default), shows all options. |
None
|
Source code in climakitae/new_core/user_interface.py
show_grid_label_options(show_n=None)
Display available grid label options (Spatial resolutions).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_n
|
int
|
Maximum number of options to display. If None (default), shows all options. |
None
|
Source code in climakitae/new_core/user_interface.py
show_variable_options(show_n=None)
Display available variable options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_n
|
int
|
Maximum number of options to display. If None (default), shows all options. |
None
|
Source code in climakitae/new_core/user_interface.py
show_derived_variables()
Display all registered derived variables.
Shows both builtin and user-registered derived variables with their dependencies and descriptions.
Examples:
>>> cd = ClimateData()
>>> cd.show_derived_variables()
Derived Variables (computed from source variables during loading):
------------------------------------------------------------------
wind_speed_10m depends on: u10, v10
heat_index depends on: t2, rh
...
Source code in climakitae/new_core/user_interface.py
1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 | |
show_processors(show_n=None)
Display available data processors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_n
|
int
|
Maximum number of processors to display. If None (default), shows all processors. |
None
|
Source code in climakitae/new_core/user_interface.py
show_station_options(show_n=None)
Display available station options for data retrieval.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
show_n
|
int
|
Maximum number of stations to display. If None (default), shows all stations. |
None
|
Source code in climakitae/new_core/user_interface.py
show_boundary_options(boundary_type=UNSET, show_n=None)
Display available boundaries for spatial queries.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
boundary_type
|
str
|
The type of boundary to display (e.g., "ca_counties", "ca_watersheds"). If not specified, displays available boundary types. |
UNSET
|
show_n
|
int
|
Maximum number of boundaries to display. If None (default), shows all boundaries. |
None
|
Source code in climakitae/new_core/user_interface.py
show_all_options()
Display all available options for exploration.
Source code in climakitae/new_core/user_interface.py
1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 | |
reset()
Manually reset the query parameters.
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance with reset parameters. |
copy_query()
Get a copy of the current query parameters.
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
A copy of the current query parameters. |
Source code in climakitae/new_core/user_interface.py
load_query(query_params)
Load query parameters from a dictionary.
Uses the individual setter methods to ensure validation is applied to each parameter. Unknown keys are silently ignored.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query_params
|
Dict[str, Any]
|
Dictionary of query parameters to load. Supported keys: catalog, installation, activity_id, institution_id, source_id, experiment_id, table_id, grid_label, variable_id, processes. |
required |
Returns:
| Type | Description |
|---|---|
ClimateData
|
The current instance with loaded parameters. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If any parameter value fails validation. |