Skip to content

Legacy Boundaries

The climakitae.core.boundaries module provides the geographic lookup layer used by the legacy API to subset by named regions, stations, and bounding boxes.

Warning

This page documents the legacy climakitae.core.boundaries module. It is kept for backward compatibility. New code should use climakitae.new_core.data_access.boundaries.

On this page


What this module does

  • loads the legacy boundary parquet catalog
  • caches the available geometries in memory
  • exposes lookup dictionaries used by DataParameters.cached_area
  • normalizes a small set of boundary categories used by the GUI

Available boundary groups

Boundary key Meaning
states Western US states
CA counties California counties
CA watersheds California HUC8 watersheds
CA Electric Load Serving Entities (IOU & POU) Investor-owned and publicly-owned utilities
CA Electricity Demand Forecast Zones Forecast zones used for demand planning
CA Electric Balancing Authority Areas Balancing authority areas
lat/lon Coordinate-based selection
none Entire-domain selection

Usage example

from climakitae.core.data_interface import DataInterface

boundaries = DataInterface().geographies
county_lookup = boundaries.boundary_dict()["CA counties"]

Public API

Get geospatial polygon data from the S3 stored parquet catalog. Used to access boundaries for subsetting data by state, county, etc.

Attributes:

Name Type Description
_cat Catalog

Parquet boundary catalog instance

_us_states DataFrame

Table of US state names and geometries

_ca_counties DataFrame

Table of California county names and geometries Sorted by county name alphabetical order

_ca_watersheds DataFrame

Table of California watershed names and geometries Sorted by watershed name alphabetical order

_ca_utilities DataFrame

Table of California IOUs and POUs, names and geometries

_ca_forecast_zones DataFrame

Table of California Demand Forecast Zones

_ca_electric_balancing_areas DataFrame

Table of Electric Balancing Areas

Methods:

Name Description
_get_us_states

Returns a dict of state abbreviations and indices

_get_ca_counties

Returns a dict of California counties and their indices

_get_ca_watersheds

Returns a dict for CA watersheds and their indices

_get_forecast_zones

Returns a dict for CA electricity demand forecast zones

_get_ious_pous

Returns a dict for CA electric load serving entities IOUs & POUs

_get_electric_balancing_areas

Returns a dict for CA Electric Balancing Authority Areas

Source code in climakitae/core/boundaries.py
def __init__(self, boundary_catalog):
    # Connect intake Catalog to class
    self._cat = boundary_catalog

load()

Read parquet files and sets class attributes.

Source code in climakitae/core/boundaries.py
def load(self):
    """Read parquet files and sets class attributes."""
    self._us_states = self._cat.states.read()
    self._ca_counties = self._cat.counties.read().sort_values("NAME")
    self._ca_watersheds = self._cat.huc8.read().sort_values("Name")
    self._ca_utilities = self._cat.utilities.read()
    self._ca_forecast_zones = self._cat.dfz.read()
    self._ca_electric_balancing_areas = self._cat.eba.read()

    # EBA CALISO polygon has two options
    # One of the polygons is super tiny, with a negligible area
    # Perhaps this is an error from the producers of the data
    # Just grab the CALISO polygon with the large area
    tiny_caliso = self._ca_electric_balancing_areas.loc[
        (self._ca_electric_balancing_areas["NAME"] == "CALISO")
        & (self._ca_electric_balancing_areas["SHAPE_Area"] < 100)
    ].index
    self._ca_electric_balancing_areas = self._ca_electric_balancing_areas.drop(
        tiny_caliso
    )

    # For Forecast Zones named "Other", replace that with the name of the county
    self._ca_forecast_zones.loc[
        self._ca_forecast_zones["FZ_Name"] == "Other", "FZ_Name"
    ] = self._ca_forecast_zones["FZ_Def"]

boundary_dict()

Return a dict of the other boundary dicts, used to populate ck.Select.

This returns a dictionary of lookup dictionaries for each set of geoparquet files that the user might be choosing from. It is used to populate the DataParameters cached_area dynamically as the category in the area_subset parameter changes.

Returns:

Type Description
dict
Source code in climakitae/core/boundaries.py
def boundary_dict(self):
    """Return a dict of the other boundary dicts, used to populate ck.Select.

    This returns a dictionary of lookup dictionaries for each set of
    geoparquet files that the user might be choosing from. It is used to
    populate the `DataParameters` cached_area dynamically as the category
    in the area_subset parameter changes.

    Returns
    -------
    dict

    """
    all_options = {
        "none": {"entire domain": 0},
        "lat/lon": {"coordinate selection": 0},
        "states": self._get_us_states(),
        "CA counties": self._get_ca_counties(),
        "CA watersheds": self._get_ca_watersheds(),
        "CA Electric Load Serving Entities (IOU & POU)": self._get_ious_pous(),
        "CA Electricity Demand Forecast Zones": self._get_forecast_zones(),
        "CA Electric Balancing Authority Areas": self._get_electric_balancing_areas(),
    }
    return all_options

Notes on behavior

  • The county and watershed tables are sorted for stable option ordering.
  • The balancing authority table drops the tiny CALISO polygon and keeps the larger geometry.
  • The forecast-zone catalog renames entries marked Other to the county definition they belong to.