Skip to content

Core Boundaries (Detailed)

Geographic boundary management for climate data clipping.

Overview

climakitae.core.boundaries provides the Boundaries class for managing geographic regions used to subset climate data. It handles: - Loading predefined boundaries (counties, watersheds, utilities) - Lazy loading to minimize memory usage - Caching of boundary geometries - Geographic operations (clipping, spatial queries)

Note

Both legacy (climakitae.core.boundaries) and the ClimateData interface (climakitae.new_core.data_access.boundaries) implement similar functionality. New code should use the ClimateData version.

Boundaries Class

Get geospatial polygon data from the S3 stored parquet catalog. Used to access boundaries for subsetting data by state, county, etc.

Attributes:

Name Type Description
_cat Catalog

Parquet boundary catalog instance

_us_states DataFrame

Table of US state names and geometries

_ca_counties DataFrame

Table of California county names and geometries Sorted by county name alphabetical order

_ca_watersheds DataFrame

Table of California watershed names and geometries Sorted by watershed name alphabetical order

_ca_utilities DataFrame

Table of California IOUs and POUs, names and geometries

_ca_forecast_zones DataFrame

Table of California Demand Forecast Zones

_ca_electric_balancing_areas DataFrame

Table of Electric Balancing Areas

Methods:

Name Description
_get_us_states

Returns a dict of state abbreviations and indices

_get_ca_counties

Returns a dict of California counties and their indices

_get_ca_watersheds

Returns a dict for CA watersheds and their indices

_get_forecast_zones

Returns a dict for CA electricity demand forecast zones

_get_ious_pous

Returns a dict for CA electric load serving entities IOUs & POUs

_get_electric_balancing_areas

Returns a dict for CA Electric Balancing Authority Areas

Source code in climakitae/core/boundaries.py
def __init__(self, boundary_catalog):
    # Connect intake Catalog to class
    self._cat = boundary_catalog

load()

Read parquet files and sets class attributes.

Source code in climakitae/core/boundaries.py
def load(self):
    """Read parquet files and sets class attributes."""
    self._us_states = self._cat.states.read()
    self._ca_counties = self._cat.counties.read().sort_values("NAME")
    self._ca_watersheds = self._cat.huc8.read().sort_values("Name")
    self._ca_utilities = self._cat.utilities.read()
    self._ca_forecast_zones = self._cat.dfz.read()
    self._ca_electric_balancing_areas = self._cat.eba.read()

    # EBA CALISO polygon has two options
    # One of the polygons is super tiny, with a negligible area
    # Perhaps this is an error from the producers of the data
    # Just grab the CALISO polygon with the large area
    tiny_caliso = self._ca_electric_balancing_areas.loc[
        (self._ca_electric_balancing_areas["NAME"] == "CALISO")
        & (self._ca_electric_balancing_areas["SHAPE_Area"] < 100)
    ].index
    self._ca_electric_balancing_areas = self._ca_electric_balancing_areas.drop(
        tiny_caliso
    )

    # For Forecast Zones named "Other", replace that with the name of the county
    self._ca_forecast_zones.loc[
        self._ca_forecast_zones["FZ_Name"] == "Other", "FZ_Name"
    ] = self._ca_forecast_zones["FZ_Def"]

boundary_dict()

Return a dict of the other boundary dicts, used to populate ck.Select.

This returns a dictionary of lookup dictionaries for each set of geoparquet files that the user might be choosing from. It is used to populate the DataParameters cached_area dynamically as the category in the area_subset parameter changes.

Returns:

Type Description
dict
Source code in climakitae/core/boundaries.py
def boundary_dict(self):
    """Return a dict of the other boundary dicts, used to populate ck.Select.

    This returns a dictionary of lookup dictionaries for each set of
    geoparquet files that the user might be choosing from. It is used to
    populate the `DataParameters` cached_area dynamically as the category
    in the area_subset parameter changes.

    Returns
    -------
    dict

    """
    all_options = {
        "none": {"entire domain": 0},
        "lat/lon": {"coordinate selection": 0},
        "states": self._get_us_states(),
        "CA counties": self._get_ca_counties(),
        "CA watersheds": self._get_ca_watersheds(),
        "CA Electric Load Serving Entities (IOU & POU)": self._get_ious_pous(),
        "CA Electricity Demand Forecast Zones": self._get_forecast_zones(),
        "CA Electric Balancing Authority Areas": self._get_electric_balancing_areas(),
    }
    return all_options

Available Boundary Types

The Boundaries class provides access to several predefined boundary catalogs:

  • US States — Western US states
  • CA Counties — All 58 California counties
  • CA Watersheds — HUC8-level watersheds in California
  • CA Utilities — Investor-owned utilities (IOUs) and publicly-owned utilities (POUs)
  • CA Electric Zones — Electricity demand forecast zones
  • CA Balancing Authorities — Electric balancing authority areas
  • CA Census Tracts — Census tract boundaries

Usage Example

from climakitae.core.boundaries import Boundaries

# Initialize boundary loader
boundaries = Boundaries()

# Get available counties
counties = boundaries.available_counties()

# Get geometry for specific county
la_geom = boundaries.get_geometry("Los Angeles")

# Multiple regions (union)
multi_geom = boundaries.get_geometry(["Alameda", "Contra Costa"])