Core Boundaries (Detailed)

Geographic boundary management for climate data clipping.

Overview

climakitae.core.boundaries provides the Boundaries class for managing geographic regions used to subset climate data. It handles: - Loading predefined boundaries (counties, watersheds, utilities) - Lazy loading to minimize memory usage - Caching of boundary geometries - Geographic operations (clipping, spatial queries)

Note

Both legacy (climakitae.core.boundaries) and the ClimateData interface (climakitae.new_core.data_access.boundaries) implement similar functionality. New code should use the ClimateData version.

Boundaries Class

Get geospatial polygon data from the S3 stored parquet catalog. Used to access boundaries for subsetting data by state, county, etc.

Attributes:

Name	Type	Description
`_cat`	`Catalog`	Parquet boundary catalog instance
`_us_states`	`DataFrame`	Table of US state names and geometries
`_ca_counties`	`DataFrame`	Table of California county names and geometries Sorted by county name alphabetical order
`_ca_watersheds`	`DataFrame`	Table of California watershed names and geometries Sorted by watershed name alphabetical order
`_ca_utilities`	`DataFrame`	Table of California IOUs and POUs, names and geometries
`_ca_forecast_zones`	`DataFrame`	Table of California Demand Forecast Zones
`_ca_electric_balancing_areas`	`DataFrame`	Table of Electric Balancing Areas

Methods:

Name	Description
`_get_us_states`	Returns a dict of state abbreviations and indices
`_get_ca_counties`	Returns a dict of California counties and their indices
`_get_ca_watersheds`	Returns a dict for CA watersheds and their indices
`_get_forecast_zones`	Returns a dict for CA electricity demand forecast zones
`_get_ious_pous`	Returns a dict for CA electric load serving entities IOUs & POUs
`_get_electric_balancing_areas`	Returns a dict for CA Electric Balancing Authority Areas

Source code in climakitae/core/boundaries.py

def __init__(self, boundary_catalog):
    # Connect intake Catalog to class
    self._cat = boundary_catalog

`load()`

Read parquet files and sets class attributes.

Source code in climakitae/core/boundaries.py

def load(self):
    """Read parquet files and sets class attributes."""
    self._us_states = self._cat.states.read()
    self._ca_counties = self._cat.counties.read().sort_values("NAME")
    self._ca_watersheds = self._cat.huc8.read().sort_values("Name")
    self._ca_utilities = self._cat.utilities.read()
    self._ca_forecast_zones = self._cat.dfz.read()
    self._ca_electric_balancing_areas = self._cat.eba.read()

    # EBA CALISO polygon has two options
    # One of the polygons is super tiny, with a negligible area
    # Perhaps this is an error from the producers of the data
    # Just grab the CALISO polygon with the large area
    tiny_caliso = self._ca_electric_balancing_areas.loc[
        (self._ca_electric_balancing_areas["NAME"] == "CALISO")
        & (self._ca_electric_balancing_areas["SHAPE_Area"] < 100)
    ].index
    self._ca_electric_balancing_areas = self._ca_electric_balancing_areas.drop(
        tiny_caliso
    )

    # For Forecast Zones named "Other", replace that with the name of the county
    self._ca_forecast_zones.loc[
        self._ca_forecast_zones["FZ_Name"] == "Other", "FZ_Name"
    ] = self._ca_forecast_zones["FZ_Def"]

`boundary_dict()`

Return a dict of the other boundary dicts, used to populate ck.Select.

This returns a dictionary of lookup dictionaries for each set of geoparquet files that the user might be choosing from. It is used to populate the DataParameters cached_area dynamically as the category in the area_subset parameter changes.

Returns:

Type	Description
`dict`

Source code in climakitae/core/boundaries.py

def boundary_dict(self):
    """Return a dict of the other boundary dicts, used to populate ck.Select.

    This returns a dictionary of lookup dictionaries for each set of
    geoparquet files that the user might be choosing from. It is used to
    populate the `DataParameters` cached_area dynamically as the category
    in the area_subset parameter changes.

    Returns
    -------
    dict

    """
    all_options = {
        "none": {"entire domain": 0},
        "lat/lon": {"coordinate selection": 0},
        "states": self._get_us_states(),
        "CA counties": self._get_ca_counties(),
        "CA watersheds": self._get_ca_watersheds(),
        "CA Electric Load Serving Entities (IOU & POU)": self._get_ious_pous(),
        "CA Electricity Demand Forecast Zones": self._get_forecast_zones(),
        "CA Electric Balancing Authority Areas": self._get_electric_balancing_areas(),
    }
    return all_options

Available Boundary Types

The Boundaries class provides access to several predefined boundary catalogs:

US States — Western US states
CA Counties — All 58 California counties
CA Watersheds — HUC8-level watersheds in California
CA Utilities — Investor-owned utilities (IOUs) and publicly-owned utilities (POUs)
CA Electric Zones — Electricity demand forecast zones
CA Balancing Authorities — Electric balancing authority areas
CA Census Tracts — Census tract boundaries

Usage Example

from climakitae.core.boundaries import Boundaries

# Initialize boundary loader
boundaries = Boundaries()

# Get available counties
counties = boundaries.available_counties()

# Get geometry for specific county
la_geom = boundaries.get_geometry("Los Angeles")

# Multiple regions (union)
multi_geom = boundaries.get_geometry(["Alameda", "Contra Costa"])