Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Input Loading Pipeline

Purpose

This spec defines the Cobre input loading architecture: the rank-0 centric loading pattern, file loading sequence with dependency ordering, dependency resolution, conditional loading rules, sparse time-series expansion, data broadcasting strategy, parallel policy loading for warm-start, and the transition to the in-memory data model.

For the file inventory and directory layout, see Input Directory Structure. For the in-memory data model after loading completes, see Internal Structures.

1. Loading Architecture

Input loading follows a rank-0 centric pattern: rank 0 loads and validates all input data, then broadcasts to worker ranks. This design:

  • Minimizes filesystem contention on parallel filesystems
  • Centralizes validation logic on a single rank
  • Reduces complexity of error handling across ranks
  • Ensures all ranks receive identical, validated data

The one exception is policy loading for warm-start (SS7), where all ranks load in parallel to avoid bottlenecking on large policy files.

flowchart TB
    subgraph R0 ["Rank 0 — load + validate"]
        direction TB
        C["config.json<br/><i>schema + parse</i>"]
        ST["stages.json<br/><i>branching, noise</i>"]
        SE["System entities <i>(dependency order)</i><br/>buses → lines → hydros → thermals → ncs<br/><i>each: schema validate → cross-ref → index</i>"]
        SD["Stochastic data<br/>seasonal_stats → ar_coefficients → correlation → external/<br/><i>PAR fit (if history) or load precomputed</i>"]
        CN["Constraints<br/><i>generic_constraints + coefficients</i>"]
        VS(["Validated System"])
        C --> ST --> SE --> SD --> CN --> VS
    end

    BC(["broadcast"])
    ALL["All other ranks:<br/>receive System, allocate, set up solver"]
    PW["Policy warm-start<br/><i>all ranks read in parallel (exception)</i>"]

    VS --> BC --> ALL
    PW -.-> ALL

Fail-fast: the first validation error aborts the pipeline. All ranks receive identical validated data.

2. File Loading Sequence

Files are loaded in dependency order so that each file can be validated against previously loaded data. Loading fails fast on the first error.

Schema validation first: Every JSON file undergoes full schema validation before any other checks. This includes required vs. optional fields, data types, value ranges, and structural constraints (e.g., array lengths, enum values). Similarly, every Parquet file is validated for expected columns, column types, and per-column value constraints. The “Validation” column in the tables below lists only the additional cross-reference and semantic checks performed after schema validation passes.

Note: The validations listed in this section are illustrative, not exhaustive. Additional validations may be introduced during implementation or as a result of reviewing other specs. See Validation Architecture for the complete multi-layer validation design.

2.1 Root-Level Files

OrderFileDependenciesValidation
1config.jsonNoneJSON schema, execution mode, section structure
2stages.jsonconfigStage count, season mapping, policy graph, block counts
3penalties.jsonNoneAll penalty values > 0, required categories present
4initial_conditions.jsonNoneArray lengths deferred until entity registries loaded

2.2 System Entity Registries

OrderFileDependenciesValidation
5system/buses.jsonNoneBus IDs unique
6system/lines.jsonbusesSource/target bus references valid
7system/hydros.jsonbusesBus references valid, cascade references acyclic
8system/thermals.jsonbusesBus references valid
9system/non_controllable_sources.jsonbusesBus references valid (optional file)
10system/pumping_stations.jsonhydros, busesHydro and bus references valid (optional file)
11system/energy_contracts.jsonbusesBus references valid (optional file)

2.3 System Extension Data

OrderFileDependenciesValidation
12system/hydro_geometry.parquethydrosHydro ID coverage, monotonic volume-area-level curves
13system/hydro_production_models.jsonhydros, stagesHydro/stage references valid (optional file)
14system/fpha_hyperplanes.parquethydrosRequired only when FPHA source is "precomputed" (optional file)

2.4 Scenario Data

OrderFileDependenciesValidation
15scenarios/inflow_seasonal_stats.parquethydros, stagesHydro/stage coverage (optional file)
16scenarios/inflow_ar_coefficients.parquethydros, stagesHydro/stage/lag coverage, consistent AR order (optional file)
17scenarios/inflow_history.parquethydrosHydro ID coverage (optional file)
18scenarios/load_seasonal_stats.parquetbuses, stagesBus/stage coverage (optional file)
19scenarios/load_factors.jsonstagesBlock count consistency (optional file)
20scenarios/correlation.jsonhydrosGroup membership covers all hydros (optional file)
21scenarios/external_scenarios.parquetentities, stagesEntity/stage/scenario coverage (optional file)

2.5 Constraints and Overrides

OrderFileDependenciesValidation
22constraints/thermal_bounds.parquetthermals, stagesEntity/stage references valid (optional file)
23constraints/hydro_bounds.parquethydros, stagesEntity/stage references valid (optional file)
24constraints/line_bounds.parquetlines, stagesEntity/stage references valid (optional file)
25constraints/pumping_bounds.parquetpumping, stagesEntity/stage references valid (optional file)
26constraints/contract_bounds.parquetcontracts, stagesEntity/stage references valid (optional file)
27constraints/exchange_factors.jsonlines, stagesLine/stage references valid (optional file)
28constraints/generic_constraints.jsonentitiesEntity references valid (optional file)
29constraints/generic_constraint_bounds.parquetgeneric constraints, stagesConstraint/stage references valid (optional file)
30constraints/penalty_overrides_bus.parquetbuses, stagesEntity/stage references valid (optional file)
31constraints/penalty_overrides_line.parquetlines, stagesEntity/stage references valid (optional file)
32constraints/penalty_overrides_hydro.parquethydros, stagesEntity/stage references valid (optional file)
33constraints/penalty_overrides_ncs.parquetNCS, stagesEntity/stage references valid (optional file)

2.6 Cross-Reference Validation

After all files are loaded, a final cross-reference validation pass checks every inter-entity reference for existence, consistency, and structural soundness. This pass corresponds to Layers 3 and 4 of the Validation Architecture pipeline. Failures produce CrossReferenceError or ConstraintError variants of LoadError (SS8.1).

The following table enumerates all 26 cross-reference validation rules. Rules are grouped by source entity type, followed by structural and coverage checks. Per-file schema validations (field types, required fields, value ranges) are NOT repeated here — those are covered by the “Schema validation first” note at the top of SS2.

#Source EntityReference FieldTarget EntityValidation RuleError on Failure
1Linesource_bus_idBusMust resolve to an existing bus ID in buses.jsonCrossReferenceError: line references non-existent source bus
2Linetarget_bus_idBusMust resolve to an existing bus ID in buses.jsonCrossReferenceError: line references non-existent target bus
3Linesource_bus_id, target_bus_idBussource_bus_id must differ from target_bus_id (no self-loops)ConstraintError: line has identical source and target bus
4Thermalbus_idBusMust resolve to an existing bus ID in buses.jsonCrossReferenceError: thermal references non-existent bus
5Hydrobus_idBusMust resolve to an existing bus ID in buses.jsonCrossReferenceError: hydro references non-existent bus
6Hydrodownstream_idHydroWhen non-null, must resolve to an existing hydro ID in hydros.jsonCrossReferenceError: hydro references non-existent downstream hydro
7Hydrodiversion.downstream_idHydroWhen diversion is present, must resolve to an existing hydro ID in hydros.jsonCrossReferenceError: hydro diversion references non-existent destination hydro
8PumpingStationbus_idBusMust resolve to an existing bus ID in buses.jsonCrossReferenceError: pumping station references non-existent bus
9PumpingStationsource_hydro_idHydroMust resolve to an existing hydro ID in hydros.jsonCrossReferenceError: pumping station references non-existent source hydro
10PumpingStationdestination_hydro_idHydroMust resolve to an existing hydro ID in hydros.jsonCrossReferenceError: pumping station references non-existent destination hydro
11PumpingStationsource_hydro_id, destination_hydro_idHydrosource_hydro_id must differ from destination_hydro_id (no self-pumping)ConstraintError: pumping station has identical source and destination hydro
12EnergyContractbus_idBusMust resolve to an existing bus ID in buses.jsonCrossReferenceError: energy contract references non-existent bus
13NonControllableSourcebus_idBusMust resolve to an existing bus ID in buses.jsonCrossReferenceError: non-controllable source references non-existent bus
14GenericConstraintentity IDs in expressionMultipleEvery entity ID in the parsed expression must exist in the corresponding entity registry (hydro, thermal, line, bus, pumping station, or contract)CrossReferenceError: generic constraint expression references non-existent entity
15GenericConstraintconstraint_id (in bounds file)GenericConstraintEvery constraint_id in generic_constraint_bounds.parquet must reference an existing constraint definition in generic_constraints.jsonCrossReferenceError: constraint bounds reference non-existent constraint definition
16Hydro (cascade)downstream_id (all hydros)HydroThe directed graph formed by all downstream_id references must be acyclic (DAG). Cycle detection via topological sort or DFSConstraintError: hydro cascade contains a cycle (list participating hydro IDs)
17InitialConditionshydro_id (in storage)HydroEvery hydro_id in the storage array must exist in the hydro registryCrossReferenceError: initial conditions reference non-existent hydro
18InitialConditionshydro_id (in filling_storage)HydroEvery hydro_id in the filling_storage array must exist in the hydro registry and must have a filling configurationCrossReferenceError: filling initial conditions reference non-existent or non-filling hydro
19InitialConditionscoverage checkHydroEvery operating hydro must appear in storage; every filling hydro must appear in filling_storage. No hydro appears in both arrays. No extra entries allowedConstraintError: initial conditions coverage mismatch (missing hydro or duplicate entry)
20PolicyGraphtransition source_id, target_idStageEvery stage ID referenced in policy graph transitions must exist in the stage collectionCrossReferenceError: policy graph transition references non-existent stage
21Inflow modelhydro coverageHydroEvery operating hydro with PAR-based scenario generation must have corresponding entries in inflow_seasonal_stats.parquet and inflow_ar_coefficients.parquetConstraintError: inflow model coverage incomplete (list missing hydro IDs)
22FPHA hyperplaneshydro_id coverageHydroEvery hydro with FPHA source "precomputed" must have hyperplane entries in fpha_hyperplanes.parquet. Hydros with source "computed" do not require this fileConstraintError: missing precomputed FPHA hyperplanes for hydro (planes are generated during Initialization only for source "computed")
23All entitiesentry_stage_idStageWhen non-null, must resolve to an existing stage ID in the stage collection. The entry stage marks when the entity becomes activeCrossReferenceError: entity references non-existent entry stage
24All entitiesexit_stage_idStageWhen non-null, must resolve to an existing stage ID in the stage collection. Must be >= entry_stage_id when both are presentCrossReferenceError: entity references non-existent exit stage, or exit precedes entry
25Hydrofilling.start_stage_idStageWhen filling configuration is present, must resolve to an existing stage ID. Must fall within [entry_stage_id, exit_stage_id] range when lifecycle is boundedCrossReferenceError: hydro filling references non-existent start stage or stage outside lifecycle
26Thermalgnl_config.lag_stagesStage (implicit)When GNL config is present, lag_stages must be positive and the thermal’s lifecycle must span at least lag_stages + 1 stages for dispatch anticipationConstraintError: GNL lag exceeds thermal lifecycle span

Execution order within SS2.6. Rules 1–15 and 23–26 (referential integrity, Layer 3) execute first. Rules 16–22 (structural and dimensional, Layers 4–5) execute after all foreign-key references are confirmed valid, because structural checks such as cascade acyclicity (rule 16) depend on all downstream_id references being resolved.

Error collection. All 26 checks run to completion (within their layer), collecting every violation. The user sees every cross-reference problem in a single validation report, consistent with the error collection strategy in Validation Architecture SS3.

2.7 Policy Files (Warm-Start Only)

OrderFileDependenciesValidation
34policy/*All aboveState dictionary matches current system, cut format valid

Policy loading uses a different pattern — see SS7 Parallel Policy Loading.

3. Dependency Graph

Input files form a directed acyclic graph (DAG) of dependencies. The loading sequence in SS2 is a valid topological ordering of this DAG. Files at the same dependency level may be loaded in any order relative to each other.

Placeholder — The dependency graph diagram (../../diagrams/exports/svg/data/json-schema-dependencies.svg) will be revised after the text review is complete.

4. Conditional Loading

Some files are loaded only when certain conditions are met. Missing optional files are not errors — the loader uses defaults (typically empty collections or identity values).

ConditionEffect
training.enabled = false in config.jsonSkip scenario noise generation (but still load models if simulation needs them)
simulation.enabled = false in config.jsonSkip simulation-specific scenario setup
policy.mode = "warm_start" in config.jsonLoad policy/* files (SS7)
Policy graph has cycle (stages.json)Validate cycle structure per Infinite Horizon
Hydros with FPHA production model, source "precomputed"Require fpha_hyperplanes.parquet
Hydros with FPHA production model, source "computed"FPHA hyperplanes computed during Initialization phase from geometry and topology data (see SS8)
pumping_stations.json presentLoad pumping bounds, validate hydro/bus references
energy_contracts.json presentLoad contract bounds
non_controllable_sources.json presentLoad NCS penalty overrides
external_scenarios.parquet presentUse external scenarios instead of PAR-generated ones

Note on external scenarios scope: The presence of external_scenarios.parquet does NOT imply simulation-only usage. External scenarios can also be used during training — see Scenario Generation SS4.2. Loading and validation apply identically regardless of the target phase.

5. Sparse Time-Series Expansion

Time-series Parquet files (bounds, penalty overrides) use sparse representation: only rows with non-default values are stored. Stages and entities not present in the file receive default values.

Expansion behavior:

  1. Load the sparse Parquet file (only rows with explicit values).
  2. Build a dense (stage × entity) structure initialized with defaults.
  3. Overlay the sparse values onto the dense structure.
  4. Validate that all stage IDs and entity IDs in the sparse data are valid (reference loaded registries).

Default values: Each file type has its own default. For bounds files, the default is the static bound from the entity registry. For penalty overrides, the default is the global value from penalties.json. This is defined per file in the relevant data model spec — see Input System Entities and Penalty System.

6. Broadcast Strategy

After rank 0 loads and validates all data, it broadcasts to worker ranks. Data is serialized to contiguous byte buffers for MPI broadcast. The broadcast uses a two-step protocol: first the buffer size, then the buffer contents.

Broadcast categories:

CategoryDataStrategy
Small objectsConfig, stages, penalties, initial conditionsSingle MPI_Bcast
Entity registriesBuses, lines, hydros, thermals, NCS, pumping, contractsSingle MPI_Bcast
Scenario modelsPAR parameters, correlation matrices, load modelsSingle MPI_Bcast
Bounds/overridesAll constraint bounds, penalty overrides, exchange factorsSingle MPI_Bcast (sparse form, expand locally)
Policy cutsFCF cuts for warm-startParallel load (SS7)

Sparse broadcast optimization: Bounds and penalty override files are broadcast in their sparse Parquet form. Each rank performs the sparse-to-dense expansion locally (SS5). This reduces broadcast volume — only non-default values are transmitted.

6.1 Serialization Format

The serialization format for MPI broadcast is postcard (via serde).

postcard is a compact binary serde format. Rank 0 serializes the System struct via postcard::to_vec(&system), producing a Vec<u8> byte buffer. The buffer is broadcast via MPI_Bcast (two-step: size first, then contents). Receiving ranks deserialize via postcard::from_bytes::<System>(&buffer) into an owned System value. The format uses variable-length integer encoding (varint) for compact payloads and serde’s trait-based dispatch for handling all Rust types including Vec, String, Option, enums with data, and nested structs.

Why postcard over rkyv: rkyv was previously specified for this use case based on its zero-copy deserialization capability. An evidence-based evaluation (see plans/implementation-readiness-audit/epic-07-serialization-eval-and-output-api/report-027-rkyv-evaluation.md) found that the zero-copy benefit is immaterial for a once-per-execution operation: the marginal deserialization saving is under 2 ms for a ~6 MB payload, occurring exactly once in a program that runs for minutes to hours. Meanwhile, rkyv imposes 129 additional derive annotations across 43 types (on top of the 86 serde derives already required for JSON loading), requires wrapper types for external types like chrono::NaiveDate, and is pre-1.0 with a history of breaking API changes. postcard reuses the serde derives that all cobre-core types must implement for JSON input loading, adding zero additional trait burden. It is post-1.0 stable, has a minimal dependency footprint, and produces slightly smaller payloads than rkyv due to varint encoding.

Why not bincode: bincode was considered but is unmaintained. Its last release predates the current Rust edition.

Sparse broadcast and Parquet pass-through: For bounds and penalty override data broadcast in sparse Parquet form (see broadcast categories table above), postcard serializes only the metadata envelope (entity IDs, stage ranges, file identity). The raw Parquet bytes are passed through as-is in a separate broadcast buffer — they are not re-serialized through postcard. Each rank deserializes the metadata envelope via postcard and then performs sparse-to-dense expansion locally from the Parquet bytes.

6.2 Required Trait Bounds

All types that participate in the System broadcast must derive serde::Serialize and serde::Deserialize. These are the same trait bounds already required for JSON input loading — no additional broadcast-specific derives are needed. The complete list of types requiring these trait bounds:

Entity types:

  • System, Bus, Line, Hydro, Thermal, PumpingStation, EnergyContract, NonControllableSource

Temporal and topological types:

  • Stage, PolicyGraph, CascadeTopology, NetworkTopology

Pre-resolved data types:

  • ResolvedPenalties, ResolvedBounds

Scenario pipeline types:

  • ParModel, CorrelationModel

Other types:

  • InitialConditions, GenericConstraint

All nested types within these top-level types (e.g., tagged union variants within Hydro, individual bound entries within ResolvedBounds) must also satisfy the same trait bounds transitively. External types such as chrono::NaiveDate implement serde traits natively via the chrono/serde feature flag.

Derive example:

#![allow(unused)]
fn main() {
#[derive(serde::Serialize, serde::Deserialize)]
pub struct Hydro { /* ... */ }
}

HashMap lookup indices are NOT serialized. The System struct contains HashMap<EntityId, usize> fields (bus_index, hydro_index, thermal_index, etc.) that serve as O(1) lookup indices from entity ID to position in the corresponding Vec. These indices are excluded from serialization via #[serde(skip)]. Each receiving rank rebuilds them locally from the deserialized entity collections. This is correct because the indices are derived data — they are deterministically reconstructable from the entity vectors — and excluding them reduces the broadcast payload size and avoids serializing HashMap internal layout, which is not stable across allocator states.

6.3 Buffer Allocation

postcard produces a standard Vec<u8> with no special alignment requirements. The MPI receive buffer on worker ranks is a standard Vec<u8> allocated to the exact required size.

Rank 0 serializes via postcard::to_vec(&system), which returns a Vec<u8>. The two-step broadcast protocol (size first, then contents) allows worker ranks to allocate a receive buffer of the exact required size before the second MPI_Bcast call. No aligned allocation or special allocator is needed.

6.4 Versioning Scope

postcard does not provide built-in schema evolution. The binary format is derived from the serde Serialize/Deserialize implementations, which are tied to the Rust struct layout — any change to field order, field types, or field count produces an incompatible archive. There is no field tagging, no optional field mechanism, and no forward/backward compatibility guarantee.

This is acceptable because postcard is used exclusively for in-memory MPI broadcast within a single program execution. All ranks run the same binary, so the struct layout is identical on the serializing and deserializing sides. There is no cross-version compatibility requirement for broadcast data — the buffer exists only for the duration of the broadcast operation and is never persisted to disk.

postcard is NOT suitable for long-term storage. Any data that must survive across program versions (policy cuts, checkpoints, warm-start files) uses FlatBuffers, which provides schema evolution and cross-version compatibility. See Binary Formats SS3 for the FlatBuffers policy persistence format.

7. Parallel Policy Loading (Warm-Start)

Policy files (cuts, states, vertices, basis) can be large. Loading them on rank 0 and broadcasting would create a bottleneck. Instead, all ranks load in parallel:

  1. Rank 0 loads and broadcasts the policy/metadata.json and policy/state_dictionary.json (small files).
  2. All ranks validate the state dictionary against the current system (entity count, state variable mapping).
  3. Each rank loads a subset of policy stage files from policy/cuts/ — stages are assigned round-robin by rank.
  4. After local loading, ranks exchange cuts so that every rank has the complete policy. The exchange strategy (e.g., MPI_Allgatherv or shared memory window + inter-node broadcast) is an implementation choice.

For the policy file format (FlatBuffers .bin files), see Binary Formats SS3.2.

SS7.1 Warm-Start Compatibility Validation

Cuts encode state-variable coefficients whose dimension and semantic ordering are determined by the system that produced them. If the state dimension changes (different hydro count), or the state vector structure changes (different PAR orders, different production models), loaded cuts become dimensionally inconsistent with the current LP and produce silent numerical errors — wrong dual coefficients applied to wrong state variables. Therefore, warm-start loading verifies structural compatibility before accepting any cuts.

The following four checks run in order after policy/metadata.json and policy/state_dictionary.json are broadcast (SS7 step 1) and before any cut deserialization begins (SS7 step 3). All four must pass; the first failure aborts loading with LoadError::PolicyIncompatible (SS8.1).

CheckPolicy SourceSystem SourceComparisonFailure
Hydro countPolicyMetadata.state_dimension (derived from hydro count at training time)system.n_hydros()Exact equalityLoadError::PolicyIncompatible
Maximum PAR order per hydroStored per-hydro in policy metadatamax(system.par_models[h].order for each season) per hydro hExact equality per hydroLoadError::PolicyIncompatible
Production method per hydroStored per-hydro variant tag in policy metadatasystem.hydros[h].generation_model variant tagExact equality per hydroLoadError::PolicyIncompatible
PAR model parameters per hydroStored per-hydro PAR coefficients and seasonal statistics in policy metadatasystem.par_models[h] (all seasons)Exact equality per hydro per seasonLoadError::PolicyIncompatible

Safe modifications. The following system properties may change between the training run that produced the policy and the current warm-start run without invalidating the loaded cuts:

  • Exchange limits (line bounds)
  • Load profiles (demand values)
  • Inflow scenarios (opening realizations)
  • Block durations
  • Penalty values
  • Thermal costs
  • Demand

These properties affect the right-hand side or objective coefficients of the stage LP but do not alter the state variable dimension or the structural mapping between cut coefficients and state variables.

Error behavior. All four validation failures are hard errors — LoadError::PolicyIncompatible — and training cannot proceed with an incompatible policy. This check runs on every rank (since all ranks received the metadata broadcast in SS7 step 1) and runs before any cut deserialization begins (SS7 step 3). If validation fails, no cut files are read, avoiding wasted I/O on an incompatible policy.

8. Transition to In-Memory Model

After loading and broadcasting, each rank constructs its in-memory data model from the loaded data. The in-memory structures are defined in Internal Structures and are not specified here.

FPHA preprocessing: For hydros with FPHA production model and source "computed", the FPHA hyperplanes are fitted during the Initialization phase (after validation, before training). Fitting uses the hydro geometry (volume-area-level curves from hydro_geometry.parquet), topology data (productivity, tailrace, hydraulic losses, turbine efficiency from hydros.json), and fitting configuration (discretization points from hydro_production_models.json). See Hydro Production Models for the mathematical formulation and Input Hydro Extensions for the required input data.

Implementation note (v0.1.4): The computed-source FPHA path is fully implemented in cobre-sddp::hydro_models::prepare_hydro_models. It runs on MPI rank 0 during Initialization, before the training loop or simulation pipeline begins. The fitting grid spans three dimensions: volume, turbined flow, and spillage (see fpha_fitting.rs). After fitting, computed hyperplanes are written to output/hydro_models/fpha_hyperplanes.parquet (see Output Schemas §8.1) and held in memory for LP assembly. No rank broadcast of raw hyperplane rows is performed — ranks receive hydro model assignments through the StudySetup struct.

The key invariant is: after the Initialization phase completes, all ranks hold identical copies of the validated case data (except for per-rank scenario assignments, which are determined during Scenario Generation). See CLI and Lifecycle SS5.2 for the phase sequence.

8.1 load_case Public API

load_case is the primary entry point from cobre-cli (rank 0) into cobre-io. It performs the complete loading sequence (SS2.1–2.6), cross-reference validation (SS2.6), and returns a fully resolved System value ready for MPI broadcast via postcard serialization (SS6.1).

Function signature:

#![allow(unused)]
fn main() {
/// Load, validate, and resolve a Cobre case from the input directory.
///
/// This is the primary entry point from cobre-cli (rank 0) into cobre-io.
/// Returns the fully resolved System struct ready for MPI broadcast.
///
/// # Errors
/// Returns LoadError if any file cannot be read, parsed, or validated.
pub fn load_case(path: &Path) -> Result<System, LoadError>
}

Parameters:

  • path: &Path – the case directory containing config.json and all subdirectories (system/, scenarios/, constraints/, policy/). This is the root directory of the case, not a path to any individual file.

Return type:

  • System – an owned value (not Arc<System>). The caller (cobre-cli or cobre-python) owns the returned System and is responsible for broadcasting it to worker ranks via the postcard serialization protocol (SS6.1). After broadcast, each rank owns its own deserialized copy. Ownership transfer is the simplest pattern: no reference counting, no shared-memory coordination, no lifetime entanglement between the loading phase and the algorithm phase.

Config loading:

load_case does NOT accept a Config parameter. The config.json file is loaded as step 1 within load_case itself, per the loading sequence (SS2.1). The config governs conditional loading decisions (SS4) such as whether to require FPHA hyperplanes or scenario model files. Accepting a pre-loaded config would split the loading sequence across two call sites, creating a risk that conditional loading decisions are evaluated against a config that does not match the case directory contents.

Responsibility boundary:

load_case performs the following steps, in order:

  1. Load and validate config.json (SS2.1).
  2. Load root-level files: stages.json, penalties.json, initial_conditions.json (SS2.1).
  3. Load system entity registries (SS2.2).
  4. Load system extension data (SS2.3).
  5. Load scenario data (SS2.4).
  6. Load constraints and overrides (SS2.5).
  7. Apply sparse time-series expansion (SS5).
  8. Perform cross-reference validation (SS2.6).
  9. Construct and return the System struct with all collections in canonical order.

load_case does NOT load policy files (SS2.7). Policy loading follows a different pattern – all ranks load in parallel (SS7) – and is a separate operation invoked after the System broadcast. See CLI and Lifecycle SS5.2 for the phase sequence that separates case loading from policy loading.

LoadError enum:

#![allow(unused)]
fn main() {
/// Errors that can occur during case loading.
#[derive(Debug, thiserror::Error)]
pub enum LoadError {
    /// Filesystem read failure (file not found, permission denied, I/O error).
    #[error("I/O error reading {path}: {source}")]
    IoError {
        path: PathBuf,
        source: std::io::Error,
    },

    /// JSON or Parquet parsing failure (malformed content, encoding error).
    #[error("parse error in {path}: {message}")]
    ParseError {
        path: PathBuf,
        message: String,
    },

    /// Schema validation failure (missing required field, wrong type, value out of range).
    #[error("schema error in {path}, field {field}: {message}")]
    SchemaError {
        path: PathBuf,
        field: String,
        message: String,
    },

    /// Cross-reference validation failure (dangling entity ID, broken foreign key).
    #[error("cross-reference error: {source_entity} in {source_file} references \
             non-existent {target_entity} in {target_collection}")]
    CrossReferenceError {
        source_file: PathBuf,
        source_entity: String,
        target_collection: String,
        target_entity: String,
    },

    /// Semantic constraint violation (acyclic cascade, complete coverage, consistency).
    #[error("constraint violation: {description}")]
    ConstraintError {
        description: String,
    },

    /// Warm-start policy is structurally incompatible with the current system.
    /// See SS7.1 for the four compatibility checks.
    #[error("policy incompatible: {check} mismatch — policy has {policy_value}, \
             system has {system_value}")]
    PolicyIncompatible {
        check: String,
        policy_value: String,
        system_value: String,
    },
}
}

The variants are ordered by the phase in which they typically occur:

VariantTypical PhaseExample
IoErrorFile read (any step)system/hydros.json not found
ParseErrorFile parse (any step)Malformed JSON in stages.json
SchemaErrorSchema validationhydros.json entry missing required field bus_id
CrossReferenceErrorCross-reference validationHydro bus_id = "BUS_99" not found in bus registry
ConstraintErrorSemantic validationHydro cascade contains a cycle; inflow model coverage is incomplete
PolicyIncompatibleWarm-start validation (SS7.1)Policy trained with 45 hydros, system now has 47

All variants carry enough context for the caller to produce a diagnostic message without re-reading the input files.

Cross-references:

Cross-References