Performance Adaptation Layer
Purpose
This spec defines the transformation layer that converts cobre-core’s clarity-first data model into cobre-sddp’s performance-adapted runtime representations. The Internal Structures 1.1 dual-nature design principle establishes that cobre-core optimizes for correctness and readability while cobre-sddp optimizes for cache locality, SIMD alignment, and zero-allocation hot paths. This spec fills the gap between that principle and the concrete performance-adapted types scattered across the solver, training loop, scenario generation, and binary format specs.
Specifically, this spec:
- Defines a taxonomy of transformation strategies used across the adaptation layer
- Inventories all performance-adapted types, their owning crate, source data, and authoritative spec
- Specifies the initialization build order as a dependency graph
- Maps entity fields from
cobre-corestructs to their performance consumers - States the contracts that the adaptation layer requires from
cobre-coreand guarantees tocobre-sddp
The adaptation layer executes entirely during the Initialization and Scenario Gen phases (CLI and Lifecycle SS5.2a). After these phases complete, all performance-adapted types are immutable for the duration of the Training and Simulation phases — no structural modifications, no heap allocations, no recomputation. The training loop operates exclusively on the adapted representations.
1. Transformation Taxonomy
The adaptation layer uses five distinct strategies to convert domain-oriented data into solver-ready representations. Each strategy addresses a specific performance concern. A single performance-adapted type may employ multiple strategies simultaneously.
1.1 Array-of-Structs to Struct-of-Arrays (AoS→SoA)
What it does. Transposes entity-oriented collections (Vec<Hydro>) into field-oriented parallel arrays (Vec<f64> per field). Each array holds one scalar field across all entities in canonical order.
Why. The SDDP hot path iterates over entities to patch RHS values, extract state, and assemble constraint rows. AoS layout scatters each field across cache lines (one Hydro struct spans hundreds of bytes; adjacent hydros’ max_storage_hm3 values are hundreds of bytes apart). SoA layout places all max_storage_hm3 values in a contiguous f64 slice, enabling sequential memory access and hardware prefetching.
Where it appears.
- PAR preprocessing arrays —
base[T][N],coefficients[T][N][max_order],scales[T][N](Scenario Generation SS1.3) - Pre-resolved bounds — stage×entity penalty and bound lookups (Internal Structures 10–11)
- Entity bounds for LP construction —
max_storage[N],min_outflow[N],max_turbined[N], etc., extracted fromVec<Hydro>for building stage template column/row bounds
Extraction pattern.
#![allow(unused)]
fn main() {
// AoS → SoA extraction at initialization (once, O(N) per field)
let max_storage: Vec<f64> = system.hydros.iter()
.map(|h| h.max_storage_hm3)
.collect();
let min_outflow: Vec<f64> = system.hydros.iter()
.map(|h| h.min_outflow_m3s)
.collect();
}
After extraction, the source Vec<Hydro> is still held by the System struct (it lives for the entire execution), but the training loop accesses only the SoA arrays.
1.2 Algebraic Absorption (Precomputation)
What it does. Evaluates algebraic expressions that combine multiple domain parameters into a single runtime value, eliminating redundant arithmetic from the hot path. The absorbed expression is typically a constant per (stage, entity) pair.
Why. Some LP coefficients and RHS values require combining 3–5 domain parameters that individually have clear physical meaning but, once combined, form a single constant. Computing this combination on every LP solve (millions of times) wastes cycles on arithmetic whose inputs never change within a stage.
Where it appears.
PrecomputedParLp.deterministic_base— absorbs so the hot path does one multiply-add (base + sigma * noise) instead of a loop (Internal Structures 14, PAR Inflow Model 7)- Time conversion factor per stage — absorbs from block durations (Notation Conventions 3.1)
- FPHA hyperplane coefficients — pre-fitted from geometry data during initialization, stored as dense
f64arrays per (hydro, plane), consumed as LP constraint coefficients (Hydro Production Models 2)
Correctness requirement. The precomputed value must produce bit-for-bit identical results to evaluating the full expression inline. The theory spec (e.g., PAR Inflow Model 7) defines the algebraic derivation; the precomputed struct is the implementation contract. A mismatch between the theory derivation and the precomputed expression is a bug — the theory spec is the oracle.
1.3 Index Flattening
What it does. Maps entity IDs (EntityId(i32)) to contiguous 0-based indices (usize) via the canonical ordering established during input loading (Design Principles 3). All performance-adapted arrays use the flattened index as their access key.
Why. Entity IDs are sparse, user-assigned, and potentially non-contiguous (e.g., hydro IDs 101, 205, 310). LP variable positions, SoA array slots, and cut coefficient positions require dense 0-based indexing. The canonical sort (by ascending EntityId) produces a bijection: the entity at sort position i occupies index i in every performance array.
Where it appears. Everywhere. Every SoA array, every LP column/row formula, every cut coefficient, every state vector position uses the flattened index. The StageIndexer (Training Loop SS5.5) is the canonical expression of this mapping — it translates semantic names (storage, lags, theta) to Range<usize> positions in the LP solution vector.
Mapping lifecycle. The mapping is established once during input loading (canonical sort in Input Loading Pipeline SS3) and is implicit thereafter — position i in any SoA array always refers to the entity at canonical position i. No explicit lookup table is needed at runtime because the canonical ordering is a structural invariant, not a runtime query.
1.4 Layout Reshaping for Access Pattern
What it does. Arranges data so that the dominant access pattern touches contiguous memory. This is distinct from SoA (section 1.1), which transposes field orientation. Layout reshaping concerns the ordering of elements within a single array to match how the algorithm iterates over them.
Why. The hardware prefetcher performs best on sequential or strided access. An array whose layout matches the iteration order gets free prefetching. An array whose layout opposes the iteration order suffers cache misses on every access.
Where it appears.
- LP column layout — State variables (storage, then lags) are placed at the column prefix so that state extraction is a single
memcpyfromprimal[0..n_state](Solver Abstraction SS2.1). Decision variables follow. This layout was designed to make the hot-path access pattern (extract state, extract duals for cut coefficients) contiguous. - LP row layout — Fixing constraints (storage, then lag) placed at the row prefix so that dual extraction for cut coefficients is
dual[0..n_state], with row-column symmetry: rowr’s dual is the cut coefficient for state variable at columnr(Training Loop SS5.5.1). - Opening tree tensor —
[stages × openings × dim]with stage as the outer dimension. The backward pass iterates stages in reverse, accessing all openings at a given stage — contiguous memory (Scenario Generation SS2.3). - Noise cache — Scenario-major layout
[scenario × stage × entity]for forward pass sequential access within a trajectory (Scenario Generation SS5.1). Trade-off: backward pass accesses non-contiguously, but the cache fits in L3 and the LP solve dominates latency (see scenario-generation SS5.1 rationale). - Cut pool coefficients — Intercepts separated from coefficient vectors to avoid polluting cache during coefficient copy loops; 64-byte alignment for prefetch boundary matching (Binary Formats 3.4).
1.5 Temporal Flattening
What it does. Pre-resolves season-indexed or period-indexed parameters into stage-indexed arrays, eliminating season-lookup indirection at runtime. The domain model naturally uses seasons (12 months or 52 weeks) because physical parameters are seasonal. The solver needs stage-indexed values (60–120 stages) because it iterates stage-by-stage.
Why. A season lookup (season = stage_to_season_map[t]; value = seasonal_values[season]) is a dependent load — the CPU cannot prefetch the value until the season index is resolved. At worst-case scale (160 hydros × 120 stages × millions of iterations), the cumulative cost of dependent loads is significant. Pre-resolving to value = stage_values[t][h] eliminates the indirection.
Where it appears.
PrecomputedParLparrays — All three arrays (psi,deterministic_base,sigma) are indexed[stage][hydro], not[season][hydro], despite the underlying PAR model being defined per-season (Internal Structures 14)- PAR preprocessing arrays in scenario generation —
base[T][N],coefficients[T][N][max_order],scales[T][N]use stage as the outer dimension (Scenario Generation SS1.3) - Pre-resolved penalties — stage-varying penalty costs resolved from the three-tier cascade (global → entity → stage override) into a per-(entity, stage) lookup (Internal Structures 10)
- Pre-resolved bounds — stage-varying entity bounds resolved from base values with sparse stage overrides applied (Internal Structures 11)
Memory cost. Temporal flattening trades memory for speed. A seasonal PAR model stores 12 × N values; the stage-indexed version stores T × N values (at worst-case scale: 120 × 160 = 19,200 vs. 12 × 160 = 1,920 — a 10× expansion; at the production baseline of 60 stages: 60 × 160 = 9,600 vs. 12 × 160 = 1,920 — a 5× expansion). At 8 bytes per f64, the absolute cost is negligible (< 1 MB for all temporally flattened arrays combined).
2. Performance-Adapted Type Inventory
Every performance-adapted type in the Cobre ecosystem is listed below, grouped by owning crate and lifecycle phase. The “Source Data” column identifies which cobre-core types feed each adapted type. The “Strategies” column references the taxonomy in section 1.
2.1 Types Built During Initialization Phase
| Type | Owner | Source Data | Strategies | Authoritative Spec |
|---|---|---|---|---|
StageTemplate (CSC arrays: col_starts, row_indices, values, col_lower, col_upper, row_lower, row_upper, objective) | cobre-solver (built by cobre-sddp) | System entities, Stage/Block definitions, pre-resolved bounds and penalties, PrecomputedParLp.psi | AoS→SoA (entity fields → CSC coefficient values), Index flattening (entity ID → column/row position), Layout reshaping (state prefix, fixing constraint prefix), Temporal flattening (stage-varying bounds pre-resolved) | Solver Abstraction SS11.1, Solver Interface Trait SS4.4 |
StageIndexer (column/row range map) | cobre-solver | System dimensions (hydro count, max PAR order) | Index flattening (semantic names → Range<usize>) | Training Loop SS5.5 |
SolverWorkspace (solver instance, pre-allocated buffers, per-stage basis cache) | cobre-solver | LP dimensions from StageTemplate, stage count | — (allocation, not transformation) | Solver Workspaces SS1 |
PrecomputedParLp (psi, deterministic_base, sigma) | cobre-sddp | ParModel (seasonal means, AR coefficients, residual std devs), season-to-stage mapping | Algebraic absorption (deterministic_base), Temporal flattening (season → stage indexing) | Internal Structures 14, PAR Inflow Model 7 |
StageLpCache (complete pre-assembled LP per stage in CSC format) | cobre-solver (built by cobre-sddp) | StageTemplate CSC + CutPool slot structure + column/row bounds + objective | Layout reshaping (structural template + cut coefficients → unified CSC with 15K pre-allocated cut slots) | Solver Abstraction SS11.4 |
CutPool (slot-based, intercept/activity bitmap/metadata only) | cobre-sddp | Pre-allocated from config (max_cuts_per_stage, n_state); warm-start cuts from FlatBuffers policy on resume | Index flattening (slot indices). Coefficients absorbed into StageLpCache CSC; pool retains metadata (~12 MB across 60 stages at 15K capacity) | Cut Management Impl SS1, Binary Formats 3.4 |
| FPHA hyperplane coefficients | cobre-sddp | Hydro geometry, pre-fitted planes from hydro_extensions | Algebraic absorption (geometry → hyperplane coefficients) | Hydro Production Models 2 |
| Cascade topology arrays (upstream indices, downstream indices per hydro) | cobre-sddp | Hydro.downstream_id, CascadeTopology | Index flattening (entity IDs → positional indices), AoS→SoA | Internal Structures 5 |
| Bus-entity membership arrays (which hydros/thermals/lines connect to each bus) | cobre-sddp | Hydro.bus_id, Thermal.bus_id, Line.source_bus_id/target_bus_id | Index flattening, AoS→SoA | Internal Structures 1.2 |
Pre-resolved bounds ([stage][entity] lookup) | cobre-core (loaded by cobre-io) | Base entity bounds + sparse stage overrides from hydro_bounds/thermal_bounds | Temporal flattening | Internal Structures 11 |
Pre-resolved penalties ([stage][entity] lookup) | cobre-core (loaded by cobre-io) | Global defaults + entity overrides + stage overrides from penalties.json | Temporal flattening | Internal Structures 10 |
2.2 Types Built During Scenario Gen Phase
| Type | Owner | Source Data | Strategies | Authoritative Spec |
|---|---|---|---|---|
PAR preprocessing arrays (base, coefficients, scales, orders) | cobre-stochastic | ParModel parameters, season definitions, stage definitions | AoS→SoA, Temporal flattening (season → stage), Algebraic absorption | Scenario Generation SS1.3 |
| Spectrally-decomposed correlation matrices | cobre-stochastic | Correlation matrices from correlation.json or estimated from residuals | Algebraic absorption ( computed once) | Scenario Generation SS2.1 |
Opening tree tensor ([stages × openings × dim]) | cobre-stochastic | Derived RNG seeds, spectral factors, PAR preprocessing arrays | Layout reshaping (stage-outer for backward pass access) | Scenario Generation SS2.3 |
2.3 Types That Are Not Transformations
For completeness, these types are sometimes discussed alongside performance-adapted views but are not transformations of cobre-core data — they are runtime artifacts produced by the algorithm:
| Type | Owner | Nature |
|---|---|---|
State vectors ([f64; n_state]) | cobre-sddp | Extracted from LP primal solution during forward pass |
| Cut coefficients | cobre-sddp | Extracted from LP dual solution during backward pass |
| Convergence history | cobre-sddp | Accumulated per-iteration bound statistics |
| Simulation results | cobre-sddp | Extracted from LP solutions during simulation forward pass |
3. Initialization Build Order
The performance-adapted types form a dependency graph. Each type requires certain inputs to be available before construction. The build order below is the topological sort of this graph, grouped by lifecycle phase.
3.1 Dependency Graph
System (from Validation phase, broadcast to all ranks)
│
├──→ Pre-resolved bounds [Internal Structures §11]
│ (base entity bounds + stage overrides → [stage][entity] lookup)
│
├──→ Pre-resolved penalties [Internal Structures §10]
│ (global → entity → stage cascade → [stage][entity] lookup)
│
├──→ Cascade topology arrays
│ (Hydro.downstream_id → flattened upstream/downstream index arrays)
│
├──→ Bus-entity membership arrays
│ (entity bus_id fields → per-bus entity index lists)
│
├──→ PrecomputedParLp [Internal Structures §14]
│ (ParModel + season-stage mapping → psi, deterministic_base, sigma)
│
├──→ FPHA hyperplane coefficients
│ (hydro geometry → fitted plane arrays)
│
└──→ StageIndexer [Training Loop §5.5]
(system dimensions → Range<usize> index map)
│
└──→ StageTemplate [Solver Abstraction §11.1]
(ALL of the above + LP formulation rules → CSC arrays)
│
├──→ SolverWorkspace [Solver Workspaces §1]
│ (LP dimensions from template → allocated buffers + solver instance)
│
├──→ CutPool [Cut Management Impl §1]
│ (n_state from indexer, max_cuts from config → metadata-only allocation)
│ │
│ └──→ (optional) Warm-start cut loading from FlatBuffers policy
│
└──→ StageLpCache [Solver Abstraction §11.4]
(StageTemplate CSC + empty cut slots → complete LP per stage)
(SharedRegion with NUMA-interleaved allocation; ~22.3 GB)
─── Initialization / Scenario Gen boundary ───
PrecomputedParLp + System
│
├──→ PAR preprocessing arrays [Scenario Generation §1.3]
│ (seasonal PAR params → stage-indexed SoA arrays)
│
├──→ Spectral decomposition [Scenario Generation §2.1]
│ (correlation matrices → symmetric spectral factors)
│
└──→ Opening tree [Scenario Generation §2.3]
(RNG seeds + spectral factors + PAR arrays → dense tensor)
3.2 Parallelism During Build
Most build steps are sequential (single-threaded on each rank), with two exceptions:
- Solver workspace allocation must happen inside a parallel region so that first-touch NUMA policy places each workspace’s buffers on the owning thread’s NUMA node (Solver Workspaces SS1.3).
- Opening tree generation is embarrassingly parallel across stages — each stage’s openings are independent once the spectral factors exist.
All other construction is sequential because the cost is negligible relative to training (one-time arithmetic vs. millions of LP solves).
3.3 Build Order as Implementation Checklist
The following sequence is a linearization of the dependency graph. Steps at the same indent level are independent and may execute in any order.
INITIALIZATION PHASE:
1. Receive broadcast System from rank 0
2. Build pre-resolved bounds (§11) and penalties (§10)
3. Build cascade topology arrays
4. Build bus-entity membership arrays
5. Build PrecomputedParLp (§14) — needs ParModel + stage definitions
6. Fit FPHA hyperplanes (if computed source) — needs hydro geometry
7. Build StageIndexer for each stage — needs system dimensions only
8. Build StageTemplate for each stage — needs everything from steps 2–7
8a. Assemble initial StageLpCache from StageTemplate + empty cut slots (SharedRegion, NUMA-interleaved)
9. Allocate SolverWorkspaces (in parallel region) — needs LP dimensions from step 8
10. Pre-allocate CutPool — metadata-only allocation (intercepts + activity bitmap); needs n_state from step 7
11. (Warm-start only) Load cuts from FlatBuffers policy into CutPool and StageLpCache
SCENARIO GEN PHASE:
12. Build PAR preprocessing SoA arrays — needs PrecomputedParLp from step 5
13. Decompose correlation matrices — needs correlation data from System
14. Generate opening tree — needs steps 12–13
Steps 2–6 depend only on System and are independent of each other. Steps 7–8 depend on 2–6. Step 8a depends on 8. Steps 9–11 depend on 8 (and 8a for StageLpCache-aware warm-start). Steps 12–14 depend on 5 and are independent of 8–11.
4. Entity Data Flow
This section maps individual entity fields from cobre-core structs to their ultimate consumers in the performance layer. The mapping answers the question: “when I change field X in the Hydro struct, which performance-adapted types need to be rebuilt?”
4.1 Hydro Fields
The Hydro entity (Internal Structures 1.9.4) is the most complex, with ~20 fields feeding multiple LP elements.
Reservoir and flow bounds → StageTemplate column/row bounds:
| Field | LP Element | Template Array | How It Enters |
|---|---|---|---|
min_storage_hm3 | Column lower bound on | col_lower[h] | Direct copy (with stage override from pre-resolved bounds) |
max_storage_hm3 | Column upper bound on | col_upper[h] | Direct copy (with stage override) |
min_outflow_m3s | Row lower bound on outflow constraint | row_lower[outflow_row(h)] | Direct copy (with stage override). Soft — slack variable created with outflow_violation_below_cost. |
max_outflow_m3s | Row upper bound on outflow constraint | row_upper[outflow_row(h)] | Direct copy when Some; +∞ when None. Soft — slack variable. |
min_turbined_m3s | Column lower bound on | col_lower[turbined_col(h, k)] | Direct copy. Soft — slack variable. |
max_turbined_m3s | Column upper bound on | col_upper[turbined_col(h, k)] | Direct copy. Hard bound. |
min_generation_mw | Row lower bound on generation constraint | row_lower[gen_row(h, k)] | Direct copy. Soft — slack variable. |
max_generation_mw | Row upper bound on generation constraint | row_upper[gen_row(h, k)] | Direct copy. Hard bound. |
Generation model → StageTemplate constraint coefficients:
| Field | LP Element | Template Array | How It Enters |
|---|---|---|---|
generation_model: ConstantProductivity { productivity_mw_per_m3s } | Coefficient in generation constraint linking and | values[...] in CSC | Written as LP coefficient: becomes row with coefficients on columns |
generation_model: Fpha | FPHA hyperplane constraints | Multiple rows in CSC per hydro, one per hyperplane | Each plane: . Coefficients from FPHA fitting. |
Topology → Cascade arrays and water balance constraint structure:
| Field | Performance Consumer | How It Enters |
|---|---|---|
bus_id | Bus-entity membership arrays | Determines which bus’s load balance constraint includes this hydro’s generation |
downstream_id | Cascade topology arrays, water balance constraint structure | Determines outflow coupling: this hydro’s outflow appears as inflow to the downstream hydro in the water balance constraint |
Penalties → StageTemplate objective coefficients:
Field (via HydroPenalties) | LP Element | Template Array |
|---|---|---|
spillage_cost | Objective coefficient on spillage variable | objective[spillage_col(h, k)] |
storage_violation_below_cost | Objective coefficient on storage slack | objective[storage_slack_col(h)] |
| All other penalty fields | Objective coefficients on corresponding slack variables | objective[slack_col(...)] |
Pre-resolved penalties (Internal Structures 10) are already temporally flattened — the stage template builder reads the penalty for the target stage directly without cascade resolution.
PAR model → PrecomputedParLp and LP constraints:
| Source | Performance Consumer | Transformation |
|---|---|---|
ParModel.seasonal_means | PrecomputedParLp.deterministic_base | Algebraic absorption: |
ParModel.ar_coefficients | PrecomputedParLp.psi → StageTemplate AR dynamics constraint coefficients | Temporal flattening (season → stage), then written as LP constraint coefficients |
ParModel.residual_std | PrecomputedParLp.sigma → hot-path RHS patching: | Temporal flattening |
Fields consumed only at initialization (not in any performance array):
| Field | Usage |
|---|---|
id | Canonical sorting key. After sort, the positional index replaces the ID for all performance access. |
name | Logging, error messages, output column headers. Not in any LP or performance array. |
entry_stage_id, exit_stage_id | Entity lifecycle filtering. Determines which stages include this hydro in the LP. Not a runtime value — it gates which per-stage template includes the hydro. |
tailrace, hydraulic_losses, efficiency | Consumed during FPHA hyperplane fitting (initialization). Not stored in performance arrays — their effect is absorbed into the fitted hyperplane coefficients. |
evaporation_coefficients_mm | Consumed during stage template construction for the evaporation constraint. Combined with surface area to produce a per-stage evaporation bound. Absorbed into the LP row bound. |
filling | Consumed during stage template construction to adjust constraints for filling-period stages. |
4.2 General Pattern
The Hydro mapping above illustrates the general pattern that applies to all entity types:
- Scalar bounds (min/max values) become column or row bounds in the
StageTemplateCSC arrays, potentially with stage overrides from pre-resolved bounds. - Costs and penalties become objective coefficients in the
StageTemplate. - Topology references (
bus_id,downstream_id,source_bus_id/target_bus_id) become index-flattened membership arrays and determine constraint structure. - Identity fields (
id,name) are consumed at initialization for sorting and logging; they do not appear in performance arrays. - Lifecycle fields (
entry_stage_id,exit_stage_id) gate per-stage entity inclusion; they are not runtime values. - Model parameters (productivity, AR coefficients, thermal cost curve segments) become LP constraint coefficients or are absorbed via precomputation.
The same analysis applies to Thermal (cost curve segments → piecewise objective, capacity → column bounds, bus_id → membership), Line (capacity → column bounds, losses → constraint coefficient, bus IDs → load balance structure), Bus (deficit segments → multiple deficit variables with piecewise costs), and other entity types. The mapping for each follows the same six categories above.
4.3 What Changes Require Rebuilding What
Change in cobre-core | Affected Performance Types | Rebuild Scope |
|---|---|---|
Entity field value (e.g., max_storage_hm3 for one hydro) | StageTemplate for stages where the hydro is active | Per-stage template rebuild. Does not affect StageIndexer, CutPool, or PAR arrays. |
| Entity count (add/remove a hydro) | All types: StageIndexer, StageTemplate, CutPool (n_state changes), PrecomputedParLp, PAR arrays, cascade arrays, membership arrays | Full rebuild. This changes LP dimensions. |
| Stage count or block structure | StageTemplate (all stages), PrecomputedParLp, PAR arrays, opening tree, SolverWorkspace (basis cache size), CutPool (per-stage pool count) | Full rebuild. |
| PAR model coefficients only | PrecomputedParLp, PAR preprocessing arrays, opening tree | Partial rebuild. StageTemplate unaffected if AR constraint structure unchanged. |
| Penalty value only | StageTemplate objective coefficients for affected stages | Per-stage template rebuild of objective array only. |
In practice, the initialization is fast enough (< 1 second for production-scale systems) that partial rebuilds are an unnecessary optimization. The table above documents dependencies for reasoning about correctness, not for implementing incremental rebuild.
5. Contracts and Invariants
5.1 What the Adaptation Layer Requires from cobre-core
| Requirement | Guaranteed By | Spec Reference |
|---|---|---|
All entity collections sorted by ascending EntityId (canonical ordering) | Input loading pipeline, canonicalization step | Design Principles 3, Input Loading Pipeline SS3 |
All defaults resolved — no None where a default exists | Input loading pipeline, default resolution step | Internal Structures 1.9 (Resolved annotation) |
All cross-references valid — downstream_id points to an existing hydro, bus_id points to an existing bus | Input loading pipeline, cross-reference validation | Validation Architecture SS2 |
All stage overrides applied — pre-resolved bounds and penalties available as [stage][entity] lookups | Input loading pipeline, resolution steps | Internal Structures 10–11 |
System struct immutable for the duration of Initialization + Scenario Gen + Training + Simulation | Ownership model: System is shared via &System | Internal Structures 1.3 |
ParModel parameters pass validation (positive residual variance, AR polynomial stability) | Input loading pipeline, model validation | PAR Inflow Model 6 |
5.2 What the Adaptation Layer Guarantees to the Training Loop
| Guarantee | Implementation | Spec Reference |
|---|---|---|
| All performance-adapted types are immutable after construction (except StageLpCache) | Built during Initialization/Scenario Gen; no mutation API exposed. StageLpCache is updated between iterations by leader rank via SharedRegion (fence + barrier) — read-only during forward/backward passes | CLI and Lifecycle SS5.2a, Solver Abstraction SS11.4 |
| No heap allocation during Training/Simulation phases from adapted types | All buffers pre-allocated; Vec capacities set at construction | Ecosystem Guidelines 6, Memory Architecture 3.3 |
| SoA arrays and LP column/row indices use identical canonical ordering | Both derived from the same canonical sort | Design Principles 3 |
State extraction is a contiguous memcpy from primal[0..n_state] | LP column layout places state variables at prefix | Solver Abstraction SS2.1 |
Dual extraction for cut coefficients is a contiguous memcpy from dual[0..n_state] | LP row layout places fixing constraints at prefix with row-column symmetry | Solver Abstraction SS2.2, Training Loop SS5.5.1 |
Cut coefficient dot products operate on 64-byte-aligned f64 arrays | Allocation uses Layout::from_size_align(..., 64) | Solver Abstraction SS2.5, Training Loop SS5.1.1 |
StageTemplate CSC arrays are shared read-only across all threads within a rank | Templates are Send + Sync; no interior mutability | Solver Abstraction SS11.1 |
StageIndexer produces identical results on all MPI ranks | LP structure depends only on System (identical on all ranks) | Training Loop SS5.5.1 |
Thread-local SolverWorkspace buffers are NUMA-local to the owning thread | First-touch allocation inside parallel region | Solver Workspaces SS1.3, Memory Architecture 2.1 |
| Pre-resolved penalties and bounds are O(1) per (entity, stage) lookup | Materialized as flat [stage][entity] arrays during loading | Internal Structures 10–11 |
5.3 Boundary Location
The adaptation boundary is the cobre-sddp initialization function. Its signature (defined in Internal Structures 1.3):
#![allow(unused)]
fn main() {
cobre_sddp::train(system: &System, config: &TrainingConfig, comm: &C) -> Result<TrainingResult, TrainError>
}
Inside train, the first operation (before entering the iteration loop) is to build all performance-adapted views from &System. After this build phase, the iteration loop never accesses System entity fields directly — all hot-path data comes from the adapted types.
The boundary is one-way: data flows from cobre-core types into performance-adapted types during initialization, and never flows back. The adapted types are consumed and discarded when train returns. Runtime artifacts (cuts, convergence history, simulation results) are written to output files, not back into System.
Cross-References
- Internal Structures 1.1 — Dual-nature design principle (the high-level contract this spec elaborates)
- Internal Structures 10–11 — Pre-resolved penalties and bounds (temporal flattening at the cobre-io/cobre-core boundary)
- Internal Structures 14 —
PrecomputedParLp(algebraic absorption + temporal flattening) - Solver Abstraction SS2 — LP column and row layout convention (layout reshaping)
- Solver Abstraction SS11.1 —
StageTemplateconstruction and CSC representation - Solver Abstraction SS11.4 —
StageLpCachedesign, sizing, SharedRegion ownership, update/read contracts - Solver Workspaces SS1 — Thread-local workspace allocation and lifecycle
- Training Loop SS5.5 —
StageIndexerdefinition and usage - Scenario Generation SS1.3 — PAR preprocessing SoA arrays
- Scenario Generation SS2.3 — Opening tree tensor layout
- Binary Formats 3.4 — Cut pool memory layout requirements
- Cut Management Impl SS1 — Cut pool runtime structure
- CLI and Lifecycle SS5.2a — Phase ordering and initialization sequencing
- Design Principles 3 — Declaration order invariance (canonical ordering foundation)
- Ecosystem Guidelines 6 — Performance principles (cache locality, SIMD, zero allocation)
- Memory Architecture 2–3 — NUMA placement, cache line alignment, zero-allocation enforcement
- PAR Inflow Model 7 — Algebraic derivation of precomputed PAR components
- Notation Conventions 3.1 — Time conversion factor derivation
- Design Principles 7 — Decision to use f64 with unit suffixes; the adaptation boundary is the cast point identified in that section