Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Solver Interface Trait

Purpose

This spec defines the SolverInterface trait – the backend abstraction through which optimization algorithms perform LP operations: loading models, adding constraint rows, updating row and column bounds, solving, warm-starting from a cached basis, and extracting solution data. The solver is resolved as a generic type parameter at compile time, following the same monomorphization pattern used by the Communicator trait (Communicator Trait §3). Because LP solvers wrap FFI calls to C libraries (HiGHS, CLP) on a per-thread exclusive-ownership basis, the trait requires Send but not Sync. The operations contract originates from Solver Abstraction SS4, which defines the behavioral requirements validated against both reference solver APIs.

Convention: Rust traits as specification guidelines. The Rust trait definitions, method signatures, and struct declarations throughout this specification corpus serve as guidelines for implementation, not as absolute source-of-truth contracts that must be reproduced verbatim. Their purpose is twofold: (1) to express behavioral contracts, preconditions, postconditions, and type-level invariants more precisely than prose alone, and (2) to anchor conformance test suites that verify backend interchangeability (see Backend Testing §1). Implementation may diverge in naming, parameter ordering, error representation, or internal organization when practical considerations demand it – provided the behavioral contracts and conformance tests continue to pass. When a trait signature and a prose description conflict, the prose description (which captures the domain intent) takes precedence; the conflict should be resolved by updating the trait signature. This convention applies to all trait-bearing specification documents in src/specs/.

1. Trait Definition

The solver interface is modeled as a Rust trait with 10 methods. Each method corresponds to one operation from the solver interface contract (Solver Abstraction SS4.1), plus a name() method for diagnostics.

#![allow(unused)]
fn main() {
/// Backend abstraction for LP solver operations in optimization algorithms.
///
/// Implementations wrap a single solver instance (HiGHS `void*` handle,
/// CLP `Clp_Simplex*` handle, etc.) and encapsulate all solver-specific
/// behavior: API calling conventions, retry logic, dual normalization,
/// and basis format translation.
///
/// The trait requires `Send` (transferable between threads) but NOT `Sync`.
/// Each solver instance is exclusively owned by one thread at a time --
/// LP solvers are not thread-safe. The thread-local workspace pattern
/// (see Solver Workspaces SS1.1) ensures exclusive ownership without
/// runtime synchronization.
pub trait SolverInterface: Send {
    /// Bulk-load a pre-assembled structural LP into the solver.
    ///
    /// The stage template contains the LP matrix in CSC form, column and
    /// row bounds, and objective coefficients. This operation replaces
    /// any previously loaded model. The LP layout follows the convention
    /// defined in [Solver Abstraction SS2](./solver-abstraction.md).
    ///
    /// Maps to `Highs_passLp` (HiGHS) or `Clp_loadProblem` (CLP).
    fn load_model(&mut self, template: &StageTemplate);

    /// Batch-add constraint rows to the LP (dynamic constraint region).
    ///
    /// The cut batch contains active cuts in CSR format, ready for a
    /// single `addRows` call. Cuts are appended at the bottom of the
    /// constraint matrix per [Solver Abstraction SS2.2](./solver-abstraction.md).
    ///
    /// Maps to `Highs_addRows` (HiGHS) or `Clp_addRows` (CLP).
    fn add_rows(&mut self, cuts: &RowBatch);

    /// Update row bounds (constraint RHS values).
    ///
    /// Takes three parallel slices: `indices` (row indices to patch),
    /// `lower` (new lower bounds), and `upper` (new upper bounds).
    /// Updates inflow RHS, state-fixing constraints, and noise-fixing
    /// values without modifying the structural LP. For equality
    /// constraints, set lower[i] = upper[i] = value.
    /// This is the primary modification performed between successive
    /// solves at the same stage (within-stage incremental updates per
    /// [Solver Abstraction SS11.4](./solver-abstraction.md)).
    ///
    /// Maps to `Highs_changeRowsBoundsBySet` (HiGHS) or mutable pointer
    /// access via `Clp_rowLower()`/`Clp_rowUpper()` (CLP).
    fn set_row_bounds(&mut self, indices: &[usize], lower: &[f64], upper: &[f64]);

    /// Update column bounds (variable lower/upper bounds).
    ///
    /// Takes three parallel slices: `indices` (column indices to patch),
    /// `lower` (new lower bounds), and `upper` (new upper bounds).
    /// Updates variable bounds without modifying the structural LP.
    /// This method is not used in minimal viable SDDP but is included
    /// for completeness — future extensions (e.g., thermal unit
    /// commitment bounds, battery state-of-charge limits) may require
    /// per-scenario column bound updates.
    ///
    /// Maps to `Highs_changeColsBoundsBySet` (HiGHS) or mutable pointer
    /// access via `Clp_colLower()`/`Clp_colUpper()` (CLP).
    fn set_col_bounds(&mut self, indices: &[usize], lower: &[f64], upper: &[f64]);

    /// Solve the loaded LP, returning a zero-copy view or terminal error.
    ///
    /// Hot-path method encapsulating internal retry logic (see SS6).
    /// Returns either a valid solution view with normalized duals
    /// (see SS7) or a terminal error. The caller never sees intermediate
    /// retry attempts. The returned `SolutionView` borrows solver-internal
    /// buffers and is valid until the next `&mut self` call. Call
    /// `SolutionView::to_owned()` when the solution must outlive the borrow.
    ///
    /// Maps to `Highs_run` (HiGHS) or `Clp_dual`/`Clp_initialDualSolve`
    /// (CLP).
    fn solve(&mut self) -> Result<SolutionView<'_>, SolverError>;

    /// Inject a basis and solve, returning a zero-copy `SolutionView`.
    ///
    /// Loads the provided basis into the solver before invoking the
    /// solve sequence. Status codes in `basis` are injected directly
    /// without per-element enum translation. The basis structure splits
    /// at the cut boundary: static rows are reused directly, new dynamic
    /// constraint rows are initialized as Basic per
    /// [Solver Abstraction SS2.3](./solver-abstraction.md).
    /// On success the returned view borrows solver-internal buffers and
    /// is valid until the next `&mut self` call. Call
    /// `SolutionView::to_owned()` when the solution must outlive the borrow.
    ///
    /// Maps to `Highs_setBasis` + `Highs_run` (HiGHS) or
    /// `Clp_copyinStatus` + `Clp_dual` (CLP).
    fn solve_with_basis(
        &mut self,
        basis: &Basis,
    ) -> Result<SolutionView<'_>, SolverError>;

    /// Clear all internal solver state.
    ///
    /// Resets the solver to a clean state: clears cached basis,
    /// factorization workspace, and any accumulated internal state.
    /// After reset, the solver is ready for a fresh `load_model` call.
    ///
    /// Maps to `Highs_clearSolver` (HiGHS) or model reconstruction
    /// (CLP).
    fn reset(&mut self);

    /// Write solver-native i32 status codes into a caller-owned Basis buffer.
    ///
    /// The caller pre-allocates a `Basis` with `Basis::new` and reuses it
    /// across iterations, eliminating per-element enum translation overhead.
    /// The buffer is not resized by this method. The implementation writes
    /// into the first `num_cols` entries of `out.col_status` and the first
    /// `num_rows` entries of `out.row_status`. Panics if no model is loaded.
    /// Stored in the original problem space (not presolved) per
    /// [Solver Abstraction SS9](./solver-abstraction.md).
    ///
    /// Maps to `Highs_getBasis` (HiGHS) or `Clp_statusArray` (CLP).
    fn get_basis(&mut self, out: &mut Basis);

    /// Return accumulated solve metrics.
    ///
    /// Provides total solve count, total simplex iterations, retry
    /// count, failure count, and cumulative wall-clock time. Counters
    /// accumulate across all solves performed by this instance since
    /// construction.
    fn statistics(&self) -> SolverStatistics;

    /// Return the solver backend name.
    ///
    /// Used for logging, diagnostics, and checkpoint metadata.
    /// Returns a static string such as `"highs"` or `"clp"`.
    fn name(&self) -> &'static str;
}
}

Thread safety model: The Send bound allows solver instances to be transferred between threads (e.g., during thread pool initialization), but the absence of Sync prevents concurrent access. This matches the reality of C-library solver handles, which maintain mutable internal state (factorization workspace, working arrays) that is not safe to share. The thread-local workspace pattern in Solver Workspaces SS1.1 ensures each OpenMP thread owns exactly one solver instance for the entire training run.

Mutability: All methods that modify solver state or write to internal buffers (load_model, add_rows, set_row_bounds, set_col_bounds, solve, solve_with_basis, reset, get_basis) take &mut self. get_basis requires &mut self because it writes to internal scratch buffers during extraction; it also takes a &mut Basis output parameter so the caller can pre-allocate and reuse the buffer across iterations without per-solve allocation. Read-only accessors (statistics, name) take &self.

2. Method Contracts

2.1 load_model

load_model bulk-loads a pre-assembled structural LP into the solver instance. This is the first step of the LP rebuild sequence at each stage transition (Solver Abstraction SS11.2, step 1). The stage template is built once at initialization and shared read-only across all threads within an MPI rank.

Preconditions:

ConditionDescription
template contains a valid CSC matrixColumn starts, row indices, and values arrays are consistent; no out-of-bounds indices
template follows the LP layout conventionColumn and row ordering per Solver Abstraction SS2
template.num_cols > 0 and template.num_rows > 0Non-empty LP

Postconditions:

ConditionDescription
Solver holds the structural LP from templatePrevious model (if any) is fully replaced
No cuts are presentThe loaded model contains only structural constraints; constraint rows are added separately via add_rows
Solver basis is clearedAny cached basis from a previous model is invalidated

Infallibility: This method does not return Result. The stage template is validated during initialization (Solver Abstraction SS11.1); passing an invalid template is a programming error (panic on violation).

2.2 add_rows

add_rows appends constraint rows to the dynamic constraint region in a single batch call. In SDDP, this is used to add Benders cuts. This is step 2 of the LP rebuild sequence (Solver Abstraction SS11.2). The row batch is assembled from the cut pool’s activity bitmap for the current stage.

Preconditions:

ConditionDescription
load_model has been calledA structural LP must be loaded before adding cuts
cuts contains valid CSR row dataRow starts, column indices, values, and bounds arrays are consistent
Cut column indices reference valid columns in the loaded modelIndices within [0, num_cols)

Postconditions:

ConditionDescription
Active cuts are appended as rows at [n_static, n_static + cuts.num_rows)Cut row positions follow Solver Abstraction SS2.2 bottom region
Structural rows [0, n_static) are unchangedAdding cuts does not modify the structural LP
Solver basis is not automatically setCaller must use solve_with_basis to apply a cached basis

Infallibility: This method does not return Result. The cut batch is assembled from the pre-validated cut pool (Solver Abstraction SS5); invalid CSR data is a programming error (panic on violation).

2.3 set_row_bounds

set_row_bounds updates row bounds (constraint RHS values) without structural LP changes. This is step 3 of the LP rebuild sequence and the primary modification between successive solves at the same stage (Solver Abstraction SS11.4). Takes three parallel slices: indices (row indices to patch), lower (new lower bounds), and upper (new upper bounds). All three slices must have equal length. For equality constraints (water balance, lag fixing, noise fixing), set lower[i] = upper[i] = value.

Preconditions:

ConditionDescription
load_model has been calledA model must be loaded before patching
All indices in indices are validEach index references a valid row in the loaded model
All values in lower and upper are finiteNo NaN or infinity
lower[i] <= upper[i] for each iLower bound does not exceed upper bound

Postconditions:

ConditionDescription
Row lower and upper bounds at each patched index are updatedThe LP reflects the current scenario realization
Non-patched rows are unchangedOnly the specified row indices are modified
Column bounds are unchangedRow patching does not affect column bounds
Solver basis is preservedPatching does not invalidate a previously set basis

Infallibility: This method does not return Result. Patch indices are computed from the LP layout convention (Solver Abstraction SS2); out-of-bounds indices are a programming error (panic on violation).

Solver API mapping:

SolverAPI Call
HiGHSHighs_changeRowsBoundsBySet(model, num_set, indices, lower, upper)
CLPMutable double* via Clp_rowLower() / Clp_rowUpper()

2.3a set_col_bounds

set_col_bounds updates column bounds (variable lower/upper bounds) without structural LP changes. Takes three parallel slices: indices (column indices to patch), lower (new lower bounds), and upper (new upper bounds). All three slices must have equal length. This method is not used in minimal viable SDDP but is included for completeness – future extensions (e.g., thermal unit commitment bounds, battery state-of-charge limits) may require per-scenario column bound updates.

Preconditions:

ConditionDescription
load_model has been calledA model must be loaded before patching
All indices in indices are validEach index references a valid column in the loaded model
All values in lower and upper are finiteNo NaN or infinity
lower[i] <= upper[i] for each iLower bound does not exceed upper bound

Postconditions:

ConditionDescription
Column lower and upper bounds at each patched index are updatedThe LP reflects the current scenario realization
Non-patched columns are unchangedOnly the specified column indices are modified
Row bounds are unchangedColumn patching does not affect row bounds
Solver basis is preservedPatching does not invalidate a previously set basis

Infallibility: This method does not return Result. Patch indices are computed from the LP layout convention (Solver Abstraction SS2); out-of-bounds indices are a programming error (panic on violation).

Solver API mapping:

SolverAPI Call
HiGHSHighs_changeColsBoundsBySet(model, num_set, indices, lower, upper)
CLPMutable double* via Clp_colLower() / Clp_colUpper()

2.4 solve

solve invokes the LP solver with its internal retry logic and returns either a valid solution or a terminal error. This is the primary hot-path method – called millions of times during a training run.

Preconditions:

ConditionDescription
load_model has been calledA model must be loaded
Cuts and scenario patches have been applied as neededThe LP is ready to solve

Postconditions (on Ok):

ConditionDescription
SolutionView.objective is the optimal objective valueMinimization sense
SolutionView.primal contains optimal primal valuesLength equals num_cols
SolutionView.dual contains normalized dual valuesSign convention per Solver Abstraction SS8; see SS7
SolutionView.dual.len() == num_rowsOne dual per constraint (structural + cuts)
SolutionView borrows solver-internal buffers (zero-copy)Valid until the next &mut self call; call to_owned() to persist
Solver basis reflects the optimal solutionAvailable via get_basis() after a successful solve
SolverStatistics counters are incrementedSolve count, iteration count, and timing updated

Postconditions (on Err):

ConditionDescription
SolverError variant identifies the terminal failureAfter all retry attempts are exhausted (see SS6)
Solver state is unspecifiedThe caller should call reset() before reusing the solver
SolverStatistics.retry_count reflects retry attemptsRetry attempts are tracked even on failure

Fallibility: This method returns Result<SolutionView<'_>, SolverError>. LP solves wrap FFI calls to C libraries that may encounter numerical difficulties, infeasibility, or other solver-internal failures that cannot be prevented by precondition checks. SolutionView is a zero-copy borrow of solver-internal buffers; call SolutionView::to_owned() to convert to an owned LpSolution when the data must outlive the solver borrow.

2.5 solve_with_basis

solve_with_basis sets a cached basis for warm-starting before solving. This combines the set basis and solve operations into a single method to ensure the basis is applied atomically with the solve. Warm-starting from a cached basis typically reduces simplex iterations by 80-95% compared to cold starts.

Preconditions:

ConditionDescription
load_model has been calledA model must be loaded
basis.col_status.len() matches the loaded model’s column countBasis dimension matches current LP
basis.row_status.len() matches the loaded model’s row count (structural + cuts)Basis covers all rows including appended cuts

Postconditions (on Ok):

ConditionDescription
Same as solve() Ok postconditionsValid SolutionView with normalized duals (zero-copy borrow)
Simplex iterations are typically reduced vs. cold startWarm-start benefit is observable in SolverStatistics

Postconditions (on Err):

ConditionDescription
Same as solve() Err postconditionsTerminal error after retry exhaustion
Implementation may fall back to cold start during retryBasis rejection is a valid retry escalation step

Fallibility: Same as solve() – returns Result<SolutionView<'_>, SolverError>.

Basis dimension mismatch handling: If the provided basis dimensions do not match the current LP (e.g., because cuts were added since the basis was saved), the solver implementation must handle this gracefully. Per Solver Abstraction SS2.3, the static portion of the basis is position-stable; only the dynamic constraint portion needs extension (new dynamic constraint rows initialized as Basic) or truncation.

2.6 reset

reset clears all internal solver state, returning the instance to a clean state equivalent to a freshly constructed solver. This is used for error recovery (after a terminal SolverError) or when switching between fundamentally different LP structures.

Preconditions: None. reset can be called at any time.

Postconditions:

ConditionDescription
Solver state is cleanNo loaded model, no cached basis, no factorization
load_model must be called before next solveThe solver cannot solve without a loaded model
Statistics are preservedreset does not zero the accumulated statistics counters

Infallibility: This method does not return Result. Clearing solver state is a local operation with no failure modes.

2.7 get_basis

get_basis writes solver-native i32 status codes into a caller-owned Basis buffer. The caller pre-allocates a Basis with Basis::new and reuses it across iterations, eliminating per-solve allocation and per-element enum translation overhead on the hot path. The basis is stored in the original problem space (not presolved) to ensure portability across solver versions and presolve strategies (Solver Abstraction SS9).

Preconditions:

ConditionDescription
A successful solve or solve_with_basis has completedA basis exists only after a successful solve
out is a pre-allocated BasisCreated via Basis::new(num_cols, num_rows) and reused across iterations

Postconditions:

ConditionDescription
out.col_status[0..num_cols] contains column status codesWritten in place; buffer is not resized
out.row_status[0..num_rows] contains row status codesIncludes both structural and dynamic constraint rows
Status values are in the canonical setAtLower, Basic, AtUpper, Free, or Fixed per Solver Abstraction SS9

Infallibility: This method does not return Result. After a successful solve, the basis always exists and can be extracted. Calling get_basis without a prior successful solve is a programming error (panic on violation).

2.8 statistics

statistics returns accumulated solve metrics for this solver instance. The counters grow monotonically across all solves performed since construction (or since the last construction – reset does not clear statistics).

Preconditions: None. Can be called at any time, including before any solves.

Postconditions:

ConditionDescription
All fields are non-negativeCounters start at zero and only increment
solve_count >= success_count + failure_countTotal solves decompose into successes and failures
retry_count counts individual retry attemptsOne retry attempt per escalation step per failed solve

Infallibility: This method does not return Result. It reads internal counters with no failure modes.

2.9 name

name returns a static string identifying the solver backend. Used for logging, diagnostics, and checkpoint metadata.

Preconditions: None.

Postconditions:

ConditionDescription
Returns a non-empty &'static strLifetime is 'static – no allocation needed
Value is constant for the implementation"highs", "clp", etc.

Infallibility: This method does not return Result. It returns a compile-time constant.

3. Error Type

The SolverError enum categorizes terminal LP solve failures. These are the errors that reach the SDDP algorithm after all solver-internal retry logic has been exhausted. The six variants correspond to the error categories defined in Solver Abstraction SS6.

#![allow(unused)]
fn main() {
/// Terminal LP solve error returned after all retry attempts are exhausted.
///
/// The calling algorithm uses the variant to determine its response:
/// hard stop (`Infeasible`, `Unbounded`, `NumericalDifficulty`,
/// `InternalError`) or terminate with a diagnostic error
/// (`TimeLimitExceeded`, `IterationLimit`).
#[derive(Debug)]
pub enum SolverError {
    /// The LP has no feasible solution.
    ///
    /// Indicates a data error (inconsistent bounds or constraints) or
    /// a modeling error. The calling algorithm performs a hard stop.
    Infeasible,

    /// The LP objective is unbounded below.
    ///
    /// Indicates a modeling error (missing bounds, incorrect objective
    /// sign). The calling algorithm performs a hard stop.
    Unbounded,

    /// Solver encountered numerical difficulties that persisted through
    /// all retry attempts.
    ///
    /// The calling algorithm should log the error and perform a hard stop.
    NumericalDifficulty {
        /// Human-readable description of the numerical issue from the solver.
        message: String,
    },

    /// Per-solve wall-clock time budget exhausted.
    TimeLimitExceeded {
        /// Elapsed wall-clock time in seconds.
        elapsed_seconds: f64,
    },

    /// Solver simplex iteration limit reached.
    IterationLimit {
        /// Number of iterations performed.
        iterations: u64,
    },

    /// Unrecoverable solver-internal failure.
    ///
    /// Covers FFI panics, memory allocation failures within the solver,
    /// corrupted internal state, or any error not classifiable into the
    /// above categories. The calling algorithm logs and performs a hard stop.
    InternalError {
        /// Human-readable error description.
        message: String,
        /// Solver-specific error code, if available.
        error_code: Option<i32>,
    },
}
}

Error-to-response mapping:

VariantHard StopDiagnostic
InfeasibleYesNo
UnboundedYesNo
NumericalDifficultyYesNo
TimeLimitExceededNoYes
IterationLimitNoYes
InternalErrorYesNo

4. Supporting Types

4.1 SolutionView and LpSolution

SolutionView<'a> is the primary return type of solve() and solve_with_basis(). It borrows directly from solver-internal buffers (zero-copy), avoiding per-solve heap allocation on the hot path. The lifetime 'a ties the view to the solver instance, enforced at compile time by the Rust borrow checker: the view is valid until the next &mut self call on the solver. Call SolutionView::to_owned() to convert to an owned LpSolution when the data must outlive the current borrow or survive a subsequent solver call.

#![allow(unused)]
fn main() {
/// Zero-copy view of an LP solution, borrowing directly from
/// solver-internal buffers.
///
/// Valid until the next mutating method call on the solver (any
/// `&mut self` call). Use `to_owned()` to convert to an owned
/// `LpSolution` when the solution data must outlive the current borrow.
///
/// All values are in the original (unscaled) problem space. Dual values
/// are normalized to the canonical sign convention per
/// [Solver Abstraction SS8](./solver-abstraction.md) -- see SS7.
#[derive(Debug, Clone, Copy)]
pub struct SolutionView<'a> {
    /// Optimal objective value (minimization sense).
    pub objective: f64,

    /// Primal variable values, indexed by column.
    /// Length equals `num_cols`. State variables occupy the contiguous
    /// prefix `[0, n_state)` per [Solver Abstraction SS2.1](./solver-abstraction.md).
    pub primal: &'a [f64],

    /// Dual multipliers (shadow prices), indexed by row.
    /// Length equals `num_rows` (structural + cuts). Cut-relevant
    /// constraint duals occupy the contiguous prefix `[0, n_dual_relevant)`
    /// per [Solver Abstraction SS2.2](./solver-abstraction.md).
    ///
    /// Sign convention: normalized per SS7 before returning.
    pub dual: &'a [f64],

    /// Reduced costs, indexed by column.
    /// Length equals `num_cols`.
    pub reduced_costs: &'a [f64],

    /// Number of simplex iterations performed for this solve.
    pub iterations: u64,

    /// Wall-clock solve time in seconds (excluding retry overhead).
    pub solve_time_seconds: f64,
}

/// Complete owned solution from a successful LP solve.
///
/// Produced by `SolutionView::to_owned()`. All values are in the
/// original (unscaled) problem space. Dual values are normalized to
/// the canonical sign convention per
/// [Solver Abstraction SS8](./solver-abstraction.md) -- see SS7.
pub struct LpSolution {
    /// Optimal objective value (minimization sense).
    pub objective: f64,

    /// Primal variable values, indexed by column.
    /// Length equals `num_cols`.
    pub primal: Vec<f64>,

    /// Dual multipliers (shadow prices), indexed by row.
    /// Length equals `num_rows` (structural + cuts).
    /// Sign convention: normalized per SS7 before returning.
    pub dual: Vec<f64>,

    /// Reduced costs, indexed by column.
    /// Length equals `num_cols`.
    pub reduced_costs: Vec<f64>,

    /// Number of simplex iterations performed for this solve.
    pub iterations: u64,

    /// Wall-clock solve time in seconds (excluding retry overhead).
    pub solve_time_seconds: f64,
}
}

4.2 Basis

#![allow(unused)]
fn main() {
/// Raw simplex basis stored as solver-native i32 status codes.
///
/// Each element is a solver-specific status code (e.g., HiGHS uses
/// 0=AtLower, 1=Basic, 2=AtUpper, 3=Free, 4=Fixed). The codes are
/// opaque to the calling algorithm — they are extracted from one solve
/// and passed back to the next via `solve_with_basis` for warm-starting.
///
/// Stored in the original problem space (not presolved) to ensure
/// portability across solver versions and presolve strategies
/// ([Solver Abstraction SS9](./solver-abstraction.md)).
pub struct Basis {
    /// Basis status codes for each column (variable), in solver-native encoding.
    pub col_status: Vec<i32>,

    /// Basis status codes for each row (constraint), in solver-native encoding.
    /// Includes both static rows and dynamic constraint rows.
    pub row_status: Vec<i32>,
}
}

4.3 SolverStatistics

#![allow(unused)]
fn main() {
/// Accumulated solve metrics for a single solver instance.
///
/// Counters grow monotonically from construction. Thread-local --
/// each thread owns one solver instance and accumulates its own
/// statistics. Aggregated across threads via reduction after training
/// completes.
///
/// `reset()` does **not** zero statistics counters. They persist across
/// model reloads for the lifetime of the solver instance.
pub struct SolverStatistics {
    /// Total number of `solve` and `solve_with_basis` calls.
    pub solve_count: u64,

    /// Number of solves that returned `Ok` (optimal solution found).
    pub success_count: u64,

    /// Number of solves that returned `Err` (terminal failure after retries).
    pub failure_count: u64,

    /// Total simplex iterations summed across all solves.
    pub total_iterations: u64,

    /// Total retry attempts summed across all failed solves.
    pub retry_count: u64,

    /// Cumulative wall-clock time spent in solver calls, in seconds.
    pub total_solve_time_seconds: f64,

    /// Number of times `solve_with_basis` fell back to cold-start due to
    /// basis rejection.
    pub basis_rejections: u64,

    /// Number of solves that returned optimal on the first attempt
    /// (before any retry). Enables first-try rate computation:
    /// `first_try_rate = first_try_successes / solve_count`.
    /// The complement `success_count - first_try_successes` gives the
    /// number of retried solves.
    pub first_try_successes: u64,

    /// Total number of `solve_with_basis` calls (basis offers).
    /// Combined with `basis_rejections`, enables basis hit rate computation:
    /// `basis_hit_rate = 1 - basis_rejections / basis_offered`.
    pub basis_offered: u64,

    /// Total number of `load_model` calls.
    pub load_model_count: u64,

    /// Total number of `add_rows` calls.
    pub add_rows_count: u64,

    /// Cumulative wall-clock time spent in `load_model` calls, in seconds.
    pub total_load_model_time_seconds: f64,

    /// Cumulative wall-clock time spent in `add_rows` calls, in seconds.
    pub total_add_rows_time_seconds: f64,

    /// Cumulative wall-clock time spent in `set_row_bounds` and
    /// `set_col_bounds` calls, in seconds.
    pub total_set_bounds_time_seconds: f64,

    /// Cumulative wall-clock time spent in `set_basis` FFI calls, in seconds.
    /// Accumulated by `solve_with_basis` around the basis installation step.
    /// `solve()` (without basis) does not increment this counter.
    pub total_basis_set_time_seconds: f64,

    /// Per-level retry success histogram (12 levels, indexed 0..11).
    /// `retry_level_histogram[k]` counts how many solves were recovered at
    /// retry level `k`. The sum equals `success_count - first_try_successes`.
    pub retry_level_histogram: Vec<u64>,
}
}

4.4 StageTemplate

The StageTemplate holds the pre-assembled structural LP for one stage in solver-ready CSC form. It is built once at initialization and shared read-only across all threads. The full specification of its contents, lifecycle, and memory layout is in Solver Abstraction SS11.1. This spec defines only the type signature for use in the SolverInterface trait.

Construction ownership: The cobre-sddp crate owns StageTemplate construction. A builder function in cobre-sddp takes a reference to the resolved System struct (Internal Structures SS1) and the StageDefinition for the target stage, and produces a StageTemplate. The solver crate (cobre-solver) receives StageTemplate as an opaque data holder and does not interpret its contents – it bulk-loads the CSC arrays into the underlying LP solver without understanding what the columns or rows represent. This separation ensures that LP modeling concerns (variable ordering, constraint structure, state dimension) remain in cobre-sddp, while solver concerns (API calls, retry logic, basis management) remain in cobre-solver.

#![allow(unused)]
fn main() {
// Construction function signature (in cobre-sddp, not cobre-solver):
//
// pub fn build_stage_template(
//     system: &System,
//     stage_def: &StageDefinition,
//     stage_index: usize,
// ) -> StageTemplate;
}
#![allow(unused)]
fn main() {
/// Pre-assembled structural LP for one stage, in CSC (column-major) form.
///
/// Built once at initialization from resolved internal structures
/// ([Internal Structures](../data-model/internal-structures.md)).
/// Shared read-only across all threads within an MPI rank.
/// Column and row ordering follows [Solver Abstraction SS2](./solver-abstraction.md).
pub struct StageTemplate {
    /// Number of columns (variables).
    pub num_cols: usize,
    /// Number of static rows (constraints, excluding dynamic constraints).
    pub num_rows: usize,
    /// Number of non-zero entries in the structural matrix.
    pub num_nz: usize,

    /// CSC column start offsets (`i32` for HiGHS FFI compatibility).
    /// Length: `num_cols + 1`; `col_starts[num_cols] == num_nz`.
    pub col_starts: Vec<i32>,
    /// CSC row indices (`i32` for HiGHS FFI compatibility). Length: `num_nz`.
    pub row_indices: Vec<i32>,
    /// CSC non-zero values. Length: `num_nz`.
    pub values: Vec<f64>,

    /// Column lower bounds. Length: `num_cols`.
    pub col_lower: Vec<f64>,
    /// Column upper bounds. Length: `num_cols`.
    pub col_upper: Vec<f64>,
    /// Objective coefficients. Length: `num_cols`.
    pub objective: Vec<f64>,

    /// Row lower bounds. Length: `num_rows`.
    pub row_lower: Vec<f64>,
    /// Row upper bounds. Length: `num_rows`.
    pub row_upper: Vec<f64>,

    /// Number of state variables (contiguous prefix of columns).
    /// Equal to N * (1 + L) per [Solver Abstraction SS2.1](./solver-abstraction.md).
    pub n_state: usize,
    /// Number of state values to transfer between stages.
    /// Equal to N * L per [Solver Abstraction SS2.1](./solver-abstraction.md)
    /// (storage + all lags except the oldest).
    pub n_transfer: usize,
    /// Number of dual-relevant constraint rows (contiguous prefix of rows).
    /// Currently equal to `n_state` (= `N + N*L` where `N` is the number
    /// of hydros and `L` is the maximum PAR lag order). FPHA and generic
    /// variable constraint rows are structural and not included in the
    /// dual-relevant set. Cut coefficients are extracted from
    /// `dual[0..n_dual_relevant]`.
    pub n_dual_relevant: usize,
    /// Number of operating hydros at this stage.
    pub n_hydro: usize,
    /// Maximum PAR order across all operating hydros at this stage.
    /// Determines the uniform lag stride: all hydros store `max_par_order`
    /// lag values regardless of their individual PAR order, enabling SIMD
    /// vectorization with a single contiguous state stride.
    pub max_par_order: usize,

    /// Per-column scaling factors for numerical conditioning.
    /// When non-empty (length `num_cols`), the constraint matrix, objective
    /// coefficients, and column bounds have been pre-scaled by these factors.
    /// The calling algorithm is responsible for unscaling primal values after
    /// each solve: `x_original[j] = col_scale[j] * x_scaled[j]`.
    /// When empty, no column scaling has been applied and solver results are
    /// used directly.
    pub col_scale: Vec<f64>,
    /// Per-row scaling factors for numerical conditioning.
    /// When non-empty (length `num_rows`), the constraint matrix and row
    /// bounds have been pre-scaled by these factors. The calling algorithm
    /// is responsible for unscaling dual values after each solve:
    /// `dual_original[i] = row_scale[i] * dual_scaled[i]`.
    /// When empty, no row scaling has been applied and solver results are
    /// used directly.
    pub row_scale: Vec<f64>,
}
}

4.5 RowBatch

The RowBatch holds constraint rows for batch addition in CSR (row-major) form, ready for a single add_rows call. In SDDP, it is assembled from the cut pool’s activity bitmap before each LP rebuild.

#![allow(unused)]
fn main() {
/// Batch of constraint rows for addition, in CSR (row-major) form.
///
/// In SDDP, assembled from the cut pool activity bitmap for the current stage.
/// Passed to `add_rows` for a single batch row-addition call.
/// See [Solver Abstraction SS5.4](./solver-abstraction.md) for the
/// assembly protocol.
pub struct RowBatch {
    /// Number of active cuts in this batch.
    pub num_rows: usize,

    /// CSR row start offsets (`i32` for HiGHS FFI compatibility).
    /// Length: `num_rows + 1`. `row_starts[num_rows]` equals the total
    /// number of non-zeros.
    pub row_starts: Vec<i32>,
    /// CSR column indices (`i32` for HiGHS FFI compatibility).
    /// Length: total non-zeros across all cuts.
    pub col_indices: Vec<i32>,
    /// CSR non-zero values. Length: total non-zeros across all cuts.
    pub values: Vec<f64>,

    /// Row lower bounds (cut intercepts alpha). Length: `num_rows`.
    pub row_lower: Vec<f64>,
    /// Row upper bounds (all +infinity). Length: `num_rows`.
    pub row_upper: Vec<f64>,
}
}

5. Dispatch Mechanism

Decision DEC-006 (active): Box<dyn Trait> rejected for all closed variant sets; enum dispatch used for algorithm variants (RiskMeasure, HorizonMode, SamplingScheme, CutSelectionStrategy, StoppingRuleSet); compile-time monomorphization reserved for FFI-wrapping traits.

The SolverInterface trait uses compile-time monomorphization – the training loop is generic over the solver type, and the concrete implementation is resolved at compile time. This is the same pattern used by the Communicator trait (Communicator Trait §3) and documented as the solver selection strategy in Solver Abstraction SS10.

#![allow(unused)]
fn main() {
/// Train the SDDP policy using the provided solver and communicator backends.
///
/// Both generic parameters are resolved at compile time -- no trait object
/// indirection, no vtable lookup, no dynamic dispatch on the hot path.
pub fn train<S: SolverInterface, C: Communicator>(
    solver_factory: impl Fn() -> S,
    comm: &C,
    config: &TrainingConfig,
    stages: &[StageTemplate],
    // ... other parameters
) -> TrainingResult {
    // Each OpenMP thread creates its own solver instance via solver_factory
    // Thread-local workspace holds the solver for the entire training run
    todo!()
}
}

Why compile-time monomorphization (not enum dispatch): The solver interface is fundamentally different from the five algorithm-variant abstractions (risk measure, cut formulation, horizon mode, sampling scheme, cut selection strategy) that use enum dispatch:

AspectSolverInterfaceAlgorithm-Variant Traits
Variation scopeGlobal – one solver per buildMay vary per stage (risk measure) or per run (others)
Variant countOne per binary (feature-gated)2-3 variants coexist in the same binary
Call frequencyMillions of times (every LP solve)Hundreds to thousands per iteration
FFI boundaryWraps C library callsPure Rust computation
Performance sensitivityExtremely high – on the hot pathLow to moderate – dominated by LP solve cost
Dispatch patternCompile-time monomorphizationEnum match at call site

The key distinction is that exactly one solver backend is active per build (selected via Cargo feature flags per Solver Abstraction SS10). The binary never needs to dispatch between HiGHS and CLP at runtime. Monomorphization eliminates virtual dispatch overhead entirely, enables the compiler to inline solver-specific code paths, and is consistent with the established Communicator trait pattern.

Contrast with enum dispatch: The algorithm-variant traits (e.g., RiskMeasure, HorizonMode) use enum dispatch because they have a small, fixed variant set where multiple variants may coexist in the same binary (e.g., per-stage risk measure variation). The solver interface has neither characteristic: there is exactly one active solver per binary, and the call frequency is orders of magnitude higher.

6. Retry Logic Encapsulation

Retry logic for numerical difficulties is encapsulated within each SolverInterface implementation. The solve() and solve_with_basis() methods handle retries internally and return only the final result (success or terminal error) to the caller. The calling algorithm never sees intermediate retry attempts. The retry_max_attempts and retry_time_budget_seconds parameters are sourced from config.json (see Configuration Reference section 3.5).

The retry behavioral contract is defined in Solver Abstraction SS7:

  • Maximum attempts: Configurable upper bound (default: 5)
  • Time budget: Configurable wall-clock budget for all attempts combined
  • Escalating strategies: Least-disruptive (clear basis) to most-disruptive (switch algorithm)
  • Final disposition: Terminal SolverError with best partial solution if available
  • Logging: Each retry attempt logged at debug level

For solver-specific retry escalation sequences, see HiGHS Implementation and CLP Implementation. The implementation specs define which HiGHS/CLP API calls correspond to each escalation step (clear basis, disable presolve, switch to interior point, relax tolerances).

7. Dual Normalization Contract

All dual multipliers in SolutionView.dual (and consequently LpSolution.dual via to_owned()) are pre-normalized to the canonical sign convention defined in Solver Abstraction SS8 before the solution is returned to the caller. Solver-specific sign differences are resolved within the SolverInterface implementation.

Canonical convention: A positive dual on a constraint means that increasing the RHS increases the objective ().

Why this matters: The cut coefficient computation in the backward pass extracts dual multipliers from the cut-relevant constraint rows ([0, n_dual_relevant)) and uses them directly as cut gradients:

A sign error in produces cuts that point in the wrong direction, leading to divergence of the outer approximation. By normalizing duals inside the solver implementation, the cut generation logic is solver-agnostic and provably correct regardless of which backend is active.

Implementation responsibility: Each solver backend must know its native dual sign convention and apply the appropriate transformation. For example, if a solver reports duals with the opposite sign for constraints, the implementation negates those duals before populating SolutionView.dual. This transformation is applied once per solve, adding negligible overhead to the solution extraction step.

Cross-References

  • Solver Abstraction – Operations contract (SS4), LP layout convention (SS2), cut pool design (SS5), error categories (SS6), retry logic contract (SS7), dual normalization (SS8), basis storage (SS9), compile-time selection (SS10), stage templates and rebuild strategy (SS11)
  • Solver Workspaces – Thread-local solver infrastructure (SS1), per-stage basis cache (SS1.5), pre-allocated solution buffers (SS1.2), solve statistics (SS1.6)
  • HiGHS Implementation – HiGHS-specific API mapping (SS2), retry strategy, batch operations, memory footprint
  • CLP Implementation – CLP-specific API mapping (SS2), C++ wrapper strategy, mutable pointer optimization, cloning path
  • Communicator Trait – Compile-time monomorphization pattern (SS3) that the solver interface follows; convention blockquote source
  • Backend Testing – Conformance test methodology applicable to solver backend testing
  • Extension Points – Dispatch mechanism analysis (SS7) contrasting compile-time monomorphization (solver) with enum dispatch (algorithm variants); variant selection pipeline (SS6)
  • Training Loop – Forward pass (SS4) and backward pass (SS6) that drive solver invocations; abstraction points (SS3) parameterizing the training loop
  • Cut Management Implementation – Cut pool activity bitmap (SS1.1), CSR assembly for addRows (SS1), cut coefficient computation using duals from LpSolution
  • LP Formulation – Constraint structure that defines which duals are cut-relevant and feed into the cut coefficient formula
  • Binary Formats – Cut pool memory layout (SS3.4) that produces RowBatch inputs
  • Internal Structures – Logical in-memory data model from which StageTemplate is built
  • Hybrid Parallelism – OpenMP threading model (SS3) requiring thread-local solvers with Send but not Sync
  • Memory Architecture – NUMA-aware allocation (SS2) for solver workspaces