Skip to content

Cut Management

This spec defines the complete Benders cut lifecycle in Cobre: what cuts represent mathematically, how cut coefficients relate to LP dual variables, how per-scenario cuts are aggregated into a single cut, under what conditions cuts are valid, and what selection strategies are available to control cut pool growth.

A Benders cut at stage t1t-1 is a linear inequality that provides a lower bound on the cost-to-go function Vt(x)V_t(x):

θα+hHπhvvh+hH=1Phπh,lagah,\theta \geq \alpha + \sum_{h \in \mathcal{H}} \pi^v_h \cdot v_h + \sum_{h \in \mathcal{H}} \sum_{\ell=1}^{P_h} \pi^{lag}_{h,\ell} \cdot a_{h,\ell}

where:

  • θ\theta is the future cost variable in the stage t1t-1 LP
  • α\alpha is the cut intercept
  • πhv\pi^v_h is the storage coefficient for hydro hh (marginal value of water)
  • πh,lag\pi^{lag}_{h,\ell} is the AR lag coefficient for hydro hh, lag \ell (marginal value of inflow history)
  • vhv_h, ah,a_{h,\ell} are the state variables (see LP Formulation)

The cut coefficients are dense — every state variable (all storage volumes and all AR lags) has a non-zero coefficient in every cut. This density is a consequence of the full-state column-bound pinning used for reduced-cost extraction (§2).

Cobre adds cuts monotonically across iterations: an active cut is never removed from the lower-bound LP within a training run. Because the cut set only grows, the lower-bound estimate is non-decreasing across iterations of a training run. This is a methodology-level guarantee — the outer approximation of the value function can only become tighter iteration by iteration.

After solving the stage tt subproblem for trial state x^t1\hat{x}_{t-1} and scenario ωt\omega_t, the cut coefficients are derived from the LP reduced costs of the pinned incoming-state columns — the columns whose lower and upper bounds were set equal to the incoming state value (see LP Formulation §4a).

Both storage and inflow lags use the same pattern: an incoming-state LP variable is pinned to the trial value by equal column bounds, and the reduced cost of that column gives the cut coefficient directly:

Pinned columnReduced costCut coefficientUnits
Incoming storage (hydro hh)cˉhin\bar{c}^{in}_hπt,hv=cˉhin/dhcol\pi^v_{t,h} = \bar{c}^{in}_h / d^{col}_h$/hm³
AR lag (hydro hh, lag \ell)cˉh,lag\bar{c}^{lag}_{h,\ell}πt,h,lag=cˉh,lag/dh,col\pi^{lag}_{t,h,\ell} = \bar{c}^{lag}_{h,\ell} / d^{col}_{h,\ell}$/(m³/s)

where dcold^{col} is the per-column prescaler factor that unscales the reduced cost; a further factor KK converts to original cost units at the reporting boundary (see LP Formulation §12). When anticipated thermals are present, each anticipated-state slot column contributes one coefficient by the same rule.

The cut intercept ensures the cut passes through the trial point:

αt=Qt(x^t1,ωt)hHπt,hvv^hhH=1Phπt,h,laga^h,\alpha_t = Q_t(\hat{x}_{t-1}, \omega_t) - \sum_{h \in \mathcal{H}} \pi^v_{t,h} \cdot \hat{v}_h - \sum_{h \in \mathcal{H}} \sum_{\ell=1}^{P_h} \pi^{lag}_{t,h,\ell} \cdot \hat{a}_{h,\ell}

where Qt(x^t1,ωt)Q_t(\hat{x}_{t-1}, \omega_t) is the optimal objective value of the stage tt subproblem.

Sign convention: By the LP envelope theorem, Qt/x^j=πj\partial Q_t / \partial \hat{x}_j = \pi_j, the sensitivity of the optimal value to the pinned incoming state. For a column pinned at xj=xˉj=x^j\underline{x}_j = \bar{x}_j = \hat{x}_j, that sensitivity is exactly the column’s reduced cost — equal, by KKT parity, to the multiplier the former equality row xjin=x^jx^{in}_j = \hat{x}_j would have carried — so the cut coefficient is taken directly from the reduced cost with no sign flip (after the per-column unscaling above). The reduced cost automatically captures all downstream effects: for storage, this includes contributions from the water balance, FPHA hyperplanes, and any generic constraints that reference the incoming storage variable vhinv^{in}_h (see LP Formulation).

In the single-cut formulation, per-scenario cuts from the backward pass are aggregated into one cut per trial point by taking the probability-weighted expectation:

αˉt1=ωΩtp(ω)αt(ω)\bar{\alpha}_{t-1} = \sum_{\omega \in \Omega_t} p(\omega) \cdot \alpha_t(\omega) πˉt1,hv=ωΩtp(ω)πt,hv(ω)\bar{\pi}^v_{t-1,h} = \sum_{\omega \in \Omega_t} p(\omega) \cdot \pi^v_{t,h}(\omega) πˉt1,h,lag=ωΩtp(ω)πt,h,lag(ω)\bar{\pi}^{lag}_{t-1,h,\ell} = \sum_{\omega \in \Omega_t} p(\omega) \cdot \pi^{lag}_{t,h,\ell}(\omega)

where p(ω)p(\omega) is the probability of scenario ω\omega.

The aggregated cut (αˉ,πˉv,πˉlag)(\bar{\alpha}, \bar{\pi}^v, \bar{\pi}^{lag}) is added to stage t1t-1‘s cut pool.

A cut is valid if it is a lower bound on the true cost-to-go function everywhere in the feasible state space:

αk+πkxVt+1(x)xXt\alpha_k + \pi_k^\top x \leq V_{t+1}(x) \quad \forall x \in \mathcal{X}_t

Validity conditions: The cuts generated by SDDP are valid under:

  1. Convexity of stage subproblems — guaranteed because all subproblems are LPs
  2. Relatively complete recourse — feasibility for all states and scenarios. In Cobre, this is guaranteed by the recourse slack system (Category 1 penalties): every constraint that could be violated by exogenous uncertainty has a penalty slack variable, ensuring the LP is always feasible. See Penalty System.
  3. Correct sensitivity extraction — the reduced costs used as cut coefficients must come from an optimal LP solution (not an infeasible or unbounded one)

The number of cuts grows as O(iterations×forward_passes)\mathcal{O}(\text{iterations} \times \text{forward\_passes}). Many older cuts become redundant as newer, tighter cuts are generated. Without selection, the number of cut rows the LP carries grows linearly with iteration count.

Append-only pool with stable slots: Cuts are never deleted. Every cut ever generated is retained for the lifetime of the run at a stable, deterministic slot index — the slot is a fixed function of the iteration and forward-pass index, which is what makes the cut order reproducible across runs and rank counts (see Determinism Guarantees). The pool is never compacted.

Deactivation mechanism (periodic-pruning methods): A deactivated cut keeps its slot but is excluded from the per-iteration template rebake, so only active cuts are encoded as LP rows on each forward/backward solve. In the persistent lower-bound LP — where cut rows are never structurally removed, so the bound stays monotone — deactivation instead toggles the cut row’s bound to a trivially-satisfied ±\pm\infty sentinel. Both routes preserve the slot index for reproducibility and make reactivation exact: a cut that selection later restores is re-baked (or its bound restored) at the same slot.

Two families of selection strategy are available. Periodic-pruning methods (Level-1, LML1, Domination) run a value-evaluation pass after each nn-th iteration and deactivate redundant cuts from the pool. Dynamic Cut Selection (DCS) takes a different approach: it keeps the pool entirely append-only — never deactivating any cut — and instead controls which cuts are resident in each individual stage LP solve (§8).

Cut selection works from a value-evaluation view of activity. At a visited forward-pass trial point x^\hat{x}, each cut’s value is αk+πkx^\alpha_k + \pi_k^\top \hat{x}, and the per-state best value is

V(x^)=maxk{αk+πkx^}V^*(\hat{x}) = \max_k \left\{ \alpha_k + \pi_k^\top \hat{x} \right\}

taken over all populated cuts, active and inactive. A cut is active (near-optimal) at x^\hat{x} when its value lies within a tolerance band of the best:

cut k is active at x^    V(x^)(αk+πkx^)ϵ\text{cut } k \text{ is active at } \hat{x} \iff V^*(\hat{x}) - (\alpha_k + \pi_k^\top \hat{x}) \le \epsilon

This is equivalent to the cut being binding (or near-binding) at the LP optimum reached from x^\hat{x} — a cut at the per-state maximum is the one the future-cost variable θ\theta rests on. The tolerance ϵ\epsilon is the strategy’s tie-breaking band: tie_tolerance for Level-1/LML1, domination_tolerance for Domination (§10).

ToleranceEffect
0Only exact-maximum cuts count as active
1e-10Default: ties within rounding of the maximum are kept active
largerWider near-optimal band retained
very largeAll cuts considered active (no deactivation)

Three periodic-pruning strategies are available, in increasing order of aggressiveness. All three share one value-evaluation kernel: every populated cut (active and inactive) is evaluated at every visited trial point, the per-state maximum is taken, and a per-state survival rule decides which cuts to keep. The kernel treats deactivation and reactivation symmetrically — in a single pass, a selected cut that is currently inactive is reactivated and an active cut not selected anywhere is deactivated. It is bit-deterministic regardless of thread count (see Determinism Guarantees).

Visited-states window: these strategies score cuts against the trial points held in the visited-states archive. To bound memory on long runs, the archive keeps only the most recent trial points — roughly those gathered over the last check_frequency iterations. After each selection run the archive is trimmed to that window, so a run sees up to about two windows of accumulated states before the trim; older trial points are then evicted. Dynamic Cut Selection (§8) does not use this archive.

At each visited trial point, every cut within tie_tolerance of the per-state maximum value survives; the selected set is the union of these near-maximum cuts across all visited states. A cut is deactivated only if, at every visited state, its value is more than tie_tolerance below the maximum there. This strategy was originally proposed by de Matos, Philpott & Finardi (2015).

Properties:

  • Least aggressive — retains any cut that is near-optimal at some visited state
  • Preserves the convergence guarantee (see section 9)

Like Level-1, but at each visited state only the single oldest eligible cut within tie_tolerance of the maximum survives (oldest = smallest slot index among non-warm-start cuts). The selected set is the union of these oldest-at-maximum cuts across visited states. Guigues & Bandarra (2019).

Properties:

  • More aggressive than Level-1 — when several cuts tie at a state, only the oldest is kept, so redundant near-duplicates are shed faster
  • Preserves the convergence guarantee (see section 9)

A cut is dominated if, at every visited state, the maximum over all populated cuts exceeds its value by more than domination_tolerance. Dominated cuts contribute nothing to the policy at any visited state and are deactivated; inactive cuts that achieve the maximum somewhere are reactivated. Formally, cut kk is dominated when

maxjk{αj+πjx^}(αk+πkx^)>domination_tolerancex^visited states\max_{j \neq k} \left\{ \alpha_j + \pi_j^\top \hat{x} \right\} - \left( \alpha_k + \pi_k^\top \hat{x} \right) > \text{domination\_tolerance} \quad \forall \hat{x} \in \text{visited states}

This is the same max-survival logic as Level-1, using domination_tolerance in place of tie_tolerance.

Properties:

  • Most aggressive — directly identifies cuts that provide no value at any known operating point
  • Cost is O(cuts×visited states)\mathcal{O}(|\text{cuts}| \times |\text{visited states}|) per stage per check; the kernel evaluates it as a dense matrix product distributed across threads
  • May deactivate cuts that would be active at unvisited states (acceptable as the visited set grows dense)

Dynamic Cut Selection (DCS) is selected by selection.method = "dynamic" and is mutually exclusive with the periodic-pruning methods (Level-1, LML1, Domination). Unlike them, DCS never deactivates cuts from the pool — it keeps the pool entirely append-only and controls only which cuts are resident in each stage LP at solve time.

Motivation: rather than baking every active cut into each stage LP, DCS solves with a small resident subset and grows it lazily until the solution is provably exact, keeping per-solve LP size bounded as the pool grows. The full pool is retained off-LP and every cut remains a candidate.

At iteration kk, the DCS loop for one (stage, solve) is:

  1. Seed the resident set with cuts that were active at this stage within the last k2k_2 iterations (the seed window), plus all cuts generated in the current iteration. The seed is derived only from synchronized per-slot pool metadata (last_active_iter and iteration_generated) — never from a per-worker solve trace. This makes the seed bit-identical across thread and MPI-rank counts.

  2. Solve the stage LP with the current resident set. Let xx^* be the optimal state vector and θ\theta^* the optimal future-cost value.

  3. Score the omitted (non-resident) candidate cuts. Cut coefficients are stored in raw (unscaled) space while the LP solves in scaled space, so both θ\theta^* and each state column must first be unscaled: xraw=col_scale[c]xscaledx_{\text{raw}} = \text{col\_scale}[c] \cdot x_{\text{scaled}}. For candidate cut ii with intercept αi\alpha_i and gradient i\nabla_i, the future-cost floor it would impose at xx^* is

    fi=αi+ixrawf_i = \alpha_i + \nabla_i \cdot x^*_{\text{raw}}

    The candidate is violated iff fiθraw>εviolf_i - \theta^*_{\text{raw}} > \varepsilon_{\text{viol}} (strict). This follows from the cut-row convention x+θα-\nabla \cdot x + \theta \ge \alpha.

  4. Add the top nadicn_{\text{adic}} most-violated candidates (sorted by violation magnitude descending, ties broken by ascending slot id). Warm re-solve from the retained basis and return to step 2.

  5. Stop when no candidate is violated. At this point the resident-subset optimum equals the full-pool optimum — every omitted cut is satisfied — so the result is exact. Backward duals and forward primals are extracted from this final LP.

By default every pool cut is a candidate (k1=k_1 = \infty, i.e. candidate_recency absent), which preserves exactness. Setting candidate_recency to a finite k1k_1 restricts candidates to cuts generated within the last k1k_1 iterations. This is a deliberate, inexact speedup: cuts older than the window are never scored or added, even when violated.

8.3 Bounded inner loop with full-pool fallback

Section titled “8.3 Bounded inner loop with full-pool fallback”

The add/re-solve loop is capped at 50 rounds (not user-configurable). If the cap is reached with violations remaining, all remaining violated candidates are added and one final solve is performed — degrading to a full-pool solve for that LP, which preserves exactness.

8.4 Pass uniformity and warm-start synergy

Section titled “8.4 Pass uniformity and warm-start synergy”

DCS applies identically across the backward pass, the forward pass, and simulation. Forward and simulation also run to exactness — no early stop — to avoid trajectory drift: an early stop in the forward pass would shift the states the backward pass visits, producing policy drift.

Because each inner re-solve adds only a few rows from a retained basis, DCS and LP basis warm-start are complementary, not in tension. See LP Warm-Start.

seed_window = 0 is valid and meaningful: the resident set is seeded only with cuts generated in the current iteration, with no history. This matches the NEWAVE selcor.dat behavior (zero-history seeding), and the lazy loop then grows the resident set from scratch each solve. A larger seed_window trades a slightly larger initial LP for fewer inner iterations.

The resident-set selection is bit-identical across thread and MPI-rank counts:

  • The seed comes from synchronized pool metadata (last_active_iter, iteration_generated), not per-worker traces.
  • The optimal state xx^* is scenario-deterministic (a function of the LP, not the rank).
  • Candidate scoring uses a bit-deterministic batched matrix product (gemm_block).
  • The violation sort uses a total ordering with an ascending-slot tie-break, making the top-nadicn_{\text{adic}} selection stable.

See Determinism Guarantees.

Theorem (Bandarra & Guigues, 2021): Under Level-1 or LML1 cut selection, SDDP with finitely many scenarios converges to the optimal value function with probability 1.

Key insight: Removing cuts that are never active at any visited state does not affect the outer approximation quality at those states. As the set of visited states becomes dense over iterations, the approximation converges.

DCS and convergence: DCS is exact at every solve (§8.1 step 5) — the optimum it returns equals the full-pool optimum — so it does not weaken the convergence argument. The exactness guarantee rests on k1=k_1 = \infty (every pool cut remains a candidate); setting a finite candidate_recency sacrifices this guarantee.

Bandarra, M., & Guigues, V. (2021). “Single cut and multicut stochastic dual dynamic programming with cut selection for multistage stochastic linear programs: convergence proof and numerical experiments.” Computational Management Science, 18(2), 125-148. https://doi.org/10.1007/s10287-021-00387-8

The training.cut_selection object has two always-on top-level knobs plus an optional tagged selection object. Omitting selection disables row selection. The selection.method field chooses the strategy; each method carries only its own parameters.

Math symbolConfig keyDefaultApplies to
row_activity_tolerance0.0Always-on (top-level)
max_active_per_stagenone (no cap)Always-on (top-level)
ϵ\epsilonselection.tie_tolerance1e-10level1, lml1
selection.check_frequency5 (must be > 0)level1, lml1, domination
ϵ\epsilonselection.domination_tolerancerequired, no defaultdomination
k2k_2selection.seed_window5 (0 is valid)dynamic
k1k_1selection.candidate_recencyunset (\infty, exact)dynamic
nadicn_{\text{adic}}selection.max_added_per_round10 (must be ≥ 1)dynamic
εviol\varepsilon_{\text{viol}}selection.violation_tolerance1e-10dynamic
selection.start_iteration2 (must be ≥ 1)dynamic
  • LP Formulation — Column-bound state pinning whose reduced costs give cut coefficients; Benders cut constraints in the LP
  • LP Warm-Start — Basis warm-start that is complementary to DCS inner re-solves
  • PAR Inflow Model — AR lag state variables that appear in cut coefficients
  • SDDP Algorithm — Forward/backward pass structure that drives cut generation
  • Scenario Generation — Fixed opening tree that defines backward pass branchings; sampling scheme abstraction
  • Penalty System — Recourse slacks that guarantee relatively complete recourse (cut validity condition)
  • Stopping Rules — Convergence criteria that depend on cut quality
  • Discount Rate — Discount factor scaling in cut aggregation
  • Risk Measures — Risk-averse cut generation (CVaR modifies aggregation weights)
  • Block Formulations — Block structure that affects how per-block duals contribute to cut coefficients
  • Determinism Guarantees — Bit-identical results across thread and MPI-rank counts, including DCS resident-set seeding