Toy Single-Reservoir SDDP Walkthrough
Purpose
Section titled “Purpose”This chapter walks through one complete SDDP iteration on a deliberately small system — one reservoir, one thermal unit, one demand block, four stages, and a 0-order (pure seasonal-sampling) inflow model — using numbers chosen so every cut coefficient and every dual variable can be verified by hand. The goal is to make the abstract machinery of forward pass, backward pass, cut construction, and lower-bound update tangible.
Cobre ships an actual reference case at examples/1dtoy/ that a curious
reader can run end-to-end (cobre run examples/1dtoy). That case has
two thermals, eight monthly stages, and ten openings per stage; the
chapter here uses simpler numbers — one thermal, four stages, three
openings — chosen for tractability. This walkthrough is a pedagogical
caricature, not a reproduction of the shipped case.
The chapter does not explain how the underlying mechanisms work; it shows them working. It does not cover multi-reservoir effects, spatial inflow correlation, the FPHA production model, risk-measure weighting, or autoregressive inflow memory. Those topics belong to the chapters cited in section 9 and to the multi-reservoir companion in Toy Four-Reservoir Walkthrough.
1. The Case in One Picture
Section titled “1. The Case in One Picture”The system has one bus, one reservoir, one thermal unit, and one demand block. The reservoir receives a stochastic inflow drawn from a 0-order seasonal-sampling model; the thermal unit and a deficit slack cover any shortfall that the reservoir cannot serve.
graph LR
INF["Inflow a_t<br/>(0-order)"]
RES["Reservoir<br/>v cap 100"]
GEN["Hydro q"]
TH["Thermal g_th<br/>cost 50"]
DEF["Deficit δ<br/>cost 1000"]
BUS["Bus<br/>demand D = 40"]
INF --> RES
RES --> GEN --> BUS
TH --> BUS
DEF --> BUS
Case parameters (chosen for hand-verifiability):
| Parameter | Symbol | Value |
|---|---|---|
| Stages | 4 | |
| Inflow openings per stage | 3 | |
| Initial storage | 30 (storage units) | |
| Reservoir capacity | 100 | |
| Demand per stage | 40 | |
| Inflow seasonal mean | 30 | |
| Inflow seasonal std deviation | 10 | |
| Thermal marginal cost | 50 | |
| Deficit cost | 1000 | |
| Discount factor | 1.0 |
Demand is set above the mean inflow () so the reservoir depletes over the four stages, eventually forcing thermal dispatch and producing non-trivial dual variables in the backward pass.
2. Stage LP for This Case
Section titled “2. Stage LP for This Case”Each stage solves the following LP (specialised from the general formulation in LP Formulation to one hydro, one thermal, one demand block, and a 0-order inflow):
Objective:
Load balance:
Water balance:
Incoming-storage pinning (binds incoming storage to the trial value by equal column bounds; the pinned column’s reduced cost becomes the cut coefficient):
Inflow (treated as data, not a state variable; see section 3):
Bounds:
Future cost variable : in the terminal stage 4 there are no cuts, and is bound to zero. As the backward pass runs, cuts of the form are added to earlier stages’ LPs. Because the value function is decreasing in storage (more water means lower future cost), the cut slope is negative.
Note the absence of any AR-lag state variable: the 0-order inflow has no memory across stages, so storage is the only state.
3. The 0-Order Inflow Model on This Case
Section titled “3. The 0-Order Inflow Model on This Case”The inflow at every stage is sampled independently from a normal distribution with stage-seasonal mean and standard deviation:
with and for every stage in this walkthrough.
The draws across stages are independent. This is the degenerate
case of the PAR Inflow Model: no lag
terms, no AR coefficients, no initial-lag state. The data file
inflow_seasonal_stats.parquet carries only and per
hydro and per stage, and the inflow_ar_coefficients.parquet file is
absent — the order-selection procedure landed at for every
season (the white-noise case described in section 4 of
PAR Inflow Model).
The three openings used in the backward pass correspond to with equal probability , giving inflows . Because the openings are identical across stages and there is no AR memory, the same three-point distribution applies at every stage.
4. Iteration 1 — Forward Pass
Section titled “4. Iteration 1 — Forward Pass”The forward pass samples one trajectory using at every stage. Under zero noise the inflow stays at the seasonal mean throughout: for .
At each stage, the LP minimises with free at zero (no cuts in iteration 1). Pure hydro dispatch dominates whenever water is available.
Stage 1 — incoming storage , inflow . Available water: . Optimal: , , . End storage . Stage cost: .
Stage 2 — , . Available: . , , . Stage cost: .
Stage 3 — , . Available: . , , . Stage cost: .
Stage 4 — , . Available: . The LP sets (all water turbined), (thermal fills the gap), , . Stage cost: .
Forward-pass summary:
| Stage | Stage cost | |||||
|---|---|---|---|---|---|---|
| 1 | 30 | 30 | 40 | 0 | 20 | 0 |
| 2 | 20 | 30 | 40 | 0 | 10 | 0 |
| 3 | 10 | 30 | 40 | 0 | 0 | 0 |
| 4 | 0 | 30 | 30 | 10 | 0 | 500 |
Upper-bound estimate from this trajectory: . This is one realisation of total cost; the statistical upper bound is the average over many simulated trajectories.
5. Iteration 1 — Backward Pass
Section titled “5. Iteration 1 — Backward Pass”The backward pass walks stages . At each stage it fixes the incoming storage to the trial point from the forward pass, evaluates all three openings, reads the reduced cost of the pinned incoming-storage column, computes per-opening intercepts, and aggregates into one cut (see Cut Management sections 2–3).
Stage 4 (terminal)
Section titled “Stage 4 (terminal)”Trial point: . Inflows under the three openings: .
| Opening | Available | |||||
|---|---|---|---|---|---|---|
| 20 | 20 | 20 | 20 | 1000 | ||
| 30 | 30 | 30 | 10 | 500 | ||
| 40 | 40 | 40 | 0 | 0 |
Storage cut coefficient , the reduced cost of the pinned incoming-storage column. By the LP envelope theorem applied to the pinned bound :
- For and (water-limited, thermal active): one extra unit of enables one extra unit of turbining, displacing one unit of thermal worth . Optimal cost falls by , so .
- For (demand met exactly by hydro alone): the LP already has ; extra storage flows into terminal , which has zero value at the terminal stage. .
| Opening | |
|---|---|
Per-opening intercepts (anchoring each cut at the trial point):
With this simplifies to : .
Single-cut aggregation (uniform probability ; see Cut Management section 3):
Cut added to stage 3’s LP:
Sanity check. At the cut gives , matching the expected stage-4 cost from the table above. At the cut gives , after which the implicit bound takes over.
Stage 3
Section titled “Stage 3”Trial point: .
The stage-3 LP now carries the cut . For each opening the optimiser balances spending on thermal now against keeping more water for the future (worth per unit via the cut).
| Opening | Available | ||||||
|---|---|---|---|---|---|---|---|
| 20 | 30 | 30 | 10 | 0 | |||
| 30 | 40 | 40 | 0 | 0 | |||
| 40 | 50 | 40 | 0 | 10 |
For the system is water-limited; the LP turbines all 30 units of available water and dispatches 10 MW of thermal, ending with and the cut binding at .
For available water exactly meets demand; no thermal, end , cut value .
For the optimiser must choose how much of the surplus to release. Decreasing raises by one (cost ) and raises by one (saves on the cut); the marginal cost of holding more is , so the optimiser pushes to its load-balance bound at , leaving and the cut value at .
Storage cut coefficients (reduced costs of the pinned incoming-storage column):
- (water-limited): one extra unit of frees one extra turbine unit, saves of thermal. .
- (demand exactly met): the LP is at a kink; both (water-limited regime) and (storage-flow regime) are valid subgradients. The walkthrough takes the basis returning .
- (water surplus, holding storage): one extra unit of raises by one, lowers the cut value by . .
Per-opening intercepts :
| Opening | |||
|---|---|---|---|
Aggregation ():
Cut added to stage 2’s LP:
Sanity check. At the cut evaluates to , matching the probability-weighted expectation .
Stages 2 and 1
Section titled “Stages 2 and 1”The same procedure repeats at stages 2 and 1. At each stage:
- Pin the incoming storage to the trial point from the forward pass (column bounds).
- Solve all three opening LPs, including the cut from the next stage.
- Read the reduced cost of the pinned incoming-storage column.
- Compute per-opening intercepts via .
- Aggregate by probability-weighted averaging.
- Add the cut to the previous stage’s LP.
By the end of the backward pass, stages 1 through 3 each carry one cut. Stage 4 has none (it is the terminal stage and has no future to approximate).
6. Iteration 1 — Lower Bound
Section titled “6. Iteration 1 — Lower Bound”After the backward pass installs a cut at stage 1, the lower bound for iteration 1 is computed by solving the stage-1 LP for every opening with the cut active and taking the probability-weighted expectation (see SDDP Algorithm section 3.3):
where is the stage-1 optimal objective under opening with the iteration-1 cut in place, and is the fixed initial storage. The lower bound rises from (iteration 0, no cuts) to a positive value once the first cut is installed; the value reflects the policy’s expected cost given the partial information encoded in the single iteration-1 cut at each stage.
The lower bound is non-decreasing across iterations — adding cuts only tightens the outer approximation, never loosening it (see Cut Management section 1).
7. The Future Cost Function Across Iterations
Section titled “7. The Future Cost Function Across Iterations”After iteration 1 each non-terminal stage holds one Benders cut. As the algorithm continues:
- Iteration 2 samples a different forward trajectory (different draws), visits different trial points, and adds a second cut at each non-terminal stage.
- Subsequent iterations each add one cut per non-terminal stage. The outer approximation tightens around the true convex future-cost function (FCF) — each new cut is tangent at a new trial point, ruling out overestimates in that region.
For this case the FCF at each stage is a univariate function of storage . Each cut is a line in this one-dimensional state space, and the outer approximation is the pointwise maximum over all cuts:
Because is decreasing in storage (more water means lower future cost) and convex, the slopes are negative and the approximation is a lower envelope of decreasing lines. As grows, visited trial points spread across the state space — low-storage trajectories force the algorithm to evaluate cut quality in the scarcity region, adding cuts that are informative for water-stressed scenarios.
8. Convergence on This Case
Section titled “8. Convergence on This Case”As iterations accumulate, the lower bound rises and the simulation-based upper-bound estimate (the mean over many forward trajectories) converges. The relative gap
narrows as cuts accumulate. For a problem of this size (one reservoir, four stages, three openings) the gap typically narrows below 1% within tens of iterations and below 0.1% within a few hundred. These are illustrative scales, not guarantees: the actual iteration count depends on the demand-to-inflow ratio, the initial storage, and the noise level — all of which determine how often the reservoir reaches zero and how sensitive the value function is to small state changes.
The stopping rule fires when the gap falls below the configured tolerance or the iteration limit is reached — see Stopping Rules for the available criteria and their combinations. The upper-bound estimator used here is the simulation-based estimator; a deterministic inner-approximation bound is also available — see Upper Bound Evaluation.
9. What This Example Does Not Show
Section titled “9. What This Example Does Not Show”This case isolates the core SDDP loop in its simplest form. It cannot illustrate:
- Multi-reservoir effects: the cut becomes a multivariate hyperplane with one storage coefficient per reservoir; the optimiser must balance releases across plants. See Toy Four-Reservoir Walkthrough.
- Spatial inflow correlation: when multiple plants share a wet/dry signal, their innovations must be drawn from a correlated multivariate normal via the spectral factorisation described in PAR Inflow Model section 8.
- Autoregressive inflow memory: a PAR(p) model with adds one lag state variable per lag and one cut coefficient per lag; the cut becomes a hyperplane in storage and lag state space. See PAR Inflow Model.
- FPHA production model: nonlinear head-dependent efficiency approximated by piecewise-linear hyperplanes; one of the planes can bind, contributing to the storage cut coefficient. See Hydro Production Models.
- Risk-measure effects: CVaR weighting that shifts cut aggregation probabilities away from uniform . See Risk Measures.
The Toy Four-Reservoir Walkthrough extends this case to four independent reservoirs at four buses with transmission coupling, illustrating how the cut hyperplane and the per-bus dispatch mechanics scale up.
Cross-References
Section titled “Cross-References”- SDDP Algorithm — Forward and backward pass structure, lower-bound computation, convergence monitoring
- LP Formulation — Complete stage LP: load balance, water balance, column-bound state pinning, objective taxonomy
- Cut Management — Dual extraction, per-opening intercepts, single-cut aggregation, cut validity, sign convention
- PAR Inflow Model — Inflow model definition; the degenerate case (white noise) used in this walkthrough; stored vs computed quantities
- Stopping Rules — Iteration limit, gap threshold, bound-stalling criteria
- Upper Bound Evaluation — Simulation-based and inner-approximation upper bounds
- Toy Four-Reservoir Walkthrough — Multi-reservoir extension at 4 buses with transmission and independent 0-order inflows