Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Upper Bound Evaluation

Status: Not Implemented. This spec describes a planned design that has not yet been implemented.

Purpose

This spec defines the upper bound evaluation mechanism in Cobre via inner approximation (SIDP): the vertex-based value function approximation, Lipschitz interpolation, the linearized upper bound LP, gap computation, and configuration. It complements the outer approximation (cuts) described in SDDP Algorithm by providing a convergence certificate through deterministic upper bounds.

For notation conventions (index sets, parameters, decision variables, dual variables), see Notation Conventions.

Symbol convention: This spec uses for the discount factor. See Discount Rate.

1 Motivation

Standard SDDP provides only a lower bound (outer approximation) through cuts. For convergence verification, we need an upper bound (inner approximation). This is especially important for:

  1. Risk-averse problems: CVaR objectives cannot be reliably estimated via Monte Carlo simulation of the policy
  2. Convergence certificates: Gap provides a true optimality measure
  3. Conservative policies: Inner approximation gives “at most ” guarantees

Deterministic vs. simulation-based upper bounds: The simulation-based stopping rules (see Stopping Rules) estimate an upper bound by running the SDDP policy on sampled scenarios and averaging costs. This is a statistical estimate — valid for risk-neutral problems but not for risk-averse (CVaR) objectives. The inner approximation described here provides a deterministic upper bound that is valid regardless of the risk measure.

2 Vertex-Based Inner Approximation

The inner approximation is constructed from vertices (visited state-value pairs):

where each vertex stores:

  • : State vector visited during forward passes
  • : Upper bound on expected cost-to-go from that state (computed recursively)

3 Lipschitz Interpolation

For a new state not in , the upper bound is computed via Lipschitz interpolation:

where is the Lipschitz constant for stage .

Interpretation: The upper bound at is the minimum over all vertices of “vertex value plus distance penalty.” This forms a concave piecewise-linear function — the inner (concave) counterpart to the outer (convex) cut approximation.

4 Lipschitz Constant Computation

The Lipschitz constant bounds the maximum rate of change of the value function with respect to the state. For SDDP with penalty-based feasibility (relatively complete recourse):

Backward accumulation:

where:

  • is the maximum penalty coefficient at stage (e.g., deficit penalty in $/MWh)
  • is the discount factor for transition (see Discount Rate)

Note: The discount factor appears in the Lipschitz accumulation because the future cost is discounted. Without discounting (), grows linearly with the remaining horizon.

Example: With deficit penalty $/MWh over 5 stages, no discounting:

Stage Lipschitz
51,000
42,000
33,000
24,000
15,000

5 Vertex Value Computation

During the upper bound evaluation pass (a backward pass variant):

At terminal stage :

At stage :

For each vertex :

  1. For each scenario , solve the stage subproblem with incoming state and realization
  2. Obtain the optimal next-stage state
  3. Evaluate the inner approximation at the next stage:
  4. Set vertex value as the expected discounted cost-to-go:

Expectation: The vertex value is an expectation over scenarios, not a single-scenario value. This parallels the backward pass for cuts, which also computes expected cost-to-go.

6 Upper Bound Evaluation LP

For policy evaluation with inner approximation, the stage LP replaces the outer approximation (cut constraints on ) with inner approximation (vertex constraints on ).

Standard LP (outer approximation — lower bound):

Inner approximation LP (upper bound):

Direction: Cut constraints are lower bounds on (). Vertex constraints are upper bounds on (). The cut approximation is convex (piecewise-linear from below); the vertex approximation is concave (piecewise-linear from above).

7 Linearized Upper Bound LP

The absolute value in the vertex constraints is linearized using standard splitting:

Additional variables (per vertex , per state component ):

VariableDomainDescription
Positive deviation from vertex in dimension
Negative deviation from vertex in dimension
freeUpper bound on future cost

Constraints (for each vertex ):

8 Gap Computation

At each iteration where the upper bound is evaluated:

Lower bound (from cuts at stage 1):

Upper bound (from vertices at stage 1):

Relative gap:

Convergence: As , for convex problems with finitely many scenarios.

For stopping rules that use the gap, see Stopping Rules.

9 Computational Considerations

AspectImpact
Vertices per stageTypically
LP size increase additional variables
Evaluation frequencyTrade-off between gap accuracy and runtime
MemoryVertices stored separately from cuts

Recommendation: Enable upper bound evaluation every 5-10 iterations after an initial burn-in period (10+ iterations) for convergence monitoring without excessive overhead.

10 Infinite Horizon

For cyclic policy graphs (see Infinite Horizon), the inner approximation operates on the same seasonal cut-pool structure: vertices are organized by season , not by absolute stage ID. The Lipschitz constant must account for the cumulative discount around the cycle, which bounds the geometric series of future contributions.

The convergence guarantee still holds: with , both the outer (cut) and inner (vertex) approximations converge to the true value function at the fixed point.

11 References

Costa, B.F.P., & Leclere, V. (2023). “Duality of upper bounds in stochastic dynamic programming.” Optimization Online. https://optimization-online.org/?p=23738

Philpott, A.B., de Matos, V.L., & Finardi, E.C. (2013). “On solving multistage stochastic programs with coherent risk measures.” Operations Research, 61(4), 957-970. https://doi.org/10.1287/opre.2013.1175

Cross-References

  • SDDP Algorithm — Core algorithm providing the outer approximation (lower bound) that this spec complements
  • Notation Conventions — Standard symbols for state variables, value functions, and cost-to-go
  • Discount Rate — Discount factor used in vertex value computation (§5) and Lipschitz accumulation (§4)
  • Infinite Horizon — Seasonal vertex organization for cyclic policy graphs
  • Cut Management — Outer approximation cuts that provide the lower bound counterpart
  • Stopping Rules — Convergence criteria that use the gap between inner and outer approximations
  • Risk Measures — CVaR objectives where deterministic upper bounds are essential
  • Binary Formats §3.3 — FlatBuffers Vertex and StageVertices schemas for persistence
  • Input Directory Structure §2.2upper_bound_evaluation configuration in config.json
  • Configuration Reference — Full configuration schema with defaults