Risk Measures

Purpose

This spec defines the risk-averse SDDP formulation used in Cobre, based on Conditional Value-at-Risk (CVaR). It covers the CVaR definition, the convex combination risk measure, dual representations, the risk-averse subgradient theorem, modified Bellman equation with discount factor, risk-averse cut generation, per-stage risk profiles, and the critical implications for bound validity.

For notation conventions (index sets, parameters, decision variables, dual variables), see Notation Conventions.

Symbol conventions:

$α$ denotes the CVaR confidence level (matching the alpha field in stages.json). This is the standard convention in the risk measure literature.

$\overset{α}{^}_{t} (ω)$ denotes per-scenario cut intercepts within this spec. This corresponds to $α$ in Cut Management §1, renamed here to avoid collision with the CVaR parameter.

$d$ denotes the discount factor. See Discount Rate.

$μ$ denotes the risk-adjusted probability measure (not $q$ , which denotes turbined flow).

$ψ (p, μ)$ denotes the dual penalty function in the general dual representation.

1 Motivation

Risk-neutral SDDP minimizes expected cost, which can lead to policies that perform poorly in adverse scenarios. Risk-averse SDDP incorporates a coherent risk measure (typically CVaR) to protect against tail risks while maintaining the convexity properties required for valid cut generation. Risk measures — cost distribution with E[C] and CVaR_α marked, tail region shaded, convex combination ρ = λ·CVaR + (1−λ)·E[C]

2 Conditional Value-at-Risk (CVaR)

For a random variable $Z$ representing cost and confidence level $α \in (0, 1]$ :

$CVaR_{α} (Z) = η \in R min {η + \frac{1}{α} E [(Z - η)^{+}]}$

where $(Z - η)^{+} = max (0, Z - η)$ captures the excess cost above threshold $η$ .

Interpretation: CVaR $_{α}$ is the expected cost in the worst $α$ -fraction of scenarios.

$α$	Risk Posture	Meaning
1.0	Risk-neutral	CVaR $_{1}$ = $E [Z]$ (expected value)
0.5	Moderately risk-averse	Average of worst 50% of outcomes
0.2	Risk-averse	Average of worst 20% of outcomes
0.05	Highly risk-averse	Average of worst 5% of outcomes

3 Convex Combination Risk Measure

Cobre uses a convex combination of expectation and CVaR (following the SDDP.jl convention):

$ρ^{λ, α} [Z] = (1 - λ) E [Z] + λ \cdot CVaR_{α} [Z]$

where:

$λ \in [0, 1]$ : Risk aversion weight (0 = risk-neutral, 1 = pure CVaR)
$α \in (0, 1]$ : CVaR confidence level

This is sometimes called the EAVaR (Expectation + Average Value-at-Risk) risk measure.

4 Dual Representation of Convex Risk Measures

Convex risk measures have a dual representation that is essential for computing risk-averse cuts:

$F [Z] = μ \in M (p) sup E_{μ} [Z] - ψ (p, μ)$

where:

$M (p) \subseteq P$ is a convex subset of the probability simplex
$ψ (p, μ)$ is a concave penalty function
$P = {p \geq 0 : \sum_{ω} p_{ω} = 1}$

Interpretation: The dual computes the expectation with respect to the worst probability vector $μ$ within the set $M$ , less a penalty term $ψ (p, μ)$ .

4.1 CVaR Dual Representation

For CVaR $_{α}$ , the dual representation is:

$CVaR_{α} [Z] = μ \in M_{α} (p) sup E_{μ} [Z]$

where the risk set $M_{α} (p)$ is:

$M_{α} (p) = {μ \geq 0 : ω \sum μ_{ω} = 1, μ_{ω} \leq \frac{p _{ω}}{α} \forall ω}$

The penalty $ψ (p, μ) = 0$ for CVaR (no penalty term).

Interpretation: CVaR puts more probability weight on the worst outcomes, with each scenario receiving at most $p_{ω} / α$ probability mass. For small $α$ , only the worst scenarios receive significant weight.

4.2 EAVaR Dual Representation

For the convex combination $ρ^{λ, α} [Z] = (1 - λ) E [Z] + λ \cdot CVaR_{α} [Z]$ :

$M^{E A Va R} (p) = {μ \geq 0 : ω \sum μ_{ω} = 1, μ_{ω} \leq (1 - λ) p_{ω} + \frac{λ p _{ω}}{α} \forall ω}$

5 Risk-Averse Subgradient Theorem

The key theorem for computing risk-averse cuts:

Theorem (Risk-Averse Subgradient): Let $V (x, ω)$ be convex with respect to $x$ for all fixed $ω \in Ω$ , and let $λ (\overset{, ω) be a subgradient of V (x, ω) at x = x}{x}$ .

If $μ^{*} = argmax_{μ \in M (p)} E_{μ} [V (\tilde{x}, ω)] - ψ (p, μ)$ , then:

$ω \in Ω \sum μ_{ω}^{*} \cdot λ (\tilde{x}, ω)$

is a subgradient of $F [V (x, ω)]$ at $\tilde{x}$ .

Application to Cut Generation: In SDDP, the subgradients $λ (\tilde{x}, ω)$ are the cut coefficients $π_{t} (ω)$ obtained from LP duals (see Cut Management §2). The risk-averse cut coefficients are computed by replacing the uniform scenario probabilities with risk-adjusted probabilities $μ^{*}$ :

$\overset{π}{ˉ}_{t - 1, h} = ω \in Ω_{t} \sum μ_{ω}^{*} \cdot π_{t, h} (ω)$

where $μ^{*}$ is the optimal dual probability vector computed from the scenario costs ${Q_{t} (\overset{x}{^}, ω)}_{ω \in Ω_{t}}$ .

6 Risk-Averse Bellman Equation

The risk-averse value function with discount factor $d$ satisfies:

$V_{t} (x_{t - 1}) = ρ^{λ_{t}, α_{t}} [x_{t} min {c_{t}^{⊤} x_{t} + d_{t \to t + 1} \cdot V_{t + 1} (x_{t}) : (x_{t}, x_{t - 1}) feasible}]$

This modifies the standard Bellman recursion in two ways:

Risk measure replaces expectation: $ρ^{λ_{t}, α_{t}} [\cdot]$ replaces $E [\cdot]$
Discount factor on future cost: $d_{t \to t + 1} \cdot V_{t + 1} (x_{t})$ discounts the cost-to-go (see Discount Rate §2)

Time consistency: The risk measure is applied stage-wise (nested formulation), not to the total cost. This guarantees time consistency, which is essential for the dynamic programming decomposition. Formally: $ρ_{1} [ρ_{2} [\dots ρ_{T - 1} [\cdot]]]$ , not $ρ [total cost]$ .

In the LP subproblem at stage $t$ , the future cost variable $θ$ appears in the objective as $d_{t \to t + 1} \cdot θ$ (see Discount Rate §5). Cuts bound $θ$ (not $d \cdot θ$ ), so the discount factor multiplies $θ$ only in the objective — exactly as in the risk-neutral case.

7 Cut Generation with Risk Measures

For each visited state $\overset{x}{^}_{t - 1}$ , compute the risk-averse cut as follows:

Step 1: Solve subproblems for all realizations $ω \in Ω_{t}$ :

$Q_{t} (\overset{x}{^}_{t - 1}, ω) = x_{t} min {c_{t}^{⊤} x_{t} + d_{t \to t + 1} \cdot θ_{t} : constraints}$

Extract dual solutions $π_{t} (ω)$ and compute per-scenario cut coefficients:

Intercept: $\overset{α}{^}_{t} (ω) = Q_{t} (\overset{x}{^}_{t - 1}, ω) - π_{t} (ω)^{⊤} \overset{x}{^}_{t - 1}$
Coefficients: $π_{t} (ω)$ — derived from LP duals (see Cut Management §2)

Step 2: Compute risk-adjusted scenario weights $μ_{ω}^{*}$ .

Each scenario $ω$ has a probability upper bound:

$\overset{μ}{ˉ}_{ω} = (1 - λ_{t}) p_{ω} + \frac{λ _{t} p _{ω}}{α _{t}}$

Since $\overset{μ}{ˉ}_{ω} > p_{ω}$ when $λ_{t} > 0$ and $α_{t} < 1$ , the total capacity $\sum_{ω} \overset{μ}{ˉ}_{ω} > 1$ . The risk-adjusted weights are found by assigning as much weight as possible to the most expensive scenarios:

Sort scenarios by cost $Q_{ω}$ in descending order
Walk down the sorted list, assigning $μ_{ω}^{*} = \overset{μ}{ˉ}_{ω}$ (the upper bound) to each scenario
When the cumulative weight reaches 1, the current scenario receives the remaining fraction, and all cheaper scenarios get $μ_{ω}^{*} = 0$

This is a greedy allocation (continuous knapsack) — it places maximum weight on the worst scenarios and minimum weight on the best.

Special cases:

Risk-neutral ( $λ_{t} = 0$ ): $\overset{μ}{ˉ}_{ω} = p_{ω}$ for all $ω$ , so $μ^{*} = p$ and this reduces to the standard aggregation from Cut Management §3.

Pure CVaR ( $λ_{t} = 1$ ): $\overset{μ}{ˉ}_{ω} = p_{ω} / α_{t}$ . Only the worst $α_{t}$ -fraction of scenarios receive weight; all others get $μ_{ω}^{*} = 0$ .

Convex combination ( $0 < λ_{t} < 1$ ): All scenarios receive at least some weight (the $(1 - λ_{t}) p_{ω}$ floor), but the worst scenarios receive up to $\overset{μ}{ˉ}_{ω}$ .

Equivalence note: This sorting procedure produces the same $μ^{*}$ as solving the dual LP from §4.2, because the LP maximizes a linear objective with per-scenario upper bounds — a structure whose optimal solution is the greedy allocation above.

Step 3: Compute risk-averse cut coefficients using $μ^{*}$ (justified by the theorem in §5):

$\overset{ˉ}{\overset{α}{^}}_{t - 1} = ω \in Ω_{t} \sum μ_{ω}^{*} \cdot \overset{α}{^}_{t} (ω)$

$\overset{π}{ˉ}_{t - 1} = ω \in Ω_{t} \sum μ_{ω}^{*} \cdot π_{t} (ω)$

Step 4: Add cut to stage $t - 1$ :

$θ_{t - 1} \geq \overset{ˉ}{\overset{α}{^}}_{t - 1} + \overset{π}{ˉ}_{t - 1}^{⊤} x_{t - 1}$

Comparison with risk-neutral aggregation: The only difference from Cut Management §3 is that the scenario probabilities $p (ω)$ are replaced by the risk-adjusted weights $μ_{ω}^{*}$ .

8 Per-Stage Risk Profiles

Risk aversion can vary by stage. The risk_measure field in stages.json specifies $(λ_{t}, α_{t})$ per stage (see Input Scenarios §1.7):

Option	Description
`"expectation"`	Risk-neutral expected value (default)
`{"cvar": {...}}`	CVaR parameters with `alpha` (confidence level) and `lambda` (weight)

Example with stage-varying risk:

{
  "stages": [
    {
      "id": 0,
      "risk_measure": { "cvar": { "alpha": 0.95, "lambda": 0.5 } }
    },
    {
      "id": 1,
      "risk_measure": { "cvar": { "alpha": 0.95, "lambda": 0.25 } }
    },
    {
      "id": 2,
      "risk_measure": "expectation"
    }
  ]
}

This allows decreasing risk aversion over the horizon (e.g., higher $λ$ for near-term stages, lower for distant stages).

9 Upper Bound with Risk Measures

Monte Carlo simulation cannot directly estimate the upper bound for CVaR problems because:

CVaR is computed over the entire distribution, not sample averages
The optimal $η$ (VaR threshold) changes with the policy

For risk-averse problems, the inner approximation (SIDP) provides deterministic upper bounds that remain valid regardless of the risk measure. See Upper Bound Evaluation §1 for the complete formulation.

10 Lower Bound Validity with Risk Measures

Critical Warning: The lower bound computed during SDDP training is NOT a valid bound for risk-averse problems.

Why the Lower Bound Fails

In risk-neutral SDDP, the lower bound $\underline{z} = V_{1} (x_{0})$ is the optimal value of the first-stage LP, which uses cuts that provide valid outer approximations of the expected future cost. This bound converges to the true optimal expected cost.

In risk-averse SDDP, this property does not hold because:

Cuts approximate nested risk measures: Each cut approximates $ρ_{t} [V_{t + 1} (x_{t}, ω)]$ , where the risk measure $ρ_{t}$ depends on the distribution of costs at that stage.
The LP optimizes under the wrong distribution: The first-stage LP optimizes $ρ_{1} [V_{2}]$ , but the risk-adjusted distribution used in the cuts was computed for the training states, not for the optimal first-stage decision.
Nested risk measures are not time-consistent in expectation: Unlike $E [\cdot]$ , the nested application of CVaR does not satisfy: $ρ_{1} [ρ_{2} [\dots]] \neq = ρ [total cost]$

What the “Lower Bound” Represents

For risk-averse problems, the value $\underline{z} = V_{1} (x_{0})$ computed by SDDP is:

A convergence indicator: It increases monotonically and plateaus when additional cuts provide no improvement
NOT a valid lower bound on the true risk-averse optimal cost

Recommendations

Purpose	Method
Convergence monitoring	Track $\underline{z}$ stabilization (bound stalling rule — see Stopping Rules §4)
Valid upper bound	Inner approximation (SIDP) — see Upper Bound Evaluation
Policy evaluation	Monte Carlo simulation with risk-averse policy decisions

Convergence reporting: When risk measures are enabled, convergence reports should label the “lower bound” as “convergence indicator” or explicitly note that it is not a valid bound.

11 References

Philpott, A.B., de Matos, V.L., & Finardi, E.C. (2013). “On solving multistage stochastic programs with coherent risk measures.” Operations Research, 61(4), 957-970. https://doi.org/10.1287/opre.2013.1175

Shapiro, A. (2011). “Analysis of stochastic dual dynamic programming method.” European Journal of Operational Research, 209(1), 63-72.

Cross-References

Notation Conventions — Symbol definitions, dual variable notation, and sign conventions
SDDP Algorithm — Bellman recursion and forward/backward pass structure modified by risk measures
Cut Management — Dual extraction and cut coefficient computation; risk-averse aggregation replaces $p (ω)$ with $μ_{ω}^{*}$ (§7)
Stopping Rules — Bound stalling recommended for risk-averse convergence monitoring; simulation-based stopping limitations
Discount Rate — Discount factor $d$ convention and discounted Bellman equation
Infinite Horizon — Interaction of risk measures with periodic policy graphs
Upper Bound Evaluation — SIDP inner approximation for valid upper bounds with CVaR objectives
Input Scenarios §1.7 — JSON schema for risk_measure field: "expectation" or {"cvar": {"alpha": ..., "lambda": ...}}

Keyboard shortcuts

Cobre Methodology Reference