Skip to content

Risk Measures

This spec defines the risk-averse SDDP formulation used in Cobre, based on Conditional Value-at-Risk (CVaR). It covers the CVaR definition, the convex combination risk measure, dual representations, the risk-averse subgradient theorem, modified Bellman equation with discount factor, risk-averse cut generation, per-stage risk profiles, and the critical implications for bound validity.

For notation conventions (index sets, parameters, decision variables, dual variables), see Notation Conventions.

Risk-neutral SDDP minimizes expected cost, which can lead to policies that perform poorly in adverse scenarios. Risk-averse SDDP incorporates a coherent risk measure (typically CVaR) to protect against tail risks while maintaining the convexity properties required for valid cut generation.

Cost distribution f(C)f(C) for a right-skewed Gamma law with E[C]\mathbb{E}[C], VaRα\mathrm{VaR}_\alpha and CVaRα\mathrm{CVaR}_\alpha marked and the worst (1α)(1-\alpha) tail shaded. The convex-combination measure ρλ,α[C]=(1λ)E[C]+λCVaRα[C]\rho^{\lambda,\alpha}[C] = (1-\lambda)\,\mathbb{E}[C] + \lambda\,\mathrm{CVaR}_\alpha[C] interpolates between the risk-neutral mean and the tail. All three markers are derived numerically from the PDF.

For a random variable ZZ representing cost and confidence level α(0,1]\alpha \in (0, 1]:

CVaRα(Z)=minηR{η+1αE[(Zη)+]}\text{CVaR}_\alpha(Z) = \min_{\eta \in \mathbb{R}} \left\{ \eta + \frac{1}{\alpha} \mathbb{E}\left[(Z - \eta)^+\right] \right\}

where (Zη)+=max(0,Zη)(Z - \eta)^+ = \max(0, Z - \eta) captures the excess cost above threshold η\eta.

Interpretation: CVaRα_\alpha is the expected cost in the worst α\alpha-fraction of scenarios.

α\alphaRisk PostureMeaning
1.0Risk-neutralCVaR1_1 = E[Z]\mathbb{E}[Z] (expected value)
0.5Moderately risk-averseAverage of worst 50% of outcomes
0.2Risk-averseAverage of worst 20% of outcomes
0.05Highly risk-averseAverage of worst 5% of outcomes

Cobre uses a convex combination of expectation and CVaR (following the SDDP.jl convention):

ρλ,α[Z]=(1λ)E[Z]+λCVaRα[Z]\rho^{\lambda, \alpha}[Z] = (1 - \lambda) \mathbb{E}[Z] + \lambda \cdot \text{CVaR}_\alpha[Z]

where:

  • λ[0,1]\lambda \in [0, 1]: Risk aversion weight (0 = risk-neutral, 1 = pure CVaR)
  • α(0,1]\alpha \in (0, 1]: CVaR confidence level

This is sometimes called the EAVaR (Expectation + Average Value-at-Risk) risk measure.

4 Dual Representation of Convex Risk Measures

Section titled “4 Dual Representation of Convex Risk Measures”

Convex risk measures have a dual representation that is essential for computing risk-averse cuts:

F[Z]=supμM(p)Eμ[Z]ψ(p,μ)\mathbb{F}[Z] = \sup_{\mu \in \mathcal{M}(p)} \mathbb{E}_\mu[Z] - \psi(p, \mu)

where:

  • M(p)P\mathcal{M}(p) \subseteq \mathcal{P} is a convex subset of the probability simplex
  • ψ(p,μ)\psi(p, \mu) is a concave penalty function
  • P={p0:ωpω=1}\mathcal{P} = \{p \geq 0 : \sum_{\omega} p_\omega = 1\}

Interpretation: The dual computes the expectation with respect to the worst probability vector μ\mu within the set M\mathcal{M}, less a penalty term ψ(p,μ)\psi(p, \mu).

For CVaRα_\alpha, the dual representation is:

CVaRα[Z]=supμMα(p)Eμ[Z]\text{CVaR}_\alpha[Z] = \sup_{\mu \in \mathcal{M}_\alpha(p)} \mathbb{E}_\mu[Z]

where the risk set Mα(p)\mathcal{M}_\alpha(p) is:

Mα(p)={μ0:ωμω=1,  μωpωα  ω}\mathcal{M}_\alpha(p) = \left\{\mu \geq 0 : \sum_\omega \mu_\omega = 1, \; \mu_\omega \leq \frac{p_\omega}{\alpha} \; \forall \omega \right\}

The penalty ψ(p,μ)=0\psi(p, \mu) = 0 for CVaR (no penalty term).

Interpretation: CVaR puts more probability weight on the worst outcomes, with each scenario receiving at most pω/αp_\omega / \alpha probability mass. For small α\alpha, only the worst scenarios receive significant weight.

For the convex combination ρλ,α[Z]=(1λ)E[Z]+λCVaRα[Z]\rho^{\lambda, \alpha}[Z] = (1-\lambda)\mathbb{E}[Z] + \lambda \cdot \text{CVaR}_\alpha[Z]:

MEAVaR(p)={μ0:ωμω=1,  μω(1λ)pω+λpωα  ω}\mathcal{M}^{EAVaR}(p) = \left\{\mu \geq 0 : \sum_\omega \mu_\omega = 1, \; \mu_\omega \leq (1-\lambda) p_\omega + \frac{\lambda p_\omega}{\alpha} \; \forall \omega \right\}

The key theorem for computing risk-averse cuts:

Application to Cut Generation: In SDDP, the subgradients λ(x~,ω)\lambda(\tilde{x}, \omega) are the cut coefficients πt(ω)\pi_t(\omega) obtained from LP duals (see Cut Management §2). The risk-averse cut coefficients are computed by replacing the uniform scenario probabilities with risk-adjusted probabilities μ\mu^*:

πˉt1,h=ωΩtμωπt,h(ω)\bar{\pi}_{t-1,h} = \sum_{\omega \in \Omega_t} \mu^*_\omega \cdot \pi_{t,h}(\omega)

where μ\mu^* is the optimal dual probability vector computed from the scenario costs {Qt(x^,ω)}ωΩt\{Q_t(\hat{x}, \omega)\}_{\omega \in \Omega_t}.

The risk-averse value function with discount factor dd satisfies:

Vt(xt1)=ρλt,αt[minxt{ctxt+dtt+1Vt+1(xt):(xt,xt1) feasible}]V_t(x_{t-1}) = \rho^{\lambda_t, \alpha_t}\left[\min_{x_t} \left\{ c_t^\top x_t + d_{t \to t+1} \cdot V_{t+1}(x_t) : (x_t, x_{t-1}) \text{ feasible} \right\}\right]

This modifies the standard Bellman recursion in two ways:

  1. Risk measure replaces expectation: ρλt,αt[]\rho^{\lambda_t, \alpha_t}[\cdot] replaces E[]\mathbb{E}[\cdot]
  2. Discount factor on future cost: dtt+1Vt+1(xt)d_{t \to t+1} \cdot V_{t+1}(x_t) discounts the cost-to-go (see Discount Rate §2)

In the LP subproblem at stage tt, the future cost variable θ\theta appears in the objective as dtt+1θd_{t \to t+1} \cdot \theta (see Discount Rate §5). Cuts bound θ\theta (not dθd \cdot \theta), so the discount factor multiplies θ\theta only in the objective — exactly as in the risk-neutral case.

For each visited state x^t1\hat{x}_{t-1}, compute the risk-averse cut as follows:

Step 1: Solve subproblems for all realizations ωΩt\omega \in \Omega_t:

Qt(x^t1,ω)=minxt{ctxt+dtt+1θt:constraints}Q_t(\hat{x}_{t-1}, \omega) = \min_{x_t} \left\{ c_t^\top x_t + d_{t \to t+1} \cdot \theta_t : \text{constraints} \right\}

Extract dual solutions πt(ω)\pi_t(\omega) and compute per-scenario cut coefficients:

  • Intercept: α^t(ω)=Qt(x^t1,ω)πt(ω)x^t1\hat{\alpha}_t(\omega) = Q_t(\hat{x}_{t-1}, \omega) - \pi_t(\omega)^\top \hat{x}_{t-1}
  • Coefficients: πt(ω)\pi_t(\omega) — derived from LP duals (see Cut Management §2)

Step 2: Compute risk-adjusted scenario weights μω\mu^*_\omega.

Each scenario ω\omega has a probability upper bound:

μˉω=(1λt)pω+λtpωαt\bar{\mu}_\omega = (1 - \lambda_t) \, p_\omega + \frac{\lambda_t \, p_\omega}{\alpha_t}

Since μˉω>pω\bar{\mu}_\omega > p_\omega when λt>0\lambda_t > 0 and αt<1\alpha_t < 1, the total capacity ωμˉω>1\sum_\omega \bar{\mu}_\omega > 1. The risk-adjusted weights are found by assigning as much weight as possible to the most expensive scenarios:

  1. Sort scenarios by cost QωQ_\omega in descending order
  2. Walk down the sorted list, assigning μω=μˉω\mu^*_\omega = \bar{\mu}_\omega (the upper bound) to each scenario
  3. When the cumulative weight reaches 1, the current scenario receives the remaining fraction, and all cheaper scenarios get μω=0\mu^*_\omega = 0

This is a greedy allocation (continuous knapsack) — it places maximum weight on the worst scenarios and minimum weight on the best.

Step 3: Compute risk-averse cut coefficients using μ\mu^* (justified by the theorem in §5):

α^ˉt1=ωΩtμωα^t(ω)\bar{\hat{\alpha}}_{t-1} = \sum_{\omega \in \Omega_t} \mu^*_\omega \cdot \hat{\alpha}_t(\omega) πˉt1=ωΩtμωπt(ω)\bar{\pi}_{t-1} = \sum_{\omega \in \Omega_t} \mu^*_\omega \cdot \pi_t(\omega)

Step 4: Add cut to stage t1t-1:

θt1α^ˉt1+πˉt1xt1\theta_{t-1} \geq \bar{\hat{\alpha}}_{t-1} + \bar{\pi}_{t-1}^\top x_{t-1}

Risk aversion can vary by stage. The risk_measure field in stages.json specifies (λt,αt)(\lambda_t, \alpha_t) per stage:

OptionDescription
"expectation"Risk-neutral expected value (default)
{"cvar": {...}}CVaR parameters with alpha (confidence level) and lambda (weight)

Example with stage-varying risk:

{
"stages": [
{
"id": 0,
"risk_measure": { "cvar": { "alpha": 0.95, "lambda": 0.5 } }
},
{
"id": 1,
"risk_measure": { "cvar": { "alpha": 0.95, "lambda": 0.25 } }
},
{
"id": 2,
"risk_measure": "expectation"
}
]
}

This allows decreasing risk aversion over the horizon (e.g., higher λ\lambda for near-term stages, lower for distant stages).

Monte Carlo simulation cannot directly estimate the upper bound for CVaR problems because:

  1. CVaR is computed over the entire distribution, not sample averages
  2. The optimal η\eta (VaR threshold) changes with the policy

For risk-averse problems, the inner approximation (SIDP) provides deterministic upper bounds that remain valid regardless of the risk measure. See Upper Bound Evaluation §1 for the complete formulation.

10 Lower Bound Validity with Risk Measures

Section titled “10 Lower Bound Validity with Risk Measures”

In risk-neutral SDDP, the lower bound z=V1(x0)\underline{z} = V_1(x_0) is the optimal value of the first-stage LP, which uses cuts that provide valid outer approximations of the expected future cost. This bound converges to the true optimal expected cost.

In risk-averse SDDP, this property does not hold because:

  1. Cuts approximate nested risk measures: Each cut approximates ρt[Vt+1(xt,ω)]\rho_t[V_{t+1}(x_t, \omega)], where the risk measure ρt\rho_t depends on the distribution of costs at that stage.

  2. The LP optimizes under the wrong distribution: The first-stage LP optimizes ρ1[V2]\rho_1[V_2], but the risk-adjusted distribution used in the cuts was computed for the training states, not for the optimal first-stage decision.

  3. Nested risk measures are not time-consistent in expectation: Unlike E[]\mathbb{E}[\cdot], the nested application of CVaR does not satisfy:

    ρ1[ρ2[]]ρ[total cost]\rho_1[\rho_2[\ldots]] \neq \rho[\text{total cost}]

For risk-averse problems, the value z=V1(x0)\underline{z} = V_1(x_0) computed by SDDP is:

  • A convergence indicator: It increases monotonically and plateaus when additional cuts provide no improvement
  • NOT a valid lower bound on the true risk-averse optimal cost
PurposeMethod
Convergence monitoringTrack z\underline{z} stabilization (bound stalling rule — see Stopping Rules §4)
Valid upper boundInner approximation (SIDP) — see Upper Bound Evaluation
Policy evaluationMonte Carlo simulation with risk-averse policy decisions

Philpott, A.B., de Matos, V.L., & Finardi, E.C. (2013). “On solving multistage stochastic programs with coherent risk measures.” Operations Research, 61(4), 957-970. https://doi.org/10.1287/opre.2013.1175

Shapiro, A. (2011). “Analysis of stochastic dual dynamic programming method.” European Journal of Operational Research, 209(1), 63-72.

  • Notation Conventions — Symbol definitions, dual variable notation, and sign conventions
  • SDDP Algorithm — Bellman recursion and forward/backward pass structure modified by risk measures
  • Cut Management — Dual extraction and cut coefficient computation; risk-averse aggregation replaces p(ω)p(\omega) with μω\mu^*_\omega (§7)
  • Stopping Rules — Bound stalling recommended for risk-averse convergence monitoring; simulation-based stopping limitations
  • Discount Rate — Discount factor dd convention and discounted Bellman equation
  • Horizon Modes — Interaction of risk measures with cyclic policy graphs
  • Upper Bound Evaluation — SIDP inner approximation for valid upper bounds with CVaR objectives