Discount Rate Formulation

Purpose

This spec defines how discount rates are incorporated into the Cobre SDDP solver: the discounted Bellman equation, stage-dependent discount factors, effect on the future cost variable $θ$ , cumulative discounting, and the effect on lower/upper bound computation.

For the infinite periodic horizon formulation (where discounting is required for convergence), see Infinite Horizon.

For notation conventions (index sets, parameters, decision variables, dual variables), see Notation Conventions.

Symbol convention: This spec uses $d$ for the discount factor. The deficit variable $δ_{b, k, s}$ uses a different symbol (lowercase delta), so there is no conflict.

1 Motivation

The discount factor $d \in (0, 1]$ captures the time value of money or risk preference, where future costs are valued less than present costs. This is essential for:

Infinite horizon problems: Ensuring convergence of the value function
Economic consistency: Reflecting opportunity cost of capital
Risk adjustment: Implicitly reducing weight of distant uncertain outcomes

2 Discounted Bellman Equation

The standard risk-neutral Bellman recursion with discount factor $d$ is:

$V_{t} (x_{t - 1}) = E_{ω_{t}} [x_{t} \in X_{t} (ω_{t}) min {c_{t} (x_{t}, u_{t}) + d_{t \to t + 1} \cdot V_{t + 1} (x_{t})}]$

where:

$c_{t} (x_{t}, u_{t})$ is the immediate cost at stage $t$
$V_{t + 1} (x_{t})$ is the future cost function (cost-to-go)
$d_{t \to t + 1}$ is the discount factor for the transition from stage $t$ to $t + 1$

Formulation Note: The discount factor $d$ multiplies only the future cost $V_{t + 1}$ , not the immediate cost $c_{t}$ . This is the standard SDDP convention:

Immediate cost $c_{t}$ : Not discounted (incurred “now” at stage $t$ )

Future cost $d \cdot V_{t + 1}$ : Discounted to present value at stage $t$

This is mathematically equivalent to computing all costs at “time 0” (stage 1) present value, where stage $t$ costs are multiplied by $\prod_{s = 1}^{t - 1} d_{s \to s + 1}$ in the objective. The Bellman formulation above is the recursive form that SDDP exploits.

3 Input Specification

Discount rates are specified as an annual rate in stages.json, within the policy_graph section:

{
  "policy_graph": {
    "type": "finite_horizon",
    "annual_discount_rate": 0.06,
    "transitions": [
      { "source_id": 0, "target_id": 1, "probability": 1.0 },
      { "source_id": 1, "target_id": 2, "probability": 1.0 }
    ]
  }
}

The system converts the annual rate to a per-transition discount factor based on each stage’s duration:

Stage duration $Δ t$ is derived from end_date - start_date, expressed in years
Transition discount factor: $d_{t \to t + 1} = \frac{1}{( 1 + r _{ann u a l} ) ^{Δ t}}$
The duration used is that of the source stage (the stage whose future cost is being discounted)

A value of 0.0 means no discounting ( $d = 1.0$ for all transitions).

Individual transitions may override the global rate:

{
  "source_id": 59,
  "target_id": 48,
  "probability": 1.0,
  "annual_discount_rate": 0.1
}

For the complete policy_graph schema, see Input Scenarios §1.2.

4 Discount Factor in the Stage Subproblem

The discount factor is applied to the future cost variable $θ$ in the stage $t$ objective, not to the cut coefficients:

$Q_{t} (x_{t - 1}, ω_{t}) = x_{t}, u_{t}, θ min {c_{t} (x_{t}, u_{t}) + d_{t \to t + 1} \cdot θ}$

subject to all standard constraints (load balance, hydro balance, etc.) and Benders cuts:

$θ \geq α_{i} + h \sum π_{i, h}^{v} \cdot v_{h} + h, ℓ \sum π_{i, h, ℓ}^{l a g} \cdot a_{h, ℓ} \forall i$

The cut coefficients $(α_{i}, π_{i, h}^{v}, π_{i, h, ℓ}^{l a g})$ are the undiscounted values from the backward pass. Cuts are stored and managed in undiscounted form — the discount factor appears only in the objective coefficient of $θ$ . See Cut Management for cut generation and aggregation details.

Why discount on $θ$ , not on cuts: Applying the discount factor to the $θ$ variable rather than scaling each cut individually is simpler and avoids modifying cut coefficients. The mathematical result is identical — if $θ \geq α + π^{⊤} x$ and $d \cdot θ$ appears in the objective, it is equivalent to having $θ^{'} \geq d \cdot (α + π^{⊤} x)$ with $θ^{'}$ in the objective.

5 Cumulative Discounting

For a path from stage 1 to stage $T$ , the cumulative discount factor is:

$d_{1 \to T} = t = 1 \prod T - 1 d_{t \to t + 1}$

The present value at stage 1 of costs incurred at stage $T$ is:

$PV_{1} [c_{T}] = d_{1 \to T} \cdot c_{T}$

where $d_{1 \to 1} = 1$ .

6 Lower Bound with Discounting

The deterministic lower bound at iteration $k$ is:

$\underline{z}^{k} = c_{1} (\overset{x}{^}_{1}^{k}) + d_{1 \to 2} \cdot θ_{1}^{k}$

where $θ_{1}^{k}$ is the optimal value of the future cost variable at stage 1. Because the discount factor cascades through $θ$ at each stage, the lower bound already reflects cumulative discounting to stage 1 present value.

7 Upper Bound (Simulation) with Discounting

When simulating the policy to estimate the upper bound:

$\overset{z}{ˉ}^{k} = \frac{1}{M} m = 1 \sum M t = 1 \sum T d_{1 \to t} \cdot c_{t} (\overset{x}{^}_{t}^{k, m})$

Each stage’s immediate cost is explicitly discounted to stage 1 present value using the cumulative discount factor.

For stopping rules that use these bounds, see Stopping Rules.

8 Reporting

Both the lower bound $\underline{z}$ and upper bound $\overset{z}{ˉ}$ represent total expected cost expressed in present value at stage 1:

A lower bound of $100M means the optimal policy costs at least $100M in stage-1 dollars
Future costs are already discounted: a $1M cost at stage 12 with $d_{1 \to 12} = 0.95$ contributes $0.95M to the bounds
Comparisons between bounds and between iterations are valid because they use consistent discounting
When reporting per-stage costs in simulation outputs, Cobre reports both nominal (undiscounted) and present value costs

9 Infinite Horizon Considerations

For cyclic policy graphs (infinite periodic horizon), discounting is required for convergence. The cumulative discount around one full cycle must satisfy:

$d_{cyc l e} = t \in cycle \prod d_{t \to t + 1} < 1$

This ensures the value function remains finite: $lim_{n \to \infty} d_{cyc l e}^{n} \cdot V_{t} (x) = 0$ .

Typical setup: An annual discount rate of 6% gives $d_{cyc l e} \approx 0.94$ per 12-month cycle.

For the complete infinite horizon formulation — including cut sharing within cycles, modified forward/backward passes, and convergence criteria — see Infinite Horizon.

Cross-References

Input Scenarios §1.2 — policy_graph schema with annual_discount_rate and per-transition overrides
SDDP Algorithm — Core Bellman equation and forward/backward pass structure that discount rates modify
Cut Management — Cut coefficients remain undiscounted; discount applied to $θ$ in objective
Stopping Rules — Convergence criteria using discounted lower/upper bounds
Upper Bound Evaluation — Inner approximation uses discounted vertex values
Infinite Horizon — Cycle detection, cut sharing, modified passes, convergence for periodic problems
Configuration Reference — stages.json transition discount rates

Keyboard shortcuts

Cobre Methodology Reference