Infinite Periodic Horizon

Status: Not Implemented. This spec describes a planned design that has not yet been implemented.

Purpose

This spec defines the infinite periodic horizon formulation for Cobre SDDP: the periodic structure, convergence requirements, cut sharing within cycles, modified forward and backward pass behavior, and convergence criteria. This formulation eliminates “end-of-world” effects where finite-horizon SDDP empties reservoirs near the terminal stage.

For the discount factor mechanics that make infinite horizon convergence possible, see Discount Rate.

Symbol convention: This spec uses $d$ for the discount factor and $π$ for cut coefficients.

1 Motivation

Standard finite-horizon SDDP has a terminal condition $V_{T + 1} (x) = 0$ , causing the algorithm to attribute zero value to stored water at the end of the horizon. For long-term planning, an infinite periodic horizon better represents the ongoing nature of hydrothermal operations by allowing the value function to reflect perpetual future use.

2 Periodic Structure

Consider a system with $P$ stages per cycle (e.g., 12 monthly stages). Let $τ (t)$ denote the season (position within the cycle) for stage $t$ :

$τ (t) = (t - 1) mod P + 1 \in {1, 2, \dots, P}$

Stages with the same season share structural properties: demand patterns, inflow statistics, block definitions, and stochastic process parameters.

3 Cyclic Policy Graph

A cyclic policy graph is defined when a transition in stages.json points from a later stage back to an earlier one, forming a cycle. The policy_graph type must be "cyclic":

{
  "policy_graph": {
    "type": "cyclic",
    "annual_discount_rate": 0.06,
    "transitions": [
      { "source_id": 0, "target_id": 1, "probability": 1.0 },
      { "source_id": 59, "target_id": 48, "probability": 1.0 }
    ]
  }
}

In this example, stage 59 transitions back to stage 48, creating a 12-stage cycle (stages 48-59).

For the complete policy_graph schema and per-transition discount rate overrides, see Input Scenarios §1.2.

4 Discount Requirement for Convergence

For the value function to remain finite, the cumulative discount around one full cycle must be strictly less than 1:

$d_{cyc l e} = t \in cycle \prod d_{t \to t + 1} < 1$

This ensures:

$n \to \infty lim d_{cyc l e}^{n} \cdot V_{t} (x) = 0$

Typical setup: A 6% annual discount rate gives $d_{cyc l e} \approx 0.94$ per 12-month cycle.

Validation: The system rejects cyclic policy graphs where the cumulative cycle discount factor is $\geq 1$ . See Input Scenarios §1.2.

Stages at the same position within the cycle share their value function approximation. Let $C_{τ} = {t : τ (t) = τ}$ be all stages with season $τ$ .

A cut generated at any stage $t \in C_{τ}$ is valid for all stages in $C_{τ}$ :

$\underline{V}_{τ} (x) = k \in K_{τ} max {α_{k} + π_{k}^{⊤} x}$

The cut pool is organized by season $τ \in {1, \dots, P}$ , not by absolute stage ID. This means a single cycle’s worth of cut pools represents the entire infinite horizon.

6 Forward Pass Behavior

In infinite horizon, the forward pass simulates the policy over multiple cycles until the discounted contribution becomes negligible:

The forward pass starts at the cycle entry point and proceeds through stages
At each stage, the immediate cost is accumulated with cumulative discounting: $d_{1 \to t} \cdot c_{t}$
When the pass reaches the end of the cycle, it wraps back to the cycle start
The pass terminates when either:
- The cumulative discount factor drops below a tolerance threshold (the remaining contribution is negligible)
- A maximum number of stages is reached (safety bound, e.g., 240 stages = 20 years for monthly stages)

The max_horizon_length parameter provides the safety bound.

7 Backward Pass Behavior

The backward pass generates cuts for each season in the cycle:

Cuts are generated using the same mechanics as finite horizon (see Cut Management)
Each cut is added to the season’s cut pool, applicable to all stages at that cycle position
The backward pass completes a full cycle per iteration

Convergence of the backward pass: The outer approximation has converged when the value functions are stable across consecutive cycles:

$τ \in {1, \dots, P} max \underline{z}^{k, τ} - \underline{z}^{k - P, τ} < δ_{cyc l e}$

where $δ_{cyc l e}$ is the cycle_convergence_tolerance parameter.

8 Fixed-Point Interpretation

The infinite-horizon SDDP finds the fixed point of the Bellman operator:

$V_{τ} = T_{τ} V_{τ + 1}$

where $T_{τ}$ is the one-stage Bellman operator for season $τ$ :

$(T_{τ} V) (x) = E_{ω_{τ}} [x^{'} min {c_{τ} (x^{'}, u) + d \cdot V (x^{'})}]$

Convergence is achieved when the value functions at all seasons stabilize — the outer approximation is a sufficiently tight lower bound on the true fixed point.

9 Reference

Costa, B.F.P., Calixto, A.O., Sousa, R.F.S., Figueiredo, R.T., Penna, D.D.J., Khenayfis, L.S., & Oliveira, A.M.R. (2025). “Boundary conditions for hydrothermal operation planning problems: the infinite horizon approach.” Proceeding Series of the Brazilian Society of Computational and Applied Mathematics, 11(1), 1-7. https://doi.org/10.5540/03.2025.011.01.0355

Cross-References

Discount Rate — Discount factor mechanics, Bellman equation, cumulative discounting
Input Scenarios §1.2 — policy_graph schema with type: "cyclic", annual_discount_rate, and transition definitions
SDDP Algorithm §4.2 — High-level overview of cyclic policy graphs
Cut Management — Cut generation and aggregation (undiscounted cuts, discount on $θ$ )
Stopping Rules — Convergence criteria using discounted bounds
Configuration Reference — Horizon and cycle configuration parameters

Keyboard shortcuts

Cobre Methodology Reference