Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

PAR(p) Autoregressive Models

What Is a PAR(p) Model?

A Periodic Autoregressive model of order p (PAR(p)) is a time series model designed for data with strong seasonal patterns. It extends the classical autoregressive (AR) model by allowing every parameter to vary by season — the coefficients that govern January inflows are different from those that govern July inflows.

The “order p” indicates how many past time steps the model looks back. A PAR(3) model for a given month predicts the current inflow using the inflows from the previous three months. The order can differ by season: January might need only one lag while April might need four, reflecting different hydrological dynamics across the year.

The PAR(p) Equation

For hydro plant at stage falling in season , the PAR(p) model is:

In words: the inflow at stage equals the seasonal mean, plus a weighted combination of how much recent inflows deviated from their seasonal means, plus a random shock.

Parameters by Season

Each season has its own set of parameters:

ParameterSymbolRole
Seasonal meanExpected inflow for season
AR coefficientsWeights on past deviations from the mean
Residual stdScale of the random innovation
InnovationStandardized random shock

The seasonal mean and sample standard deviation are estimated from historical data. The AR coefficients are fitted using the Yule-Walker equations (see below). The residual standard deviation is derived at runtime from the other parameters (it is not stored independently).

How Lags Become State Variables

In the SDDP framework, decisions at each stage depend on a set of state variables that summarize everything the optimizer needs to know from the past. For the PAR(p) model, the state variables are the lagged inflows:

where is the reservoir volume and are the lagged inflows needed by the autoregressive equation. Each lag adds one state variable per hydro plant to the SDDP subproblem.

This is significant for problem size: a system with 150 hydro plants and a maximum PAR order of 6 adds up to state variables beyond the reservoir volumes. The LP formulation includes constraints that “shift” lagged inflows forward from one stage to the next, ensuring the autoregressive structure is respected across the Bellman recursion.

Stored vs. Computed Quantities

Cobre stores the natural outputs of the fitting process:

  • Stored: seasonal means (), seasonal sample standard deviations (), AR order (), and AR coefficients in original units ()
  • Computed at runtime: the residual standard deviation , derived from the stored quantities to guarantee consistency

This design avoids redundancy — is fully determined by the other parameters and recomputing it is inexpensive.

Yule-Walker Fitting Procedure

When fitting PAR(p) parameters from historical inflow data, the AR coefficients are estimated by solving the Yule-Walker equations — a linear system that relates the autocorrelations of the data to the model coefficients. The procedure has six steps.

Implementation status: As of v0.1.1, this full fitting procedure is implemented in cobre-stochastic’s estimation module. Steps 1–5 are carried out by the seasonal statistics, autocorrelation, and AR coefficient estimators; Step 6 selects the model order via AIC before the final coefficients are computed.

Step 1 — Seasonal Statistics

For each season , compute the sample mean and standard deviation from historical observations :

where is the number of historical observations for season .

Step 2 — Seasonal Autocorrelations

Compute the cross-seasonal autocorrelation at lag for season . The cross-seasonal structure arises because lag at season reaches back to season (cyclically):

Note that is the standard deviation of season , not of season . This is the defining feature of a periodic (as opposed to stationary) autoregressive model.

Step 3 — Yule-Walker System

For each season , the coefficients in standardized form satisfy:

where:

The solution is:

The matrix is not a standard Toeplitz matrix (because consecutive rows use different seasons’ correlations), but it has a similar structure. The correlation matrix must be positive definite for the solution to exist; if not, the historical record may be too short for the requested order.

Levinson-Durbin recursion: In the implementation, the Yule-Walker system is solved using the Levinson-Durbin recursion rather than direct matrix inversion. This algorithm exploits the near-Toeplitz structure of to solve the system in time and space, compared to for a general LU factorization. For the per-season orders typical in hydro studies (), the difference is small in absolute terms, but the recursion also produces intermediate partial autocorrelation coefficients that are useful for diagnosing model fit.

Step 4 — Residual Standard Deviation

After solving the Yule-Walker system, the residual standard deviation for season is:

This equals times the square root of the unexplained variance fraction. If , the model overfits — it explains all historical variance, leaving no room for the noise term.

Step 5 — Convert to Original Units

The Yule-Walker solution yields coefficients in standardized form (dimensionless, relating standardized deviations). The LP requires original-unit coefficients:

These are computed once at initialization and used directly as LP constraint matrix entries.

Step 6 — Model Order Selection (AIC)

Before Steps 3–5 are applied at the final model order, the implementation selects the order for each season using the Akaike Information Criterion (AIC). For a candidate order , the AIC is:

where is the number of historical observations for the season and is the residual variance obtained by solving the Yule-Walker system at order . The first term penalizes poor fit (large residual variance); the second term penalizes model complexity (many lags). The selected order minimizes the criterion over the candidate range:

where is the user-specified maximum order. Order corresponds to a white-noise model (no autoregressive structure); its residual variance equals the sample variance .

Implementation: This criterion is implemented in select_ar_order_aic in the cobre-stochastic estimation module. The function evaluates for each candidate order and returns the minimizer. BIC and coefficient significance tests are recognized alternatives but are not implemented in v0.1.1.

Key Properties

  • Periodicity: All parameters vary by season, matching the strong seasonality of hydrological data.
  • Parsimony: The model order is selected per season using the AIC criterion (implemented in v0.1.1 via select_ar_order_aic). BIC and coefficient significance tests are recognized alternatives but are deferred.
  • Stationarity: Fitted models are validated to ensure the AR process does not diverge — the characteristic polynomial roots must lie outside the unit circle.
  • Positive residual variance: After fitting, must hold for all seasons. A zero or negative residual variance indicates overfitting.

Further Reading


See also: For implementation details and usage, see the Stochastic Modeling guide in the software book.