Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Configuration Reference

Purpose

This spec provides a comprehensive mapping between Cobre configuration options and their effects on LP subproblem construction and solver behavior. Configuration is split across two files: config.json (solver-side settings) and stages.json (temporal structure, per-stage variants, scenario configuration). This spec documents all options, their valid values, LP effects, and links to the defining math or data model spec.

Decision DEC-018 (active): MPI/HPC parameters removed from config.json — all are auto-detected implementation details or contradicted by approved architecture.

For the file layout and config.json schema overview, see Input Directory Structure §2. For the stages.json schema, see Input Scenarios.

1. Configuration File Split

FileScopeWhere Defined
config.jsonSolver behavior: modeling options, training parameters, simulation, scenario source, I/OInput Directory Structure §2
stages.jsonTemporal structure: stages, blocks, block_mode, policy graph, risk measure, per-stage num_scenariosInput Scenarios

Design rationale: Settings that are inherently per-stage (block mode, risk measure) live in stages.json alongside the stage definitions. Settings that are global solver parameters (training iteration count, cut selection, inflow non-negativity method, upper bound evaluation, scenario source) live in config.json.

1.1 CLI Presentation Settings

The output format (--output-format human|json|json-lines) is a CLI flag, not a configuration parameter. It is intentionally excluded from config.json because:

  1. Output format is a presentation concern, not a computation concern. It does not affect solver behavior, random seeds, convergence criteria, or output files on disk. Mixing presentation settings with algorithm configuration would violate separation of concerns.

  2. The same case may be run with different output formats depending on the consumer: human for interactive HPC sessions, json for CI/CD pipelines, json-lines for agent monitoring. These are per-invocation choices, not per-case choices.

  3. Reproducibility: Two runs with the same config.json and inputs must produce identical results regardless of output format. If output format were in config.json, it would appear in the configuration hash (data_integrity.config_hash in Output Infrastructure §2), creating spurious hash differences.

Similarly, the --quiet and --no-progress flags are CLI-only. They suppress output but do not change what is computed or written to disk. See CLI and Lifecycle §3.1 for the global CLI flags and Structured Output §5 for the output format negotiation.

2. Modeling Options (config.jsonmodeling)

2.1 Inflow Non-Negativity Treatment

OptionValueLP EffectReference
modeling.inflow_non_negativity.method"none"No slack, may cause infeasibilityInflow Non-Negativity
modeling.inflow_non_negativity.method"penalty"Add slack with penaltyInflow Non-Negativity
modeling.inflow_non_negativity.method"truncation"Pre-truncate in scenario generationInflow Non-Negativity
modeling.inflow_non_negativity.method"truncation_with_penalty"Noise adjustment slack Inflow Non-Negativity
modeling.inflow_non_negativity.penalty_costfloatPenalty coefficient (default: 1000)Inflow Non-Negativity
MethodMath FormulationLP Variables Added
noneDirect AR outputNone
penaltyinflow_slack
truncationNone
truncation_with_penaltynoise_slack

2.2 Penalty Coefficients

Penalty defaults are defined in penalties.json, not config.json. The penalty system uses a three-tier cascade: global defaults (penalties.json) → entity overrides (entity registry files) → stage overrides (constraints/penalty_overrides_*.parquet). See Penalty System for the full specification.

The penalty categories are:

CategoryPurposeExamples
Recourse slacksLP feasibilitydeficit_segments, excess_cost
Constraint violationsPolicy shapingstorage_violation_below_cost, filling_target_violation_cost, outflow_violation_*_cost
RegularizationSolution guidancespillage_cost, fpha_turbined_cost, exchange_cost, curtailment_cost

Priority ordering: Filling target > Storage violation > Deficit > Constraint violations > Resource costs > Regularization.

3. Training Options (config.jsontraining)

3.1 Training Enabled

OptionTypeDefaultDescription
training.enabledbooltrueWhen false, skip the Training phase and proceed directly to Simulation. See CLI and Lifecycle SS5 for lifecycle phases.

3.1a Seed and Cut Formulation

OptionTypeDefaultDescription
training.tree_seedint or null42Random seed for reproducible opening tree generation. When null, uses the default seed (42). Negative values use their absolute value unsigned.
training.cut_formulationstring or nullnullCut formulation variant: "single" (one aggregated cut per stage per iteration) or "multi" (one cut per forward-pass scenario per stage). When null, the solver selects the default.

3.2 Forward Pass Count

OptionTypeDefaultDescription
training.forward_passesintmandatory (no default)Number of scenario trajectories per iteration

The loader must reject any configuration that omits training.forward_passes. This field has no default because the optimal value depends on the problem structure and the available MPI ranks; an absent value would silently produce incorrect cut pool sizing (see Cut Management Implementation SS1.3).

3.3 Cut Selection

OptionValueLP EffectReference
training.cut_selection.enabledboolEnable/disable cut pruningCut Management
training.cut_selection.method"level1"Keep ever-active cutsCut Management §7
training.cut_selection.method"lml1"Limited memory level-1Cut Management §7
training.cut_selection.method"domination"Remove dominated cutsCut Management §7
training.cut_selection.thresholdintStrategy-specific threshold (default 0). For level1: activity count threshold (u64). For lml1: memory window in iterations (u64). For domination: near-binding tolerance epsilon (f64).Cut Selection Strategy Trait SS5
training.cut_selection.check_frequencyintIterations between pruning checks (must be > 0, default 5)Cut Management Implementation §2
training.cut_selection.cut_activity_tolerancefloat1e-6Minimum dual multiplier value for a cut to be considered binding (active). Used by all selection strategies. See Cut Management Implementation SS6.1.
MethodAlgorithm
level1Keep cuts that were binding at least once (deactivate when active_count <= threshold)
lml1Keep most recently active cuts within a memory window (deactivate when current_iteration - last_active_iter > threshold)
dominationRemove cuts dominated at all visited forward-pass states (a cut is dominated when no visited state shows it within threshold of the best active cut)

3.4 Stopping Rules

OptionTypeDescription
training.stopping_rulesarrayList of stopping rule configurations
training.stopping_modestring"any" (OR logic) or "all" (AND logic)

Available rule types:

Rule TypeParametersDescriptionReference
iteration_limitlimit: intMaximum iteration count (mandatory)Stopping Rules §2
time_limitseconds: intWall-clock time limitStopping Rules §3
bound_stallingiterations: int, tolerance: floatLB relative improvement below toleranceStopping Rules §4
simulationreplications, period, bound_window, distance_tol, bound_tolPolicy stability via Monte Carlo simulationStopping Rules §5
{
  "training": {
    "stopping_rules": [
      { "type": "iteration_limit", "limit": 100 },
      { "type": "bound_stalling", "iterations": 10, "tolerance": 0.0001 }
    ],
    "stopping_mode": "any"
  }
}

3.5 Solver Retry Configuration

OptionTypeDefaultDescription
training.solver.retry_max_attemptsint5Maximum number of solver retry attempts before propagating a hard error.
training.solver.retry_time_budget_secondsfloat30.0Total time budget (seconds) across all retry attempts for a single LP solve.

The retry escalation strategy — which solver parameters are varied across attempts — is encapsulated within each solver implementation and is not user-configurable. The external config parameters above control only the stopping conditions for the retry loop. See Solver Interface Trait SS6 and Solver Abstraction SS7 for the encapsulation contract.

4. Upper Bound Evaluation (config.jsonupper_bound_evaluation)

OptionValueLP EffectReference
upper_bound_evaluation.enabledboolEnable vertex-based inner approxUpper Bound Evaluation
upper_bound_evaluation.initial_iterationintFirst iteration to compute UBUpper Bound Evaluation
upper_bound_evaluation.interval_iterationsintIterations between UB evaluationsUpper Bound Evaluation
upper_bound_evaluation.lipschitz.mode"auto"Auto-compute Lipschitz constantsUpper Bound Evaluation
upper_bound_evaluation.lipschitz.fallback_valuefloatFallback when auto failsUpper Bound Evaluation
upper_bound_evaluation.lipschitz.scale_factorfloatMultiplicative safety marginUpper Bound Evaluation
{
  "upper_bound_evaluation": {
    "enabled": true,
    "initial_iteration": 10,
    "interval_iterations": 5,
    "lipschitz": {
      "mode": "auto",
      "fallback_value": 10000.0,
      "scale_factor": 1.1
    }
  }
}

5. Horizon and Discount (stages.jsonpolicy_graph)

The horizon mode and discount rate are derived from the policy_graph structure in stages.json. See Input Scenarios §1.2 for the full schema and Extension Points §4 for variant selection.

OptionValueLP EffectReference
policy_graph.type"finite_horizon"Terminal value SDDP Algorithm §4.1
policy_graph.type"cyclic"Cycle detection, cut sharing by seasonInfinite Horizon
policy_graph.annual_discount_ratefloatPer-transition factorDiscount Rate
policy_graph.transitions[].annual_discount_ratefloatPer-transition overrideInput Scenarios §1.2

Discount rate conversion: The annual_discount_rate is converted to a per-transition discount factor:

where is the source stage duration in years. A rate of 0.0 means no discounting ().

Finite horizon example:

{
  "policy_graph": {
    "type": "finite_horizon",
    "annual_discount_rate": 0.06,
    "transitions": [
      { "source_id": 0, "target_id": 1, "probability": 1.0 },
      { "source_id": 1, "target_id": 2, "probability": 1.0 }
    ]
  }
}

Cyclic (infinite periodic) horizon:

{
  "policy_graph": {
    "type": "cyclic",
    "annual_discount_rate": 0.06,
    "transitions": [
      { "source_id": 0, "target_id": 1, "probability": 1.0 },
      { "source_id": 59, "target_id": 48, "probability": 1.0 }
    ]
  }
}

Validation: For cyclic graphs, the cumulative cycle discount must be (see Extension Points §4.3).

6. Per-Stage Options (stages.jsonstages[])

These settings vary by stage. Each is configured in the stage object within stages.json. See Input Scenarios §1.4 for the full stage field table.

6.1 Block Mode

OptionValueLP EffectReference
block_mode"parallel"Single water balance per stage, averaged generationBlock Formulations
block_mode"chronological"Per-block storage variables, sequential water balanceBlock Formulations

Default: "parallel". Can vary by stage (e.g., chronological for near-term, parallel for distant stages).

6.2 Risk Measure

OptionValueLP EffectReference
risk_measure"expectation"Risk-neutral expected valueRisk Measures §1
risk_measure{"cvar": {"alpha": ..., "lambda": ...}}Convex combination of E and CVaRRisk Measures §3

Default: "expectation". Can vary by stage (e.g., higher risk aversion for near-term stages). See Extension Points §2 for variant selection and validation rules.

6.3 Scenario Source and Sampling Scheme

The scenario_source object configures how scenarios are selected during the SDDP forward pass. It lives in config.json under training.scenario_source (for training) and simulation.scenario_source (for simulation). When simulation.scenario_source is absent, it falls back to training.scenario_source. See Scenario Generation §3 for the full abstraction and Extension Points §5 for variant selection.

Each stochastic class (inflow, load, NCS) has its own sampling scheme, configured via per-class sub-objects. The ScenarioSource struct groups the three per-class schemes with a shared seed and optional historical year selection:

OptionValueEffectReference
scenario_source.inflow.scheme"in_sample"Inflow forward pass samples from the fixed opening treeScenario Generation §3
scenario_source.inflow.scheme"out_of_sample"Inflow forward pass draws from independently generated noiseScenario Generation §3
scenario_source.inflow.scheme"external"Inflow forward pass draws from user-provided per-class scenario dataScenario Generation §4
scenario_source.inflow.scheme"historical"Inflow forward pass replays historical inflow sequencesScenario Generation §3
scenario_source.load.scheme(same 4 variants)Load class sampling schemeScenario Generation §3
scenario_source.ncs.scheme(same 4 variants)NCS class sampling schemeScenario Generation §3
scenario_source.seedi64Base seed for reproducible noise generation (required for in_sample)Input Scenarios §2.1
scenario_source.historical_yearsarray of i32Specific historical years for historical scheme (optional)Input Scenarios §2.1
Sampling SchemeForward Noise SourceBackward Noise Source
in_sampleOpening tree (PAR-generated)Same opening tree
out_of_sampleIndependently generated Monte Carlo noiseOpening tree from same PAR model
externalUser-provided per-class scenario valuesOpening tree from PAR fitted to external data
historicalHistorical records mapped to stagesOpening tree from PAR fitted to history

6.4 Opening Tree Size

OptionLocationDefaultEffectReference
num_scenariosstages.json → stages[i]Number of backward pass noise branchings for stage iScenario Generation §2.3

The opening tree is generated once before training and remains fixed throughout. Larger values improve cut quality but increase backward pass cost linearly. Each stage can have a different branching factor — this is required for complete tree mode where near-term stages use num_scenarios: 1 and the final stage uses the full branching count.

Deferred: Monte Carlo backward sampling — sample noise terms per backward step. See Deferred Features §C.14.

6.5 Production Function

The hydro production model is configured per-hydro (not globally) in hydros.json, with optional per-stage selection in hydro_production_models.json. See Input Hydro Extensions §2.

ModelLP EffectPhaseReference
constant_productivityFixed Training + SimulationHydro Production Models §1
fphaPiecewise-linear head approximationTraining + SimulationHydro Production Models §2
linearized_headBilinear Simulation onlyHydro Production Models §3

7. Simulation Options (config.jsonsimulation)

The simulation section controls the optional post-training simulation phase and its I/O behavior.

7.1 Core Simulation Settings

OptionTypeDefaultDescription
simulation.enabledboolfalseEnable the simulation phase after training.
simulation.num_scenariosint2000Number of independent Monte Carlo simulation scenarios to evaluate.
simulation.policy_typestring"outer"Policy representation for simulation. "outer" uses the cut pool (Benders cuts).
simulation.output_pathstringnullOptional custom directory for simulation output files.
simulation.output_modestringnullOutput mode: "streaming" or "batched".

When simulation.enabled is false or num_scenarios is 0, the simulation phase is skipped entirely.

7.2 Scenario Source (Simulation)

The simulation phase uses simulation.scenario_source when present, otherwise falls back to training.scenario_source. The format is identical to the training scenario source (see §6.3) – a per-class object with inflow, load, and ncs sub-objects, plus optional seed and historical_years.

7.3 I/O Options

OptionTypeDefaultDescriptionReference
simulation.io_channel_capacityint64Bounded channel capacity between simulation threads and the I/O writer thread. Higher values reduce backpressure under fast simulation / slow disk, at the cost of increased peak memory.Output Infrastructure §6.2
{
  "simulation": {
    "enabled": true,
    "num_scenarios": 2000,
    "policy_type": "outer",
    "io_channel_capacity": 128
  }
}

7a. Policy Options (config.jsonpolicy)

Controls policy persistence: checkpoint saving and warm-start loading.

OptionTypeDefaultDescription
policy.pathstring"./policy"Directory where policy data (cuts, states, vertices, basis) is stored.
policy.mode"fresh", "warm_start", or "resume""fresh"Initialization mode. "fresh" starts from scratch; "warm_start" loads cuts from a previous run; "resume" continues an interrupted run.
policy.validate_compatibilitybooltrueWhen loading a policy, verify that entity counts, stage counts, and cut dimensions match the current system.

7a.1 Checkpointing

Nested under policy.checkpointing:

OptionTypeDefaultDescription
policy.checkpointing.enabledboolfalseEnable periodic checkpointing during training.
policy.checkpointing.initial_iterationintnullFirst iteration to write a checkpoint.
policy.checkpointing.interval_iterationsintnullIterations between checkpoints.
policy.checkpointing.store_basisboolfalseInclude LP basis in checkpoints for warm-start.
policy.checkpointing.compressboolfalseCompress checkpoint files.
{
  "policy": {
    "path": "./policy",
    "mode": "warm_start",
    "validate_compatibility": true,
    "checkpointing": {
      "enabled": true,
      "initial_iteration": 10,
      "interval_iterations": 20,
      "store_basis": true
    }
  }
}

7b. Export Options (config.jsonexports)

Controls which outputs are written to the results directory.

OptionTypeDefaultDescription
exports.trainingbooltrueWrite training convergence data (Parquet).
exports.cutsbooltrueWrite the cut pool (FlatBuffers).
exports.statesboolfalseWrite visited state vectors (Parquet).
exports.verticesbooltrueWrite inner approximation vertices when applicable (Parquet).
exports.simulationbooltrueWrite per-entity simulation results (Parquet).
exports.forward_detailboolfalseWrite per-scenario forward-pass detail (large; disabled by default).
exports.backward_detailboolfalseWrite per-scenario backward-pass detail (large; disabled by default).
exports.stochasticboolfalseExport stochastic preprocessing artifacts to output/stochastic/.
exports.compression"zstd", "lz4", or nullnullOutput Parquet compression algorithm. null uses the crate default (zstd).
{
  "exports": {
    "training": true,
    "cuts": true,
    "states": true,
    "simulation": true,
    "stochastic": false,
    "compression": "zstd"
  }
}

7c. Estimation Options (config.jsonestimation)

Controls the PAR(p) model estimation pipeline. When the case provides inflow_history.parquet, Cobre can automatically estimate AR coefficients instead of requiring pre-computed inflow_ar_coefficients.parquet.

OptionTypeDefaultDescription
estimation.max_orderint6Maximum lag order considered during autoregressive model fitting.
estimation.order_selectionstring"pacf"Order selection criterion: "pacf" (PACF-based). "fixed" is deprecated and remaps to "pacf" at load time.
estimation.min_observations_per_seasonint30Minimum observations per (entity, season) group to proceed with estimation.
estimation.max_coefficient_magnitudefloatnullSafety net: reduce to order 0 if any coefficient exceeds this magnitude.
{
  "estimation": {
    "max_order": 6,
    "order_selection": "pacf",
    "min_observations_per_season": 30
  }
}

8. Complete Example

config.json

{
  "modeling": {
    "inflow_non_negativity": {
      "method": "penalty",
      "penalty_cost": 1000.0
    }
  },
  "training": {
    "enabled": true,
    "tree_seed": 42,
    "forward_passes": 10,
    "cut_formulation": "single",
    "cut_selection": {
      "enabled": true,
      "method": "level1",
      "threshold": 0,
      "check_frequency": 10
    },
    "stopping_rules": [
      { "type": "iteration_limit", "limit": 100 },
      { "type": "bound_stalling", "iterations": 10, "tolerance": 0.0001 }
    ],
    "stopping_mode": "any",
    "scenario_source": {
      "seed": 42,
      "inflow": { "scheme": "in_sample" },
      "load": { "scheme": "in_sample" },
      "ncs": { "scheme": "in_sample" }
    }
  },
  "upper_bound_evaluation": {
    "enabled": true,
    "initial_iteration": 10,
    "interval_iterations": 5
  },
  "policy": {
    "path": "./policy",
    "mode": "fresh",
    "validate_compatibility": true
  },
  "simulation": {
    "enabled": true,
    "num_scenarios": 2000,
    "policy_type": "outer"
  },
  "exports": {
    "training": true,
    "cuts": true,
    "states": false,
    "simulation": true,
    "stochastic": false,
    "compression": "zstd"
  },
  "estimation": {
    "max_order": 6,
    "order_selection": "pacf",
    "min_observations_per_season": 30
  }
}

stages.json (excerpt — per-stage options)

{
  "policy_graph": {
    "type": "finite_horizon",
    "annual_discount_rate": 0.06,
    "transitions": [
      { "source_id": 0, "target_id": 1, "probability": 1.0 },
      { "source_id": 1, "target_id": 2, "probability": 1.0 }
    ]
  },
  "stages": [
    {
      "id": 0,
      "start_date": "2024-01-01",
      "end_date": "2024-02-01",
      "season_id": 0,
      "block_mode": "chronological",
      "risk_measure": { "cvar": { "alpha": 0.95, "lambda": 0.5 } },
      "num_scenarios": 20,
      "blocks": [
        { "id": 0, "name": "LEVE", "hours": 168 },
        { "id": 1, "name": "MEDIA", "hours": 336 },
        { "id": 2, "name": "PESADA", "hours": 240 }
      ],
      "state_variables": { "storage": true, "inflow_lags": true }
    },
    {
      "id": 1,
      "start_date": "2024-02-01",
      "end_date": "2024-03-01",
      "season_id": 1,
      "block_mode": "parallel",
      "risk_measure": "expectation",
      "num_scenarios": 20,
      "blocks": [{ "id": 0, "name": "UNICO", "hours": 696 }],
      "state_variables": { "storage": true, "inflow_lags": true }
    }
  ]
}

Note: block_mode, risk_measure, num_scenarios, and state_variables vary by stage. scenario_source is in config.json under training.scenario_source (global for the run). policy_graph defines the horizon mode and discount rate. See Input Scenarios for the full schema.

8. Formulation-to-Configuration Mapping

This table maps each mathematical formulation to its configuration source and data files.

Formulation TopicConfig LocationData FilesSpec Reference
Block Formulationstages.json → stages[].block_modeStage.blocks[], per-block water balanceBlock Formulations
Production Functionshydros.json + hydro_production_models.jsonfpha_hyperplanes.parquet, hydro_geometry.parquetHydro Production Models
PAR(p) Modelscenarios/inflow_seasonal_stats.parquet, inflow_ar_coefficients.parquetSeasonal means/std, AR coefficients per (hydro, stage, lag)PAR Inflow Model
Non-Negativityconfig.json → modeling.inflow_non_negativityLP slack variables, penalty coefficientsInflow Non-Negativity
Cut GenerationN/A (runtime)Cut intercept/coefficients, dual extractionCut Management
Cut Selectionconfig.json → training.cut_selectionCut activity trackingCut Management
Stopping Rulesconfig.json → training.stopping_rules[]Convergence metricsStopping Rules
Discount Ratestages.json → policy_graph.annual_discount_ratePer-transition discount factor, cut scalingDiscount Rate
Horizon Modestages.json → policy_graph.typePolicy graph cycle detection, cut sharingSDDP Algorithm §4
Inner Approximationconfig.json → upper_bound_evaluationVertex storage, Lipschitz constantsUpper Bound Evaluation
Risk-Averse CVaRstages.json → stages[].risk_measureRisk-adjusted probability computationRisk Measures
Scenario Sourceconfig.json → training.scenario_sourceOpening tree, per-class external scenario files, inflow_history.parquetScenario Generation §3
Penalty Systempenalties.json + entity registries + constraints/penaltyoverrides*.parquetThree-tier cascade: global → entity → stage overridesPenalty System

9. Variable Correspondence

Math SymbolField NameJSON/File PathType
Hydro storagehydros.json → storagef64
Incoming storage (state)Internal state vectorVec<f64>
Incremental inflowinflow_seasonal_stats.parquet (mean_m3s)f64
AR coefficientsinflow_ar_coefficients.parquet → coefficientf64
Residual std devComputed from and AR coefficients at runtimef64
Future cost variableLP variablef64
Cut interceptFlatBuffers policy data (see Binary Formats §3)f64
Cut coefficientsFlatBuffers policy data (see Binary Formats §3)Vec<f64>
Discount factorstages.json → policy_graph.annual_discount_ratef64
Lipschitz constantComputed from penaltiesf64

Cross-References