Sampling Scheme Testing and Conformance
Purpose
This spec defines the conformance test suite for the SamplingScheme enum and its three methods (sample_forward, requires_noise_inversion, backward_tree_source), as specified in Sampling Scheme Trait. The suite verifies that all three variants (InSample, External, Historical) produce correct noise vectors, noise inversion flags, and backward tree source indicators against hand-computable reference values. All test cases use small scenarios (3 stages, 2 hydros, 5 openings) so that expected outputs can be verified by manual calculation using the PAR model equations from PAR(p) Inflow Model and the noise inversion procedure from Scenario Generation SS4.3.
The forward-backward separation tests (SS2) are the highest-priority tests in this spec: they verify the invariant that changing the sampling scheme does not alter the backward pass noise vectors. This invariant is the foundation of correctness for iterative optimization algorithms when forward and backward noise sources differ (Sampling Scheme Trait SS5, Scenario Generation SS3.1).
Test cases reference the method contracts from Sampling Scheme Trait SS2, the validation rules S1-S4 from Extension Points SS5.3, and the noise inversion formula from Scenario Generation SS4.3.
SS1. Conformance Test Suite
Test naming convention: test_sampling_{variant}_{method}_{scenario} where {variant} is insample, external, or historical, {method} is sample_forward, requires_noise_inversion, or backward_tree_source, and {scenario} describes the test case.
Shared test fixture: Unless otherwise noted, tests use the following small scenario setup with 3 stages, 2 hydros, and 5 openings.
PAR model parameters (both hydros, all stages use same season for simplicity):
| Parameter | Hydro 0 | Hydro 1 |
|---|---|---|
| (mean, m3/s) | 100.0 | 200.0 |
| (residual std, m3/s) | 10.0 | 20.0 |
| AR order | 1 | 1 |
| (AR coeff, original units) | 0.3 | 0.4 |
| (base value) |
Correlation: Identity matrix (no spatial correlation). Each hydro receives independent noise.
Fixed opening tree (5 openings x 3 stages x 2 hydros): Pre-generated noise vectors (opening , stage , hydro ):
| Opening | Stage | Hydro 0 | Hydro 1 |
|---|---|---|---|
| 0 | 0 | -1.2 | 0.5 |
| 0 | 1 | 0.8 | -0.3 |
| 0 | 2 | -0.1 | 1.1 |
| 1 | 0 | 0.3 | -0.7 |
| 1 | 1 | 1.5 | 0.2 |
| 1 | 2 | -0.6 | -1.0 |
| 2 | 0 | 0.0 | 0.0 |
| 2 | 1 | -0.9 | 1.4 |
| 2 | 2 | 0.7 | -0.5 |
| 3 | 0 | 1.1 | 0.8 |
| 3 | 1 | -0.4 | -1.2 |
| 3 | 2 | 0.2 | 0.6 |
| 4 | 0 | -0.8 | 1.3 |
| 4 | 1 | 0.6 | -0.1 |
| 4 | 2 | 1.0 | 0.4 |
External scenario data (3 external scenarios x 3 stages x 2 hydros): Inflow values in m3/s:
| Scenario | Stage | Hydro 0 inflow | Hydro 1 inflow |
|---|---|---|---|
| 0 | 0 | 105.0 | 210.0 |
| 0 | 1 | 98.0 | 225.0 |
| 0 | 2 | 112.0 | 195.0 |
| 1 | 0 | 88.0 | 180.0 |
| 1 | 1 | 110.0 | 240.0 |
| 1 | 2 | 95.0 | 205.0 |
| 2 | 0 | 102.0 | 215.0 |
| 2 | 1 | 107.0 | 190.0 |
| 2 | 2 | 99.0 | 220.0 |
Historical inflow data (3 years x 3 stages x 2 hydros): Same structure as external scenarios but loaded from inflow_history.parquet. For this fixture, historical data uses the same values as external scenario data above.
Lag initialization: All lag values initialized to the respective hydro mean (: 100.0 for hydro 0, 200.0 for hydro 1).
Noise inversion reference (External scenario 0, stage 0):
For hydro 0:
For hydro 1:
Seed configuration: Base seed = 42. Deterministic seed derivation: seed(iteration, scenario_index, stage_id) = hash(42, iteration, scenario_index, stage_id). The exact hash function is implementation-defined, but given fixed inputs the derived seed must be identical across MPI ranks.
SS1.1 sample_forward Conformance
| Test Name | Input Scenario | Expected Observable Behavior | Variant |
|---|---|---|---|
test_sampling_insample_sample_forward_basic | Shared fixture. InSample with seed=42. Call sample_forward(stage_id=0, scenario_index=0, rng) where rng is initialized from seed(iteration=0, scenario_index=0, stage_id=0). Suppose the RNG draws index from . | Returns NoiseVector { values: [0.0, 0.0] }, which is the opening tree row for opening 2, stage 0. The noise values are looked up directly from the opening tree – no noise generation or inversion occurs during the forward pass. | InSample |
test_sampling_insample_sample_forward_different_stage | Shared fixture. InSample with seed=42. Call sample_forward(stage_id=1, scenario_index=0, rng) where rng draws index . | Returns NoiseVector { values: [0.8, -0.3] }, which is the opening tree row for opening 0, stage 1. Each stage draws an independent opening index. | InSample |
test_sampling_external_sample_forward_random | Shared fixture. External with selection_mode = Random. Call sample_forward(stage_id=0, scenario_index=0, rng) where rng selects external scenario 0. Noise inversion at stage 0 with lag : hydro 0: , hydro 1: . | Returns NoiseVector { values: [0.5, 0.5] }. The external inflow values are inverted to noise terms via the PAR model. The returned vector has length 2 (one per hydro). | External |
test_sampling_external_sample_forward_sequential | Shared fixture. External with selection_mode = Sequential. Call sample_forward(stage_id=0, scenario_index=1, rng). Sequential mode assigns scenario . Noise inversion for external scenario 1, stage 0 with lag : hydro 0: , hydro 1: . | Returns NoiseVector { values: [-1.2, -1.0] }. Sequential mode deterministically assigns external scenario index without using the RNG. The rng parameter is unused in sequential mode. | External |
test_sampling_external_sample_forward_stage_chain | Shared fixture. External with selection_mode = Sequential, scenario_index=0 (selects external scenario 0). Invoke sample_forward for all 3 stages sequentially, updating lags after each stage. Stage 0: hydro 0 inversion = 0.5, hydro 1 inversion = 0.5 (computed above). Now update lags: , . Stage 1: hydro 0: , hydro 1: . Update lags: , . Stage 2: hydro 0: , hydro 1: . | Stage 0: NoiseVector { values: [0.5, 0.5] }. Stage 1: NoiseVector { values: [-0.35, 1.05] }. Stage 2: NoiseVector { values: [1.26, -0.75] }. Noise inversion proceeds sequentially through stages because each stage’s lag depends on the previous stage’s target inflow. The full 3-stage chain produces 6 concrete noise values. | External |
test_sampling_historical_sample_forward_basic | Shared fixture. Historical variant. Call sample_forward(stage_id=0, scenario_index=0, rng). Historical variant replays year 0 (mapped via season definitions). The inflow values for year 0 are identical to external scenario 0 in the shared fixture. Noise inversion at stage 0: hydro 0: , hydro 1: . | Returns NoiseVector { values: [0.5, 0.5] }. Historical inflows are inverted using the same PAR inversion formula as External. With identical inflow values and PAR parameters, the result is identical to the External variant for the same data. | Historical |
test_sampling_insample_sample_forward_single_opening | Modified fixture: 1 opening instead of 5. Opening tree has only opening 0 with noise at stage 0. InSample with seed=42. Call sample_forward(stage_id=0, scenario_index=0, rng). | Returns NoiseVector { values: [-1.2, 0.5] }. With a single opening, the sampled index is always . The RNG draw is deterministic but trivial – only one valid index exists. | InSample |
SS1.2 requires_noise_inversion Conformance
| Test Name | Input Scenario | Expected Observable Behavior | Variant |
|---|---|---|---|
test_sampling_insample_requires_noise_inversion | InSample variant with seed=42. | Returns false. InSample operates directly on pre-generated noise vectors from the opening tree. No inversion of raw inflow values is needed. | InSample |
test_sampling_external_requires_noise_inversion | External variant with selection_mode = Random. | Returns true. External scenarios provide raw inflow values (m3/s) that must be inverted to noise terms () via the PAR model before use in the stage LP. | External |
test_sampling_historical_requires_noise_inversion | Historical variant. | Returns true. Historical inflows are raw values that must be inverted to noise terms, following the same inversion procedure as External. | Historical |
SS1.3 backward_tree_source Conformance
| Test Name | Input Scenario | Expected Observable Behavior | Variant |
|---|---|---|---|
test_sampling_insample_backward_tree_source | InSample variant with seed=42. | Returns BackwardTreeSource::UserProvidedPAR. The opening tree is generated from the user-supplied PAR parameters (inflow_seasonal_stats.parquet and inflow_ar_coefficients.parquet). | InSample |
test_sampling_external_backward_tree_source | External variant with selection_mode = Random. | Returns BackwardTreeSource::FittedToExternalData. The opening tree is generated from a PAR model fitted to the external scenario data (external_scenarios.parquet), so backward branchings reflect the statistical properties of the forward scenarios. | External |
test_sampling_historical_backward_tree_source | Historical variant. | Returns BackwardTreeSource::FittedToHistoricalData. The opening tree is generated from a PAR model fitted to the historical inflow data (inflow_history.parquet). | Historical |
SS1.4 Edge Cases
| Test Name | Input Scenario | Expected Observable Behavior | Variant |
|---|---|---|---|
test_sampling_insample_sample_forward_single_stage | Modified fixture: 1 stage, 2 hydros, 5 openings. InSample with seed=42. Call sample_forward(stage_id=0, scenario_index=0, rng) where RNG draws index . | Returns NoiseVector { values: [1.1, 0.8] } (opening 3, stage 0 from the shared fixture opening tree). Single-stage operation is a valid degenerate case – the forward pass visits exactly one stage per trajectory. | InSample |
test_sampling_external_sample_forward_sequential_cycling | Shared fixture with 3 external scenarios. External with selection_mode = Sequential. Call sample_forward(stage_id=0, scenario_index=4, rng). Sequential mode selects scenario . Noise inversion for external scenario 1, stage 0: hydro 0: , hydro 1: . | Returns NoiseVector { values: [-1.2, -1.0] }. When scenario_index exceeds the external scenario count, sequential mode wraps around using modular arithmetic. Scenario 4 maps to the same data as scenario 1. | External |
SS2. Forward-Backward Separation Tests
These tests verify the most critical invariant of the sampling scheme abstraction: the backward pass ALWAYS uses the fixed opening tree, regardless of the forward sampling scheme (Sampling Scheme Trait SS5, Scenario Generation SS3.1, Extension Points SS5.4).
The invariant implies that changing the forward sampling scheme (e.g., from InSample to External) must NOT alter the backward pass noise vectors. Only the forward pass noise source changes; the backward pass continues to evaluate all branchings from the same fixed opening tree.
Test approach: Run a complete SDDP backward pass at a fixed trial point under two different sampling scheme configurations. Collect the backward pass noise vectors (one per opening per stage). Verify that the two configurations produce identical backward noise vectors.
| Test Name | Input Scenario | Expected Observable Behavior |
|---|---|---|
test_sampling_forward_backward_separation_insample_vs_external | Shared fixture. Run two configurations: (A) InSample with seed=42, (B) External with selection_mode = Random and the shared fixture external scenarios. For both configurations, execute a backward pass at stage 1 with trial point (any fixed values). The backward pass evaluates all 5 openings, solving one subproblem per opening with noise from the opening tree. | The backward pass noise vectors at stage 1 are identical under both configurations: opening 0 gets , opening 1 gets , opening 2 gets , opening 3 gets , opening 4 gets . These are the stage-1 rows from the fixed opening tree, unchanged regardless of forward sampling scheme. |
test_sampling_forward_backward_separation_insample_vs_historical | Same as above but comparing (A) InSample with seed=42 against (C) Historical variant with the shared fixture historical data. Execute a backward pass at stage 2 evaluating all 5 openings. | The backward pass noise vectors at stage 2 are identical under both configurations: opening 0 gets , opening 1 gets , opening 2 gets , opening 3 gets , opening 4 gets . The forward sampling scheme (InSample vs Historical) does not affect the backward pass opening tree. |
test_sampling_forward_backward_separation_cut_coefficients | Shared fixture. Run two full SDDP iterations (forward + backward) with (A) InSample seed=42 and (B) External selection_mode = Sequential. After both runs complete iteration 1, compare the cut coefficients (, ) generated at each stage. The cuts are generated from backward pass subproblem duals, which depend only on the opening tree noise and the trial point. Since the trial points may differ (because forward paths differ), compare only the backward pass noise inputs, not the cut values themselves. | At each stage in the backward pass, the noise vectors used for cut generation are the same 5-opening tree rows under both (A) and (B). Configuration (B) produces different forward trial points (because the forward noise comes from external data), but the backward pass noise vectors are invariant. This separation is what allows SDDP to generate valid cuts when the forward and backward noise sources differ. |
SS3. Reproducibility Tests
These tests verify that InSample forward sampling with the same seed produces identical scenarios regardless of MPI rank assignment, restart, or thread ordering (Scenario Generation SS2.2, Sampling Scheme Trait SS2.1 deterministic output postcondition).
| Test Name | Input Scenario | Expected Observable Behavior |
|---|---|---|
test_sampling_insample_reproducibility_same_seed | Shared fixture. InSample with seed=42. Run sample_forward(stage_id=0, scenario_index=0, rng) twice, each time initializing rng from the same derived seed seed(iteration=0, scenario_index=0, stage_id=0). | Both calls return identical NoiseVector values. The deterministic seed derivation from (iteration, scenario_index, stage_id) ensures that the same tuple always produces the same RNG state and therefore the same sampled opening index. |
test_sampling_insample_reproducibility_cross_rank | Shared fixture. InSample with seed=42. Simulate two MPI configurations: (A) 1 rank processes all 5 scenarios at stage 0, (B) 2 ranks process scenarios 0-2 and 3-4 respectively. For each scenario index , call sample_forward(stage_id=0, scenario_index=s, rng_s) where rng_s is derived from seed(iteration=0, s, 0). | Configuration (A) and (B) produce identical NoiseVector for every scenario index. The seed derivation depends on (iteration, scenario_index, stage_id) only – not on MPI rank, rank count, or which rank processes which scenario. Scenario 3 produces the same noise whether processed by rank 0 (in config A) or rank 1 (in config B). |
test_sampling_insample_reproducibility_different_seeds | Shared fixture. Run InSample with seed=42 and InSample with seed=99, both calling sample_forward(stage_id=0, scenario_index=0, rng). | The two calls return different NoiseVector values (with overwhelming probability). Different base seeds produce different derived seeds, which produce different RNG states and different sampled opening indices. This confirms that the seed parameter meaningfully controls the random sequence. |
SS4. Validation Tests
These tests verify that the validation rules S1-S4 from Extension Points SS5.3 and Sampling Scheme Trait SS6 are correctly enforced during configuration loading.
| Test Name | Input Scenario | Expected Observable Behavior | Rule |
|---|---|---|---|
test_sampling_validate_s1_missing_seed | stages.json contains "scenario_source": { "sampling_scheme": "in_sample" } with no seed field. | Configuration loading rejects with SamplingSchemeValidationError::MissingSeed. The InSample variant requires an explicit seed for reproducibility – implicit or random seeds are not allowed. | S1 |
test_sampling_validate_s2_missing_external_file | stages.json contains "scenario_source": { "sampling_scheme": "external", "selection_mode": "random" } but the input directory does not contain external_scenarios.parquet. | Configuration loading rejects with SamplingSchemeValidationError::MissingExternalScenarioFile { expected_path: "<input_dir>/scenarios/external_scenarios.parquet" }. The External variant cannot function without the external scenario data file. | S2 |
test_sampling_validate_s3_missing_historical_file | stages.json contains "scenario_source": { "sampling_scheme": "historical" } but the input directory does not contain inflow_history.parquet. | Configuration loading rejects with SamplingSchemeValidationError::MissingHistoricalInflowFile { expected_path: "<input_dir>/scenarios/inflow_history.parquet" }. The Historical variant cannot function without the historical inflow data file. | S3 |
test_sampling_validate_s4_invalid_selection_mode | stages.json contains "scenario_source": { "sampling_scheme": "external", "selection_mode": "weighted" } with the required external_scenarios.parquet present. | Configuration loading rejects with SamplingSchemeValidationError::InvalidSelectionMode { value: "weighted" }. Only "random" and "sequential" are valid selection modes for the External variant. | S4 |
test_sampling_validate_s1_seed_present_passes | stages.json contains "scenario_source": { "sampling_scheme": "in_sample", "seed": 42 }. | Configuration loading succeeds. Returns a valid SamplingScheme::InSample { seed: 42 }. The seed is present and valid. | S1 |
test_sampling_validate_s2_external_file_present_passes | stages.json contains "scenario_source": { "sampling_scheme": "external", "selection_mode": "random" } and external_scenarios.parquet exists in the input directory with valid schema. | Configuration loading succeeds. Returns a valid SamplingScheme::External { scenarios: ..., selection_mode: Random }. Both the file and selection mode are valid. | S2 |
test_sampling_validate_s4_selection_mode_sequential_passes | stages.json contains "scenario_source": { "sampling_scheme": "external", "selection_mode": "sequential" } with valid external_scenarios.parquet. | Configuration loading succeeds. Returns a valid SamplingScheme::External { scenarios: ..., selection_mode: Sequential }. The "sequential" value is one of the two accepted selection modes. | S4 |
test_sampling_validate_s4_selection_mode_default | stages.json contains "scenario_source": { "sampling_scheme": "external" } (no selection_mode field) with valid external_scenarios.parquet. | Configuration loading succeeds with selection_mode defaulting to Random. The selection_mode field has a default value of "random" per Sampling Scheme Trait SS3.1. | S4 |
SS5. Parallel Generation Tests
These tests verify the communication-free parallel noise generation model documented in Scenario Generation SS2.2b. They complement the reproducibility tests in SS3 by explicitly varying MPI rank count, thread count, and opening tree generation configurations, and verifying that results are bit-identical across configurations. These tests validate the architectural claim that deterministic seed derivation eliminates the need for inter-rank communication during noise generation.
Test naming convention: test_sampling_parallel_{component}_{scenario} where {component} is noise, opening_tree, or forward_pass, and {scenario} describes the parallel configuration being tested.
| Test Name | Input Scenario | Expected Observable Behavior |
|---|---|---|
test_sampling_parallel_noise_rank_independence | Shared fixture from SS1. InSample with seed=42. Two MPI configurations: (A) 1 rank processes all 5 scenarios at stage 0, (B) 4 ranks process scenarios [0-1], [2-3], [4], [] respectively. For each scenario index , generate the noise vector at stage 0 using seed(iteration=0, s, stage_id=0). | Configurations (A) and (B) produce identical NoiseVector for every scenario index. The noise for scenario depends only on the tuple (base_seed=42, iteration=0, s, stage_id=0) – not on rank count, rank assignment, or which rank generates it. This is a direct consequence of the communication-free noise generation model (Scenario Generation SS2.2b). |
test_sampling_parallel_noise_thread_independence | Shared fixture. InSample with seed=42. Single rank, two thread configurations: (A) 1 Rayon worker thread processes all 5 scenarios, (B) 4 Rayon worker threads process scenarios dynamically via work-stealing. Generate noise vectors for all 5 scenarios at all 3 stages. | Configurations (A) and (B) produce identical NoiseVector for every (scenario, stage) pair. Thread ID does not appear in the seed derivation – only (base_seed, iteration, scenario_index, stage_id). The work-stealing scheduler may assign scenarios to different threads across runs, but the deterministic seed ensures identical outputs. |
test_sampling_parallel_opening_tree_rank_equivalence | Shared fixture with 5 openings, 3 stages, 2 hydros, base seed=42. Two generation configurations: (A) single rank generates all 5 openings for all 3 stages, (B) 2 ranks generate openings [0-2] and [3-4] respectively (contiguous block assignment per Work Distribution §3.1). Each opening’s noise is generated from seed(base_seed=42, opening_index, stage). | The resulting OpeningTree is bit-identical under both configurations. For each (opening , stage , hydro ), the noise value matches the shared fixture opening tree table from SS1. Multi-rank generation is equivalent to single-rank generation because the seed depends on (base_seed, opening_index, stage) only (Scenario Generation SS2.3c). |
test_sampling_parallel_opening_tree_thread_equivalence | Same as Test 3 but within a single rank. Two thread configurations: (A) 1 thread generates all 5 openings, (B) 3 threads generate openings [0-1], [2-3], [4] respectively. Openings are the outer loop; stages are the inner loop per opening (per Scenario Generation SS2.3c). | Bit-identical OpeningTree under both configurations. Each thread writes to a disjoint portion of the opening tree; the result is independent of thread assignment. The per-opening seed derivation ensures that the noise for opening at stage is fully determined by (base_seed, j, t) regardless of which thread computes it. |
test_sampling_parallel_forward_pass_multi_config | Shared fixture. InSample with seed=42. Run a complete forward pass (all 3 stages, 5 scenarios) under three configurations: (A) 1 rank / 1 thread, (B) 2 ranks / 1 thread each, (C) 1 rank / 4 threads. At each (scenario, stage), the forward pass invokes sample_forward with the deterministically seeded RNG. | All three configurations produce identical forward pass results: same sampled opening indices, same noise vectors, same LP solutions (assuming deterministic LP solver). The forward pass noise path is invariant to MPI rank count and thread count because the seed derivation depends only on (base_seed, iteration, scenario_index, stage_id) (Scenario Generation SS2.2b). |
test_sampling_parallel_external_noise_inversion_rank_independence | Shared fixture. External variant with selection_mode = Sequential. Two configurations: (A) 1 rank processes all 5 scenarios, (B) 2 ranks process scenarios [0-2] and [3-4] respectively. Generate inverted noise for all 3 stages of scenario 0 (using the external scenario data and the noise inversion formula from Scenario Generation SS4.3). | Both configurations produce identical inverted noise vectors for scenario 0. The noise inversion depends on (external_scenario_data, PAR_parameters, scenario_index, stage_id) – not on rank assignment. Scenario 0 produces the same 3-stage chain regardless of which rank processes it: stage 0 = [0.5, 0.5], stage 1 = [-0.35, 1.05], stage 2 = [1.26, -0.75] (values from test_sampling_external_sample_forward_stage_chain in SS1.1). |
Cross-References
- Sampling Scheme Trait – Enum definition (SS1), method contracts for
sample_forward(SS2.1),requires_noise_inversion(SS2.2),backward_tree_source(SS2.3), supporting types (SS3), forward-backward separation invariant (SS5), validation rules (SS6) - Scenario Generation – Three orthogonal concerns (SS3.1), forward sampling schemes (SS3.2), backward sampling (SS3.4), opening tree (SS2.3), noise inversion formula (SS4.3), PAR fitting from external data (SS4.2), reproducible sampling (SS2.2)
- Extension Points – Sampling scheme variant table (SS5.1), configuration mapping (SS5.2), validation rules S1-S4 (SS5.3), forward-backward separation invariant (SS5.4), variant selection pipeline (SS6)
- PAR(p) Inflow Model – PAR model definition (SS1), parameter set (SS2), residual std derivation (SS3), inflow computation formula
- Input Scenarios SS2.1 –
scenario_sourceJSON schema:sampling_scheme,seed,selection_mode - Input Scenarios SS2.5 –
external_scenarios.parquetschema:stage_id,scenario_id,hydro_id,value_m3s - Risk Measure Testing – Sibling conformance test spec following the same table format pattern
- Horizon Mode Testing – Sibling conformance test spec following the same table format pattern
- Backend Testing – Conformance test suite structure and requirements table format (reference pattern for this spec)
- Scenario Generation SS2.2b – Communication-free parallel noise generation work distribution (tested by SS5)
- Work Distribution §3.1 – Contiguous block assignment formula used in parallel generation tests (SS5)