Cut Selection Testing and Conformance

Purpose

This spec defines the conformance test suite for the CutSelectionStrategy enum and its three methods (should_run, select, update_activity), as specified in Cut Selection Strategy Trait. The suite verifies that the three variants (Level1, LML1, Dominated) correctly identify cuts for deactivation according to their respective selection criteria, that the aggressiveness ordering Level1 <= LML1 <= Dominated holds on identical cut pools, and that the convergence-safe variants (Level1, LML1) never deactivate a currently binding cut. All test cases use small cut pools (5 cuts, 3 visited states) so that expected outputs can be verified by hand calculation using the activity definitions from Cut Management SS6-SS7.

Test cases reference the method contracts from Cut Selection Strategy Trait SS2, the selection strategy definitions from Cut Management SS7, and the activity tracking implementation from Cut Management Implementation SS6.

SS1. Conformance Test Suite

Test naming convention: test_cutselection_{variant}_{method}_{scenario} where {variant} is level1, lml1, or dominated, {method} is select, should_run, or update_activity, and {scenario} describes the test case.

Shared test fixture: Unless otherwise noted, tests use the following 5-cut pool at a single stage with 1-dimensional state variable (scalar coefficients). The pool is at iteration 20 and check_frequency=5 for all variants.

Cut Index	Intercept ( $α_{k}$ )	Coefficient ( $π_{k}$ )	active_count	last_active_iter	iteration_generated	Active
0	10.0	2.0	3	18	2	yes
1	15.0	1.0	0	5	5	yes
2	8.0	3.0	1	8	3	yes
3	12.0	1.5	5	19	1	yes
4	6.0	0.5	0	4	4	yes

Visited states (used by Dominated variant):

State Index	$\overset{x}{^}$
0	1.0
1	3.0
2	5.0

Cut values at visited states (hand-computed: $α_{k} + π_{k} \cdot \overset{x}{^}$ ):

Cut \ State	$\overset{x}{^} = 1.0$	$\overset{x}{^} = 3.0$	$\overset{x}{^} = 5.0$
0	12.0	16.0	20.0
1	16.0	18.0	20.0
2	11.0	17.0	23.0
3	13.5	16.5	19.5
4	6.5	7.5	8.5

Max value among other cuts at each state (for domination check with threshold=0):

Cut	Best-other at $\overset{x}{^} = 1.0$	Best-other at $\overset{x}{^} = 3.0$	Best-other at $\overset{x}{^} = 5.0$	Dominated at all 3?
0	max(16, 11, 13.5, 6.5)=16.0	max(18, 17, 16.5, 7.5)=18.0	max(20, 23, 19.5, 8.5)=23.0	yes: 12<16, 16<18, 20<23
1	max(12, 11, 13.5, 6.5)=13.5	max(16, 17, 16.5, 7.5)=17.0	max(20, 23, 19.5, 8.5)=23.0	no: 16.0>13.5 at $\overset{x}{^} = 1.0$
2	max(12, 16, 13.5, 6.5)=16.0	max(16, 18, 16.5, 7.5)=18.0	max(20, 20, 19.5, 8.5)=20.0	no: 23.0>20.0 at $\overset{x}{^} = 5.0$ (not dominated, cut 2 achieves 23.0 which is the max)
3	max(12, 16, 11, 6.5)=16.0	max(16, 18, 17, 7.5)=18.0	max(20, 20, 23, 8.5)=23.0	yes: 13.5<16, 16.5<18, 19.5<23
4	max(12, 16, 11, 13.5)=16.0	max(16, 18, 17, 16.5)=18.0	max(20, 20, 23, 19.5)=23.0	yes: 6.5<16, 7.5<18, 8.5<23

Domination summary: Cuts 0, 3, and 4 are dominated at all three visited states. Cut 1 is NOT dominated (it achieves the maximum value 16.0 at $\overset{x}{^} = 1.0$ ). Cut 2 is NOT dominated (it achieves the maximum value 23.0 at $\overset{x}{^} = 5.0$ ).

SS1.1 Level1 Conformance

Test Name	Input Scenario	Expected Observable Behavior	Variant
`test_cutselection_level1_select_deactivate_never_active`	Shared fixture, iteration=20, threshold=0. Cuts 1 and 4 have active_count=0. Cuts 0, 2, 3 have active_count > 0.	DeactivationSet contains indices {1, 4}. Cuts 0, 2, 3 are retained (active_count > 0).	Level1
`test_cutselection_level1_select_retain_all_active`	Modified fixture: all 5 cuts have active_count >= 1. Specifically: cut 1 active_count=1, cut 4 active_count=2, others unchanged.	DeactivationSet is empty. No cut has active_count=0, so nothing is deactivated.	Level1
`test_cutselection_level1_select_deactivate_all_inactive`	Modified fixture: all 5 cuts have active_count=0.	DeactivationSet contains indices {0, 1, 2, 3, 4}. All cuts have never been binding.	Level1
`test_cutselection_level1_select_single_cut`	Pool contains only cut 0 with active_count=0.	DeactivationSet contains index {0}. A single never-active cut is deactivated.	Level1

SS1.2 LML1 Conformance

Test Name	Input Scenario	Expected Observable Behavior	Variant
`test_cutselection_lml1_select_deactivate_old_cuts`	Shared fixture, iteration=20, memory_window=10, threshold=0. Retention threshold: 20 - 10 = 10. Cuts with last_active_iter < 10: cut 1 (last_active_iter=5), cut 2 (last_active_iter=8), cut 4 (last_active_iter=4). Cuts with last_active_iter >= 10: cut 0 (18), cut 3 (19).	DeactivationSet contains indices {1, 2, 4}. Cuts 0 and 3 are retained (recently active within window).	LML1
`test_cutselection_lml1_select_retain_recent`	Shared fixture, iteration=20, memory_window=20, threshold=0. Retention threshold: 20 - 20 = 0. All cuts have last_active_iter >= 0.	DeactivationSet is empty. The large memory window retains all cuts (behavior approaches Level1).	LML1
`test_cutselection_lml1_select_deactivate_all_old`	Modified fixture: all 5 cuts have last_active_iter=1. Iteration=20, memory_window=5. Retention threshold: 20 - 5 = 15. All cuts have last_active_iter=1 < 15.	DeactivationSet contains indices {0, 1, 2, 3, 4}. All cuts are outside the memory window.	LML1

SS1.3 Dominated Conformance

Test Name	Input Scenario	Expected Observable Behavior	Variant
`test_cutselection_dominated_select_deactivate_dominated`	Shared fixture, iteration=20, threshold=0. Cut values at visited states as computed in the fixture table. Cuts 0, 3, 4 are dominated at all 3 visited states. Cuts 1 and 2 are not dominated (each achieves the max value at at least one state).	DeactivationSet contains indices {0, 3, 4}. Cuts 1 and 2 are retained (not dominated at all states).	Dominated
`test_cutselection_dominated_select_partial_domination_retained`	Modified fixture: 3 cuts with coefficients such that cut 0 ( $α$ =10, $π$ =2) is dominated at states $\overset{x}{^}$ =1.0 and $\overset{x}{^}$ =3.0 by cut 1 ( $α$ =15, $π$ =1), but at $\overset{x}{^}$ =5.0 cut 0 achieves value 20.0 while cut 1 achieves 20.0 (tied, not strictly dominated within threshold=0).	DeactivationSet does NOT contain index 0. Partial domination (dominated at only some states) is insufficient for deactivation.	Dominated
`test_cutselection_dominated_select_none_dominated`	3 cuts each achieving the maximum value at exactly one visited state. Cut 0: $α$ =20, $π$ =0 (value 20 everywhere). Cut 1: $α$ =0, $π$ =8 (values 8, 24, 40). Cut 2: $α$ =15, $π$ =2 (values 17, 21, 25). States: $\overset{x}{^}$ =1.0, $\overset{x}{^}$ =3.0, $\overset{x}{^}$ =5.0. At $\overset{x}{^}$ =1.0: max is cut 0 (20). At $\overset{x}{^}$ =3.0: max is cut 1 (24). At $\overset{x}{^}$ =5.0: max is cut 1 (40). Cut 0 is not dominated (achieves max at $\overset{x}{^}$ =1.0). Cut 1 is not dominated (achieves max at $\overset{x}{^}$ =3.0 and $\overset{x}{^}$ =5.0). Cut 2 is dominated at all 3 states: 17<20, 21<24, 25<40.	DeactivationSet contains index {2}. Cut 2 is dominated everywhere. Cuts 0 and 1 are retained.	Dominated

SS1.4 should_run Conformance

Test Name	Input Scenario	Expected Observable Behavior	Variant
`test_cutselection_level1_should_run_periodic`	check_frequency=5. Test iterations 0, 1, 4, 5, 10, 15, 20.	Returns false for iterations 0, 1, 4. Returns true for iterations 5, 10, 15, 20. Iteration 0 is always false (no cuts yet). Iteration 5 is the first trigger (5 % 5 == 0 and 5 > 0).	Level1
`test_cutselection_lml1_should_run_frequency_one`	check_frequency=1. Test iterations 0, 1, 2, 3.	Returns false for iteration 0. Returns true for iterations 1, 2, 3. With frequency 1, selection runs every iteration (except iteration 0).	LML1

SS1.5 update_activity Conformance

The update_activity method accepts an is_binding: bool parameter. When is_binding is false, metadata must remain unchanged regardless of variant.

Test Name	Input Scenario	Expected Observable Behavior	Variant
`test_cutselection_level1_update_activity_increment`	CutMetadata with active_count=3. Call update_activity(is_binding=true) at iteration 15.	metadata.active_count becomes 4. The counter is incremented by 1 regardless of iteration number.	Level1
`test_cutselection_lml1_update_activity_timestamp`	CutMetadata with last_active_iter=8. Call update_activity(is_binding=true) at iteration 15.	metadata.last_active_iter becomes 15. The timestamp is set to the current iteration, replacing the old value.	LML1
`test_cutselection_dominated_update_activity_reset`	CutMetadata with domination_count=7. Call update_activity(is_binding=true) at iteration 15.	metadata.domination_count becomes 0. The binding event resets the domination counter (the cut is not dominated at this state).	Dominated
`test_cutselection_level1_update_activity_not_binding`	CutMetadata with active_count=3. Call update_activity(is_binding=false) at iteration 15.	metadata.active_count remains 3. When is_binding is false, no field is modified.	Level1
`test_cutselection_lml1_update_activity_not_binding`	CutMetadata with last_active_iter=8. Call update_activity(is_binding=false) at iteration 15.	metadata.last_active_iter remains 8. The non-binding call is a no-op.	LML1
`test_cutselection_dominated_update_activity_not_binding`	CutMetadata with domination_count=7. Call update_activity(is_binding=false) at iteration 15.	metadata.domination_count remains 7. The non-binding call does not reset the counter.	Dominated

SS1.6 select_for_stage Conformance

The select_for_stage method accepts a stage_index: u32 parameter that is propagated into the returned DeactivationSet.stage_index field.

Test Name	Input Scenario	Expected Observable Behavior	Variant
`test_cutselection_level1_select_for_stage_propagates_index`	Shared fixture, iteration=20, threshold=0. Call select_for_stage with stage_index=5.	Returned DeactivationSet has stage_index=5 and indices={1, 4}. The stage_index is passed through without modification.	Level1
`test_cutselection_dominated_select_for_stage_propagates_index`	Shared fixture, iteration=20, threshold=0. Call select_for_stage with stage_index=3 and the visited states from the shared fixture.	Returned DeactivationSet has stage_index=3 and indices={0, 3, 4}. The stage_index is independent of the deactivation logic.	Dominated
`test_cutselection_select_delegates_to_select_for_stage`	Shared fixture, iteration=20, threshold=0 (Level1). Call both select() and select_for_stage(stage_index=0). Compare results.	Both calls return identical indices. The select() convenience method delegates to select_for_stage with stage_index=0.	Level1

SS2. Aggressiveness Ordering Tests

These tests verify the relative aggressiveness ordering of the three selection strategies: given the same cut pool and configuration, Level1 deactivates the fewest cuts, Dominated deactivates the most, and LML1 falls between them. This ordering is a structural property of the strategy definitions (Cut Management SS7) and must hold for any cut pool.

The ordering property is: $∣ deactivated by Level1 ∣ \leq ∣ deactivated by LML1 ∣ \leq ∣ deactivated by Dominated ∣$ .

Fixture for aggressiveness ordering: The shared test fixture from SS1 is used. All three variants use threshold=0 and check_frequency=5. The LML1 variant uses memory_window=10. The test is evaluated at iteration=20.

Test Name	Input Scenario	Expected Observable Behavior
`test_cutselection_aggressiveness_ordering_shared_fixture`	Shared fixture at iteration=20. Level1 (threshold=0, check_frequency=5): deactivates cuts with active_count=0, which are cuts {1, 4}. Count=2. LML1 (threshold=0, check_frequency=5, memory_window=10): deactivates cuts with last_active_iter < 10, which are cuts {1, 2, 4}. Count=3. Dominated (threshold=0, check_frequency=5): deactivates cuts dominated at all visited states, which are cuts {0, 3, 4}. Count=3.	Level1 deactivates 2 cuts ({1, 4}). LML1 deactivates 3 cuts ({1, 2, 4}). Dominated deactivates 3 cuts ({0, 3, 4}). Ordering holds: 2 <= 3 <= 3. Note: the deactivated SETS differ across strategies (they identify different cuts), but the COUNT ordering Level1 <= LML1 <= Dominated is satisfied.
`test_cutselection_aggressiveness_ordering_all_recently_active`	Modified fixture: all 5 cuts have active_count >= 1 and last_active_iter=19 (all recently active). Same visited states. Level1: no cuts have active_count=0, deactivates 0. LML1 (memory_window=10): all last_active_iter=19 >= 10, deactivates 0. Dominated: evaluates domination at visited states. Using shared fixture coefficients, cuts 0, 3, 4 are still dominated at all states.	Level1 deactivates 0 cuts. LML1 deactivates 0 cuts. Dominated deactivates 3 cuts ({0, 3, 4}). Ordering holds: 0 <= 0 <= 3. Demonstrates that Dominated can deactivate cuts that both Level1 and LML1 would retain, because domination is a geometric property independent of activity history.

SS3. Convergence Property Tests

These tests verify the convergence-critical invariant from Cut Management SS8 and Cut Selection Strategy Trait SS7: Level1 and LML1 never deactivate a cut that is currently binding at any visited state. A cut is currently binding if its dual multiplier is positive at the most recent LP solution – equivalently, if it achieves the maximum value $max_{k} {α_{k} + π_{k}^{⊤} \overset{x}{^}}$ at some visited state within the activity threshold.

Key insight: A binding cut necessarily has active_count > 0 (for Level1) and last_active_iter equal to the current or a recent iteration (for LML1). Therefore, the strategies’ deactivation criteria structurally exclude binding cuts.

Test Name	Input Scenario	Expected Observable Behavior
`test_cutselection_level1_preserves_binding_cut`	Shared fixture. At state $\overset{x}{^} = 1.0$ , cut 1 achieves the maximum value (16.0) and is therefore binding. However, cut 1 has active_count=0 in the shared fixture. This is a contradiction: if cut 1 is truly binding at iteration 20, then `update_activity` would have been called, setting active_count >= 1. This test constructs the corrected scenario: cut 1 has active_count=1 (after being binding). Level1 at iteration=20.	DeactivationSet does NOT contain index 1. After `update_activity` is called for the binding event, cut 1 has active_count=1 > 0, so Level1 retains it. The precondition that `update_activity` is called before `select` (Cut Selection Strategy Trait SS2.2) guarantees that binding cuts always have active_count > 0 when Level1 runs.
`test_cutselection_lml1_preserves_binding_cut`	Modified fixture: cut 2 is binding at state $\overset{x}{^} = 5.0$ (value 23.0 is the maximum). `update_activity` is called at iteration 20, setting cut 2’s last_active_iter=20. LML1 with memory_window=10, iteration=20. Retention threshold=10. Cut 2 now has last_active_iter=20 >= 10.	DeactivationSet does NOT contain index 2. The `update_activity` call at the current iteration refreshes the timestamp to 20, which is within the memory window. The precondition that `update_activity` runs before `select` ensures binding cuts always have a current timestamp when LML1 evaluates them.
`test_cutselection_level1_never_deactivates_positive_active_count`	Construct a pool where every cut has active_count >= 1 (all were binding at some prior iteration). Cuts have varying active_count values: [1, 50, 1, 100, 2].	DeactivationSet is empty. Level1’s deactivation criterion (active_count == 0) structurally excludes any cut with a positive active count. No binding cut – past or present – can be deactivated by Level1.
`test_cutselection_lml1_never_deactivates_current_iteration`	Construct a pool where all 5 cuts have last_active_iter equal to the current iteration (20). LML1 with memory_window=1, iteration=20. Retention threshold = 20 - 1 = 19. All cuts have last_active_iter=20 >= 19.	DeactivationSet is empty. Even with the smallest practical memory window (1), cuts active at the current iteration are always retained. A binding cut at the current iteration is guaranteed to have last_active_iter = current iteration after `update_activity`.

SS4. Parallel Selection Integration Tests

Decision DEC-016 (active): Cut selection uses deferred parallel execution — stages distributed across ranks and threads, with DeactivationSet allgatherv and leader-only SharedRegion write.

These tests verify the distributed parallel selection model described in Cut Selection Strategy Trait SS2.2a and Cut Management Implementation SS7.1a-SS7.1b. The tests validate three properties: (1) determinism — the same deactivation results regardless of rank/thread count, (2) correct stage partitioning — all stages are covered exactly once, and (3) allgatherv completeness — every rank holds the full deactivation set after gathering.

Test naming convention: test_cutselection_parallel_{property}_{scenario} where {property} is determinism, partitioning, or allgatherv, and {scenario} describes the test case.

Shared fixture extension: Tests in this section use the SS1 shared fixture (5-cut pool, 1-dimensional state) replicated across 3 stages (stages 2, 3, 4) to simulate a $T = 4$ system. Each stage has an identical cut pool (same coefficients, metadata, and visited states as the SS1 fixture). The Level1 strategy with threshold=0 and check_frequency=5 at iteration=20 is used unless otherwise noted.

SS4.1 Determinism Tests

Test Name	Input Scenario	Expected Observable Behavior
`test_cutselection_parallel_determinism_rank_independence`	Run Level1 selection on 3 stages with 1 rank (sequential baseline), then with 2 ranks (stages split: rank 0 gets stages 2-3, rank 1 gets stage 4), then with 3 ranks (one stage each). Compare DeactivationSets.	All three configurations produce identical DeactivationSets for each stage: stages 2, 3, and 4 each deactivate cuts {1, 4} (the never-active cuts from the shared fixture). The deactivation result depends only on the cut pool data, not on which rank computed it.
`test_cutselection_parallel_determinism_thread_independence`	Run Level1 selection on 3 stages with 1 rank and 1 thread, then with 1 rank and 4 threads (Rayon work-stealing distributes stages across threads). Compare DeactivationSets.	Identical DeactivationSets for each stage regardless of thread count. The `select` method is a pure function of the cut pool data — no thread-local state affects the result.

SS4.2 Stage Partitioning Tests

Test Name	Input Scenario	Expected Observable Behavior
`test_cutselection_parallel_partitioning_complete_coverage`	$T = 60$ stages (stages 2-60), $R = 16$ ranks. Compute the stage assignment for each rank using the block formula from Cut Selection Strategy Trait SS2.2a. Collect all assigned stages.	The union of all ranks’ assigned stages equals ${2, 3, \dots, 60}$ . No stage is missing (complete coverage). No stage appears in two ranks’ assignments (no overlap).
`test_cutselection_parallel_partitioning_uneven_division`	$T = 10$ stages (stages 2-10, $S = 9$ ), $R = 4$ ranks. Block size $B = ⌈ 9/4 ⌉ = 3$ .	Rank 0: stages {2, 3, 4}. Rank 1: stages {5, 6, 7}. Rank 2: stages {8, 9, 10}. Rank 3: no stages (empty assignment). The last rank may have zero stages when $S$ is not divisible by $R$ . The implementation must handle empty assignments gracefully (no-op, zero-length allgatherv contribution).
`test_cutselection_parallel_partitioning_single_rank`	$T = 60$ stages, $R = 1$ rank.	Rank 0 receives all 59 stages {2, 3, …, 60}. The partitioning degenerates to the sequential case. DeactivationSets are identical to the sequential baseline from SS1.

SS4.3 Allgatherv Completeness Tests

Test Name	Input Scenario	Expected Observable Behavior
`test_cutselection_parallel_allgatherv_all_ranks_have_full_set`	3 stages, 3 ranks (one stage per rank). Each rank runs `select` on its assigned stage and contributes a DeactivationSet to `allgatherv`.	After `allgatherv`, every rank holds DeactivationSets for all 3 stages: stage 2 deactivates {1, 4}, stage 3 deactivates {1, 4}, stage 4 deactivates {1, 4}. No rank is missing any stage’s result.
`test_cutselection_parallel_allgatherv_empty_deactivation_set`	Modified fixture: all 5 cuts have active_count >= 1 across all 3 stages. Level1 at iteration=20. No cuts meet the deactivation criterion. 2 ranks.	After `allgatherv`, every rank holds empty DeactivationSets for all 3 stages. The allgatherv correctly handles zero-length payloads (count=0 per stage).
`test_cutselection_parallel_allgatherv_mixed_deactivation_counts`	3 stages with different cut pools: stage 2 has 2 cuts with active_count=0, stage 3 has 0 cuts with active_count=0 (all active), stage 4 has 5 cuts with active_count=0 (all deactivated). 2 ranks.	After `allgatherv`, every rank holds: stage 2 deactivates 2 cuts, stage 3 deactivates 0 cuts, stage 4 deactivates 5 cuts. Variable-length DeactivationSets are correctly gathered and reassembled.

Cross-References

Cut Selection Strategy Trait – Enum definition (SS1), method contracts for should_run (SS2.1), select (SS2.2), and update_activity (SS2.3), parallel work distribution (SS2.2a), per-cut tracking metadata (SS3.2), dispatch mechanism (SS4), validation rules C1-C5 (SS5), activity bitmap interaction (SS6), convergence guarantee (SS7)
Cut Management (Math) – Cut activity definition (SS6), activity threshold (SS6), Level-1 selection (SS7.1), LML1 selection (SS7.2), Dominated cut detection (SS7.3), convergence guarantee theorem (SS8), selection parameters (SS9)
Cut Management Implementation – Per-stage cut pool structure (SS1.1), deterministic slot assignment (SS1.2), selection strategy implementation (SS2), Level-1 (SS2.1), LML1 (SS2.2), Dominated (SS2.3), activity tracking and binding detection (SS6), parallel selection phase (SS7.1a), StageLpCache update phase (SS7.1b)
Extension Points – Variant selection pipeline (SS6) where CutSelectionConfig is validated and converted, dispatch mechanism analysis (SS7)
Backend Testing – Conformance test suite structure and requirements table format (reference pattern for this spec)
Risk Measure Testing – Sibling conformance test spec following the same structure (shared fixture, conformance tables, variant equivalence, numerical properties)
Horizon Mode Testing – Sibling conformance test spec following the same structure
Sampling Scheme Testing – Sibling conformance test spec following the same structure
Solver Workspaces – Stage solve workflow (SS1.4) where binding detection occurs and update_activity is invoked
Training Loop – Backward pass iteration where update_activity and select are called in sequence (SS7.3), cut selection step in iteration lifecycle (SS2.1 step 4a)
Work Distribution – Contiguous block assignment formula (§3.1) used in stage partitioning tests (SS4.2)
Synchronization – DeactivationSet allgatherv (§1.4a) verified in completeness tests (SS4.3)
Communicator Trait – allgatherv contract (SS2.1) for DeactivationSet gathering

Keyboard shortcuts

Cobre Methodology Reference