Skip to content

factrix.metrics.oos

Out-of-sample (OOS) persistence analysis for any time-indexed series.

This tool is agnostic to what the series represents — it only knows about IS/OOS splits on a time-indexed numeric sequence.

Notes

Pipeline. Time-series only, IS/OOS window split on a 1-D series; descriptive decay diagnostic (no formal H₀).

Input. DataFrame with date, value (IC series, CAAR series, spread series).

Output. MetricOutput with value = median survival ratio + sign-flip / status detail in metadata.

factrix.metrics.oos.SplitDetail dataclass

SplitDetail(is_ratio: float, mean_is: float, mean_oos: float, survival_ratio: float, sign_flipped: bool)

Per-split IS/OOS calculation intermediate.

Not part of the public return API (which is MetricOutput); surfaced as a typed helper for callers that want typed access to individual splits without round-tripping through metadata["per_split"] dicts.

survival_ratio = |mean_OOS| / |mean_IS| — 1.0 = OOS matches IS, 0.0 = signal vanished out of sample. Higher is better.

factrix.metrics.oos.multi_split_oos_decay

multi_split_oos_decay(series: DataFrame, value_col: str = 'value', splits: list[tuple[float, float]] | None = None, survival_threshold: float = 0.5) -> MetricOutput

Multi-split out-of-sample (OOS) survival analysis with sign-flip detection.

For each split point, divides the series into IS and OOS portions, computes |mean_OOS| / |mean_IS| (the survival ratio), and checks for sign flips. The reported ratio is the median across splits.

Parameters:

Name Type Description Default
series DataFrame

DataFrame with date and value_col, sorted by date.

required
splits list[tuple[float, float]] | None

List of (IS_fraction, OOS_fraction) tuples. Default: [(0.6, 0.4), (0.7, 0.3), (0.8, 0.2)].

None
survival_threshold float

Minimum survival ratio for PASS (default 0.5).

0.5

Returns:

Type Description
MetricOutput

MetricOutput with:

MetricOutput
  • name: "oos_decay"
MetricOutput
  • value: median survival ratio across splits (NaN on short-circuit)
MetricOutput
  • stat: None — descriptive only (no hypothesis test attached; a t-stat at MIN_OOS_PERIODS = 5 would have power ≈ 0 and invite mis-reading the diagnostic as a significance test).
MetricOutput
  • metadata:

  • sign_flipped (bool): any split had sign flip

  • status ("PASS" | "VETOED")
  • per_split (list[dict]): see SplitDetail.to_dict
  • method (str): "multi-split OOS decay"
  • survival_threshold (float)
  • reason (str, short-circuit only): "insufficient_oos_periods" or "no_valid_splits"
Notes

For each split fraction f, partition the sorted series into IS (first f·n) and OOS (remainder). Per-split survival ratio is s_f = |mean_OOS| / |mean_IS|; reported headline is median_f s_f. A split is flagged as sign_flipped when mean_IS and mean_OOS have opposite signs — any such split sets status = VETOED.

factrix reports the median across splits rather than mean: a single regime change landing inside one split distorts the mean disproportionately. Descriptive only — no p_value is emitted (the multi-split structure already conveys the signal-decay message; running a t-test at the MIN_OOS_PERIODS floor would have power ≈ 0).

References
  • McLean-Pontiff (2016): post-publication returns ~58% lower than in-sample, with ~32% of that drop attributable to publication itself (the remaining ~26% is the pure out-of-sample decay).
  • Lopez-de-Prado (2018): CPCV for robust train/test split.

Examples:

Survival on a per-date information coefficient (IC) series from :func:~factrix.metrics.ic.compute_ic:

>>> import factrix as fx
>>> from factrix.preprocess import compute_forward_return
>>> from factrix.metrics.ic import compute_ic
>>> from factrix.metrics.oos import multi_split_oos_decay
>>> panel = compute_forward_return(
...     fx.datasets.make_cs_panel(n_assets=80, n_dates=240, seed=0),
...     forward_periods=5,
... )
>>> series = compute_ic(panel).rename({"ic": "value"}).select("date", "value")
>>> result = multi_split_oos_decay(series)
>>> result.name
'oos_decay'

Descriptive only — no formal \(H_0\)

multi_split_oos_decay emits a survival ratio + sign-flip detail; no p_value is attached and stat is None. A \(t\)-test at the MIN_OOS_PERIODS floor would have power \(\approx 0\) and would invite mis-reading the diagnostic as a significance test. Callers routing this output into Benjamini-Hochberg-Yekutieli (BHY) / gate logic must read status ("PASS" / "VETOED") and sign_flipped, not a probability.

Use cases

  • Persistence read on a factor-return series


    multi_split_oos_decay is a (*, CONTINUOUS, *, TIMESERIES) diagnostic — input is a 1-D (date, value) series, typically information coefficient (IC) from compute_ic, spread from compute_spread_series, or any other factor-mimicking-portfolio return series. Reports \(|\mathrm{mean}_{\text{OOS}}| / |\mathrm{mean}_{\text{IS}}|\) across multiple (IS_fraction, OOS_fraction) splits.

  • Sign-flip veto


    Any split with opposite-signed IS and out-of-sample (OOS) means flips sign_flipped = True and forces status = "VETOED" — IC sign-flip OOS means the factor predicts the wrong direction, not just a weaker one. McLean & Pontiff (2016) report average OOS decay around 32 %; factrix's default survival_threshold = 0.5 sits inside that window.

  • Median across splits, not mean


    Headline is median_f s_f over the default splits [(0.6, 0.4), (0.7, 0.3), (0.8, 0.2)]. A single regime change landing inside one split distorts the mean disproportionately; the median absorbs it.

Choosing a function

Goal Function
Multi-split OOS survival + sign-flip gate on a (date, value) series multi_split_oos_decay
Typed accessor for an individual split's (is_ratio, mean_is, mean_oos, ...) SplitDetail

Worked example — IC series fed into multi_split_oos_decay

compute_ic → multi_split_oos_decay

import factrix as fx
from factrix.metrics.ic import compute_ic
from factrix.metrics.oos import multi_split_oos_decay
from factrix.preprocess import compute_forward_return

raw   = fx.datasets.make_cs_panel(
    n_assets=100, n_dates=1000, ic_target=0.08, seed=2024,
)
panel = compute_forward_return(raw, forward_periods=5)

# The series diagnostic consumes (date, value); the value column on
# the compute_ic output is named ``ic``.
ic_df = compute_ic(panel)
out   = multi_split_oos_decay(ic_df, value_col="ic")
print(out.value, out.metadata["status"], out.metadata["sign_flipped"])
# 0.94   PASS   False   (approximate)
for split in out.metadata["per_split"]:
    print(split)
# {"is_ratio": 0.6, "mean_is": 0.080, "mean_oos": 0.077,
#  "survival_ratio": 0.96, "sign_flipped": False}
# ...

See also

  • compute_ic / compute_spread_series


    Canonical producers of the (date, value) series this diagnostic consumes.

    api/metrics/ic →

  • hit_rate / trend


    Sibling series diagnostics on the same input shape — sign significance and slope detection. Pair with oos when both in-sample magnitude and out-of-sample persistence matter.

    api/metrics/hit_rate →

  • by_slice


    Per-slice survival summaries (regime / universe / sector).

    api/by-slice →

  • Metric applicability reference


    When this metric applies and the sample-size guards that gate it (MIN_OOS_PERIODS * 2 floor; per-split MIN_OOS_PERIODS on each side).

    reference/metric-applicability →

  • Series diagnostics landing


    Adjacent axis-agnostic series diagnostics.

    api/metrics/series-tools →