factrix.metrics.ts_quantile ¶

Time-series quantile bucketing + monotonicity test (issue #5).

Diagnostic for (COMMON, CONTINUOUS, *) and single-asset TIMESERIES cells: bucket factor history into quantiles and check the conditional mean forward return per bucket. Catches U-shape / inverted-U / extreme-only signals that ordinary least squares (OLS) β assumes away (linear) and reports pass / fail on as a single slope.

Standalone metric — does not enter the registry. See ARCHITECTURE.md §"Registry procedure vs standalone metric" for the distinction. SPARSE / binary signals are out of scope; the input gate redirects to event_quality helpers.

Notes

Pipeline. Per-date aggregation to a common (_f, _r) series (cross-section step), then quantile-bucketed Newey-West (NW) heteroskedasticity-and-autocorrelation-consistent (HAC) OLS on that time series; Wald χ² on the top-bottom bucket spread.

factrix.metrics.ts_quantile.ts_quantile_spread ¶

ts_quantile_spread(df: DataFrame, *, factor_col: str = 'factor', return_col: str = 'forward_return', n_groups: int = 5, forward_periods: int | None = None, nw_lags: int | None = None) -> MetricOutput

Bucket time-series factor by historical quantiles, test conditional means.

Reported:

value = top-bottom spread (β_{K-1} - β_0)
stat = Wald on H0: β_{K-1} = β_0 → two-sided p in metadata
metadata["spearman_rho"] / spearman_p = small-sample monotonicity diagnostic across the K bucket means
metadata["buckets"] = per-bucket {idx, mean_return, n}

Gate (issue #5): n_unique(factor) >= n_groups * 2. Below the gate the factor cannot sustain quantile cuts — short-circuits with a redirect to event_quality.* for binary / sparse signals.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Long panel; aggregated to per-date `(_f, _r)` internally.	required
`factor_col`	`str`	Column carrying the factor.	`'factor'`
`return_col`	`str`	Column carrying the forward return.	`'forward_return'`
`n_groups`	`int`	Number of quantile buckets `K` to cut the factor history into.	`5`
`forward_periods`	`int \| None`	Overlap horizon of the forward return; floors the Newey-West (NW) bandwidth.	`None`
`nw_lags`	`int \| None`	Override for the NW lag count. `None` resolves to the standard rule given `forward_periods` and `T`.	`None`

Returns:

Type	Description
`MetricOutput`	`MetricOutput` whose `value` is the top-bottom bucket
`MetricOutput`	spread; bucket detail and the Spearman monotonicity diagnostic
`MetricOutput`	live in `metadata`. Short-circuits with a reason code when
`MetricOutput`	input shape is insufficient (no `date` / factor / return
`MetricOutput`	column, fewer than `MIN_PORTFOLIO_PERIODS_HARD` rows, or factor
`MetricOutput`	variation below `n_groups * 2` distinct values).

Notes

Aggregate the panel to per-date (_f, _r), ordinal-rank into K = n_groups buckets by historical _f quantile, run r_t = sum_k beta_k * I(bucket_t = k) + eps with NW heteroskedasticity-and-autocorrelation-consistent (HAC) covariance, and form the spread value = beta_{K-1} - beta_0 with Wald p-value on H0: beta_{K-1} = beta_0. A Spearman(0..K-1, beta) rank-monotonicity diagnostic across buckets is reported alongside.

factrix uses NW HAC + Wald rather than Welch t for cross-method comparability with ts_asymmetry / ts_beta_t_nw and because forward_periods > 1 breaks the iid assumption Welch relies on.

References

Newey-West 1987: HAC covariance under-pinning the Wald test. Andrews 1991: Bartlett growth rate T^(1/3) used for the default lag. Hansen-Hodrick 1980: forward_periods - 1 floor for overlapping returns.

Examples:

>>> import factrix as fx
>>> from factrix.preprocess import compute_forward_return
>>> from factrix.metrics.ts_quantile import ts_quantile_spread
>>> panel = compute_forward_return(
...     fx.datasets.make_cs_panel(n_assets=80, n_dates=180, seed=0),
...     forward_periods=5,
... )
>>> result = ts_quantile_spread(panel, n_groups=5)
>>> result.name
'ts_quantile_spread'

Timeseries-mode conventions

FACTOR_ADF_P persistence diagnostic, plain stage-1 SE rationale, and the forward_periods vs signal_horizon bias framing apply here as for the rest of the TS-mode family. See Timeseries-mode conventions.

Use cases¶

Detect non-linear factor → return shape

Linear ordinary least squares (OLS) \(\beta\) reports a single slope and fails on U-shape / inverted-U / extreme-only signals. ts_quantile_spread aggregates the panel to a per-date \((\_f, \_r)\) series, buckets _f into \(K\) historical quantiles, and reads the conditional mean return per bucket — preserves whatever shape the relationship has.
Top-bottom spread significance with Newey-West (NW) heteroskedasticity-and-autocorrelation-consistent (HAC)

Headline is \(\beta_{K-1} - \beta_0\) — the conditional mean difference between the top and bottom quantile of factor history. Inference is a Wald \(\chi^2\) on \(H_0: \beta_{K-1} = \beta_0\) with Newey-West HAC covariance; kernel choice is consistent with ts_asymmetry so cross-method \(p\)-values stay comparable under overlapping forward returns.
Rank-monotonicity diagnostic across buckets

metadata["spearman_rho"] / spearman_p give a non-parametric Spearman rank check on the bucket-mean sequence \((\beta_0, \ldots, \beta_{K-1})\). Catches a monotone shape that the top-bottom Wald would conflate with a wider U.

Worked example — quantile-bucketed conditional means on a common-factor panel¶

broadcast common factor → ts_quantile_spread

import factrix as fx
import polars as pl
from factrix.metrics.ts_quantile import ts_quantile_spread
from factrix.preprocess import compute_forward_return

# Build a panel whose ``factor`` is broadcast (one value per date,
# shared across all assets) — VIX / USD-index style.
raw = fx.datasets.make_cs_panel(
    n_assets=50, n_dates=1000, ic_target=0.08, seed=2024,
)
common = raw.group_by("date").agg(pl.col("factor").mean().alias("factor"))
panel  = raw.drop("factor").join(common, on="date")
panel  = compute_forward_return(panel, forward_periods=5)

out = ts_quantile_spread(panel, n_groups=5, forward_periods=5)
print(out.value, out.stat, out.metadata["p_value"])
# 0.0018  3.20  0.0014   (approximate)
print(out.metadata["spearman_rho"], out.metadata["spearman_p"])
# 0.90  0.037   (approximate; positive ⇒ monotone shape)
for b in out.metadata["buckets"]:
    print(b)
# {"idx": 0, "mean_return": -0.00091, "n": 199}
# ...
# {"idx": 4, "mean_return":  0.00091, "n": 199}

factrix.metrics.ts_quantile ¶

factrix.metrics.ts_quantile.ts_quantile_spread ¶

Use cases¶

Worked example — quantile-bucketed conditional means on a common-factor panel¶

See also¶