Multi-factor screening

Source notebook: examples/multi_factor_screening.ipynb.

Apply Benjamini-Hochberg-Yekutieli (BHY) false discovery rate control to a batch of candidate factors. The notebook demonstrates the explicit-family contract and the duplicate-identity defense: the behavior that is hard to learn from API docstrings alone.

Factor type¶

This recipe uses multi_factor.bhy(...) over a list of EvaluationResult objects. Each result is produced by evaluate(panel, metrics=..., factor_cols=[<name>]) for any registered cell; screening is factor-type agnostic.

The input list is the family. Each result must carry a unique identity (factor, forward_periods). The recommended path is to name the factor column distinctly per candidate panel and pass it via factor_cols=[...]; evaluate stamps the factor name onto each returned EvaluationResult.

Pooling horizons without expand_over controls the full factor × horizon search and emits an informational RuntimeWarning. Pass expand_over=("forward_periods",) only for predeclared horizon-specific screens that will be selected and reported separately; it does not control later horizon shopping.

Use this when¶

You have two or more candidate factors and want type-I error control under multiple testing.
Candidates are evaluated under a declared family: same research question, metric, and horizon unless you explicitly split with expand_over.
You prefer FDR control (BHY) over family-wise error control (Bonferroni), appropriate when discoveries can tolerate a known false-positive rate and power matters.

What it tests¶

For an input family of size N, BHY step-up keeps results whose ranked p-values satisfy the Benjamini-Yekutieli threshold. Cross-family aggregation is not performed; that remains the caller's responsibility.

Output to read¶

bhy returns a dict of BhyResult containers keyed by metric label:

len(survivors) vs len(results) gives a coarse hit rate.
survivors[i].metrics["ic"].p_value and .factor identify which factor cleared the family-adjusted bar.
adj_p[i] is the BHY-adjusted p-value driving the survivor decision.

1. Setup¶

from __future__ import annotations

import factrix as fx
import numpy as np
import polars as pl
from factrix.preprocess import compute_forward_return

2. Build a single-family batch¶

Six candidate factors, all evaluated under Newey-West IC with forward_periods=5. This is a valid BHY input where one step-up actually controls the declared family.

We start from one ground-truth factor and add increasing IID noise to produce variants with varying signal strengths. Each variant is materialized under its own column name (variant_0 through variant_4) so evaluate(..., factor_cols=[name]) stamps a distinct factor per result. The extra variant_0_copy column is a deliberate duplicate: a different hypothesis identity with the same values, useful for the post-BHY redundancy check.

from factrix.metrics import ic

raw = fx.datasets.make_cs_panel(
    n_assets=100,
    n_dates=500,
    ic_target=0.08,
    seed=2024,
)
panel = compute_forward_return(raw, forward_periods=5)


def variant_panel(
    base: pl.DataFrame, *, name: str, scale: float, seed: int
) -> pl.DataFrame:
    """Add IID noise on top of the ground-truth factor."""
    rng = np.random.default_rng(seed)
    noisy = base["factor"].to_numpy() + scale * rng.standard_normal(base.height)
    return base.drop("factor").with_columns(pl.Series(name, noisy))


candidates = {
    f"variant_{i}": variant_panel(
        panel, name=f"variant_{i}", scale=0.5 + 0.3 * i, seed=100 + i
    )
    for i in range(5)
}
candidates["variant_0_copy"] = candidates["variant_0"].rename(
    {"variant_0": "variant_0_copy"}
)

results = []
for name, candidate_panel in candidates.items():
    results.extend(
        fx.evaluate(
            candidate_panel,
            metrics={"ic": ic(inference=fx.inference.NEWEY_WEST)},
            factor_cols=[name],
            forward_periods=5,
        ).values()
    )

# Raw p-values are descriptive here. FDR-controlled decisions wait for BHY.
for res in results:
    print(f"  {res.factor:12s} p_value={res.metrics['ic'].p_value:.4g}")

3. Apply BHY¶

The input list is the family. bhy runs one Benjamini-Yekutieli step-up over all results and returns a dict of BhyResult containers keyed by metric label. Each BhyResult exposes .survivors, .adj_p, .q, .expand_over, and .n_tests for audit.

bhy_ic = fx.multi_factor.bhy(results, metrics=["ic"], q=0.05)["ic"]
print(f"BHY survivors: {len(bhy_ic.survivors)} / {len(results)}")
for res, adj in zip(bhy_ic.survivors, bhy_ic.adj_p, strict=True):
    print(
        f"  {res.factor:12s} p_value={res.metrics['ic'].p_value:.4g}  adj_p={adj:.4g}"
    )

# BhyResult also renders as a table in Jupyter.
bhy_ic

Illustrative output:

BHY survivors: 6 / 6
  variant_0    p_value=2.136e-39  adj_p=1.57e-38
  variant_1    p_value=4.052e-26  adj_p=1.985e-25
  variant_2    p_value=6.682e-22  adj_p=2.456e-21
  variant_3    p_value=3.987e-17  adj_p=1.172e-16
  variant_4    p_value=7.407e-14  adj_p=1.815e-13
  variant_0_copy p_value=2.136e-39  adj_p=1.57e-38

4. Check redundancy after BHY¶

BHY controls the false-discovery rate for the declared family; it does not say the survivors are economically distinct. Before treating a survivor list as a factor set, inspect pairwise factor correlation and run fixed spanning comparisons on spread series.

from factrix.metrics.quantile import compute_spread_series
from factrix.metrics.spanning import spanning_alpha

survivor_names = [res.factor for res in bhy_ic.survivors]
factor_panel = candidates[survivor_names[0]].select(
    "date", "asset_id", survivor_names[0]
)
for name in survivor_names[1:]:
    factor_panel = factor_panel.join(
        candidates[name].select("date", "asset_id", name),
        on=["date", "asset_id"],
    )

corr_rows = []
for left_idx, left in enumerate(survivor_names):
    for right in survivor_names[left_idx + 1 :]:
        mean_corr = (
            factor_panel.group_by("date")
            .agg(pl.corr(left, right).alias("corr"))
            .select(pl.col("corr").mean())
            .item()
        )
        corr_rows.append({"left": left, "right": right, "mean_cs_corr": mean_corr})

corr_table = pl.DataFrame(corr_rows).sort("mean_cs_corr", descending=True)
print(corr_table.head(5))

spreads = {
    name: compute_spread_series(
        candidates[name],
        forward_periods=5,
        n_groups=5,
        factor_cols=[name],
    )[name]
    for name in survivor_names
}
base_name = survivor_names[0]
spanning_rows = []
for name in survivor_names[1:]:
    out = spanning_alpha(spreads[name], base_spreads={base_name: spreads[base_name]})
    spanning_rows.append(
        {
            "candidate": name,
            "base": base_name,
            "alpha": out.value,
            "alpha_p_value": out.p_value,
            "r_squared": out.metadata.get("r_squared"),
        }
    )

spanning_table = pl.DataFrame(spanning_rows).sort("alpha_p_value")
print(spanning_table)

variant_0_copy passes BHY because it carries the same strong signal as variant_0, but the correlation and spanning rows show it is not new alpha. spanning_alpha is a fixed comparison diagnostic: choose the base factor intentionally, then read alpha and p-value. Do not read this as stepwise post-selection inference. greedy_forward_selection remains a model-construction helper, and its returned t-stats are explicitly marked invalid for inference.

5. Duplicate-identity defense¶

evaluate() stamps each result's factor from its factor_cols name, so two results built off the same "factor" column both land at identity ("factor", forward_periods). Pass two such results to bhy() and the family-resolution layer raises UserInputError rather than silently treating distinct candidates as one hypothesis.

The canonical fix is what section 2 already does: distinct column name per panel plus evaluate(..., factor_cols=[name]). Below we deliberately take the colliding path to surface the error, then show the fix.

from factrix.metrics import fm_beta, ic

# Colliding path: both calls use factor_cols=["factor"].
unstamped = []
unstamped.extend(
    fx.evaluate(
        panel,
        metrics={"ic": ic(inference=fx.inference.NEWEY_WEST)},
        factor_cols=["factor"],
        forward_periods=5,
    ).values()
)
unstamped.extend(
    fx.evaluate(
        panel, metrics={"fm": fm_beta()}, factor_cols=["factor"], forward_periods=5
    ).values()
)
try:
    fx.multi_factor.bhy(unstamped, metrics=["ic"], q=0.05)
except fx.UserInputError as exc:
    print("UserInputError raised as expected:")
    print(str(exc))

# Canonical fix: rename each panel's factor column upfront so the
# returned results carry distinct `factor` identities.
stamped = []
stamped.extend(
    fx.evaluate(
        panel.rename({"factor": "ic_var"}),
        metrics={"ic": ic(inference=fx.inference.NEWEY_WEST)},
        factor_cols=["ic_var"],
        forward_periods=5,
    ).values()
)
stamped.extend(
    fx.evaluate(
        panel.rename({"factor": "fm_var"}),
        metrics={"fm": fm_beta()},
        factor_cols=["fm_var"],
        forward_periods=5,
    ).values()
)
print(f"\ncanonical fix identities: {[r.factor for r in stamped]}")

6. Where to go next¶

For the broader structural details, see Panel vs timeseries, Large-scale evaluation, the bhy API reference, and the spanning_alpha API reference.