Batch screening with Benjamini-Hochberg-Yekutieli

Answers

What is Benjamini-Hochberg-Yekutieli (BHY), when to use it, and how multi_factor.bhy partitions the candidate set across statistical families. For the API signature, see multi_factor. For the underlying theorem and assumptions, see Statistical methods.

BHY controls false discovery rate (FDR) within a statistical family: evaluate multiple candidate factors under the same procedure, then apply step-up correction on the resulting p-values.

Do not mix families

p-values from information coefficient (IC) / FM / TS-β carry different null distributions and cannot be pooled. bhy() partitions automatically — see below — but if you assemble the input list yourself across procedures, the FDR guarantee breaks.

Basic usage¶

import factrix as fx

candidates = ["mom_5d", "mom_20d", "mom_60d"]
cfg = fx.AnalysisConfig.individual_continuous(metric=fx.Metric.IC, forward_periods=5)

profiles = [fx.evaluate(panel, cfg, factor_col=name) for name in candidates]
survivors = fx.multi_factor.bhy(profiles, q=0.05)
survivor_names = [p.factor_id for p in survivors.profiles]

evaluate() stamps factor_col into profile.factor_id, so the survivor → name mapping reads off the survivor profiles directly — no external name → profile dict, no is-comparison idiom. Survivors.profiles lists the survivors in their original input order; Survivors.adj_p carries the bucket-local BHY-adjusted p in matching order.

Each evaluate call repays the per-date cross-section overhead (sort / group-by / rank) on its own — that cost is intrinsic to producing one FactorProfile per signal in factrix today. bhy() operates on the resulting list for FDR control; it does not reduce the per-signal evaluation cost.

bhy() automatically partitions by (procedure, forward_periods) — you do not pass a group key. Same-procedure, same-horizon profiles form one family. Different horizons always split: each horizon carries its own null distribution and effective sample size; pooling dilutes the step-up threshold and silently inflates FDR.

If any family degenerates to size=1 (typical misuse: one factor evaluated across multiple scenarios), bhy() emits a RuntimeWarning — at size=1 BHY equals the raw threshold and provides no FDR correction.

Horizon-shopping correction¶

bhy() controls FDR within a horizon. If you sweep multiple horizons per factor and pick the minimum p, the horizon selection itself is hidden multiple testing (K = number of horizons). You must collapse the horizon dimension first with a family-wise error rate (FWER) procedure, then feed the result into BHY.

Background¶

The multiple-testing discipline for factor research established by Harvey-Liu-Zhu 2016 motivates correcting for selection once factor candidates and horizons are swept — a 5% nominal threshold no longer controls type-I error. factrix's specific composition (FWER across horizons, then FDR within) is a project-level application; HLZ themselves prescribe stricter thresholds, not this two-axis stack. The reason factrix picks FWER for the inner step is the dependence structure Boudoukh-Richardson-Whitelaw 2008 documents: under the null and a persistent regressor, ordinary least squares (OLS) slope estimators across horizons are highly correlated — approaching unity between adjacent horizons at dividend-yield-like persistence — so the K horizons behave more like one repeatedly-tested null than K independent draws, and the positive regression dependence on a subset (PRDS) condition fails. Independence- and PRDS-friendly FDR procedures (Benjamini-Hochberg (BH) / BHY) assume neither identity and lose their level guarantees in this regime.

Bailey & López de Prado (2014) formalises the parallel multiple-trials problem on the Sharpe axis (Deflated Sharpe Ratio) for backtest selection — same correction path, different statistic; not implemented in factrix.

Recommended FWER procedures:

Procedure	When to use
Bonferroni (`p × K`)	K small (≤ 5), horizons approximately independent
Holm (step-down)	K larger, or p-values vary widely in strength

Do not use BHY as the inner procedure

(1) Picking one representative p is a FWER problem, not FDR. (2) BHY ∘ BHY has no composition theorem. (3) For small K, BHY's c(m) factor makes it more conservative than Bonferroni anyway.

Do not flatten K × H profiles into one bhy() call

Flattening K factors × H horizons into K×H profiles and feeding them to bhy() directly is wrong. BHY partitions by horizon into H families of K — correct for "pick factors within each horizon" — but wrong for "pick best horizon per factor."