factrix.multi_factor.bhy ¶

bhy(profiles: Iterable[FactorProfile], *, expand_over: Sequence[str] | None = None, estimator: Estimator | None = None, q: float = 0.05) -> Survivors

Benjamini-Hochberg-Yekutieli (BHY) step-up false discovery rate (FDR) within one declared family; return the survivors.

The input list is treated as a single family. When expand_over is supplied, one independent step-up runs per unique tuple of profile.context[k] for k in expand_over (Benjamini & Bogomolov 2014 selective inference). Cell / horizon partitioning is the caller's responsibility — v0.5 retired the implicit auto-split.

Parameters:

Name	Type	Description	Default
`profiles`	`Iterable[FactorProfile]`	Iterable of `FactorProfile`. The full input is one family unless `expand_over` further partitions it.	required
`expand_over`	`Sequence[str] \| None`	Optional context keys whose distinct value tuples split the input into independent BHY step-up batches. `None` runs a single step-up over all profiles.	`None`
`estimator`	`Estimator \| None`	Optional inference-method override (#170). An :class:`~factrix.stats.Estimator` instance (e.g. `factrix.stats.NeweyWest()`) names which p-value to feed the step-up; `None` uses each profile's `primary_p`. The instance reports its applicability per cell and dispatches to a `StatCode` key in `profile.stats`.	`None`
`q`	`float`	Nominal false discovery rate target. The BHY step-up controls FDR ≤ q under positive regression dependence on a subset (PRDS); under arbitrary dependence the effective level is `q / sum(1/k for k in 1..n)`. Default `0.05`.	`0.05`

Returns:

Type	Description
`Survivors`	`Survivors` container in input order; `adj_p` carries the
`Survivors`	bucket-local BHY-adjusted p-value and the survivor set is
`Survivors`	defined as `adj_p <= q` (single source of truth — no separate
`Survivors`	rejection mask path).

Raises:

Type	Description
`UserInputError`	On any family-resolution invariant failure (unknown / identity-shadowing `expand_over` name; an `estimator` not applicable to a profile's cell or whose dispatched `StatCode` is unpopulated; duplicate partition key — typically fixed by setting unique `factor_id` per profile or splitting via `expand_over`).

Warns:

Type	Description
`RuntimeWarning`	When the input mixes `forward_periods` while `expand_over` is `None` — pooling horizons in one step-up dilutes the per-rank threshold and silently inflates FDR. Or when most `expand_over` buckets contain a single profile (BHY on n=1 is a raw cutoff and provides no FDR correction).

Notes

Horizon multiplicity is a two-axis problem — FDR within a horizon (this function's job) ∘ family-wise error rate (FWER) across horizons (the caller's job, before invoking bhy()). The two cannot be collapsed into a single step-up: under the null and a persistent regressor, slope estimators across horizons are nearly perfectly correlated, so pooling them dilutes the per-rank threshold and silently inflates FDR — the rationale for the forward_periods-mixed warning above. See the "Horizon-shopping correction" section of docs/guides/batch-screening.md for the recommended Bonferroni / Holm inner step.

References

Harvey, Liu & Zhu (2016). "…and the Cross-Section of Expected Returns." Review of Financial Studies, 29(1), 5–68. Multiple-testing discipline for factor research motivating correction for selection; the specific FWER ∘ FDR composition is factrix's application, not HLZ's prescription.
Boudoukh, Richardson & Whitelaw (2008). "The Myth of Long-Horizon Predictability." Review of Financial Studies, 21(4), 1577–1605. Across-horizon slope estimators are highly correlated under the null for persistent regressors — independence and PRDS fail when horizons share information through the predictor, so an FWER (independence-free) inner step is the factrix-side response to that dependence.

Examples:

Screen five candidate factors with BHY at q=0.05 — each panel is a seeded synthetic cross-section, factor_id is stamped post-hoc so the family resolver sees distinct identities:

>>> import dataclasses
>>> import factrix as fx
>>> from factrix.preprocess import compute_forward_return
>>> cfg = fx.AnalysisConfig.individual_continuous(forward_periods=5)
>>> profiles = [
...     dataclasses.replace(
...         fx.evaluate(
...             compute_forward_return(
...                 fx.datasets.make_cs_panel(
...                     n_assets=100, n_dates=250, seed=i,
...                 ),
...                 forward_periods=5,
...             ),
...             cfg,
...         ),
...         factor_id=f"alpha_{i}",
...     )
...     for i in range(5)
... ]
>>> survivors = fx.multi_factor.bhy(profiles, q=0.05)

Use cases¶

Screening a candidate pool with false discovery rate (FDR) control

Run evaluate over m candidate signals on the same return panel, feed the resulting list of FactorProfile to bhy, and read the surviving subset off the Survivors container. Controls FDR ≤ q under arbitrary dependence — the regime that matches a correlated factor pool (e.g. 200 momentum variants on one panel).
Splitting into independent families

Pass expand_over=[<context key>] to run one step-up per distinct bucket — for instance per regime_id or universe_id — under the Benjamini & Bogomolov (2014) selective-inference framing. Each bucket's m, threshold, and survivors stay self-contained.
Swapping the inference path

Pass estimator=NeweyWest() (or another Estimator) to feed a specific heteroskedasticity-and-autocorrelation-consistent (HAC) variant's p-value into the step-up math rather than primary_p. The procedure dispatches via Estimator.emits_for to a StatCode on profile.stats.
Auditing the family boundary

Survivors.expand_over / Survivors.n_tests / Survivors.q record the family declared and the m fed into each bucket's step-up, so the FDR claim is self-contained in the return object.

Survivors attributes¶

See the docstring Examples block above for the canonical multi-factor call. The returned Survivors container exposes:

Attribute	Type	Meaning
`survivors.profiles`	`list[FactorProfile]`	input order, surviving subset
`survivors.adj_p`	`np.ndarray`	bucket-local Benjamini-Hochberg-Yekutieli (BHY)-adjusted p-value, index-aligned with `profiles`
`survivors.q`	`float`	the nominal target you passed
`survivors.expand_over`	`tuple[str, ...]`	`()` for a single family; `("regime_id",)` etc. otherwise
`survivors.n_tests`	`dict[tuple, int]`	`{(): N}` or `{bucket_key: m_per_bucket}`

Jupyter rendering surfaces a three-column text / HTML table of identity | primary_p | adj_p, plus an expand_over_values column when buckets are declared.

The input list is the family. bhy runs one Benjamini–Yekutieli step-up over all profiles by default, returning the surviving subset wrapped in a Survivors container with rich Jupyter rendering (three-column text / HTML table of identity | primary_p | adj_p, plus an expand_over_values column when buckets are declared). Each panel carries its factor under a distinct column name and evaluate(..., factor_col=name) auto-stamps factor_id from that name; this is the canonical multi-factor pattern. When you cannot rename the column (e.g. an upstream loader fixes it), reach for dataclasses.replace(profile, factor_id=...) as an escape hatch.

Set expand_over=[<context key>] to declare per-bucket independent families (Benjamini & Bogomolov 2014 selective inference); for example, expand_over=["regime_id"] runs one step-up per regime.

bhy implements the Benjamini–Yekutieli (2001) procedure with the harmonic dependence correction \(c(m) = \sum_{i=1}^{m} 1/i\) baked into the threshold. Pass your nominal q directly — no manual division. The procedure controls FDR ≤ q under arbitrary positive or negative dependence at the cost of a \(1/\ln m\) power loss relative to plain Benjamini-Hochberg (BH). Plain Benjamini–Hochberg (1995) is not offered: typical factor-pool dependence violates its Positive Regression Dependency on a Subset (PRDS) assumption.

Parameters¶

Kwarg	Default	Meaning
`expand_over`	`None`	Context keys whose distinct value tuples split the input into independent step-ups. Names must live in `FactorProfile.context` (never identity).
`estimator`	`None` (= `primary_p`)	An `Estimator` instance (e.g. `NeweyWest()`) selecting which inference method's p-value to feed into the step-up math. Dispatches via `Estimator.emits_for` to a `StatCode` key on `profile.stats` (see #170, #187).
`q`	`0.05`	Nominal false discovery rate target. The Benjamini–Yekutieli \(c(m)\) correction is applied internally — pass the level you actually want; do not pre-divide.

Identity vs context (anti-shopping defense)¶

Identity = (factor_id, forward_periods) — names which hypothesis. Context = mutable dict of slicing conditions (universe_id, regime_id, …) — names which slice of the data the hypothesis was tested on. expand_over may only name context keys, never identity.

Concretely: if expand_over=["factor_id"] were allowed, every factor would land in its own size-1 family and pass its own step-up trivially — FDR control across the screening universe collapses. Same logic for forward_periods: per-horizon families would let a candidate "shop" the horizon at which its noise happens to clear the threshold. Forcing expand_over to live in context keeps the family axis orthogonal to the hypothesis being tested. See Development § Architecture § Family functions for the full invariant list.

Sample restriction vs hypothesis dimension¶

A context key (universe_id, regime_id, …) can play one of two roles in a screening run, and the role you intend dictates whether to pre-filter or to pass it through expand_over:

Sample restriction — you have already committed to a single slice (e.g. "this study runs on tw50 only"). The context value is a pre-registered scope, not a tested dimension. Filter the input list upstream and call bhy without naming that key. FDR is controlled over the implied family factor_id × forward_periods.
Hypothesis dimension — you genuinely want to ask "across these contexts, which factor / context combinations survive?" The context value is part of the hypothesis identity at the family level. Pass it via expand_over=[<key>]; one independent step-up runs per distinct value tuple and FDR is controlled within each bucket (Benjamini & Bogomolov 2014).

User intent	API call	Family scope per step-up
"Run BHY on the `tw50` universe only"	`bhy([p for p in profiles if p.context["universe_id"] == "tw50"])`	`factor_id × forward_periods`
"Test the same factors on `tw50` and `tw100`, treating universe as a tested dimension"	`bhy(profiles, expand_over=["universe_id"])`	`factor_id × forward_periods × universe_id`, one step-up per universe
"Run BHY only in `bull` regime months"	`bhy([p for p in profiles if p.context["regime_id"] == "bull"])`	`factor_id × forward_periods`
"Test across `bull` and `bear` regimes as separate hypotheses"	`bhy(profiles, expand_over=["regime_id"])`	one step-up per regime
"Test universe × regime as a joint hypothesis grid"	`bhy(profiles, expand_over=["universe_id", "regime_id"])`	one step-up per `(universe, regime)` cell

bhy deliberately offers no implicit default for ambiguous contexts: if the same (factor_id, forward_periods) appears more than once with no expand_over, the duplicate-partition check raises with both fixes (canonical factor_id rename and expand_over=[<key>]) called out in the error. This forces an explicit commitment to the family boundary before any step-up runs.

A context key can graduate from sample restriction to hypothesis dimension when the study scope widens (extending a tw50-only run to also cover tw100 once tw100 profiles exist). What the API refuses to make easy is the reverse path: re-running with expand_over= toggled to whichever shape produces more survivors. That is p-hacking on the family boundary itself — the family must be declared before inspecting the adjusted p-values it produces.

Return type: `Survivors` (#171)¶

bhy returns a Survivors container, not a bare list. The container is procedure-agnostic — adj_p carries the function's procedure-canonical adjusted p-value (BHY here; future Holm / Bonferroni / Romano-Wolf functions populate the same shape via their own *_adjusted_p).

Survivors carries only the kept rows. The construction rule inside bhy is: compute adj_p_all over the full bucket-local input via bhy_adjusted_p, then keep {i : adj_p_all[i] <= q} and slice both profiles and adj_p to that index set. The survivor mask is derived from the same adjusted p-values that downstream code reads — not a separate rejection-mask path — so tie / boundary edge cases where two parallel step-up implementations could disagree are eliminated by construction.

adj_p[i] is computed within profiles[i]'s own expand_over bucket — bucket-local n and p_array, never pooled across buckets. This is the per-family adjustment that selective-inference theory (Benjamini & Bogomolov 2014) requires; n_tests[bucket_key] records the m fed into each step-up so the audit trail is self-contained.

Migration from the v0.10 list-of-profiles return:

# before
survivors = bhy(profiles, q=0.05)
for s in survivors:
    print(s.factor_id)

# after
survivors = bhy(profiles, q=0.05)
for s in survivors.profiles:
    print(s.factor_id)
# or just `repr(survivors)` in Jupyter for the table view

factrix.multi_factor.Survivors `dataclass` ¶

Survivors(profiles: list[FactorProfile], adj_p: ndarray, q: float, expand_over: tuple[str, ...], n_tests: Mapping[tuple[Any, ...], int], pc_p: ndarray | None = None, min_pass: int | None = None, n_passed_uncorr: ndarray | None = None)

Survivor container for multi-factor screening functions with rich Jupyter rendering.

Procedure-agnostic: adj_p carries the function's procedure-canonical adjusted p-value (Benjamini-Hochberg-Yekutieli (BHY) bhy_adjusted_p, Holm step-down, Bonferroni min(p*m, 1), Romano-Wolf resampling, ...). The contract is survivor[i] iff adj_p[i] <= q — a duality every step-up / step-down family procedure satisfies.

Invariants

len(profiles) == len(adj_p) and entries align in input order. Per-bucket independent step-up uses bucket-local n and p_array; adj_p[i] reflects profiles[i]'s own bucket only (family-partition idea from Benjamini-Bogomolov (2014); factrix does not apply BB14's selection-adjusted within- family level inflation), not a global cross-bucket adjustment.

Attributes:

Name	Type	Description
`profiles`	`list[FactorProfile]`	Survivors in input order.
`adj_p`	`ndarray`	Bucket-local adjusted p-values aligned with `profiles`.
`q`	`float`	Nominal false discovery rate (FDR) (or family-wise) target shared across all buckets.
`expand_over`	`tuple[str, ...]`	Context keys used to partition the input into independent step-up buckets (`bhy`) or to aggregate conditions per identity (`partial_conjunction`). Empty tuple when the full input is one family.
`n_tests`	`Mapping[tuple[Any, ...], int]`	Family size per bucket fed into the step-up math. Keying depends on the producing function: `bhy` keys by `expand_over_values` tuple (`()` for the single-bucket case); `partial_conjunction` keys by `identity` tuple (`(factor_id, forward_periods)`) and records the `m` condition count per identity.
`pc_p`	`ndarray \| None`	Raw partial-conjunction p-value per survivor (Benjamini-Heller (2008) Bonferroni-style: `(m - min_pass + 1) * p_((min_pass))`, capped at 1). `None` when the producing function is not `partial_conjunction`.
`min_pass`	`int \| None`	`k` in the `k` of `m` partial conjunction test. `None` when the producing function is not `partial_conjunction`.
`n_passed_uncorr`	`ndarray \| None`	Per-identity count of raw p-values strictly below `q` (descriptive — not used in inference; flags borderline cases and data gaps at a glance). The cutoff is the caller's `q` (same value driving the BHY step-up), so this count moves with `q`; using it to override `adj_p` survivor selection is the anti-shopping failure mode `partial_conjunction` exists to prevent. `None` when the producing function is not `partial_conjunction`.

Examples:

>>> import dataclasses
>>> import factrix as fx
>>> from factrix.preprocess import compute_forward_return
>>> cfg = fx.AnalysisConfig.individual_continuous(forward_periods=5)
>>> profiles = [
...     dataclasses.replace(
...         fx.evaluate(
...             compute_forward_return(
...                 fx.datasets.make_cs_panel(n_assets=30, n_dates=120, seed=i),
...                 forward_periods=5,
...             ),
...             cfg,
...         ),
...         factor_id=f"alpha_{i}",
...     )
...     for i in range(3)
... ]
>>> survivors = fx.multi_factor.bhy(profiles, q=0.05)
>>> isinstance(survivors, fx.multi_factor.Survivors)
True
>>> len(survivors) == len(survivors.profiles)
True
>>> survivors.q == 0.05
True

Behaviour change (#161)¶

Before	After
`bhy(profiles, threshold=0.05)`	`bhy(profiles, q=0.05)` (`threshold=` removed in v0.12.0; raises `TypeError`)
`bhy(profiles, gate=StatCode.X)`	`bhy(profiles, p_stat=StatCode.X)` (`gate=` removed in v0.11.0)
auto-partition by dispatch cell × horizon	caller declares the family; mixed `forward_periods` without `expand_over` emits `RuntimeWarning`. Fix: split the call per horizon, or pass `expand_over=[<context key>]` if profiles legitimately co-exist as one family across horizons.
same `factor_id` across cells silently auto-split	raises `UserInputError` (duplicate identity). Fix (canonical): name each panel's factor column distinctly and pass `evaluate(..., factor_col=name)`. Fix (escape hatch): post-hoc stamp via `dataclasses.replace(profile, factor_id=...)`. Or: use `expand_over` if profiles really do share identity but belong in separate test buckets.

Design rationale¶

For why BHY (rather than Bayesian or reality-check / SPA bootstraps) see Reference § Statistical methods § Multiple-testing and Development § Design notes § BHY. For the architectural place of _resolve_family and the closed-form vs resampling-based function classification see Development § Architecture § Family functions.