Input contract — the panel must satisfy the four-column floor documented in Panel schema.
factrix.evaluate ¶
evaluate(panel: Any, config: AnalysisConfig | None = None, /, *, factor_col: str = 'factor') -> FactorProfile
Evaluate one factor against its forward returns and return a FactorProfile.
The profile carries primary_p (the headline p-value for downstream
false discovery rate (FDR)), the cell-specific statistics, sample-size diagnostics, warnings,
and the identity / context tuple used by multi-factor
aggregators (bhy /
partial_conjunction /
bhy_hierarchical).
All factrix-raised errors inherit from
FactrixError.
Dispatch lore — cell schema, Mode, multi-factor cost
Dispatch is explicit. No auto-fallback when the panel shape
does not match the cell. The one exception: Common × Continuous
at N == 1 auto-routes to the TIMESERIES single-series path
(profile.mode == "TIMESERIES") so single-asset macro factors
still flow through.
Required columns per cell. Every cell floors its
INPUT_SCHEMA at the same four columns; optional columns
activate additional standalone metrics and short-circuit
gracefully (NaN + reason) when absent.
| Cell | Required | Optional column → enables |
|---|---|---|
Individual × Continuous (ic, fama_macbeth) |
date, asset_id, factor, forward_return |
market_cap (or any name passed as weight_col=) → quantile_spread_vw value-weighting |
| Individual × Sparse (event studies) | date, asset_id, factor, forward_return |
price → event_around_return, mfe_mae_summary (degrade gracefully if absent) |
| Common × Continuous (broadcast macro factor) | date, asset_id, factor, forward_return |
— |
| Common × Sparse (broadcast event dummy) | date, asset_id, factor, forward_return |
— |
forward_return is part of the input contract — attach it via
compute_forward_return
before the call so the horizon is explicit and aligned with
config.forward_periods.
Mode — PANEL vs TIMESERIES. Derived at evaluate-time from
N = panel["asset_id"].n_unique() and surfaced on
profile.mode:
profile.mode |
When | Inference |
|---|---|---|
"PANEL" |
N ≥ 2 cross-sectional / event cells |
per-date statistic → time-series mean with Newey-West (NW) heteroskedasticity-and-autocorrelation-consistent (HAC) |
"TIMESERIES" |
Common × Continuous with N == 1 |
single-series ordinary least squares (OLS) with plain SE; HAC only on stage-2 aggregation |
Full conventions: Timeseries-mode conventions. Sample-guard contract: Panel vs timeseries.
Multi-factor cost. Each call repeats the per-date
cross-section work (sort / group-by / rank / Herfindahl-Hirschman index (HHI)) on its own, so
cost scales as O(n_factors × per_date_cost). There is no
shared-pass primitive; bhy controls
FDR but does not reduce the per-signal evaluation cost.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
panel
|
Any
|
Long-format panel satisfying the four-column floor
|
required |
config
|
AnalysisConfig | None
|
Validated |
None
|
factor_col
|
str
|
Name of the signal column on |
'factor'
|
Returns:
| Type | Description |
|---|---|
FactorProfile
|
|
FactorProfile
|
|
FactorProfile
|
|
FactorProfile
|
and |
Raises:
| Type | Description |
|---|---|
MissingConfigError
|
|
IncompatibleAxisError
|
|
ModeAxisError
|
Legal cell has no procedure under the derived
|
InsufficientSampleError
|
|
ValueError
|
|
Examples:
Single-factor inference on a cross-sectional panel:
>>> import factrix as fx
>>> from factrix.preprocess import compute_forward_return
>>> raw = fx.datasets.make_cs_panel(n_assets=100, n_dates=250)
>>> panel = compute_forward_return(raw, forward_periods=5)
>>> cfg = fx.AnalysisConfig.individual_continuous(forward_periods=5)
>>> profile = fx.evaluate(panel, cfg)
Non-default signal column name:
>>> panel_renamed = panel.rename({"factor": "alpha"})
>>> profile = fx.evaluate(panel_renamed, cfg, factor_col="alpha")
Multi-factor screening with FDR — see
:func:factrix.multi_factor.bhy.
Use cases¶
-
Single-factor significance
One panel + one
AnalysisConfig→ oneFactorProfilecarryingprimary_pand the cell-specific statistics. -
Batch screening with false discovery rate (FDR)
Loop
evaluateover candidate signal columns and feed the resulting list of profiles tobhyfor false-discovery-rate control. See Batch screening. -
Cross-cell apples-to-apples
Swap the
AnalysisConfigfactory to compare information coefficient (IC) rank-ordering against Fama-MacBeth λ on the same panel, or individual-asset factors against broadcast macro factors. Return shape is identical across cells. -
TIMESERIES auto-routing
Common × ContinuouswithN == 1falls back to single-series ordinary least squares (OLS) with Newey-West heteroskedasticity-and-autocorrelation-consistent (HAC) SE, so single-asset macro factors flow through the same entry point without a parallel code path.
Worked example — single-factor smoke test¶
Synthetic panel → evaluate → read primary_p + diagnose()
Full runnable example complementing the doctest snippets in Examples
above with realistic console output and a diagnose() dump.
import factrix as fx
from factrix.preprocess import compute_forward_return
raw = fx.datasets.make_cs_panel(
n_assets=100, n_dates=500, ic_target=0.08, seed=2024,
)
panel = compute_forward_return(raw, forward_periods=5)
cfg = fx.AnalysisConfig.individual_continuous(
metric=fx.Metric.IC, forward_periods=5,
)
profile = fx.evaluate(panel, cfg)
print("primary_p =", round(profile.primary_p, 4))
# → primary_p = 0.0
print(profile.diagnose())
# {'identity': {'factor_id': 'factor', 'forward_periods': 5},
# 'context': {},
# 'cell': {'scope': 'individual', 'signal': 'continuous',
# 'metric': 'ic', 'mode': 'panel'},
# 'n_obs': 494, 'n_pairs': 49400, 'n_periods': 494, 'n_assets': 100,
# 'primary_p': 2.13e-40,
# 'primary_stat': 14.60,
# 'primary_stat_name': 't_nw',
# 'warnings': [], 'info_notes': [],
# 'stats': {'mean': 0.0722, 't_nw': 14.60, 'p_nw': 2.13e-40},
# 'metadata': {'t_nw': {'nw_lags': 5}, 'p_nw': {'nw_lags': 5}}}
Config recipes — one per dispatch cell¶
Minimum-viable AnalysisConfig for each of the four cells. The
evaluate(panel, cfg) call site is identical; only cfg changes.
Rank predictive ordering — Spearman IC + Newey-West (NW) HAC.
Unit-of-exposure premium — Fama-MacBeth λ.
Event study with sparse {0, R} event triggers (R is any real
magnitude; {0, 1} for a pure event flag is the simplest form).
Attach a price
column on the panel to also get event_around_return /
mfe_mae_summary in the profile.
Broadcast macro factor (e.g. VIX). With N == 1 on the panel,
evaluate auto-routes to single-series OLS with NW HAC SE
(profile.mode == "TIMESERIES").
Per-cell required / optional columns and the PANEL ↔ TIMESERIES Mode derivation are documented in the Dispatch lore admonition above.
Next steps¶
-
Batch screening guide
Wires
evaluateinto the multi-factor FDR pipeline: loop over candidates while preservingidentity/context; choose betweenbhy/partial_conjunction/bhy_hierarchical; mixed-cell batches;primary_pvsstatsat the FDR stage. -
Panel schema
New to the input contract? Start here for the four-column floor (
date,asset_id,factor,forward_return), dtype semantics, and optional columns that activate extra metrics.
See also¶
-
Timeseries-mode conventions
The
N == 1auto-routing rules and SE conventions for single-series paths. -
Panel vs timeseries sample guard
Sample-size floors and the
InsufficientSampleErrorrecovery path. -
run_metrics— descriptive twin
Computes the same statistics but makes no FDR claim. Use when you want the numbers without the inference framing.