Skip to content

Input contract — the panel must satisfy the four-column floor documented in Panel schema.

factrix.run_metrics

run_metrics(panel: DataFrame, cfg: AnalysisConfig, *, factor_col: str = 'factor', metrics: list[str] | None = None) -> MetricsBundle

Run every standalone descriptive metric the cell exposes.

Parallel to :func:factrix.evaluate: both consume (panel, cfg), each produces a disjoint result type. run_metrics collects the cell's :mod:factrix.metrics surface — information coefficient (IC) family from one shared compute_ic (cache), every other panel-direct metric called directly — and returns a :class:MetricsBundle keyed by metric name.

Parameters:

Name Type Description Default
panel DataFrame

Canonical-column panel (date, asset_id, factor, forward_return). Renamed internally if factor_col is not "factor", mirroring :func:factrix.evaluate.

required
cfg AnalysisConfig

Validated :class:AnalysisConfig. cfg.scope and cfg.signal route metric discovery; cfg.forward_periods (a single int) is the horizon every metric sees. Cross-horizon sweeps go through user comprehension into compare(bundles) or bhy(profiles, expand_over=["forward_periods"])run_metrics itself does not loop horizons (#147 §E /

186).

required
factor_col str

Name of the signal column on panel. Renamed to "factor" internally before dispatch so metric callables see the canonical schema.

'factor'
metrics list[str] | None

None (default) auto-discovers every applicable metric from :func:factrix.list_metrics. Pass an explicit list to run a subset; unknown or :data:~factrix._metric_index._AUTO_DISCOVER_EXCLUDED names raise :class:factrix.UserInputError with a fix path (a fuzzy suggestion plus the explicit-import recipe for stage-1 consumers).

None

Returns:

Name Type Description
A MetricsBundle

class:MetricsBundle whose metrics map carries every

MetricsBundle

MetricOutput produced (including short-circuit outputs for

MetricsBundle

sample-floor failures) and whose skipped map carries

MetricsBundle

per-metric reasons for metrics that could not auto-run.

Raises:

Type Description
UserInputError

metrics contains an unknown name or a name that needs explicit kwargs not threaded by run_metrics v1. Carries a fuzzy suggestion plus the documented reason from :data:~factrix._metric_index._AUTO_DISCOVER_EXCLUDED.

RunMetricsError

Wraps an unexpected exception raised inside a metric callable or the IC stage-1 helper. Treat as a likely factrix bug; the original exception is chained via __cause__. Sample-floor exceptions (:class:factrix.InsufficientSampleError) and metric-internal short-circuits are converted to short-circuit MetricOutput entries inside the bundle, not raised.

Examples:

Auto-discover every applicable metric for the cell:

>>> import factrix as fx
>>> from factrix.preprocess import compute_forward_return
>>> raw = fx.datasets.make_cs_panel(n_assets=100, n_dates=250)
>>> panel = compute_forward_return(raw, forward_periods=5)
>>> cfg = fx.AnalysisConfig.individual_continuous(forward_periods=5)
>>> bundle = fx.run_metrics(panel, cfg)
>>> ic_output = bundle["ic"]
>>> long_frame = bundle.to_frame()
>>> skipped = dict(bundle.skipped)

Restrict to a subset by name:

>>> bundle = fx.run_metrics(panel, cfg, metrics=["ic"])

Cell-level descriptive batch runner — the descriptive twin of evaluate. Where evaluate runs a cell's primary inferential procedure and returns a FactorProfile, run_metrics fans out across the cell's standalone descriptive metrics in factrix.metrics.* and returns a MetricsBundle.

The two paths share the (panel, cfg) entry contract; their result types are disjoint by design.

Function Returns Use when
evaluate FactorProfile (primary p, drives false discovery rate (FDR)) you want the single inferential decision for the cell
run_metrics MetricsBundle (cell's descriptive surface) you want every standalone metric the cell exposes for plotting / dashboards / cross-factor comparison

Both can be called on the same (panel, cfg); neither is a prerequisite for the other.

Identity uniqueness in sweeps

panel follows the same canonical schema as evaluate (date, asset_id, factor, forward_return); factor_col renames an alternate signal column to "factor" internally before dispatch. bundle.identity = (factor_col, cfg.forward_periods) — when looping over candidate signals, pass factor_col=name for each so bundle.identity stays unique across the sweep (otherwise every bundle reports ("factor", h) and concatenated to_frame() outputs collide).

What auto-discover runs

metrics=None (default) runs every metric list_metrics(cfg.scope, cfg.signal) exposes for the cell, after three filters:

  1. input_kind == "panel" — drops scalar-input utilities (breakeven_cost, net_spread).
  2. _STAGE1_HELPERS — drops shared compute_* helpers (they produce stage-1 frames consumed by other metrics).
  3. _AUTO_DISCOVER_EXCLUDED — drops metrics that need explicit kwargs run_metrics does not thread (per-row reason; surfaced on bundle.skipped).

In v1 the information coefficient (IC) family (ic, ic_newey_west, ic_ir) shares a single compute_ic(panel) per call. Other stage-1 consumers (caar, fama_macbeth, ts_beta, mfe_mae_summary, plus series / spread consumers) live in the auto-discover exclusion set; the bundle's skipped map carries the explicit-import recipe for each. v1.x will extend stage-1 wiring per cell.

Explicit subset error semantics

When metrics= is passed an explicit list (see the docstring Examples block above for the canonical call shape), unknown names raise UserInputError with a fuzzy suggestion plus the full candidate list. Names registered for the cell but in the auto-discover exclusion set raise the same error type with the documented reason and the explicit-call recipe — run_metrics never silently drops a name the caller asked for.

Cross-horizon and cross-universe analysis

run_metrics runs at exactly one horizon — cfg.forward_periods — and on exactly the panel passed in. Sweeps go through the existing functions; run_metrics does not ship a horizons=[...] helper because cross-horizon analysis is the job of compare(bundles) (descriptive, v1.x — see #148) or bhy(expand_over=["forward_periods"]) (FDR controlled, anti-shopping defense per #160). Keeping run_metrics single-horizon preserves the (panel, cfg) → bundle contract symmetric with evaluate.

horizons = [1, 5, 21]

# Descriptive sweep — no FDR claim
bundles = [
    fx.run_metrics(panel, cfg.replace(forward_periods=h))
    for h in horizons
]
# compare(bundles)  # descriptive cross-factor view; v1.x function, see #148

# Inferential sweep — FDR-controlled
profiles = [
    fx.evaluate(panel, cfg.replace(forward_periods=h))
    for h in horizons
]
fx.multi_factor.bhy(profiles, expand_over=["forward_periods"])

Universe / regime works the same way: filter the panel, optionally stamp bundle.context via dataclasses.replace, then concatenate through compare / bhy. See the Identity / context guide.

MetricsBundle

Frozen dataclass keyed by identity = (factor_id, forward_periods) (matches FactorProfile.identity per #160).

Member Type Description
identity (str, int) hypothesis dimensions
metrics Mapping[str, MetricOutput] every metric that produced a value (incl. short-circuit NaN outputs with metadata["reason"])
skipped Mapping[str, str] metric → reason for everything excluded from auto-discover
context Mapping[str, Any] sample-restriction dimensions; v1 always empty, populated by downstream slicers or by user via dataclasses.replace after panel-side filtering

Access patterns:

  • bundle["ic"] — dict-style metric lookup
  • "ic" in bundle / list(bundle) / iter(bundle) — operate on the metric keys
  • bundle.to_frame() — long-form pl.DataFrame (one row per metric); fixed 8-column schema for stable pl.concat([b.to_frame() ...])

Hashing is disabled (__hash__ = None) because the bundle holds MetricOutput instances whose metadata is a mutable dict. Group bundles by identity (a hashable tuple).

to_frame schema

Column Type Source
factor_id str identity[0]
forward_periods int identity[1]
metric str mapping key
value float MetricOutput.value
stat float \| null MetricOutput.stat
significance str \| null MetricOutput.significance
p_value float \| null metadata["p_value"]
short_circuit_reason str \| null metadata["reason"]

metadata is not flattened — its shape is heterogeneous across metrics (per-regime dicts, per-horizon entries, KP source labels…). Reach into bundle["name"].metadata directly. context.* is not flattened in v1 (the column would always be empty).

Error handling

Three classes, three responses:

Class Source What run_metrics does
A Sample-floor / data-quality InsufficientSampleError, metric-internal _short_circuit_output Convert to a short-circuit MetricOutput (value=NaN, metadata["reason"]=...) inside the bundle. Other metrics keep running.
B User input unknown / excluded metrics=[...], missing / colliding factor_col Raise UserInputError with fuzzy suggestion + fix path.
C Unexpected bug in a metric callable or stage-1 helper Raise RunMetricsError wrapping the original exception (chained via __cause__); attributes .cell, .metric_name, .stage identify which metric broke.

A single logging.info line at logger name factrix.run_metrics summarises ran / skipped counts per call (observability hook for batch jobs; not the primary user surface — bundle.skipped is).