Input contract — the panel must satisfy the four-column floor documented in Panel schema.

factrix.run_metrics ¶

run_metrics(panel: DataFrame, cfg: AnalysisConfig, *, factor_col: str = 'factor', metrics: list[str] | None = None) -> MetricsBundle

Run every standalone descriptive metric the cell exposes.

Parallel to :func:factrix.evaluate: both consume (panel, cfg), each produces a disjoint result type. run_metrics collects the cell's :mod:factrix.metrics surface — information coefficient (IC) family from one shared compute_ic (cache), every other panel-direct metric called directly — and returns a :class:MetricsBundle keyed by metric name.

Parameters:

Name	Type	Description	Default
`panel`	`DataFrame`	Canonical-column panel (`date, asset_id, factor, forward_return`). Renamed internally if `factor_col` is not `"factor"`, mirroring :func:`factrix.evaluate`.	required
`cfg`	`AnalysisConfig`	Validated :class:`AnalysisConfig`. `cfg.scope` and `cfg.signal` route metric discovery; `cfg.forward_periods` (a single int) is the horizon every metric sees. Cross-horizon sweeps go through user comprehension into `compare(bundles)` or `bhy(profiles, expand_over=["forward_periods"])` — `run_metrics` itself does not loop horizons (#147 §E / 186).¶	required
`factor_col`	`str`	Name of the signal column on `panel`. Renamed to `"factor"` internally before dispatch so metric callables see the canonical schema.	`'factor'`
`metrics`	`list[str] \| None`	`None` (default) auto-discovers every applicable metric from :func:`factrix.list_metrics`. Pass an explicit list to run a subset; unknown or :data:`~factrix._metric_index._AUTO_DISCOVER_EXCLUDED` names raise :class:`factrix.UserInputError` with a fix path (a fuzzy suggestion plus the explicit-import recipe for stage-1 consumers).	`None`

Returns:

Name	Type	Description
`A`	`MetricsBundle`	class:`MetricsBundle` whose `metrics` map carries every
	`MetricsBundle`	`MetricOutput` produced (including short-circuit outputs for
	`MetricsBundle`	sample-floor failures) and whose `skipped` map carries
	`MetricsBundle`	per-metric reasons for metrics that could not auto-run.

Raises:

Type	Description
`UserInputError`	`metrics` contains an unknown name or a name that needs explicit kwargs not threaded by `run_metrics` v1. Carries a fuzzy suggestion plus the documented reason from :data:`~factrix._metric_index._AUTO_DISCOVER_EXCLUDED`.
`RunMetricsError`	Wraps an unexpected exception raised inside a metric callable or the IC stage-1 helper. Treat as a likely factrix bug; the original exception is chained via `__cause__`. Sample-floor exceptions (:class:`factrix.InsufficientSampleError`) and metric-internal short-circuits are converted to short-circuit `MetricOutput` entries inside the bundle, not raised.

Examples:

Auto-discover every applicable metric for the cell:

>>> import factrix as fx
>>> from factrix.preprocess import compute_forward_return
>>> raw = fx.datasets.make_cs_panel(n_assets=100, n_dates=250)
>>> panel = compute_forward_return(raw, forward_periods=5)
>>> cfg = fx.AnalysisConfig.individual_continuous(forward_periods=5)
>>> bundle = fx.run_metrics(panel, cfg)
>>> ic_output = bundle["ic"]
>>> long_frame = bundle.to_frame()
>>> skipped = dict(bundle.skipped)

Restrict to a subset by name:

>>> bundle = fx.run_metrics(panel, cfg, metrics=["ic"])

Cell-level descriptive batch runner — the descriptive twin of evaluate. Where evaluate runs a cell's primary inferential procedure and returns a FactorProfile, run_metrics fans out across the cell's standalone descriptive metrics in factrix.metrics.* and returns a MetricsBundle.

The two paths share the (panel, cfg) entry contract; their result types are disjoint by design.

Function	Returns	Use when
`evaluate`	`FactorProfile` (primary p, drives false discovery rate (FDR))	you want the single inferential decision for the cell
`run_metrics`	`MetricsBundle` (cell's descriptive surface)	you want every standalone metric the cell exposes for plotting / dashboards / cross-factor comparison

Both can be called on the same (panel, cfg); neither is a prerequisite for the other.

Identity uniqueness in sweeps¶

panel follows the same canonical schema as evaluate (date, asset_id, factor, forward_return); factor_col renames an alternate signal column to "factor" internally before dispatch. bundle.identity = (factor_col, cfg.forward_periods) — when looping over candidate signals, pass factor_col=name for each so bundle.identity stays unique across the sweep (otherwise every bundle reports ("factor", h) and concatenated to_frame() outputs collide).

What auto-discover runs¶

metrics=None (default) runs every metric list_metrics(cfg.scope, cfg.signal) exposes for the cell, after three filters:

input_kind == "panel" — drops scalar-input utilities (breakeven_cost, net_spread).
_STAGE1_HELPERS — drops shared compute_* helpers (they produce stage-1 frames consumed by other metrics).
_AUTO_DISCOVER_EXCLUDED — drops metrics that need explicit kwargs run_metrics does not thread (per-row reason; surfaced on bundle.skipped).

In v1 the information coefficient (IC) family (ic, ic_newey_west, ic_ir) shares a single compute_ic(panel) per call. Other stage-1 consumers (caar, fama_macbeth, ts_beta, mfe_mae_summary, plus series / spread consumers) live in the auto-discover exclusion set; the bundle's skipped map carries the explicit-import recipe for each. v1.x will extend stage-1 wiring per cell.

Explicit subset error semantics¶

When metrics= is passed an explicit list (see the docstring Examples block above for the canonical call shape), unknown names raise UserInputError with a fuzzy suggestion plus the full candidate list. Names registered for the cell but in the auto-discover exclusion set raise the same error type with the documented reason and the explicit-call recipe — run_metrics never silently drops a name the caller asked for.

Cross-horizon and cross-universe analysis¶

run_metrics runs at exactly one horizon — cfg.forward_periods — and on exactly the panel passed in. Sweeps go through the existing functions; run_metrics does not ship a horizons=[...] helper because cross-horizon analysis is the job of compare(bundles) (descriptive, v1.x — see #148) or bhy(expand_over=["forward_periods"]) (FDR controlled, anti-shopping defense per #160). Keeping run_metrics single-horizon preserves the (panel, cfg) → bundle contract symmetric with evaluate.

horizons = [1, 5, 21]

# Descriptive sweep — no FDR claim
bundles = [
    fx.run_metrics(panel, cfg.replace(forward_periods=h))
    for h in horizons
]
# compare(bundles)  # descriptive cross-factor view; v1.x function, see #148

# Inferential sweep — FDR-controlled
profiles = [
    fx.evaluate(panel, cfg.replace(forward_periods=h))
    for h in horizons
]
fx.multi_factor.bhy(profiles, expand_over=["forward_periods"])

Universe / regime works the same way: filter the panel, optionally stamp bundle.context via dataclasses.replace, then concatenate through compare / bhy. See the Identity / context guide.

MetricsBundle¶

Frozen dataclass keyed by identity = (factor_id, forward_periods) (matches FactorProfile.identity per #160).

Member	Type	Description
`identity`	`(str, int)`	hypothesis dimensions
`metrics`	`Mapping[str, MetricOutput]`	every metric that produced a value (incl. short-circuit `NaN` outputs with `metadata["reason"]`)
`skipped`	`Mapping[str, str]`	metric → reason for everything excluded from auto-discover
`context`	`Mapping[str, Any]`	sample-restriction dimensions; v1 always empty, populated by downstream slicers or by user via `dataclasses.replace` after panel-side filtering

Access patterns:

bundle["ic"] — dict-style metric lookup
"ic" in bundle / list(bundle) / iter(bundle) — operate on the metric keys
bundle.to_frame() — long-form pl.DataFrame (one row per metric); fixed 8-column schema for stable pl.concat([b.to_frame() ...])

Hashing is disabled (__hash__ = None) because the bundle holds MetricOutput instances whose metadata is a mutable dict. Group bundles by identity (a hashable tuple).

to_frame schema¶

Column	Type	Source
`factor_id`	`str`	`identity[0]`
`forward_periods`	`int`	`identity[1]`
`metric`	`str`	mapping key
`value`	`float`	`MetricOutput.value`
`stat`	`float \\| null`	`MetricOutput.stat`
`significance`	`str \\| null`	`MetricOutput.significance`
`p_value`	`float \\| null`	`metadata["p_value"]`
`short_circuit_reason`	`str \\| null`	`metadata["reason"]`

metadata is not flattened — its shape is heterogeneous across metrics (per-regime dicts, per-horizon entries, KP source labels…). Reach into bundle["name"].metadata directly. context.* is not flattened in v1 (the column would always be empty).

Error handling¶

Three classes, three responses:

Class	Source	What `run_metrics` does
A Sample-floor / data-quality	`InsufficientSampleError`, metric-internal `_short_circuit_output`	Convert to a short-circuit `MetricOutput` (`value=NaN`, `metadata["reason"]=...`) inside the bundle. Other metrics keep running.
B User input	unknown / excluded `metrics=[...]`, missing / colliding `factor_col`	Raise `UserInputError` with fuzzy suggestion + fix path.
C Unexpected	bug in a metric callable or stage-1 helper	Raise `RunMetricsError` wrapping the original exception (chained via `__cause__`); attributes `.cell`, `.metric_name`, `.stage` identify which metric broke.

A single logging.info line at logger name factrix.run_metrics summarises ran / skipped counts per call (observability hook for batch jobs; not the primary user surface — bundle.skipped is).