factrix.compare ¶

compare(artifacts: CompareInput, *, sort_by: str | None = None) -> DataFrame

Render a leaderboard pl.DataFrame for a list of artifacts.

Parameters:

Name	Type	Description	Default
`artifacts`	`CompareInput`	One of three input shapes (input-type dispatch — no `compare_profiles` / `compare_bundles` split): `list[FactorProfile]` → identity + context + `primary_stat` / `primary_stat_name` / `primary_p` `list[MetricsBundle]` → identity + context + one column per standalone metric (`MetricOutput.value`) :class:`~factrix.multi_factor.Survivors` → as the profile branch plus a final `adj_p` column (read from `Survivors.adj_p`). `expand_over` dimensions surface as ordinary context columns (via `profile.context[k]`). Mixed-type lists raise.	required
`sort_by`	`str \| None`	Column name to sort by. `None` keeps input order. `nulls_last=True` matches polars default so heterogeneous context / metric coverage stays robust.	`None`

Returns:

Type	Description
`DataFrame`	`pl.DataFrame` with columns laid out as `factor_id`,
`DataFrame`	`forward_periods`, then context keys (union across entries,
`DataFrame`	first-seen order), then branch-specific columns.

Raises:

Type	Description
`UserInputError`	Empty input; mixed artifact types; `sort_by` not present in the output schema (with fuzzy suggestion).

Examples:

Leaderboard from a list of :class:FactorProfile:

>>> import dataclasses
>>> import factrix as fx
>>> from factrix.preprocess import compute_forward_return
>>> cfg = fx.AnalysisConfig.individual_continuous(forward_periods=5)
>>> profiles = [
...     dataclasses.replace(
...         fx.evaluate(
...             compute_forward_return(
...                 fx.datasets.make_cs_panel(
...                     n_assets=100, n_dates=250, seed=i,
...                 ),
...                 forward_periods=5,
...             ),
...             cfg,
...         ),
...         factor_id=f"alpha_{i}",
...     )
...     for i in range(3)
... ]
>>> leaderboard = fx.compare(profiles, sort_by="primary_p")

Leaderboard from a :class:~factrix.multi_factor.Survivors (adds an adj_p column):

>>> survivors = fx.multi_factor.bhy(profiles, q=0.5)
>>> board = fx.compare(survivors, sort_by="adj_p")

Leaderboard renderer that stacks N artifacts side by side as a polars DataFrame. Pure projection — no metric is recomputed; Survivors.adj_p is read straight through so Benjamini-Hochberg-Yekutieli (BHY) survivor tables keep their adjusted p-values without manual re-attach.

import factrix as fx

profiles = [fx.evaluate(panel, cfg, factor_col=c) for c in candidates]
fx.compare(profiles, sort_by="primary_p")

When to reach for `compare`¶

Use case	Function	Notes
Rank N `evaluate` results	`compare(list[FactorProfile])`	Identity + context + `primary_stat` / `primary_stat_name` / `primary_p`
Rank N `run_metrics` results	`compare(list[MetricsBundle])`	Identity + context + one column per standalone metric (`MetricOutput.value`)
Rank BHY survivors	`compare(Survivors)`	Profile schema plus `adj_p` (read from `Survivors.adj_p`)
Re-run inference under perturbations	`robustness` (#178)	`compare` is a pure view; `robustness` recomputes
Test factor across slices	`slice_pairwise_test` / `slice_joint_test`	Re-runs inference per slice; `compare` does not

If you need fresh statistics, you want a re-compute function. compare is strictly read-through.

Input dispatch¶

Single entrypoint, input-type dispatch — no compare_profiles / compare_bundles split, so the call site does not branch on artifact shape.

compare(
    artifacts: list[FactorProfile] | list[MetricsBundle] | Survivors,
    *,
    sort_by: str | None = None,
) -> pl.DataFrame

`list[FactorProfile]`¶

┌───────────────┬─────────────────┬─────────────┬──────────────┬───────────────────┬──────────┐
│ factor_id     │ forward_periods │ universe_id │ primary_stat │ primary_stat_name │ primary_p│
│ str           │ i64             │ str         │ f64          │ str               │ f64      │
╞═══════════════╪═════════════════╪═════════════╪══════════════╪═══════════════════╪══════════╡
│ quality_roe   │ 1               │ large_cap   │ 3.21         │ t_nw              │ 0.0013   │
│ momentum_12_1 │ 1               │ large_cap   │ 2.84         │ t_nw              │ 0.0046   │
│ value_btm     │ 1               │ large_cap   │ 1.92         │ t_nw              │ 0.0550   │
└───────────────┴─────────────────┴─────────────┴──────────────┴───────────────────┴──────────┘

primary_stat_name looks redundant when every entry shares one procedure, but it is the only disambiguation for mixed lists — for example a Newey-West t-stat alongside a block-bootstrap p-only entry. The column carries the StatCode.value slug ("t_nw" / "wald_nwcl" / "p_boot" / …).

`list[MetricsBundle]`¶

┌───────────────┬─────────────────┬─────────────┬───────┬───────┬───────────┬──────────┐
│ factor_id     │ forward_periods │ universe_id │ ic    │ ic_ir │ fm_lambda │ hit_rate │
│ str           │ i64             │ str         │ f64   │ f64   │ f64       │ f64      │
╞═══════════════╪═════════════════╪═════════════╪═══════╪═══════╪═══════════╪══════════╡
│ quality_roe   │ 1               │ large_cap   │ 0.051 │ 1.83  │ 0.0042    │ 0.561    │
│ momentum_12_1 │ 1               │ large_cap   │ 0.042 │ 1.52  │ 0.0031    │ 0.547    │
│ value_btm     │ 1               │ large_cap   │ 0.028 │ 0.89  │ 0.0019    │ 0.521    │
└───────────────┴─────────────────┴─────────────┴───────┴───────┴───────────┴──────────┘

One column per metric, projected from MetricOutput.value. Per-cell n_obs (first-class on MetricOutput) is not flattened — for 4 metrics that would double the table to 8 columns and drown the leaderboard. When you need sample-size honesty for a specific cell, look up bundle[metric].n_obs directly.

`Survivors`¶

survivors = fx.multi_factor.bhy(profiles, q=0.05)
fx.compare(survivors, sort_by="adj_p")

┌───────────────┬─────────────────┬─────────────┬──────────────┬───────────────────┬──────────┬────────┐
│ factor_id     │ forward_periods │ universe_id │ primary_stat │ primary_stat_name │ primary_p│ adj_p  │
│ str           │ i64             │ str         │ f64          │ str               │ f64      │ f64    │
╞═══════════════╪═════════════════╪═════════════╪══════════════╪═══════════════════╪══════════╪════════╡
│ quality_roe   │ 1               │ large_cap   │ 3.21         │ t_nw              │ 0.0013   │ 0.0078 │
│ momentum_12_1 │ 1               │ large_cap   │ 2.84         │ t_nw              │ 0.0046   │ 0.0120 │
└───────────────┴─────────────────┴─────────────┴──────────────┴───────────────────┴──────────┴────────┘

When bhy(..., expand_over=[k, ...]) was used, the partitioning keys are already inside profile.context[k]. compare reads them through the same context-key path as every other context column — there is no sidecar expand_over_values field on Survivors, and the renderer does no reverse lookup.

Column policy¶

Concern	Behaviour
Identity	Always flattens to `factor_id` + `forward_periods` (two columns), matching `FactorProfile.identity` and `MetricsBundle.identity`.
Context	Union of keys across entries, ordered by first appearance; missing keys fill with `null`. Matches `pl.concat(how="diagonal")`.
`sort_by`	`None` keeps input order; otherwise polars sort with `nulls_last=True`. Unknown column raises with a fuzzy suggestion.
Mixed-type list	`FactorProfile` and `MetricsBundle` cannot be mixed — raises with the offending indices.
Empty input	`[]` and empty `Survivors` raise (rather than returning a schema-undefined empty frame).

Errors¶

compare raises UserInputError for every input shape issue (empty input, mixed types, unknown sort_by). Unknown sort_by carries suggestions populated by difflib against the output schema.

factrix.compare ¶

When to reach for compare¶