Stat keys
Answers
MetricOutput schema — which metadata key is the primary inference target, which are auxiliary, what the headline stat carries.
For applicability gates, see Metric applicability.
For computation pipeline, see Metric pipelines.
Per-metric schema of the MetricOutput
returned by every public callable in factrix.metrics.
For the SE / test machinery itself see
Statistical methods. For the
MetricOutput.name → docs-page reverse index see
MetricOutput. The
evaluate()-side equivalent — FactorProfile.stats keyed by
StatCode — lives at
FactorProfile § stats keys by cell.
metadata keys are tagged by role in the per-metric subsections
below:
- primary — carries
p_value/ the inference target. - secondary-test — a complementary p-value / statistic from a
different test on the same data (e.g.
long_p_value/short_p_valuelegs ofquantile_spread). - descriptive — sample-size diagnostics, method labels, parameter echoes; not a test result.
- conditional — emitted only on certain branches; the trigger is named in parentheses.
Hypothesis-test metrics share a common envelope (p_value,
stat_type, h0, method) — listed once here, not repeated per
metric below. Cross-slice inference functions
(slice_pairwise_test /
slice_joint_test) are
not listed in the table: their headline output is a DataFrame of
contrasts, not a sidecar to a primary value.
Cross-metric summary¶
| Metric | Primary stat (MetricOutput.stat) |
Primary metadata key |
value |
|---|---|---|---|
ic |
Newey-West (NW) heteroskedasticity-and-autocorrelation-consistent (HAC) t on per-date information coefficient (IC) series |
p_value |
mean(IC) |
ic_newey_west |
NW HAC t (overlapping) |
p_value |
mean(IC) |
ic_ir |
none — descriptive | — | mean(IC) / std(IC) |
fama_macbeth |
NW HAC t on per-date λ |
p_value |
mean(β) |
pooled_ols |
clustered ordinary least squares (OLS) t (or None if G < 3) |
p_value |
pooled β |
beta_sign_consistency |
none — descriptive | — | fraction with expected sign |
caar |
non-overlapping t on event-date CAAR |
p_value |
mean(CAAR) |
bmp_test |
BMP cross-sectional z on SAR |
p_value |
mean(SAR) |
corrado_rank_test |
nonparametric rank z |
p_value |
mean(U × sign(factor)) |
hit_rate |
binomial test (or normal z) |
p_value |
hit rate ∈ [0, 1] |
event_hit_rate |
binomial test (or normal z) |
p_value |
hit rate ∈ [0, 1] |
event_ic |
Fisher-transformed Spearman z |
p_value |
Spearman ρ |
event_skewness |
D'Agostino skew z (N ≥ 20) |
p_value (conditional) |
Fisher skewness |
profit_factor |
none — descriptive | — | gains / |losses| |
signal_density |
none — descriptive | — | mean bars per event |
event_around_return |
none — descriptive | — | mean leakage score |
monotonicity |
cross-asset t on signed Spearman |
p_value |
mean |Spearman| |
quantile_spread |
NW HAC t on top-bottom spread |
p_value |
mean(spread) |
quantile_spread_vw |
NW HAC t on vw spread |
p_value |
mean(vw spread) |
top_concentration |
one-sided t on diversity ratio |
p_value |
mean(eff_n / n_top) |
clustering_diagnostic |
none — descriptive | — | event-date Herfindahl-Hirschman index (HHI) |
mfe_mae_summary |
none — descriptive | — | MFE_p50 / |MAE_p75| |
multi_split_oos_decay |
none — descriptive | — | median(survival) |
spanning_alpha |
OLS t on α |
p_value |
spanning α |
greedy_forward_selection |
none — selection meta | — | (NaN; results in metadata) |
ic_trend |
Theil-Sen slope t (CI-based) |
p_value |
Theil-Sen slope |
ts_beta |
cross-asset t on per-asset β |
p_value |
mean(β) |
ts_beta_sign_consistency |
none — descriptive | — | max(p, 1-p) on sign fraction |
mean_r_squared |
none — descriptive | — | mean(R²) |
ts_asymmetry |
Wald χ² (NW HAC) on slope sum / equality | p_value |
β_long + β_short |
ts_quantile_spread |
Wald (NW HAC) on bucket β contrast | p_value |
top − bottom bucket β |
turnover |
none — descriptive | — | 1 − mean(rank-AC) |
notional_turnover |
none — descriptive | — | replaced fraction |
breakeven_cost |
none — descriptive | — | breakeven spread (bps) |
net_spread |
none — descriptive | — | net spread (bps) |
Per-metric schemas¶
ic family (factrix.metrics.ic)¶
ic¶
- primary:
p_value— non-overlappingt-test on the per-date IC series with strideforward_periods. - descriptive:
n_periods,tie_ratio(median across dates).
ic_newey_west¶
- primary:
p_value— NW HACton the overlapping IC series. - descriptive:
n_periods,newey_west_lags,forward_periods,tie_ratio(median across dates).
ic_ir¶
Descriptive metric — MetricOutput.stat is None and no p_value
is emitted.
- descriptive:
mean_ic,std_ic,n_periods,tie_ratio.
fama_macbeth family (factrix.metrics.fama_macbeth)¶
fama_macbeth (emits MetricOutput.name = "fm_beta")¶
- primary:
p_value— NW HACton per-date λ. Withis_estimated_factor=Truethe Shanken EIV correction is applied post-hoc and the correctedp_valuereplaces the raw value. - secondary-test (conditional, Shanken applied):
p_value_uncorrected,stat_uncorrected. - descriptive:
n_periods,newey_west_lags,forward_periods,is_estimated_factor,warning_codes(conditional). - descriptive (conditional, Shanken applied):
shanken_c,shanken_factor_return_var,shanken_factor_return_var_source. - descriptive (conditional, σ²_f ≈ 0):
shanken_correction="skipped_zero_factor_variance"— the correction is undefined when the factor-return variance collapses; the uncorrected NW result is reported.
pooled_ols (emits MetricOutput.name = "pooled_beta")¶
- primary:
p_value— single- or two-way clustered OLSt. When the cluster count G < 3 the test is short-circuited withstat = Noneandp_value = 1.0. - Sample size:
MetricOutput.n_obs(row count entering the test). - descriptive:
n_clusters(one-way) orn_clusters_a,n_clusters_b,n_clusters_intersection(two-way). - descriptive (conditional, short-circuit):
reason = "insufficient_clusters",n_clusters(smallest G — first-classn_obscarries the row count),min_required(always 3). - descriptive (conditional):
variance_non_psd_fallback— names the fallback path when the meat matrix is non-PSD.
beta_sign_consistency¶
Descriptive; no test.
- descriptive:
expected_sign,n_periods.
caar family (factrix.metrics.caar)¶
caar¶
- primary:
p_value— non-overlappington per-event-date CAAR. - descriptive:
n_event_dates,n_sampled,warning_codes(conditional, e.g.FEW_EVENTS).
bmp_test¶
Boehmer-Musumeci-Poulsen standardised-abnormal-return cross-sectional
z test, with optional Kolari-Pynnönen clustering adjustment.
- primary:
p_value. - descriptive:
n_events,n_dropped,std_sar,estimation_window,include_prediction_error_variance. - descriptive (conditional, KP applied):
kolari_pynnonen_r,kolari_pynnonen_n_eff,kolari_pynnonen_r_source,kolari_pynnonen_applied,kolari_pynnonen_scaling,stat_uncorrected.
corrado (factrix.metrics.corrado)¶
corrado_rank_test (emits MetricOutput.name = "corrado_rank")¶
- primary:
p_value— Corrado nonparametric rankz. - descriptive:
n_events,n_total_obs.
hit_rate (factrix.metrics.hit_rate)¶
hit_rate¶
MetricOutput.stat is the binomial hit count when the exact branch
runs, the normal z when the approximation branch runs;
stat_type discriminates ("binomial_hits" vs "z").
- primary:
p_value— binomial / normal-approximation test on non-overlapping wins (strideforward_periods). - descriptive:
n_hits,n_total.
event_quality (factrix.metrics.event_quality)¶
event_hit_rate¶
Same shape as hit_rate (binomial / normal-approx branches).
- primary:
p_value. - descriptive:
n_events,n_hits.
event_ic¶
- primary:
p_value— Fisher-transformed Spearman ρ between|factor|andsigned_car. - descriptive:
n_events.
MetricOutput.stat = None and the short-circuit reason is set to
"not_applicable_discrete_signal" when the signal lacks magnitude
variance (e.g. binary {-1, +1}).
event_skewness¶
- primary (conditional, N ≥ 20):
p_value— D'Agostino skewz. - descriptive:
n_events.
When N < 20, MetricOutput.stat = None and p_value / stat_type
/ h0 / method are omitted — the metric reports the Fisher
skewness in value only.
profit_factor¶
Descriptive; no test.
- descriptive:
total_gains,total_losses,n_events,n_wins,n_losses.
signal_density¶
Per-asset event frequency; descriptive (the period-axis analogue
is clustering_diagnostic).
- descriptive:
n_events_total,n_assets_with_events,mean_events_per_asset,mean_bars_between_events.
event_horizon (factrix.metrics.event_horizon)¶
event_around_return¶
Pre/post-event return profile; descriptive.
- descriptive:
per_offset(dictoffset → {mean, median, p25, p75, hit_rate, n}),interpretation. - descriptive:
p_value(sentinel; not a test result — kept for uniformMetricOutputshape).
monotonicity (factrix.metrics.monotonicity)¶
monotonicity¶
MetricOutput.value carries the magnitude (mean |Spearman|);
MetricOutput.stat carries the cross-asset t on the signed
Spearman series. The split is intentional — magnitude and direction
consistency are read separately.
- primary:
p_value— cross-assett(H₀: μ = 0). - descriptive:
mean_signed,n_valid_periods,n_groups,tie_ratio,tie_policy.
quantile (factrix.metrics.quantile)¶
quantile_spread¶
- primary:
p_value— non-overlappingt-test on the (top − bottom) spread series. - secondary-test:
long_alpha,long_stat,long_p_value— long-leg attribution (mean excess andt/ p-value). - secondary-test:
short_alpha,short_stat,short_p_value,short_significance— short-leg attribution. - descriptive:
n_periods,tie_ratio,tie_policy.
quantile_spread_vw¶
Value-weighted variant. Same metadata shape as quantile_spread
plus a weights_lagged flag indicating whether the weighting input
was lagged before the join (descriptive).
concentration (factrix.metrics.concentration)¶
top_concentration¶
H₀: ratio ≥ 0.5 (one-sided). Tests whether the top-bucket
diversity ratio (effective-n / n_top, derived from HHI) falls
below the 0.5 threshold — i.e. concentration risk.
- primary:
p_value— one-sidedt. - descriptive:
mean_n_top,ratio_eff_to_total,tie_ratio,weight_by,warning_codes(conditional).
clustering (factrix.metrics.clustering)¶
clustering_diagnostic (emits MetricOutput.name = "clustering_hhi")¶
Descriptive; period-axis concentration of event dates.
- descriptive:
n_events,n_event_dates,effective_n_dates,hhi_normalized,cluster_window.
mfe_mae (factrix.metrics.mfe_mae)¶
mfe_mae_summary¶
Descriptive; no test.
- descriptive:
mfe_p50,mae_p75,mae_p95,mfe_mae_ratio,bars_to_mfe_mean,bars_to_mae_mean,n_events. - descriptive (conditional, when σ-normalised inputs available):
mfe_z_p50,mae_z_p75,mfe_mae_ratio_z,n_events_z. - descriptive:
p_value(sentinel).
oos (factrix.metrics.oos)¶
multi_split_oos_decay (emits MetricOutput.name = "oos_decay")¶
MetricOutput.stat = None; rank-based PASS/VETO gate, no formal
hypothesis test.
- descriptive:
status("PASS"/"VETOED"),sign_flipped,per_split(list of{is_ratio, mean_is, mean_oos, survival_ratio, sign_flipped}),survival_threshold,n_splits,method.
spanning (factrix.metrics.spanning)¶
spanning_alpha¶
- primary:
p_value— OLSton α from the multivariate spanning regression. Plain (non-HAC) SE — assumes the input spread series are non-overlapping. - Sample size:
MetricOutput.n_obs(length of the aligned candidate-series). - descriptive:
n_base_factors,base_factors(list of base-factor names),betas(per-base OLS slope dict),r_squared. - descriptive (conditional, short-circuit):
reason.
greedy_forward_selection¶
Stepwise selection meta-metric; MetricOutput.value is NaN and
MetricOutput.stat = None. Per-candidate t-stats are not valid
for inference (selection bias).
- descriptive:
selected_factors(list ofSpanningResult),eliminated_factors,all_candidates,t_stats_inference_invalid(alwaysTrue).
trend (factrix.metrics.trend)¶
ic_trend¶
Theil-Sen median slope on the IC series. The reported MetricOutput.stat
is the slope-t derived from the rank-based confidence interval.
- primary:
p_value— slope significance from the Theil-Sen CI. - descriptive:
n_periods,ci_low,ci_high,ci_excludes_zero,intercept. - descriptive (conditional, augmented Dickey-Fuller (ADF) run):
adf_stat,adf_p,unit_root_suspected.
ts_beta (factrix.metrics.ts_beta)¶
ts_beta¶
- primary:
p_value— cross-assetton the per-asset OLS β distribution. - descriptive:
n_assets,beta_std,median_beta.
mean_r_squared¶
Descriptive; no test.
- descriptive:
n_assets,median_r_squared,min_r_squared,max_r_squared.
ts_beta_sign_consistency¶
Descriptive symmetric consistency — value ∈ [0.5, 1.0].
- descriptive:
n_assets,fraction_positive.
ts_asymmetry (factrix.metrics.ts_asymmetry)¶
ts_asymmetry¶
Two complementary methods:
- Method A (always): Wald χ² on
H₀: β_long + β_short = 0with NW HAC SE. -
Method B (conditional, ≥ 2 distinct values per side): Wald χ² on
H₀: β_pos = β_neg. -
primary:
p_value— Method A. - secondary-test (conditional, Method B ran):
beta_pos,beta_neg,p_wald_slopes. - descriptive:
beta_long,beta_short,abs_short_over_long,n_pos,n_neg,n_zero,n_periods,nw_lags_used,method_b_skipped(conditional),intercept(conditional),beta_zero(conditional).
ts_quantile (factrix.metrics.ts_quantile)¶
ts_quantile_spread¶
- primary:
p_value— Waldχ²(NW HAC) onH₀: β_top = β_bottomfrom an OLS fit on bucket dummies. - secondary-test:
spearman_rho,spearman_p— small-sample Spearman of (bucket-idx, mean-return) for monotonicity diagnostic. - descriptive:
n_groups,n_periods,n_distinct_factor,nw_lags_used,buckets(list of{idx, mean_return, n}).
tradability (factrix.metrics.tradability)¶
All four are descriptive — MetricOutput.stat = None and no
p_value is emitted. They feed cost/benefit arithmetic, not
inference.
turnover¶
- descriptive:
mean_rank_autocorrelation,std_rank_autocorrelation,n_pairs,forward_periods,quantile,n_cross_section_mean.
notional_turnover¶
- descriptive:
n_rebalances,n_groups,forward_periods,mean_tail_size.
breakeven_cost¶
Scalar-input metric (consumes pre-aggregated scalars rather than a date-keyed DataFrame).
- descriptive:
gross_spread,turnover,forward_periods.
net_spread¶
Scalar-input metric.
- descriptive:
gross_spread,cost_drag,estimated_cost_bps,turnover,forward_periods.
Short-circuit envelope¶
Every metric falls back to a uniform short-circuit MetricOutput
when input data fails the metric's preconditions (insufficient
sample, no events, degenerate signal, …). The fallback shape is:
value = float("nan"),stat = None,significance = "".MetricOutput.n_obs: int | None— first-class sample size the estimator saw before bailing (e.g. how many periods / events were actually available). Populated when the short-circuit knows the number;Noneotherwise.metadata["reason"]: strnames the short-circuit branch (e.g."insufficient_periods","no_events","not_applicable_discrete_signal","insufficient_clusters").metadata["p_value"] = 1.0— conservative default so Benjamini-Hochberg-Yekutieli (BHY) treats short-circuited metrics as rejected rather than crashing.- Optional diagnostic keys naming what was missing or under-spec:
min_required,min_required_per_asset,min_required_per_regime,missing_column,std_u,hint,n_distinct. Each is descriptive — emitted only on the short-circuit branch that needed it; consumers should branch onreasonbefore reading.
The auxiliary metadata keys listed in the per-metric subsections
above are not present on the short-circuit path.