Warning / info / stat codes
Structured enum payloads attached to every
FactorProfile. Use these as the SSOT
when you need to filter, route, or trigger downstream behaviour from
factrix output without parsing free-text strings.
The three enums:
WarningCode— risk flags surfaced byprofile.diagnose(). Does not affectprimary_p; the user decides whether to pre-filter on warnings before multi-factor Benjamini-Hochberg-Yekutieli (BHY).InfoCode— information-severity diagnose annotations (e.g. scope-axis collapsed underMode = TIMESERIES).StatCode— canonical names for the scalar statistics that populateFactorProfile.metrics.
Each member's trigger / meaning is sourced from
factrix._codes.<Code>.description (single source of truth, also
returned at runtime by profile.diagnose()). For the per-procedure
breakdown of which codes a given pipeline can emit, see
Architecture § Procedure pipelines.
WarningCode¶
| WarningCode | Trigger / meaning |
|---|---|
unreliable_se_short_periods |
n_periods is below the WARN floor (~30); NW HAC SE may be biased. Reused across panel time-series guards (MIN_PERIODS_WARN) and primitive inference (MIN_FM_PERIODS_WARN); both default to 30. |
event_window_overlap |
Adjacent events sit within forward_periods; AR windows overlap. |
persistent_regressor |
ADF p > 0.10 on the continuous factor; β may carry Stambaugh bias. |
serial_correlation_detected |
Ljung-Box p < 0.05 on residuals; NW lag may be under-set. |
small_cross_section_n |
PANEL cross-asset t-test with n_assets < MIN_ASSETS (10); df=n_assets-1 too low — t_crit at n_assets=3 ≈ 4.30 (+119% vs asymptotic 1.96). |
borderline_cross_section_n |
PANEL cross-asset t-test with MIN_ASSETS ≤ n_assets < MIN_ASSETS_WARN (10..29); residual t_crit inflation 5–15% — read borderline p-values cautiously. |
sparse_common_few_events |
(COMMON, SPARSE, PANEL) broadcast dummy has MIN_BROADCAST_EVENTS_HARD ≤ n_events < MIN_BROADCAST_EVENTS_WARN (5..19); per-asset β estimable but cross-event averaging too thin for asymptotic t. |
sparse_magnitude_weighted |
Sparse factor column is mixed-sign and not a clean ±1 ternary; statistic is magnitude-weighted (Sefcik-Thompson) rather than textbook MacKinlay signed CAAR — apply .sign() before calling for sign-flip semantics. |
few_events_brown_warner |
CAAR significance test with MIN_EVENTS_HARD ≤ n_event_dates < MIN_EVENTS_WARN (4..29); t-stat returned but Brown-Warner (1985) convention treats sub-30 events as power-thin for the asymptotic t-distribution — read borderline p-values cautiously. |
borderline_portfolio_periods |
top_concentration with MIN_PORTFOLIO_PERIODS_HARD ≤ n_periods < MIN_PORTFOLIO_PERIODS_WARN (3..19); one-sided t-test on the per-date diversification ratio is returned but df=n-1 inflates t_crit relative to the asymptotic cutoff. |
rect_kernel_negative_variance |
Rectangular-kernel HAC variance-of-mean came out negative (no PSD guarantee, Andrews 1991); clamped to 0 → SE=0, t=0, p=1.0. Fires only on short / mildly anti-correlated samples. |
singular_weight_matrix |
GMM long-run covariance Ŝ was numerically singular; J-statistic was computed via Moore-Penrose pseudo-inverse rather than a true inverse. Fires on rank-deficient or strongly collinear moment matrices. |
Bases: StrEnum
Procedure-degradation flags (replaces v3 DegradedMode).
Each value carries a one-line description gloss for
profile.diagnose() consumers (review fix UX-4) — pure metadata,
StrEnum value identity is unchanged.
InfoCode¶
| InfoCode | Trigger / meaning |
|---|---|
scope_axis_collapsed |
N=1 collapsed scope axis; routed via _SCOPE_COLLAPSED sentinel. |
StatCode¶
| StatCode | Trigger / meaning |
|---|---|
mean |
Cell primary point estimate (interpretation per profile.config.metric: IC mean, FM λ mean, CAAR event-only mean, or TS β / E[β]). |
t_nw |
Newey-West HAC t-stat on the cell primary estimate. Implementation convention lives in factrix.stats.NeweyWest. |
p_nw |
Two-sided p-value from the Newey-West HAC t-test on the cell primary estimate. Sibling of T_NW. |
t_hh |
Hansen-Hodrick (1980) rectangular-kernel HAC t-stat on the cell primary estimate. Sibling of T_NW; uses Var(mean) = (γ₀ + 2 Σ_{j=1..h-1} γⱼ) / n instead of NW's Bartlett kernel. |
p_hh |
Two-sided p-value from the Hansen-Hodrick (1980) rectangular-kernel HAC t-test on the cell primary estimate. Implementation convention lives in factrix.stats.HansenHodrick. |
j_gmm |
Hansen (1982) GMM J-statistic for over-identifying moment restrictions; chi-square distributed under H₀ with df = n_moments - n_params. Implementation convention lives in factrix.stats.GMM. |
p_gmm |
Right-tail p-value from the Hansen (1982) GMM J-test (1 - χ²_df.cdf(J_GMM)). Sibling under the (J_GMM, P_GMM) algorithm-pair convention; computed by factrix.stats.GMM. |
wald_nwcl |
Wald χ² statistic for a linear restriction on slice contrasts / joint coefficients, computed under NW Bartlett HAC plus one-way cluster on the slice grouping. Implementation convention lives in factrix.stats.WaldNWCluster. |
p_wald_nwcl |
P-value from WALD_NWCL. Sibling under the (WALD_NWCL, P_WALD_NWCL) algorithm-pair convention. |
wald_twoway |
Wald χ² statistic for a linear restriction on a panel coefficient vector, computed under two-way cluster on (date, asset) (Cameron-Gelbach-Miller 2011). Implementation convention lives in factrix.stats.WaldTwoWayCluster. |
p_wald_twoway |
P-value from WALD_TWOWAY. Sibling under the (WALD_TWOWAY, P_WALD_TWOWAY) algorithm-pair convention. |
p_boot |
Empirical two-sided p-value from a block-bootstrap resample of a paired-diff statistic. Implementation convention lives in factrix.stats.BlockBootstrap (Politis-Romano stationary or Künsch fixed scheme; Politis-White auto block length). Single key for both schemes — scheme choice is metadata, not StatCode. |
factor_adf_tau |
ADF τ statistic on the factor input series (constant-only specification); fed to the MacKinnon 1996 response-surface for FACTOR_ADF_P. |
factor_adf_p |
ADF unit-root test p-value on the factor input series (MacKinnon 1996 response-surface; constant-only specification). p > 0.05 flags persistent regressor regime. |
resid_ljung_box_q |
Ljung-Box Q statistic on regression residuals (TS-dummy single-asset path); compared against χ²(h) for RESID_LJUNG_BOX_P. |
resid_ljung_box_p |
Ljung-Box p-value on residual autocorrelation (TS-dummy single-asset path); p < 0.05 flags under-set NW lag. |
event_hhi_value |
Herfindahl concentration of event counts across equal-width period bins on the panel's time axis; high values flag time-axis clumping. Does not measure within-asset event clustering. |
Bases: StrEnum
Cell-specific scalar stats keyed in FactorProfile.stats.
Naming grammar (#187): each code is <TARGET>_<KIND>.
TARGET— what is being measured. Primary (cell main effect) carries no prefix becauseFactorProfile.config(scope/signal/metric) is the single source of truth for cell identity. Diagnostic carries an explicit prefix (FACTOR_/RESID_/EVENT_) because the target lives outsideconfig.KIND— what kind of number._MEAN/_VALUEfor point estimates, statistic-named suffixes (_T_NW/_TAU/_Q) for test statistics,_P_<algo>for p-values (_P_NW/_P_HH/_P_GMM). Diagnostic p-values keep their<TARGET>_<TEST>_Pshape (FACTOR_ADF_P/RESID_LJUNG_BOX_P) — the asymmetry is structural, see below.
Inference primary stats — algorithm-suffixed pair shape
Each inference algorithm emits a (test statistic, p-value) pair with
KINDs that abbreviate the test statistic's reference distribution:
T (Student-t / asymptotic normal), J (Hansen J / χ²),
WALD (Wald χ²), F (Snedecor F), LR (likelihood ratio).
Currently shipping: (T_NW, P_NW) for Newey-West,
(T_HH, P_HH) for Hansen-Hodrick, (J_GMM, P_GMM) for
Hansen (1982) generalized method of moments (GMM) J-test
(over-identification is χ², not t, so GMM emits J rather than T),
(WALD_NWCL, P_WALD_NWCL) for Newey-West (NW)
heteroskedasticity-and-autocorrelation-consistent (HAC) + one-way
cluster on the slice grouping, and (WALD_TWOWAY,
P_WALD_TWOWAY) for two-way cluster on (date, asset) (slice-test
functions, #153 / #176). The Wald pairs follow the same
<KIND>_<ALGO> shape — KIND = WALD (χ² statistic name,
parallel to T), ALGO names the cluster-SE family
(parallel to NW / HH naming the kernel family). P_BOOT
ships alongside as the singleton emitted by BlockBootstrap:
empirical p-values have no parametric test statistic to publish,
and BlockBootstrap is a single Estimator class (fixed vs
stationary scheme is a ctor arg living in metadata).
Why primary p-value is P_<algo> while diagnostic p-value is
<target>_<test>_P: primary p has a single conceptual target
(the cell's primary estimate, identified by profile.config) so
the prefix slot carries the algorithm choice. Diagnostic p has
multiple non-primary targets (factor input / residual / event
distribution) so the prefix slot carries the target axis and the
test name floats with KIND. Both grammars co-exist deliberately.
Redesign trigger — when (a) ≥ 4 inference algorithms ship
concurrently or (b) ≥ 3 distinct test-statistic KINDs (T / J /
Wald / F / LR) coexist, the flat <KIND>_<ALGO> enum becomes
a (kind × algo) cardinality product and a structured shape
(profile.inference[Algo.X] = {test_stat, kind, p, df}) earns
its breaking-change cost. Below those thresholds the flat
enum stays cheaper. As of #191 the algorithm count is 6
(NW / HH / GMM / NWCL / DC / BlockBootstrap) and 3 KINDs
(T / J / WALD). The flat enum is over-budget on both axes; any
further inference algorithm must trigger the structured-shape
redesign discussion
before extending the enum.
Convention: df always means statistical degrees of freedom.
Wherever df appears in factrix StatCode descriptions, metadata
inner-dict keys (profile.metadata[StatCode.X]["df"]), or
profile.inference[...] schema fields, it carries the statistics
sense — never a DataFrame. This matches scipy's API
(scipy.stats.chi2.sf(..., df=...)) and is uniform across the
codebase. DataFrames are spelled out as DataFrame in type hints
and as df only in user-facing function-argument names where the
Python variable convention is unambiguous from context.
is_p_value returns True for any code whose
underscore-separated tokens contain "p"; the
family-function estimator= kwarg (#170) dispatches via
:class:Estimator.emits_for
and is implicitly a p-value source by construction.