factrix.metrics.hit_rate ¶
Hit rate computation for any time-indexed series.
Notes
Pipeline. Time-series only, sampled non-overlapping on a 1-D
series; binomial test against p = 0.5.
Input. DataFrame with date, value or a 1-D array.
Output. Proportion of periods where the value satisfies a condition (default: value > 0).
factrix.metrics.hit_rate.hit_rate ¶
hit_rate(series: DataFrame, value_col: str = 'value', forward_periods: int = 5) -> MetricOutput
Hit rate = proportion of periods where value > 0.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
series
|
DataFrame
|
DataFrame with |
required |
forward_periods
|
int
|
Sampling interval for non-overlapping dates. |
5
|
Returns:
| Type | Description |
|---|---|
MetricOutput
|
MetricOutput with value = hit rate (0.0-1.0). |
Notes
rate = (#{t : value_t > 0}) / n on a non-overlapping subsample
at stride forward_periods. Two-sided binomial test against
H0: p = 0.5: exact binomial below _BINOMIAL_EXACT_CUTOFF,
normal-approximation z-test (rate - 0.5) sqrt(n) / 0.5 above.
factrix reports the actual statistic (hits or z) consistent with the test branch taken, so a reader cannot mistake an exact-binomial p for a Gaussian z. Non-overlap stride mirrors the information coefficient (IC) pipeline so autocorrelation from overlapping forward returns does not leak in.
References
Hansen-Hodrick 1980: overlapping-return autocorrelation horizon motivating the non-overlap stride.
Examples:
Hit rate of a per-date IC series produced by
:func:~factrix.metrics.ic.compute_ic:
>>> import factrix as fx
>>> from factrix.preprocess import compute_forward_return
>>> from factrix.metrics.ic import compute_ic
>>> from factrix.metrics.hit_rate import hit_rate
>>> panel = compute_forward_return(
... fx.datasets.make_cs_panel(n_assets=80, n_dates=180, seed=0),
... forward_periods=5,
... )
>>> series = compute_ic(panel).rename({"ic": "value"}).select("date", "value")
>>> result = hit_rate(series, forward_periods=5)
>>> result.name
'hit_rate'
Use cases¶
-
Sign-significance on a factor-return series
hit_rateis a(*, CONTINUOUS, *, TIMESERIES)diagnostic — input is a 1-D series with(date, value), not the raw panel. Typical pipes: per-date information coefficient (IC) fromcompute_ic, quantile spread fromcompute_spread_series, or any other factor-mimicking-portfolio return series. Reports the fraction of periods withvalue > 0against \(H_0: p = 0.5\). -
Two-branch test under one \(p\)-value
Below
_BINOMIAL_EXACT_CUTOFFthe test is the exact two-sided binomial; above the cutoff it switches to the normal-approximation \(z\).stat/stat_typetrack the branch actually taken, so a reader can never seestat=zpaired with an exact-binomial \(p\). -
Non-overlap stride matches the IC pipeline
The series is sub-sampled at stride
forward_periodsbefore the test so the MA dependence induced by overlapping forward returns does not leak in; same convention asquantile_spreadand the other non-overlap inference paths.
Choosing a function¶
| Goal | Function |
|---|---|
Hit-rate significance vs \(p = 0.5\) on a (date, value) series |
hit_rate |
Per-date hit indicator (value > 0 cast to float) for slice-test plumbing |
per_date_series |
Worked example — IC series fed into hit_rate¶
compute_ic → hit_rate
import factrix as fx
from factrix.metrics.ic import compute_ic
from factrix.metrics.hit_rate import hit_rate
from factrix.preprocess import compute_forward_return
raw = fx.datasets.make_cs_panel(
n_assets=100, n_dates=500, ic_target=0.08, seed=2024,
)
panel = compute_forward_return(raw, forward_periods=5)
# The series diagnostic consumes (date, value); the value column on
# the compute_ic output is named ``ic``.
ic_df = compute_ic(panel)
out = hit_rate(ic_df, value_col="ic", forward_periods=5)
print(out.value, out.stat, out.metadata["p_value"], out.metadata["method"])
# 0.62 19 0.011 exact-binomial (approximate)
See also¶
-
compute_ic/compute_spread_series
Canonical producers of the
(date, value)series this diagnostic consumes.compute_icemits(date, ic, tie_ratio);compute_spread_seriesemits(date, spread, ...). -
trend/oos
Sibling series diagnostics on the same input shape — slope detection and IS/out-of-sample (OOS) persistence.
-
by_slice
Per-slice hit-rate summaries; uses
per_date_seriesas the slice-test capability hook. -
Statistical methods
Two-sided binomial branching, non-overlap stride discipline, and the Hansen-Hodrick autocorrelation floor that motivates it.
-
Metric applicability reference
When this metric applies and the sample-size guards that gate it (
MIN_ASSETS_PER_DATE_ICfloor on the sampled series). -
Series diagnostics landing
Adjacent axis-agnostic series diagnostics.