Stats the receipts, as they accumulate

loading...

Recent trend

computing...

Active calibration

The model has a learning loop: every night it re-reads the log and computes how much to shrink each segment's edges based on observed hit rates. A factor of 1.00 means raw model output; anything below means the segment has historically underperformed enough that the published edge gets discounted. Bounded at 0.50 minimum — we can shrink, never inflate.

Lineup-aware coverage — afternoon re-run

The morning slate uses team-level offense. At 5 PM ET we re-fetch each game's posted lineup and rebuild the offense number from the actual hitters (weighted by batting order, with L/R splits against the opposing starter). Picks can change. This table tracks coverage + how often the lineup re-run flipped the model's pick from morning.

computing...

Rolling model win %

AGREE vs FLIP — the audit thesis

AGREE = model and market favour the same side; the model just thinks the favourite should be bigger. FLIP = they disagree on which team is the favourite. If FLIP plays systematically underperform AGREE, the edge filter is leaking, and we know what to fix.

Hit rate & ROI by edge bucket

All ROI numbers assume flat staking at -110 (52.4% breakeven). Wilson 95% confidence intervals printed alongside — small samples have very wide ranges and shouldn't be over-read.

Calibration curve

When the model says a team has a 60-65% chance to win, does that team actually win ~60-65% of the time? A bar below the predicted line = overconfident in that band. Above = underconfident.