Honest receipts

The Calibration
Index.

Every prediction we make is logged. Every outcome is scored. Honest receipts for a discipline that lives or dies by them.

88.7%
Distribution accuracy on Pew US benchmarks · How we measure
Live · today
Distribution accuracy.

How well our calibrated populations replicate known attitude and behaviour distributions on benchmark surveys.

What we have shipped: 88.7% on Pew US benchmarks, 85.3% on Pew India (the first system to replicate India at population scale), 4.9× closer to the human self-consistency ceiling than the average frontier LLM across 5,878 SHA-256 verified API calls.

In progress
Outcome accuracy.

Whether a decision we predicted to succeed or fail in market actually did.

What we have shipped: structure for tracking. Data populates with each shipped engagement and Blind Replay. Distribution accuracy is necessary but not sufficient — outcome accuracy is the discipline's load-bearing claim, and it must be earned with every engagement.

Blind replays

Cold predictions on public launches.

A Blind Replay is a forecast made on a public 2024–25 consumer launch where the outcome is now known. We publish the prediction, the actual outcome, and the verdict — hit, miss, or partial. The empty seats are the point.

BLIND REPLAY · 01Coming Q2 2026
Target · Pending disclosure
  • Launch type: CPG holiday SKU
  • Prediction shipping in Q2
  • Actual outcome to be measured 90 days post-launch
  • Verdict: TBD
BLIND REPLAY · 02Coming Q2 2026
Target · Pending disclosure
  • Launch type: DTC national expansion
  • Prediction shipping in Q2
  • Actual outcome to be measured 90 days post-launch
  • Verdict: TBD
Scope of calibration

Where it holds.
Where it doesn't.

A platform that doesn't publish a scope statement is selling intuition, not engineering. Here's where calibration is live, partial, or absent.

US
India
EU
LATAM
CPG Food/Bev
CPG Personal Care
Fintech
SaaS B2B

● Live · ◐ Partial · ○ Not yet calibrated