Solving
the perfect bracket.
An internal research program building the first systematically-derived perfect NCAA tournament bracket — a 1-in-9.2-quintillion problem that has resisted every attempt for forty years and currently carries more than a billion dollars in standing prizes.
PYTHIA — Probabilistic Yield-Tree Heuristic Inference Architecture
March Madness. The bracket the world has decided is impossible.
PYTHIA is built around a single, well-defined target: the NCAA Division I men's basketball tournament. Sixty-four teams. Sixty-three games. Two outcomes per game. The correct path through the bracket is one — out of nine quintillion two hundred twenty-three quadrillion, three hundred seventy-two trillion, and change.
263
= 9,223,372,036,854,775,808 candidate configurations.
For four decades, across more than a billion submitted brackets, the problem has resisted everything thrown at it — crowd consensus, expert picks, brute-force compute, and every modeling effort on record. ESPN alone processes tens of millions of brackets a year, and not one of them has ever survived past the second weekend intact.
The interesting feature is not the size. It is that the size lives inside a system with structure. That gap — between uniform probability and structured probability — is the entire research surface PYTHIA was designed to operate on.
9.22 × 10¹⁸
Uniform-probability outcome space.
0
In forty years of public attempts.
$1B+
Currently posted across operators.
49 / 63
Best documented partial solution.
Search smaller, not harder.
The frontier consensus on hard search is to throw more compute at the same space. PYTHIA does the opposite: it changes the shape of the space before any compute is committed. Most prior bracket attempts are trying to find a needle in a haystack. We are working on the haystack.
The space is large.
It is not random.
Genuine probabilistic structure exists in this system and has been visible in the historical record for decades. That structure is exploitable. Reduction precedes prediction — every modeling cycle PYTHIA runs is run on a deliberately collapsed space, not on the full nine-quintillion surface. Compute spent inside an unreduced space is compute wasted.
Four modules. One pipeline.
PYTHIA is a single architecture composed of four modules that run in sequence and feed each other. Each is independently evaluated. None is allowed to compensate for failure in another. The specifics of each module — the constraints, the models, the validation regime — are intentionally not published.
Reduction.
Collapses the live configuration space using deterministic, structural features of the system itself — before any probabilistic model is consulted.
BuiltInference.
An ensemble of heterogeneous predictive models operating inside the reduced space, designed so that each model's failure modes are absorbed by the others' strengths.
In ProgressGeneration.
Produces candidate paths at scale, stratified across the ensemble's posterior, with explicit coverage guarantees on high-likelihood regions.
SpecifiedValidation.
Adversarial evaluation against held-out historical regimes, with no candidate path permitted into the production set without surviving out-of-distribution stress.
SpecifiedFour commitments. None negotiable.
Every decision inside PYTHIA is tested against these. They are not heuristics. They are the contract the architecture is held to, regardless of result.
Reduction precedes prediction.
Predictive models are never asked to do work that structural analysis can do for free. Compute spent inside an unreduced space is compute wasted, and we treat it as such.
Out-of-sample is the only sample.
Any result that has not survived data the model has never seen is treated as an artifact of fitting, not evidence of skill. Backtests on training data do not count as evidence.
The system, not the bracket.
The interesting object is the architecture itself. The terminal output is a byproduct. A method that solves once but cannot be explained or repeated is not a method — it is an anecdote.
Quiet by default.
Specifics, partners, vendors, and intermediate findings stay internal until they don't need to. There is no commercial surface to defend, and no audience to perform for.
Five phases. Foundation to tournament.
Each phase is a complete deliverable, not a milestone. We do not advance until the phase actually works under live use. Phase order is fixed; scope inside a phase is not.
Phase 1 closed. Phase 2 underway.
The reduction layer is real, and it works.
Phase 1 closed when the system provably collapses the live configuration space by many orders of magnitude using only structural and historical priors — with no predictive model consulted yet. Phase 2 opens on top of that reduced space. The ensemble is being assembled.