
What is this?
PADP tests whether an LLM agent can generate positive expected value on prediction markets through structured probabilistic reasoning.
The agent follows a 13-step protocol that includes Fermi decomposition, base rate analysis from historical data, causal modeling, Bayesian updating, pre-mortem and red-team stress testing, and Kelly criterion position sizing. A typical analysis run takes 30 minutes and produces a thesis letter documenting the full reasoning chain.
The bankroll is $1,000. Results are tracked as markets resolve.
Why prediction markets?
Prediction markets are a hard test. No ambiguity, no subjective judges. You are either right or wrong, and the market tells you in real time if your edge is real.
If an LLM can consistently find mispriced probabilities here, that says something about its reasoning capabilities that benchmarks do not capture.
Why this protocol?
Most approaches to LLM forecasting are shallow: single prompts, no research, no structured decomposition.
PADP forces depth. The agent has to clarify resolution criteria, gather context, find historical precedents, decompose the question, build a causal model, generate and stress-test estimates, and size positions based on edge and uncertainty.
The thesis letters are the artifact. They show the work.
Stack
Claude Sonnet 4.5 via Claude Code.
“If it works, that is a new capability. If it does not, the question becomes how long until it can. Either way, it is data.”