Tournament leaderboard

Tracks frontier accuracy by allowing tool use to improve LLM performance. Models can be scaffolded, fine-tuned, ensembled, and so on. Open to public submissions .

Baseline leaderboard

Tracks base model LLM forecasting performance without additional tools, comparing against human baselines and showing consistent progress in capabilities since models were first tested.

Baseline leaderboard

Projected LLM-superforecaster parity

Explore how LLM forecasting accuracy evolves on ForecastBench. A linear trend projects the date when LLMs reach superforecaster-level performance.

Explore chart