Baseline leaderboard

Tracks base model LLM forecasting performance without additional tools, comparing against human baselines and showing consistent progress in capabilities since models were first tested.

Tournament leaderboard

Tracks frontier accuracy by allowing tool use to improve LLM performance. Models can be scaffolded, fine-tuned, ensembled, and so on. Open to public submissions .

Tournament leaderboard

Projected LLM-superforecaster parity

Explore how LLM forecasting accuracy evolves on ForecastBench. A linear trend projects the date when LLMs reach superforecaster-level performance.

Explore chart