Baseline leaderboard
Tracks base model LLM forecasting performance without additional tools, comparing against human baselines and showing consistent progress in capabilities since models were first tested.
Tournament leaderboard
Tracks frontier accuracy by allowing tool use to improve LLM performance. Models can be scaffolded, fine-tuned, ensembled, and so on. Open to public submissions .
Projected LLM-superforecaster parity
Explore how LLM forecasting accuracy evolves on ForecastBench. A linear trend projects the date when LLMs reach superforecaster-level performance.