Tournament leaderboard
Tracks frontier accuracy by allowing tool use to improve LLM performance. Models can be scaffolded, fine-tuned, ensembled, and so on. Open to public submissions .
Baseline leaderboard
Tracks base model LLM forecasting performance without additional tools, comparing against human baselines and showing consistent progress in capabilities since models were first tested.
Projected LLM-superforecaster parity
Explore how LLM forecasting accuracy evolves on ForecastBench. A linear trend projects the date when LLMs reach superforecaster-level performance.