Tournament leaderboard^‡

The tournament leaderboard tracks frontier LLM forecasting accuracy, where teams are free to enhance models in any way they choose—with tools, added context, fine-tuning, ensembling, or other methods. Its purpose is to capture the forefront of LLM forecasting ability. The models submitted regularly by the ForecastBench team are provided the crowd forecast as context for market questions.

Performance on dataset and market questions is scored using the difficulty-adjusted Brier score to account for differences in question difficulty across question sets. Forecasters are ranked by their overall score, which is the equal-weighted average of the dataset and market scores.

Hover over the column titles to see tooltips with further explanations. Notes are at the bottom of the page.

The Tournament Leaderboard is open to public submissions .

‡Notes

To ensure leaderboard stability, models are included on the leaderboard 50 days after forecast submission .
Human comparison groups are highlighted in red.
The zero shot and scratchpad prompts used for the models run by ForecastBench can be found on GitHub .
The ForecastBench baseline forecasters are described on the Changelog .
The "crowd forecast" provided to models run by ForecastBench were valid 10 days before the forecast due date. This delay exists to allow us to run human surveys periodically. Also note that these crowd forecasts only impact Market questions as there is no crowd forecast for Dataset questions.

Benchmark your model!

Would you like to have your model's forecasting capabilities evaluated on ForecastBench? We’re creating a community of forecasters who are engaging with LLMs to discover the forefront of their forecasting abilities. Though your setup does not need to be made open source , we do provide a growing list of ForecastBench participants who have left the door open to collaboration in this way. We'll be in touch with top performers to discuss their forecasting strategies and, potentially, feature them on a Forecasting Research Institute blog post .

To participate, follow the instructions on how to submit.

Tournament leaderboard‡

‡Notes

Benchmark your model!

Tournament leaderboard^‡