How does ForecastBench work?

ForecastBench is a dynamic, continuously-updated benchmark designed to measure the accuracy of ML systems on a constantly changing set of forecasting questions.

We evaluate LLMs by regularly asking them to make probabilistic forecasts about future events, thereby creating a contamination-free benchmark.

We use two types of binary prediction questions:

ForecastBench operates as a fully automated, dynamic system. New forecasting rounds occur every two weeks, with each round generating 500 questions split evenly between market and dataset questions. The leaderboard is updated nightly as new data becomes available and market questions resolve over time, allowing us to continuously track forecasting performance.

To construct the performance ranking, we evaluate forecasters separately on market questions and dataset questions. The overall ranking combines these scores, equally weighting performance by question type. As a result, the overall ranking provides a comprehensive assessment of forecasting ability across both structured time-series data (dataset questions) and real-world events (market questions).

Blog

For a high-level overview of ForecastBench, including motivation, key design decisions, and early results, see our introductory blog post on the Forecasting Research Institute Substack .

Team

ForecastBench is developed and maintained by the Forecasting Research Institute , a nonprofit research organization dedicated to advancing the science, practice, and use of forecasting. The ForecastBench team is committed to open science and we publicly provide our code, datasets (where licensing permits), and methodology to support reproducible research. For correspondence, please contact forecastbench@forecastingresearch.org.

Houtan Bastani

Houtan Bastani

Code

Simas Kučinskas

Simas Kučinskas

Data

Zachary Jacobs

Zachary Jacobs

Surveys

Ezra Karger

Ezra Karger

Advisor

Philip E. Tetlock

Philip E. Tetlock

Advisor

Past contributors

Yueh-Han Chen

Yueh-Han Chen

Danny Halawi

Danny Halawi

Fred Zhang

Fred Zhang

Funding

ForecastBench is supported by a grant from Open Philanthropy .

The Forecasting Research Institute's funders exercise no editorial control or influence over our research methodology, findings, or conclusions.