Will AI finally unlock the potential of probabilistic forecasting?

Will AI finally unlock the potential of probabilistic forecasting?

Tldr; - Hopefully, but it won’t be because of superhuman accuracy.

AI forecasting systems are the talk of the forecasting world. AI forecasting tournaments are running, VC money is flowing, and benchmarks are being published. We’re building an AI forecasting system into our new platform, ARC. As with the rest of crowdsourced forecasting’s history, most of the chatter is around accuracy. How accurate are the LLMs? When will they beat human and pro forecasters? Can you train models specifically for forecasting accuracy? In my opinion, the conversation is extremely over-oriented toward accuracy.

Cultivate Labs has been around for over a decade now, building forecasting platforms and running them both inside and outside organizations. Inkling, our predecessor, started in this space even earlier. When I look through the list of companies, think tanks, and government agencies that we’ve worked with over the years, it’s a source of pride. But when you look at the list that opted not to renew their contracts, it’s embarrassing. The reasons they give for cancelling are varied. Conspicuously absent from the list? Accuracy. In 10+ years, I can’t remember a single time that someone told us they had issues with the forecasts not being accurate enough.

To be clear: I am not claiming that accuracy isn’t important. It’s plainly obvious that it is. But I would contend that the near-monoply attention on accuracy is ignoring the reasons that these systems haven’t been broadly adopted.

Where are the shortcomings then and how will AI help?

We could talk for days about this, but let’s pick 3 key ones:

1. Capacity

Historically, one of our major challenges has been getting enough humans to forecast. The people who participate in forecasting platforms, both paid and unpaid, are absolute unsung heroes & heroines of our space. But there’s a limit to how many questions a single forecaster can cover, however enthusiastic they might be. If you’re running a forecasting platform inside your company but are only able to recruit a handful of participants, then you’re severely constrained in the problem space that your limited portfolio of questions can cover.

2. Speed

Similar to the capacity issue, it’s probably obvious that humans can only respond so quickly to new questions. In most cases, you’re probably waiting a number of days before you’re generating a meaningful signal. Maybe less obvious, but even more limiting, is the question development lifecycle. In human forecasting systems, it’s essential to make sure questions are well crafted to make sure they don’t end up voided – wasting precious human forecasting time. Translating vague or esoteric question topics into well authored questions can take weeks of collaboration and research.

3. The strategic vs. resolvable question divide

Executive level decision makers care about higher-level questions related to their strategic objectives (e.g. Is the US still committed to NATO?). Resolvable forecasting questions, by their nature, become narrow and parochial (e.g. Will the US withdraw its rotational forces from Lithuania by December 31, 2026?). We’ll try to craft good questions connecting the two, but it's not unusual for months of forecasting go by, the situation evolves, and suddenly the relevance of the forecasting questions has diminished relative to the overarching question. Changing the questions mid-stream risks wasting precious forecaster time and asking new questions bumps against the capacity issue we’ve mentioned.

The forecasting space abounds with platitudes like “better forecasts lead to better decision making.” But if the forecasts are on the wrong questions, don’t address the decision you’re making, and arrive too slowly, then their value is neutered.

It’s pretty obvious how AI can help alleviate capacity and speed issues. One can envision a system that helps author questions quickly and immediately starts producing probabilistic forecasts. It's so easy to imagine that new problems are immediately obvious: if you’re generating & using AI to forecast on hundreds or thousands of questions, it’ll almost immediately overwhelm anyone tracking and consuming the forecasts, so there is still a “so what” and “what do I do with this?” problem to solve (problems we're actively working to solve -- more to come soon!).

Why We’re Excited About AI Forecasting

Our graveyard of past projects proves that perfect forecasts don’t matter if an organization can’t use them effectively. The real promise of AI forecasting isn’t superhuman accuracy (even if it might get there one day), but its ability to rapidly generate, maintain, and evolve a portfolio of decision-relevant signals at scale. Our hope is that those signals will finally meet the needs of high-stakes decision making and reduce uncertainty about future ground truth.

Sadly, we must sometimes cannibalize our own business and realize what we have been working on could soon be a relic. Human forecasters will certainly have a role in the future – providing forecasts on key questions, but also in guiding, refining, and curating AI generated signals. Hybrid human/AI forecasting systems, paired with a “translation layer” that renders raw forecasts into the formats humans naturally consume (scenarios, narratives, and contextual summaries), can finally break through this ceiling. They make it possible for every meaningful decision, — large or small — to carry a probabilistic foundation. That’s the world we’ve been trying to build for over a decade, and we predict AI forecasting is what will finally make it achievable.


Ben Roesch, Cultivate Labs Co-Founder, CTO

Blog Post By: Ben Roesch

Co-Founder, CTO