Introducing Cultivate's AI Forecaster

Introducing Cultivate's AI Forecaster

For two decades, our mission at Cultivate Labs, and Inkling Markets before that, focused on a practical problem: helping people make better judgments about uncertain futures by leveraging the diverse perspective, scale, and wisdom of an employee “crowd” or public community of interest.

That work has centered solely on people and finding ways to aggregate their judgment most accurately and to provide context for the decision the forecast is informing. Today, we’re announcing a major new capability in ARC and with our forecasting capabilities generally: AI-Driven Forecasting.

Using AI allows us to generate high-quality probabilistic forecasts almost instantly, update them reliably over time, and integrate them directly into the same analytic and decision workflows our users already rely on. We’re not out to replace human judgment, but to make meaningful signals available earlier and more continuously, especially in fast-moving or resource-constrained organizations where regular, robust human engagement just isn’t possible.

What We’ve Built

Our AI Forecaster is actually an aggregation of forecasts from many AI frontier models - in essence, it’s a “wisdom of the AI crowd” approach where we:

  • Pose a forecasting question
  • Use our research agent to curate relevant background information about that forecast question
  • Instruct a group of frontier models at OpenAI, Anthropic, Google, Deepseek, etc. to use that background information and its own knowledge corpus to work through a series of tasks, ultimately resulting in a probabilistic judgment and rationale.

Our AI Forecaster runs in ARC, and can also be invited into a Cultivate forecasting site to participate alongside human users. Having quietly made the AI Forecaster to ARC users the past few months, we can already see the practical effect: instead of waiting days or weeks for a signal as human forecasters achieve critical mass in any particular question, analysts can begin with an initial “take” almost immediately - dramatically speeding up their ability to inform their perspective in a structured way.

Here’s a quick tour of how AI Forecasting has been implemented in ARC:

  1. First we ask a question and ideally provide background information, resolution criteria, and an end-date for our question (soon we’ll automate this too!) We save the question and have the option to start collecting AI forecasts and invite people if we also want human input.


  2. Our research agent begins collecting source information for the models to leverage, and we send that content and instructions to each model.



  3. Each model sends back its forecast and rationale according to the structure we’ve asked it to follow. Using the aggregation methods we’ve long used in our other forecasting work, we produce a crowd forecast for all the AI models, any humans participating, and a hybrid aggregation that combines AI+humans.



  4. This process is triggered once a week until the resolution date and the forecast trend is tracked over time.


Here is an example of a question I pulled off Metaculus and just launched. You can see the initial forecasts the Cultivate AI Forecaster generated. (In comparison at the time of this post the Metaculus crowd was at 70% likelihood):

Before May 1, 2026, will the United States offer to purchase Greenland from Denmark?

I've set these forecast to update once per week.

Early Results

We will be sharing comprehensive quantitative benchmarks and results separately, but even at this early stage, AI forecasting in ARC has already been achieving promising results across a multitude of questions - currently beating the general crowd and a “pro” cohort which includes many Good Judgment Superforecasters on one of our public sites.

Why We Took This Approach

There are already organizations pursuing AI forecasting by training their own proprietary models. We acknowledge that approach can be effective in certain contexts, particularly when the data environment is stable, well-defined, and tightly controlled. But we consciously chose a different path.

Instead, ARC’s AI forecasting capability is designed to leverage the enormous and ongoing investment being made across the entire landscape of frontier models. Rather than betting on a single architecture, we allow multiple leading models to contribute probabilistic judgments to the same forecast problem. This lets us take advantage of the fact that different models have different strengths, and that those strengths continue to shift as the frontier evolves.

Because ARC treats models as participants rather than as a single system, we can benchmark their performance over time, learn which models perform best on which kinds of questions, and weight them accordingly. In practice, this means the system gets better not by retraining a single model, but by learning how to combine and rely on the right models for the right problems.

This approach also allows us to pursue diversity of perspective in a very concrete way. Just as we have always emphasized diversity in our human forecasting crowds, we can now do the same with AI. In a recent conversation with a potential client in the Middle East, for example, they explicitly raised concerns about relying solely on “Western” models for forecasting. Because ARC is model-agnostic by design, we can address that concern directly by incorporating a broader mix of models and viewpoints, rather than assuming a single default perspective.

This choice builds directly on what we’ve learned over more than a decade of running forecasting systems. Aggregated judgment across diverse forecasters consistently outperforms even very strong individual forecasters. We believe that insight applies just as much to AI systems as it does to humans, particularly in a world where models are improving rapidly and no single architecture dominates across all question types for long.

Finally, this design ensures we are not solely dependent on AI at all. Because AI forecasters and human forecasters operate within the same framework in both ARC and our forecasting platform, it is straightforward to create hybrid forecasts where humans and AI contribute side by side, learn from one another, and serve as checks on each other’s blind spots. Over time, that interaction is likely to be just as valuable as raw accuracy gains on any single class of questions.

Designed for Enterprise Organizations

Our design choice also reflects the realities of our clients. Many of the organizations we work with: intelligence and research agencies, think tanks, and large enterprises in banking, pharma and energy, operate under real constraints. They often cannot centralize sensitive data for training purposes, and they are understandably cautious about handing over large internal datasets to external systems.

At the same time, many already have internal or proprietary models they trust and want to continue using. ARC’s approach makes it possible to leverage frontier models as they improve, incorporate internal models when appropriate, and do so without retraining the platform itself or requiring clients to surrender control over sensitive data. From an enterprise perspective, this is both less intrusive and more realistic than approaches that assume all forecasting capability must live inside a single trained model.

The Role of Humans Going Forward

We continue to believe that humans play a critical role in forecasting, especially for longer time horizons, complex geopolitical or organizational dynamics, and questions where framing and interpretation matter as much or more than raw probabilities.

AI forecasting clearly excels at speed, applying the same methodology consistently, producing an informed “first take,” and avoiding certain biases humans may succumb to. At the same time, humans excel at understanding novelty, second order effects, and whether a question is even framed correctly!

In practice, we expect many teams to use AI forecasts as an always-on, credible signal, while layering human judgment on top when stakes are high or we already know the complexity of a question or lack of reference information demands human intervention. Disagreement between AI and humans will be part of the value proposition and add to the rigor of the forecast as useful information rather than as failure.

What’s Next

This release is a starting point for us. We will continue to invest time in ideas we have to improve the utility of AI Forecasting even more, like improving our research agent’s ability to leverage base rate data, and having models red team each other and provide each other that feedback to work more as a collaborative team.

If you are already using ARC, AI Forecasting is available now. If you are running or want to run a crowdforecasting site, we can turn on AI Forecasting on your site to forecast alongside your invited participants. Either way, we would be happy to show you how it all works in practice and discuss where it fits into your existing analytic and decision making processes.