The Unintended Consequences of Running Internal Forecasting Tournaments
By Adam Siegel
In 2006, I went through the second ever funding round of Y Combinator and
started a company called Inkling. Inkling built a prediction market platform
for use inside companies and we worked with some big names like Cisco,
Microsoft, and Procter & Gamble. Back then we focused almost purely on
getting to the most accurate consensus forecast and cared much less about
individual accuracy. We proved time and again that people making trades to
reflect their beliefs, in aggregate, were more accurate than most individual
predictors. More critically, these aggregates were also more accurate than
traditional methods of forecasting being used at our corporate clients.
Here’s a research paper that describes some of Inkling’s work with Ford
Motor Company for example.
Fast forward to late 2015 when Philip Tetlock and Dan Gardner published the book, Superforecasting: The Art and Science of Prediction, which is based on results the winning research team, The Good Judgment Project, saw on a U.S. Intelligence Community sponsored forecasting tournament called Aggregative Contingent Estimation, or ACE. Among its findings, The Good Judgment Project found that recording people’s forecasts, then segregating the top 1-2% most accurate forecasters, got you more accurate results than using the crowd at large. The project dubbed this top 1-2% “Superforecasters.” No other team participating in ACE hit upon the right combination of segregating these forecasters, applying specific aggregation algorithms, and introducing stimuli to generate the best possible forecasts. In fact, the Superforecasters originally identified in this research project that ended over 5 years ago have gone on to continue working with Good Judgment Inc. (the commercial entity that monetized the research approach) and to this day, do a fantastic job making forecasts about geopolitical events, the pandemic, and other topics.
Not surprisingly, many people we talk to about running internal crowd forecasting projects reference the Superforecasting book and have visions of cultivating their own group of Superforecasters to rely upon via a competitive process. I understand why this is so appealing. Having a group that has proven to accurately forecast influential events and metrics about your business is quite valuable and potentially creates a market advantage to exploit.
Within a research context, running a tournament makes perfect sense. But in a corporate context, we’ve found tournaments, with their focus on individual performance and competition, to be problematic. Our clients approach us not just seeking accurate forecasts, but change in their culture. They want to introduce more transparency, increase collaboration across organizational silos, grow awareness in critical topics, encourage people to think more holistically and rationally about their industry and the factors that influence it, highlight alternative points of view that can draw a valuable contrast to internal conventional wisdom. If you spoke to our clients, many would say these benefits outweigh the accurate forecasts the methodology can provide.
Conventional wisdom says competition is a powerful incentive, but it has often failed in a corporate context. Here’s what we’ve experienced:
- We’ve known for a while that leaderboards can be both a positive and
negative incentive. Stating that your goal is to find Superforecasters can
motivate some people, but also dissuades large numbers of people who are
afraid of failing. Let’s say in the first few months of forecasting you’re in
the top 10 for forecasting accuracy. But then you have a poor result, and you
tumble down the rankings. Many people will simply drop out at this point,
having lost their status and not feeling like working their way back.
Similarly, if after regularly forecasting you’re not at the top of the
leaderboard, you might decide to lessen your activity or stop altogether.
“Why bother, if I’ll never make it to the top?”
- Competition will often encourage a “bettor’s mentality” in forecasting
behavior. Trying to get a good forecasting score, people will often go
“all-in” with extreme forecasts of >90% or <10% likelihood 6+ months
before an answer is known! In an objectively difficult forecast question,
research has shown your best approach is to start with a moderate
forecast based on any external priors you can collect, and update your
forecast incrementally as new information becomes available. Instead,
competition breeds highly irrational behavior in many.
- Knowing they are in a competition, some people are afraid to have their judgments recorded. Especially in financial services which are already so cutthroat, or intelligence services where the entire job is to make forecasts, what incentive is there to be measured and potentially penalized in some way for performing poorly? So they just don’t participate at all.
One could argue that this is all perfectly appropriate. You’re weeding out the weak, the forecasters who won’t put the time in to staying at the top of their game. But there are other critical roles that people play, regardless of their forecasting prowess in our platform. They offer up valuable counter arguments. They provide new information. They upvote useful commentary. They suggest forecasting questions. They suggest resolutions and ask for clarifications to questions. They become more aware of strategic issues. They learn to be more rational, which impacts their decision making in other areas. By simply participating, they’re potentially providing the organization with an even bigger benefit than accurate forecasts: a collective learning experience.
Ultimately, if you’re thinking about starting a crowdsourced forecasting program, you must be clear on your objectives. Is this a take no prisoners “tournament” to find the best individual forecasters? Is it a collaborative effort to get to the best consensus forecast? Or is its objective something else entirely? Your objectives don’t have to be at odds with each other, nor exclusive, but if you aren’t careful, you can end up turning off the lifeblood of the program by pitting people against one another needlessly.
To stay updated on what we’re doing, follow us on Twitter @cultivatelabs.
You may also be interested to read New Platform to Crowdsource Experts’ Forecasts on COVID-19 Pandemic.