Prediction Markets give hope to reproducibility crisis in scientific experiments

By Adam Siegel on December 17, 2019

Anna Dreber Almenberg is the Johan Björkman professor of economics at the Stockholm School of Economics. She is also a Wallenberg Scholar, a member of the Young Academy of Sweden and a member of the Royal Swedish Academy of Engineering Sciences (IVA).

We’ve been working with Anna and her colleagues for the last few years. She and her team have run prediction markets to make forecasts on the future outcome of scientific experiments in an effort to improve the “reproducibility crisis” in scientific experimentation. I talked to Anna about her perspective on the crisis and why she thinks prediction markets are potentially a powerful tool in helping to combat the problem.

Here is that conversation.

Adam: Let’s start with the basics. In your words, what is the reproducibility crisis and why is it a “crisis?”

ADA: We should not expect all published research results to replicate. When we redo the [same] studies with new and larger samples, it will turn out that some original results were probably false positive results or false negative results. However, during the last few years, several large replication efforts have shown that the share is a lot higher than we would expect and want, for example, in psychology and economics. That’s the crisis – but this has led to a number of improvements, so I’m positive about the future.

Adam: How did you become interested in this problem and what led you to prediction markets as a potential solution?

ADA: I had worked on candidate genes, trying to link dopamine receptor genes [in the brain] to economic risk taking. This field is plagued by false positive results. I realized that most likely [my own research could be prone to] such results. At the time, I was working on gender differences in economic preferences, and my colleagues and I became interested in taking the setup [from the Carney et al. 2010 paper on power posing in Psychological Science] and extend it to other behaviors. We did a larger study (200 participants instead of Carney’s original 42) with some modifications to the setup, and basically found nothing. This is something we (Ranehill et al.) published in Psychological Science in 2015.

Simultaneously, I had been interested in prediction markets for a few years from reading Robin Hanson’s “Could gambling save science?” paper. Also, from the work of my husband, Johan Almenberg, an economist, and our close friend and collaborator, Thomas Pfeiffer, a computational biologist, who together did a study testing the use of prediction markets in scientific settings (Almenberg et al. 2009). We were talking about how interesting it would be to see whether prediction markets could be used to predict research results from replications. Then in 2012, I read about Brian Nosek and the big replication project in psychology in Science. We contacted Brian and the huge replication team, and they said yes to adding prediction markets. From then on, this is a topic I’ve worked a lot on and still find super interesting.

Adam: When did you first realize that you were on to something?

ADA: Probably [as soon as we] performed the first set of markets in 2012, on a subset of psychology replications.

Adam: Have you been able to go back and see how accurate the prediction markets have been against actual outcomes? Can you share any measurable results?

ADA: In several papers (Dreber et al. 2015, Camerer et al. 2016, Camerer et al. 2018, Forsell et al. 2019), we have found that the prediction markets [where the forecasters are] researchers in the field, typically perform well in predicting research results – not perfectly, but better than randomness. And typically, they perform slightly better than a survey on the same researchers. We are now working on the pooled data from these projects.

For the 104 replications for which we have predictions and outcomes – in the simplest analysis – we say that a market price [or probability] above 50 indicates that the market thinks the study will replicate, and a price [or probability] below 50, indicates that the study will not replicate (where replicate is defined as the replication study finding an effect in the same direction as the original study with p<0.05 in a two-sided test). With this, we find that the market predicts correctly for 76 out of 104 replications. We are discussing ways to improve this too – there are some aspects of original studies that the researchers [who are forecasting] in the markets could use more and improve their [accuracy].

Adam: Let’s look forward now: what’s next for you personally on this research front and what future do you see for “replicability markets?” If they become widespread, how will this change scientific research as it’s practiced today and the gateways that control the publishing of this research?

ADA: Good question! Not sure – I’m continuing with my collaborators on a bunch of related projects but also going in new directions – and the replication crisis has definitely led to many journals and researchers making changes in practices. It would be fantastic if we could test whether prediction markets could be added to the review process – not replacing the 3-5 reviewers who carefully read the paper, but as an addition, and see how much that improves the review process.

prediction markets crowdsourced forecasting