Academic Stats Prediction Markets

In a column, Andrew Gelman and Eric Loken note that academia has a problem:

Unfortunately, statistics—and the scientific process more generally—often seems to be used more as a way of laundering uncertainty, processing data until researchers and consumers of research can feel safe acting as if various scientific hypotheses are unquestionably true.

They consider prediction markets as a solution, but largely reject them for reasons both bad and not so bad. I’ll respond here to their article in unusual detail. First the bad:

Would prediction markets (or something like them) help? It’s hard to imagine them working out in practice. Indeed, the housing crisis was magnified by rampant speculation in derivatives that led to a multiplier effect.

Yes, speculative market estimates were mistaken there, as were most other sources, and mistaken estimates caused bad decisions. But speculative markets were the first credible source to correct the mistake, and no other stable source had consistently more accurate estimates. Why should the most accurate source should be blamed for mistakes made by all sources?

Allowing people to bet on the failure of other people’s experiments just invites corruption, and the last thing social psychologists want to worry about is a point-shaving scandal.

What about letting researchers who compete for grants, jobs, and publications write critical referee reports and publish criticism, doesn’t that invite corruption too? If you are going to forbid all conflicts of interest because they invite corruption, you won’t have much left you will allow. Surely you need to argue that bet incentives are more corrupting that other incentives.

And there are already serious ways to bet on some areas of science. Hedge funds, for instance, can short the stock of biotech companies moving into phase II and phase III trials if they suspect earlier results were overstated and the next stages of research are thus underpowered.

So by your previous argument, don’t you want to forbid such things because they invite corruption? You can’t have it both ways; either bets are good so you want more, or bets are bad so you want less, or you must distinguish the good from the bad somehow.

More importantly, though, we believe that what many researchers in social science in particular are more likely to defend is a general research hypothesis, rather than the specific empirical findings. On one hand, researchers are already betting—not just money (in the form of research funding) but also their scientific reputations—on the validity of their research.

No, the whole problem here that we’d like to solve is that scientific reputations are not tied very strongly to research validity. Folks often gain enviable reputations from publishing lots of misleading research.

On the other hand, published claims are vague enough that all sorts of things can be considered as valid confirmations of a theory (just as it was said of Freudian psychology and Marxian economics that they can predict nothing but explain everything).

Now we have a not-so-bad reason to avoid prediction markets: people are often unclear about what they mean, and they often don’t really want to be clear. And honestly, many of their patrons don’t want them to be clear either. We might create a prediction market on if what they meant will ever be clear. But they won’t want to pay for it, and others paying for it might just be mean.

And scientists who express great confidence in a given research area can get a bit more cautious when it comes to the specifics.

Yeah, that’s the problem with being clear; you might end up being clearly wrong.

For example, our previous ethics column, “Is It Possible to Be an Ethicist Without Being Mean to People,” considered the case of a controversial study, published in a top journal in psychology, claiming women at peak fertility were three times more likely to wear red or pink shirts, compared to women at other times during their menstrual cycles. After reading our published statistical criticism of this study in Slate, the researchers did not back down; instead, they gave reasons for why they believed their results (Tracy and Beall, 2013). But we do not think that they or others really believe the claimed effect of a factor of 3. For example, in an email exchange with a psychologist who criticized our criticisms, one of us repeatedly asked whether he believed women during their period of peak fertility are really three times more likely to wear red or pink shirts, and he repeatedly declined to answer this question.

What we think is happening here is that the authors of this study and their supporters separate the general scientific hypothesis (in this case, a certain sort of connection between fertility and behavior) from the specific claims made based on the data. We expect that, if forced to lay down their money, they would bet that, in a replication study, women in the specified days in their cycle would be less than three times more likely to wear red or pink, compared to women in other days of the cycle. Indeed, we would not be surprised if they would bet that the ratio would be less than two, or even less than 1.5. But we think they would still defend their hypothesis by saying, first, that all they care about is the existence of an effect and not its magnitude, and, second, that if this particular finding does not replicate, the non-replication could be explained by a sensitivity to experimental conditions.

Those authors might well be right that an expected replication ratio of 1.5 does indeed support their key hypothesis of the existence of a substantial effect with a certain sign. This doesn’t seem a reason not to bet on what that replication ratio would be, conditional on a replication being tried. One could also bet on long term consensus opinion on this general hypothesis; not all bets have to be about specifics. One could even bet on if such a long term consensus opinion will ever form.

In addition, betting cannot be applied easily to policy studies that cannot readily be replicated. For example, a recent longitudinal analysis of an early childhood intervention in Jamaica reported an effect of 42% in earnings (Gertler et al., 2013). The estimate was based on a randomized trial, but we suspect the effect size was being overestimated for the usual reason that selection on statistical significance induces a positive bias in the magnitude of any comparison, and the reported estimate represents just one possible comparison that could have been performed on these data (Gelman, 2013a). So, if the study could be redone under the same conditions, we would bet the observed difference would be less than 42%. And under new conditions (larger-scale,modern-day interventions in other countries), we would expect to see further attenuation and bet that effects would be even lower, if measured in a controlled study using pre-chosen criteria. Given the difficulty in setting up such a study, though, any such bet would be close to meaningless. Similarly, there might be no easy way of evaluating the sorts of estimates that appear from time to time in the newspapers based on large public-health studies.

Bets on new studies might be “meaningless” for evaluating old studies – but we should care more about evaluating policy than about evaluating studies. If there will be some large studies conducted in the future, prediction market estimates of their future estimate values could be very useful to inform policy, even more useful than the actual estimates those studies will find.

That said, scientific prediction markets could be a step forward, just because it would facilitate clear predictive statements about replications. If a researcher believes in his or her theory, even while not willing to bet his or her published quantitative finding would reappear in a replication, that’s fine, but it would be good to see such a statement openly made. We don’t know that such bets would work well in practice—the biggest challenge would seem to be defining clearly enough the protocol for any replications—but we find it helpful to think in this framework, in that it forces us to consider, not just what is in a particular past data set, but also what might be happening in the general population.

On that, we can mostly agree.

GD Star Rating
Tagged as: ,
Trackback URL:
  • Most critics of a prediction-market culture miss (what I think is) the main point: prediction markets encourage information hoarding.

    Robin’s apologetics for speculative markets, holding that the self-correction (by means of economic recession) itself proves their prescience, ignores that centralist solutions could be more efficient by preventing the disaster—rather than second guessing the market. For the market itself created the imbalances—misjudging, on its own terms.

    • Ben Southwood

      Prediction markets do precisely the opposite—discourage information hoarding. If you keep your information to yourself, i.e. do not bet on the prediction market, then you forgo money you could otherwise make. Conversely, by quantifying and formalising your info and adding it to the common pool—you can make a profit.

      • VV

        The bets may be public, but the information used to make them generally wouldn’t, and the speculators would have no incentive to release it.

        Prediction markets, like all speculative markets, favour insiders.

      • Ben Southwood

        The important information is the market price after everyone’s had a say; it effectively aggregates all those individual says and turns them into something useful. The underlying “background information” used is of much lower importance. See and the Fundamental Theorems of Welfare Economics.

      • The underlying background information is of much lower importance.

        Much more would be required to reach this conclusion than Pareto Equilibrium and Hayek.

      • VV

        If the markets aren’t substantially efficient then the prices aren’t accurate estimations of the underlying probabilities, and markets can’t be substantially efficient, otherwise nobody would be able to consistently make money on them.

        In the real world, financial markets operate far from efficiency conditions, and it is in fact possible to make money using very simple and predictable investment strategies such as index tracking. Market equilibrium theorems are largely irrelevant.

      • oldoddjobs

        Wow, why doesn’t everyone just use this simple index tracking?

      • VV

        oldoddjobs because people think they can do better, even if they generally can’t.

      • IMASBA

        “markets can’t be substantially efficient, otherwise nobody would be able to consistently make money on them.”

        The number of players and transactions is large so from the central limit theorem you’d expect there to be people who make a living from speculation, even in a perfectly efficient market where everything is due to chance. Having said that there are ways that push markets away from optimal efficiency such as trading firms with advanced trading algorithms and access to indexes from foreign markets miliseconds before everyone else does (miliseconds are all the trading algorithms need) or speculators with pockets deep enough to manipulate currency exchange rates.

      • VV

        @IMASBA but if speculators were successfull in financial markets only due to chance, you wouldn’t expect index funds to do well on average.

      • IMASBA

        Isn’t the whole purpose of an index fund to be representative of the exchange? You expect them to do well but not better than a blind gambler, in fact you’d expect them to perform almost exactly like a blind gambler (maybe a little bit better because they are established firms with savings so they can weather some storms). The average blind gambler makes a profit of 8% per year on a long term basis (this 8% is the result of technoglogical progress, societal progress in poor countries and the increasing share of the global economy that’s occuppied by big firms due to economies of scale and lobbying) so skill isn’t necessary to do well. It’s like playing craps with a pot that continually grows 8%: on average every player can leave after a number of rounds and feel like they made money playing craps, while in reality they could’ve watched tv all the time and divided the pot and still end up with the same profit.

      • VV

        In an efficient market, with no information asymmmetry and other ideal assumptions, you expect nobody to have a non-zero average return.
        If you buy a stock, it will usually pay you dividends, but it the stock was correctly priced, as the efficiency assumption entails, then all these dividends, appropriately discounted, were already accounted for in the price you paid.

        Under weaker, more realistic assumptions, such as information asymmetry, different utility functions, and technicaal limits, you could expect some investment strategies to be consistently profitable, but only by making use of insider information, complex prediction techniques or exploiting the technical limits (e.g. by doint high frequency trading).
        But index tracking is very simple and predictable. Every other speculator can outpredict index funds, hence, if the markets were substantially efficient, you shouldn’t expect index funds to make money.

      • IMASBA

        “In an efficient market, with no information asymmmetry and other ideal assumptions, you expect nobody to have a non-zero average return.”

        Actually you do because there are always unforeseen events (such as natural disasters or changes in fashion) that even a perfectly efficient market cannot predict, but even if that were not true you’d still expect some players to get lucky over finite sequences of transactions and it just so happens that human lifespans are finite.

        “If you buy a stock, it will usually pay you dividends, but it the stock was correctly priced, as the efficiency assumption entails, then all these dividends, appropriately discounted, were already accounted for in the price you paid.”

        And you’d still make an 8% annual gross profit.

        Whether or not insiders or experts can beat the market depends on what it is that market is selling.

      • Some info has direct channels to get paid for revelation, while other info is only rewarded indirectly, via it helping to infer other info that is directly rewarded. Folks have incentives to reveal the info that is directly rewarded, but not the info that is not directly rewarded. This is a reason to want more kinds of info to be directly rewarded. Which is a reason to want more prediction markets on more topics, as well as to want more other kinds of info rewards on all info topics.

      • Folks have incentives to reveal the info that is directly rewarded, but not the info that is not directly rewarded.

        The relevant point is that folks have disincentives the reveal information when they can profit from secrecy.

        This is a reason to want more kinds of info to be directly rewarded

        Not necessarily, since rewarding some uses of information (like betting on it) punishes (some) information sharing.

        I think you must acknowledge at least the theoretical possibility that, on balance, prediction markets will decrease the sharing of vital information more than augmenting it.

      • brendan_r

        Stephen, you’re wrong. Once an investor shorts a stock, he publicizes his thesis relentlessly. Sure, while he’s accumulating his short position he keeps his thesis quiet. But once short, he promotes to speed up the price adjustment process.

        Net result: far more info sharing.

        Why would it work different in other contexts?

      • Not necessarily more information sharing on balance, even after he shorts: withheld is adverse information. (Also, general understandings that help you predict in the future may be withheld, unless its value of inducing conviction in the present case is great.)

        But your point is valuable in foreseeing the cultural consequences of prediction markets. These would include not just information hoarding but increased promotion activity as opposed to disinterested discussion.

        And (relatively) disinterested discussion is a fragile public good.

      • brendan_r

        Stephen, stocks which can’t be bet against (unshortable and no puts available) often become astronomically overvalued and badly misunderstood.

        “But your point is valuable in foreseeing the cultural consequences of prediction markets. These would include not just information hoarding but increased promotion activity as opposed to disinterested discussion.”

        The talk surrounding unshortable stocks is anything but enlightened, disinterested, discussion. It is the most gullible, naive BS out there. [1] The absence of Shorts doesn’t shut up the Longs; it emboldens them to spew nonsense.

        [1] Because when bad info is silenced, overoptimism results, which filters out non-loons. See Afro Studies for an analog.

  • Juan Mario Inca

    Really interesting Robin, thanks​!​

    I think that you would be really interested in some of the most cutting-edge research that I have come across explaining crowds, and prediction markets.​

    And you may also enjoy this blog about the same too:

    Powerful stuff, no?

  • Andrew Gelman


    Let me just clarify one small thing near the end of the discussion.

    We wrote, “Given the difficulty in setting up such a study, though, any such bet would be close to meaningless” to which you replied, “Bets on new studies might be “meaningless” for evaluating old studies – but we should care more about evaluating policy than about evaluating studies.”

    I agree that policy is what is important. When we wrote about the difficulty in setting up a study, our point was not that the goal was to evaluate studies; our point was just that it would be very difficult to set up a bet and get agreement on all the conditions.

    And now let me emphasize our agreement:

    I agree that, at the very least, whatever the practical difficulty of setting up real prediction markets, that it’s a good idea for people to be talking about such bets, in that it would move people toward more quantitative claims so that, instead of someone saying, “I stand by study X,” they would say, “Yes, I stand by the estimate that in the general population the difference between these two groups is truly Y, and I think an independent replication would have a 50% chance of producing a result as high as Y.” My experience seems to be that (a) defenders of published studies often duck away from defending their quantitative claims, and (b) when they do defend a quantitative claim, they’re generally secure in their knowledge that there will be enough wiggle room regarding any replication that they can continue to stand proudly behind their original result even after an attempted replication that completely fails (see, for example, here). In this sort of case I’d prefer if the researchers would recognize the concept of a prediction market and, if they don’t really think their results will replicate in anticipated way, just flat-out say so (and, for example, defend their research on the grounds that it doesn’t need to replicate, that they’re on a voyage of discovery in which every new study yields results that are a pleasant surprise, etc.).

    • I agree that it can be complex to define bets on undefined future studies. But once a study is specified clearly, it should be straight-forward to define bets on the possible results of that study.

  • Joe Teicher

    I have 2 predictions about prediction markets.

    1. Prediction Markets will never become popular

    2. If they do become popular, finance/trading experts will be net winners and subject-matter “experts” will be net losers.

    where can I make big bets on these predictions?


    “But speculative markets were the first credible source to correct the mistake [the housing crisis]”

    Isn’t that like saying a rock corrects the mistake of being in the air when it falls down due to gravity? The markets took notice of the housing crisis because a lot of people couldn’t pay their mortgage anymore, they didn’t predict and they in fact allowed it to get to the point of a crisis. The correction came from a link with the physical world. Prediction markets have no such link, except when they are decided at the end, but you want them to be useful before that, otherwise they’re not deserving of the name “prediction” market. That doesn’t mean prediction markets can never work, but it does mean you can’t just compare them to markets with links to the physical world that will therefore react to real world events even if all the players were computer programs or trained monkeys (just like a rock doesn’t have to know about gravity to fall down).

    • The speculative markets that first told of the financial crisis where *not* physically forced to their prices. Speculators set those prices because they foresaw future problems.

      • VV

        After they had been trading grossly overpriced

        Mortgage-backed securities for years.

      • IMASBA

        Really, they “foresaw” the problems before the problems came about? Did they act before the collapse of Lehman brothers?

        FYI, I’m just saying there’s a difference between markets where most people have a stake in the underlying goods or services and markets that are purely speculative. I actually believe in the case of the housing crisis the detachment would have served a prediction market well: a prediction market in 2005 based around the sustainability over say 5 years of the housing market would probably have leaned towards predicting trouble within those 5 years, whereas the actual housing market suffers from self-fulfilling prophecies and people trying to cash in right before the next bubble bursts. But there are also cases where detachment is not a good thing. I guess shortselling mortgages on a 5 year term is already a bit like a prediction market, it would be interesting to look at such markets (which you’ve probably done in much greater detail than I have). Oh, and you should really try to get prediction markets going in the UK (and/or Macau) if you want to empirically test them.

      • A few investors foresaw the problems as early as 2005, as described in the book The Big Short.

  • I invite the reader to try and identify a single instance in which a “deep structural parameter” has been estimated in a way that has affected the profession’s beliefs about the nature of preferences or production technologies or to identify a meaningful hypothesis about economic behavior that has fallen into disrepute because of a formal statistical test.The Scientific Illusion in Empirical Macroeconomics, Lawrence H. SummersThe Scandinavian Journal of Economics, Vol. 93, No. 2, Proceedings of a Conference on NewApproaches to Empirical Macroeconomics. (Jun., 1991), pp. 129-148

  • Ronfar

    I have a question about prediction markets and information sharing.

    Let’s say that there’s a 6-sided die with a bias in favor of one of the sides. The bias is currently unknown, but there’s a prediction market on what the die will roll on a specific future occasion. Other people have gotten a chance to roll the die, and the current market estimate of the probabilities of rolling each number is:

    1: 5%
    2: 5%
    3: 5%
    4: 75%
    5: 5%
    6: 5%

    I get a chance to roll the die once, and it comes up 6. This is new, relevant information that only I have. How should I bet so as to communicate this information?

    • You would raise the price on #6 up to say 5.1%. Just how much you move it up depends on just how many dice rolls everyone else has seen before.

      • Eliezer Yudkowsky

        Pointless quibble: Depending on how the original probabilities were arrived at, like “We previously saw five 4s and none of any other number, so we figured it wasn’t a fair die”, you might want to bet quite a lot differently from the previous distribution after seeing a single 6. Less pointlessly: Ronfar, if you have some math background, check out the Dirichlet prior and the Good-Turing estimator for some general principles on adjusting probabilities on this kind of sampling problem.

      • Ronfar

        What if everyone else has been keeping the number of die rolls they’ve seen, and their outcomes, secret? I thought all you needed was the market price!

      • No one claims that a small number of market prices reveals all possible related info.

      • But some here have claimed that market prices reveal the important information.

  • Pingback: Academic Stats Prediction Markets from Overcoming Bias | The Official SciCast Blog()

  • Dante

    Reading this in the newsfeed reader was impossible, since the quotes are done with p tags (whose padded-left style gets stripped) instead of the more reasonable blockquote tags.

    I’d bet this will be fixed in the mid to long term.

  • Pingback: Overcoming Bias : Fixing Academia Via Prediction Markets()