The latest New Yorker:
All sorts of well-established, multiply confirmed findings have started to look increasingly uncertain. … This phenomenon … is occurring across a wide range of fields, from psychology to ecology. … The most likely explanation for the decline is … regression to the mean. … Biologist Michael Jennions argues that the decline effect is largely a product of publication bias. Biologist Richard Palmer suspects that an equally significant issue is the selective reporting of results. … The disturbing implication … is that a lot of extraordinary scientific data is nothing but noise. (more)
Academics are trustees of one of our greatest resources – the accumulated abstract knowledge of our ancestors. Academics appear to spend most of their time trying to add to that knowledge, and such effort is mostly empirical – seeking new interesting data. Alas, for the purpose of intellectual progress, most of that effort is wasted. And one of the main wastes is academics being too gullible about their and allies’ findings, and too skeptical about rivals’ findings.
Academics can easily coordinate to be skeptical of the findings of non-academics and low-prestige academics. Beyond that, each academic has an incentive to be gullible about his own findings, and his colleagues, journals, institutions, etc. share in that incentive as they gain status by association with him. The main contrary incentive is a fear that others will at some point dislike a findings’ conclusions, methods, or conflicts with other findings.
Academics in an area can often coordinate to declare their conclusions reasonable, methods sound, and conflicts minimal. If they do this, the main anti-guillibility incentives are outsiders’ current or future complaints. And if an academic area is prestigious and unified enough, it can resist and retaliate against complaints from academics in other fields, the way medicine now easily resists complaints from economics. Conflicts with future evidence can be dismissed by saying they did their best using the standards of the time.
It is not clear that these problems hurt academics’ overall reputation, or that academics care much to coordinate to protect it. But if academics wanted to limit the gullibility of academics in other fields, their main tool would be simple clear social norms, like those now encouraging public written archives, randomized trials, controlled experiments, math-expressed theories, and statistically-significant estimates.
Such norms remain insufficient, as great inefficiency remains. How can we do better? The article above concludes by suggesting:
We like to pretend that our experiments define the truth for us. But … when the experiments are done, we still have to choose what to believe.
True, but of little use. The article’s only other suggestion:
Schooler says “Every researcher should have to spell out, in advance, how many subjects they’re going to use, and what exactly they’re testing, and what constitutes a sufficient level of proof.”
Alas this still allows much publication bias, and one just cannot anticipate all reasonable ways to learn from data before it is collected. Arnold Kling suggests:
An imperfect but workable fix would be to standardize on a lower significance level. I think that for most ordinary research, the significance level ought to be set at .001.
I agree this would reduce excess gullibility, though at the expense of increasing excess skepticism. My proposal naturally involves prediction markets:
When possible, a paper whose main contribution is “interesting” empirical estimates should give a description of a much better (i.e., larger later) study that, if funded, would offer more accurate estimates. There should be funding to cover a small (say 0.001) chance of actually doing that better study, and to subsidize a conditional betting markets on its results, open to a large referee community with access to the paper for a min period (say a week). A paper should not gain prestigious publication mainly on the basis of “interesting” estimates if current market estimates of better estimates do not support those estimates.
Theory papers containing proofs might similarly offer bets on whether errors will be found in them, and might also offer conditional bets on if more interesting and general results could be proven, if sufficient resources were put to the task.
More quotes from that New Yorker article:
The study turned [Schooler] into an academic star. … It has been cited more than four hundred times. … [But] it was proving difficult to replicate. …his colleagues assured him that such things happened all the time. … “I really should stop talking about this. But I can’t.” That’s because he is convinced he’s has stumbled on a serious problem, one that afflicts many of the most exciting new ideas in psychology. ….
Jennions admits that his findings are troubling, but expresses a reluctance to talk about them publicly. “this is a very sensitive issue for scientists,” he says. … In recent years, publication bias has mostly been seen as a problem for clinical trials …But its becoming increasingly clear that publication bias also produces major distortions in fields without large corporate incentives, such as psychology and ecology. …
“Once I realized that selective reporting is everywhere in science, I got quite depressed.” Palmer told me. … “I had no idea how widespread it is.” … “Some – perhaps many – cherished generalities are at best exaggerated … at at worst a collective illusion.” … John Ioannidis … says … “We waste a lot of money treating millions of patients and doing lots of follow up studies on other themes based on results that are misleading.”
You don't want to publish papers that are most likely to be true. You want to publish papers that change your Bayesian priors the most. This system would screen out all novel ideas.
Should we have a prediction market in how much the prediction market idea can be successfully extended in the myriad ways you propose, and then fund the development of those markets accordingly?
Of course it makes sense to develop the ideas that you continue to flog out new and different applications of the idea. And part of the value of doing that is it gets others, including myself, to thinking, what are the limits and how can we determine them?
So I ask you, how good at predicting corporate values are the stock market? Maybe this is already studied and reported?
I can certainly speak qualitatively of the stock markets failures. It did not predict the total explosion of mortgage backed securities and the effect of that on numerous banks and other large tradable companies. It did not predict the over-valuation, the over-investment, in internet companies in the 1990s.
I'd love to start seeing some education from you on the limits and failures of prediction markets intermixed with the amazing stream of abstruse proposals for their use.