the reason "Why Most Published Research Findings Are False" is the most downloaded technical paper is same reason WHO magazine sells more copies when it has a photo of B. Spears drugged and semi naked on the cover than say a photo of a moth. It is sensational - nothing more.

it is the very system that is criticized in this 'blog' that generates new papers and meta-analysis that contradicts previous findings. that is the strength of the scientific process. it is possible to be closer to an 'objective truth' without actually ever getting there.

Aristotle's thoughts on the motion of objects were better than nothing at explaining the world, which were superceded with Newton's and then by Einstein's. It is certain that all three models are 'wrong' - that hardly reduces the merit of them.

The famous quote "If I have seen further it is by standing on the shoulders of giants." should really read "If I have seen further it is by standing on the corpses of incorrect theories."

Expand full comment

There is actually a very standard question in Statistics, generally refered to as the Law of Very Large Numbers: given a finite database, you can try to infer enough ideas so that a pointless one get out and is relevant -- that is basically the same idea.

The usual solution is to only make sensible assumption: easier to say in reasonable science with little history then Medical Science with Centuries of Documented Research and Billions at stake. The only other alternative is to actually reproduce the experiment, not claim it is well-documented enough to. Some scientist do not quote a result unless it was reproduced -- I recommend this information be linked to papers.

If retraction is too harsh a step for a minority opinion, then let's add a tag to it: the good news is that it would attract attention to things that might be hidden behind overlooked experimental designs, or assumptions. It would also help measure the actual ratio of false positive -- the 1:20 mentioned earlier is only the most common limit, not the actual rate. A lower limit would not make sense: it has more to do with the accuracy of measurements then the science behind.

Expand full comment

"Retraction" of an article is an unusual step. If a study merely gives the "wrong" answer (say, because it is among the 1 in 20 studies that gives statistically significant results by chance), it isn't retracted. For a paper to be retracted means that there was something so bad about the paper that it should never be looked at again. Fraud would call for a retraction, as would a severely flawed experimental design that does not actually measure what it claims to be measuring. Therefore, the number of retracted studies should be much lower than the number of studies that are contradicted by further studies.

Expand full comment

I doubt just lowering the acceptable p to 0.01 would do it. Consider an actual experiment that "proved" ESP at about p=0.0001. Flawed of course, because IIUC it used a predictable random number generator, but it got a nice low p value.

I like Robin's idea in theory, but I see two problems. Articles that included betting offers, for their own protection against being basically robbed, would have to specify what counts as replication in detail.

That poses two problems:

<ul><li>The articulation problem. It's difficult to articulate all the assumptions one has used, against the possibility of a robber finding a loophole in the conditions.</li><li>The "fine print" problem, where the details of the replication conditions remove any real possibility of claiming the reward.</li></ul>

Expand full comment

Imagine articles ended with betting odds, odds at which the authors offer to bet that, conditional on a replication attempt, such an attempt will vindicate the article's conclusion. Of course this offer would only be valid for a limited time and quantity. And each conclusion might have a different odds.

Expand full comment

Apparently, not many Overcoming Bias readers are still surprised.

So what kind of politically realistic solution would work here? The first thing that comes to mind is lowering "statistical significance" to p<0.01. Is there any other implementable solution that stands a decent chance of going through?

Expand full comment