30 Comments

Richard,Naive evidence-based-medicine does show signs of overemphasizing statistical knowledge, partly in overreaction to the use of poor causal models.Rigorously acquired statistical knowledge is generally better than causal models which exist only in the mind of doctor who has little incentive to avoid biases such as overconfidence. But "generally better" doesn't mean "always better", and the manner in which causal models are chosen can be improved.Our knowledge about the effects of hitting the ground at a particular velocity hasn't been rigorously tested. But for almost all medical treatments, the effects are much weaker, and the weaker an effect is the more rigorously we need to examine the evidence. I doubt that your causal model of how high velocity impacts cause injury is particularly well thought out. I think your confidence in your causal model is based largely on the absence of people who have any doubts about the predictions you have made so far using that causal model.

I doubt that it's important to read any particular book before reading Causality, but it does require a good deal of comfort with basic statistical theory and with the way math is taught in college level textbooks.

Expand full comment

Judea Pearl makes the distinction between statistical knowledge and causal knowledge and argues that causal knowledge is much more useful. In fact, the main purpose of statistical knowledge in Pearl's way of looking at the world is to help a person or an intelligent agent acquire causal knowledge.

The evidence-based-medicine movement shows signs of overemphasizing statistical knowledge.

The parachute example is a good illustration of the superiority of causal knowledge because the reader knows that falling out of an airplane causes injury because of the velocity with which the person hits the ground. In the vocabulary developed by Pearl and his colleagues, the velocity at which the person hits the ground screens off the effect falling out of the airplane has on the injury.

Eliezer has written that the reader should read Pearl's Probabilistic Reasoning in Intelligent Systems before attempting Pearl's Causality book, and has advised students of AI to read Tom Mitchell's Machine Learning before reading Probabilistic Reasoning in Intelligent Systems.

Expand full comment

Well, it doesn't really matter if the first parachute is poorly defined. The issue is whether or not people interested in some form of "parachute-like device" tested it out or simply claimed it worked. We need not even go to the early versions. The question would be just as pertinent if it were asked if modern-day parachute manufacturers test their designs empirically to see if it reduces terminal velocity to levels that don't injure humans by trauma, or if they simply put their products on the market and claim it works because it redirects the Qi energy of falling into a safe landing according to the principles of acculanding, or the parachute canopy harnesses the homeopathic curative properties of atmospheric humidity.

Expand full comment

AL, it's unclear what constituted the first parachute. There was one initially tested on a dog in 1785. There were prior parachute-like devices tested by people jumping from buildings.

Robin, it's unclear what observable outcomes from tin foil hats we should be testing for. For sufficiently broad meanings of tin foil hat, near the end of this post is advice to "Wrap your rooms in aluminum foil" for protection against EMP that ought to be studied.

Expand full comment

As Lewis Thomas wrote, shortly before the RAND study, the most effective medical interventions are the cheapest and are based on theoretical understanding, not experiment.

Expand full comment

Did the inventor of the parachute not take some kind of evidence-based approach in determining whether or not the invention will work?

Expand full comment

Maybe I was wrong. Maybe this group just sleeps late on Sundays.

Expand full comment

Smith and Pell are flaunting their ignorance and stupidity. A randomised, controlled trial needs a control group: some participants don't get the treatment. And the assignment to the treatment group or the control group had better depend on something irrelevant such as the toss of a coin, or we will be in trouble when we try to work out cause and effect.

The authors further imagine that the treatment group and the control group have to be similar in size, that is, the randomisation has to be fifty-fifty. This is completely wrong.

If you wanted to set up a randomised trial of parachute effectiveness you could arrange that 99.9999% of the jumpers were assigned to the treatment group and 0.0001% were assigned to the control group. After 10 million jumps, you would have 10 splattered controls and a statistically significance result.

The biggest difficulty in arranging this is in distinguishing the experiment from what sports parachutests do anyway. When you jump out of the plane and pull the ripcord it is like rolling a dice with a 1 in 250 chance of your main shute failing. Whoops! You have become a candidate. You cut away and try the reserve. Reserves are more reliable. Very roughly one time in 4000 you find yourself assigned to the control group. Good luck. American sports parachutests are basically repeating this randomised controlled trial every four years.

OK no-one is writing this up in the BMJ, so it doesn't have the red ribbons and gold braid of a randomised controlled trial, but it does have the substance. There is a control group and assignment is randomised. A RCT purist can put his hand on his heart and say: My belief in parachutes is evidence based.

There is another problem with the article. The authors are annoyed at being asked to provide evidence in the form of RCT results for treatments in which they have faith. So they write a sarcastic article. The key ingredient is a treatment which is accepted even though there is no RCT testifying to its effectiveness. Notice the role of controversy in their article.

If a controversial treatment, whose efficacy is doubted, is given a free pass, and allowed into the repetoire of approved treatments without a RCT, the authors can say "what is sauce for the goose is sauce for the gander. Our favourite treatment should also be exempted from the RCT requirement even though it too is controversial."

Notice the direction of the implied comparison. If the treatments that are exempted from the requirement for RCT evidence are more controversial than their favourite treatment, then their argument is correspondingly strong. They can say "you shouldn't be bugging us for evidence, you should be focusing on those over there whose anecdotes are thinner and less plausible than ours"

Keep your eye on the direction of the comparison. If they start talking about less controversial treatments, their argument is correspondingly weaker. They might object to demands for RCT evidence to support their favourite treatment, but there is something quite natural about the EBM Nazis coming for the less convincing treatments first.

If they were doing a straight presentation this would be obvious. You show up the hypocrisy of the EBM Nazis by drawing attention to the free passes they give to treatments that might be no good and they are all the more embarrassed if the treatment is especially weak. Unfortunately the authors try to be sarcastic. So they turn things around. Their logic skills are not up to the job and they get the logic of the comparision the wrong way round. Yes, going after a treatment whose efficacy no-one doubts is good comedy, but the more certain the treatment they ridicule, the weaker their argument.

Expand full comment

The analogy, while humorous, is not accurate, because it doesn't take into account the magnitude of the effect. Medical therapies can have subtle benefits AND risks, which only randomized trials with large sample sizes can uncover.

The probability of surviving a sky jump with a working parachute is greater than 99%. The probability of surviving without a parachute is less than 1%. For an effect of that magnitude, you would only need a sample size of six or seven to achieve statistical rigor, while I would personally be satisfied after watching three people with parachutes and with vehicle (a parachute that fails to deploy) jump out of airplanes. If you've ever watched TV, you have enough data.

Expand full comment

Even worse, parachutes encourage dangerous behavior (jumping out of airplanes). They should be banned immediately.

Expand full comment

I find it odd that, with so many responses to the article, no one addressed the issue of how the intended use of a trial outcome affects the choice of methodology.

The dialogue so far assumes that the question is therapeutic: Should a doctor recommend one therapy over another?

The reality is that another important use of randomized trials is for regulation and insurance. Randomized trials are required for a drug to be legal, and for a drug or procedure to be paid for by insurance. The proper analogy to make is not whether it is sensible to use a parachute before jumping out of an airplane in the absence of randomized trials. The more relevant analogies are:

1) Is it proper for insurance companies to help pay for the cost of parachutes? (For this analogy to work, we must assume that it is known before takeoff that the plane will crash, and that it must take off anyway.) For the answer to be "yes", we should require more proof of efficacy than for the therapeutic question.

2) Is it proper for the government to forbid the use of parachutes in the absence of randomized trials? For the answer to be "yes", we should require more proof of inefficacy than for the therapeutic question.

The discussion of the parachute question has focused on the question of effectiveness, without any mention of the regulation which (if this were a serious medical issue) would be the main outcome of the discussion.

Likewise, many debates about controversial drugs have a clear answer if you realize that the question being asked is not just "is this drug effective?" but also "should use of this drug be legal?" The debate takes place framed as if it were a debate about therapy; the results are used as if it had been a debate about regulation; and the result is too much regulation.

Expand full comment

The theory that injury is causally related to the speed at impact is pretty robust; the theory that an apple a day keeps the doctor away, less so.

Expand full comment

Well, humor aside, there's two major things you can say about the value of parachutes:

1) The basic physical and mechanical argument for how damage takes place without parachutes, and how parachutes prevent it, is much more straightforward than the mode of action of most pharmaceuticals, or even of surgical interventions. Sure, there's a lot of things for which you could claim "It's as straightforward as jumping without a parachute", but very few of them are actually that straightforward after the complexity of human biochemistry is taken into account.

2) A controlled study would be very expensive, so we have no choice but to engage in strange, dangerous, high-minded modes of reasoning that use less evidence.

Expand full comment

I share your appreciation for the dry humor displayed by many educated Brits, and the fascination that David J. Balan expressed about them yesterday. But as to the actual subject matter of the two posts (evidence-based medicine, and the positive margin of medical intervention in two disorders found in Britain), my evidence-based observation is this: Balan's post generated only 2 comments, one of them mine, a seemingly pertinent question that Robin Hanson ducked; and this post has generated one comment, a snark of no value, plus this one (perhaps also of no value). While I'm not going to take the time to give an accurate count, it is within the ballpark to say that Hanson's series on the harmful aggregate effect of the US medical system generated hundreds of posts. I happen to accept Hanson's analysis for the group as a whole (why shouldn't I? It's math, not philosophy). Yet I can see there is a strong bias at work here, away from addressing both sides of the coin. Perhaps could it be that this blog is not actually what it seems, but rather this? It's not meant to be factual, but still ...

Expand full comment

There also seem to be no published randomized trials of tin-foil hats. Given the extremely low cost of such a trial, what does this say about tin-foil hat skeptics?

Expand full comment

Richard,Naive evidence-based-medicine does show signs of overemphasizing statistical knowledge, partly in overreaction to the use of poor causal models.Rigorously acquired statistical knowledge is generally better than causal models which exist only in the mind of doctor who has little incentive to avoid biases such as overconfidence. But "generally better" doesn't mean "always better", and the manner in which causal models are chosen can be improved.Our knowledge about the effects of hitting the ground at a particular velocity hasn't been rigorously tested. But for almost all medical treatments, the effects are much weaker, and the weaker an effect is the more rigorously we need to examine the evidence. I doubt that your causal model of how high velocity impacts cause injury is particularly well thought out. I think your confidence in your causal model is based largely on the absence of people who have any doubts about the predictions you have made so far using that causal model.

I doubt that it's important to read any particular book before reading Causality, but it does require a good deal of comfort with basic statistical theory and with the way math is taught in college level textbooks.

Expand full comment