A new article at Behavioral and Brain Sciences reviews attempts to explain the following puzzle. People do badly at questions worded this way: The probability of breast cancer is 1% for a woman at age forty who participates in routine screening. If a woman has breast cancer, the probability is 80% that she will get a positive mammography. If a woman does not have breast cancer, the probability is 9.6% that she will also get a positive mammography. A woman in this age group had a positive mammography in a routine screening. What is the probability that she actually has breast cancer? __%

Self-plug: Anyone who has trouble teaching Bayes's Theorem to high school students can send them to An Intuitive Explanation of Bayesian Reasoning, which I designed after reading all the pessimistic papers about how hard it is to get subjects to retain Bayes's Theorem for two weeks. Includes neat Java applets.

In teaching AP statistics in high school, I have found that many students have an easier time doing a conditional probability or Bayes' Theorem problem if they put everything in terms of a total frequency of 100.

I think that this has something to do with preferring concrete to abstract, but I'm not sure.

Is the issue really visualization of probabilities or the fact that conditional probabilities are tricky to interpret because they deemphasize the a priori probabilities? To me, both desriptions are confusing. Why not just work with the probabilities of sample outcomes, scaled by an appropriately large group. For this example, why not say "In a thousand tests, we expect that 8 women will test positive and have cancer, 95 women will test positive but not have cancer, 2 women will test negative but actually have cancer and about 895 women will test negative and not have cancer.

This says that a women who test positive can look at the facts and deduce that she is among a group of about 103 who tested positive and about 8 of those women will be found to have cancer, which will suggest the right answer of about 8/103.

Do you really mean that as an imperative?I suspect this is not a matter of bad reasoning, but lack of reasoning. Thus I think that someone willing to put in as much effort as to rephrase the problem, was going to solve it anyhow.

Thus, I think a more helpful heuristic would be to make other people encounter frequencies, rather than probabilities. But this is a pretty contrived example: surely it would be better to give the doctor false positive rates than to make it easier for the doctor to compute them.

This is why visualising risks of different options using dartboards and roulette wheels likely works so well, while pie charts (more based on frequency) do badly. See Hoffman JR, Wilkes MS, Day FC, Bell DS, Higa JK (2006) The Roulette Wheel: An Aid to Informed Decision Making. PLoS Med 3(6): e137 doi:10.1371/journal.pmed.0030137

Still, people tend to estimate areas as the real area to the power of 0.7, http://en.wikipedia.org/wik... and this might still give an overestimation of small areas compared to large ones.

## Think Frequencies, Not Probabilities

Self-plug: Anyone who has trouble teaching Bayes's Theorem to high school students can send them to An Intuitive Explanation of Bayesian Reasoning, which I designed after reading all the pessimistic papers about how hard it is to get subjects to retain Bayes's Theorem for two weeks. Includes neat Java applets.

In teaching AP statistics in high school, I have found that many students have an easier time doing a conditional probability or Bayes' Theorem problem if they put everything in terms of a total frequency of 100.

I think that this has something to do with preferring concrete to abstract, but I'm not sure.

Is the issue really visualization of probabilities or the fact that conditional probabilities are tricky to interpret because they deemphasize the a priori probabilities? To me, both desriptions are confusing. Why not just work with the probabilities of sample outcomes, scaled by an appropriately large group. For this example, why not say "In a thousand tests, we expect that 8 women will test positive and have cancer, 95 women will test positive but not have cancer, 2 women will test negative but actually have cancer and about 895 women will test negative and not have cancer.

This says that a women who test positive can look at the facts and deduce that she is among a group of about 103 who tested positive and about 8 of those women will be found to have cancer, which will suggest the right answer of about 8/103.

prefer to reason in terms of frequencies

Do you really mean that as an imperative?I suspect this is not a matter of bad reasoning, but lack of reasoning. Thus I think that someone willing to put in as much effort as to rephrase the problem, was going to solve it anyhow.

Thus, I think a more helpful heuristic would be to make other people encounter frequencies, rather than probabilities. But this is a pretty contrived example: surely it would be better to give the doctor false positive rates than to make it easier for the doctor to compute them.

This is why visualising risks of different options using dartboards and roulette wheels likely works so well, while pie charts (more based on frequency) do badly. See Hoffman JR, Wilkes MS, Day FC, Bell DS, Higa JK (2006) The Roulette Wheel: An Aid to Informed Decision Making. PLoS Med 3(6): e137 doi:10.1371/journal.pmed.0030137

Still, people tend to estimate areas as the real area to the power of 0.7, http://en.wikipedia.org/wik... and this might still give an overestimation of small areas compared to large ones.