Category Archives: Statistics

Why so little model checking done in statistics?

One thing that bugs me is that there seems to be so little model checking done in statistics.  Data-based model checking is a powerful tool for overcoming bias, and it’s frustrating to see this tool used so rarely.  As I wrote in this referee report,

I’d like to see some graphs of the raw data, along with replicated datasets from the model. The paper admirably connects the underlying problem to the statistical model; however, the Bayesian approach requires a lot of modeling assumptions, and I’d be a lot more convinced if I could (a) see some of the data and (b) see that the fitted model would produce simulations that look somewhat like the actual data. Otherwise we’re taking it all on faith.

But, why, if this is such a good idea, do people not do it?

GD Star Rating
Tagged as:

Will bond yields go up, or down, or remain the same? If you’re a TV pundit and your job is to explain the outcome after the fact, then there’s no reason to worry. No matter which of the three possibilities comes true, you’ll be able to explain why the outcome perfectly fits your pet market theory . There’s no reason to think of these three possibilities as somehow opposed to one another, as exclusive, because you’ll get full marks for punditry no matter which outcome occurs.

But wait! Suppose you’re a novice TV pundit, and you aren’t experienced enough to make up plausible explanations on the spot. You need to prepare remarks in advance for tomorrow’s broadcast, and you have limited time to prepare. In this case, it would be helpful to know which outcome will actually occur – whether bond yields will go up, down, or remain the same – because then you would only need to prepare one set of excuses.

Alas, no one can possibly foresee the future. What are you to do? You certainly can’t use "probabilities". We all know from school that "probabilities" are little numbers that appear next to a word problem, and there aren’t any little numbers here. Worse, you feel uncertain. You don’t remember feeling uncertain while you were manipulating the little numbers in word problems. College classes teaching math are nice clean places, therefore math itself can’t apply to life situations that aren’t nice and clean.  You wouldn’t want to inappropriately transfer thinking skills from one context to another.  Clearly, this is not a matter for "probabilities".

GD Star Rating

How should unproven findings be publicized?

A year or so ago I heard about a couple of papers by Satoshi Kanazawa on "Engineers have more sons, nurses have more daughters" and "Beautiful parents have more daughters."  The titles surprised me, because in my acquaintance with such data, I’d seen very little evidence of sex ratios at birth varying much at all, certainly not by 26% as was claimed in one of these papers.  I looked into it and indeed it turned out that the findings could be explained as statistical artifacts–the key errors were, in one of the studies, controlling for intermediate outcomes and, in the other study, reporting only one of multiple potential hypothesis tests.  At the time, I felt that a key weakness of the research was that it did not include collaboration with statisticians, experimental psychologists, or others who are aware of these issues.

GD Star Rating
Tagged as: , ,

Statistical inefficiency = bias, or, Increasing efficiency will reduce bias (on average), or, There is no bias-variance tradeoff

Statisticians often talk about a bias-variance tradeoff, comparing a simple unbiased estimator (for example, a difference in differences) to something more efficient but possibly biased (for example, a regression).  There’s commonly the attitude that the unbiased estimate is a better or safer choice.  My only point here is that, by using a less efficient estimate, we are generally choosing to estimate fewer parameters (for example, estimating an average incumbency effect over a 40-year period rather than estimating a separate effect for each year or each decade).  Or estimating an overall effect of a treatment rather than separate estimates for men and women.  If we do this–make the seemingly conservative choice to not estimate interactions, we are implicitly estimating these interactions at zero, which is not unbiased at all!

I’m not saying that there are any easy answers to this; for example, see here for one of my struggles with interactions in an applied problem—in this case (estimating the effect of incentives in sample surveys), we were particularly interested in certain interactions even thought they could not be estimated precisely from data.

GD Star Rating
Tagged as:

Priors as Mathematical Objects

Followup to:  "Inductive Bias"

What exactly is a "prior", as a mathematical object?  Suppose you’re looking at an urn filled with red and white balls.  When you draw the very first ball, you haven’t yet had a chance to gather much evidence, so you start out with a rather vague and fuzzy expectation of what might happen – you might say "fifty/fifty, even odds" for the chance of getting a red or white ball.  But you’re ready to revise that estimate for future balls as soon as you’ve drawn a few samples.  So then this initial probability estimate, 0.5, is not repeat not a "prior".

An introduction to Bayes’s Rule for confused students might refer to the population frequency of breast cancer as the "prior probability of breast cancer", and the revised probability after a mammography as the "posterior probability". But in the scriptures of Deep Bayesianism, such as Probability Theory: The Logic of Science, one finds a quite different concept – that of prior information, which includes e.g. our beliefs about the sensitivity and specificity of mammography exams. Our belief about the population frequency of breast cancer is only one small element of our prior information.

GD Star Rating

“Inductive Bias”

(Part two in a series on "statistical bias", "inductive bias", and "cognitive bias".)

Suppose that you see a swan for the first time, and it is white.  It does not follow logically that the next swan you see must be white, but white seems like a better guess than any other color.  A machine learning algorithm of the more rigid sort, if it sees a single white swan, may thereafter predict that any swan seen will be white.  But this, of course, does not follow logically – though AIs of this sort are often misnamed "logical".  For a purely logical reasoner to label the next swan white as a deductive conclusion, it would need an additional assumption:  "All swans are the same color."  This is a wonderful assumption to make if all swans are, in reality, the same color; otherwise, not so good.  Tom Mitchell’s Machine Learning defines the inductive bias of a machine learning algorithm as the assumptions that must be added to the observed data to transform the algorithm’s outputs into logical deductions.

A more general view of inductive bias would identify it with a Bayesian’s prior over sequences of observations…

GD Star Rating

The Error of Crowds

I’ve always been annoyed at the notion that the bias-variance decomposition tells us something about modesty or Philosophical Majoritarianism.  For example, Scott Page rearranges the equation to get what he calls the Diversity Prediction Theorem:

Collective Error = Average Individual Error – Prediction Diversity

I think I’ve finally come up with a nice, mathematical way to drive a stake through the heart of that concept and bury it beneath a crossroads at midnight, though I fully expect that it shall someday rise again and shamble forth to eat the brains of the living.

GD Star Rating

Useful Statistical Biases

Friday’s post on statistical bias and the bias-variance decomposition discussed how the squared error of an estimator equals the directional error of the estimator plus the variance of the estimator.  All else being equal, bias is bad – you want to get rid of it.  But all else is not always equal.  Sometimes, by accepting a small amount of bias in your estimator, you can eliminate a large amount of variance.  This is known as the "bias-variance tradeoff".

GD Star Rating

“Statistical Bias”

(Part one in a series on "statistical bias", "inductive bias", and "cognitive bias".)

"Bias" as used in the field of statistics refers to directional error in an estimator.  Statistical bias is error you cannot correct by repeating the experiment many times and averaging together the results.

The famous bias-variance decomposition states that the expected squared error is equal to the squared directional error, or bias, plus the squared random error, or variance.  The law of large numbers says that you can reduce variance, not bias, by repeating the experiment many times and averaging the results.

GD Star Rating