Rationality Requires Common Priors
Late in November 2006 I started this blog, and a month later on Christmas eve I reported briefly on the official publication (after 8 rejections) of my paper Uncommon Priors Require Origin Disputes. That was twelve years ago, and now Google Scholar tells me that this paper has 17 cites, which is about 0.4% of my 3933 total cites, which I’d say greatly under-estimates its value.
Recently I had the good fortune to be invited to speak at the Rutgers Seminar on Foundations of Probability, and I took that opportunity to raise awareness about my old paper. Only about ten folks attended (a famous philosopher spoke nearby at the same time), but this video was taken:
In the video my slides are at times dim, but they can be seen sharp here. Let me now try to explain why my topic is important, and what is my result.
In economics, the most common formal model of a rational agent, by far, is that of a Bayesian. This standard model is also very common in business, political science, statistics, computer science, and many other fields. As there is actually a family of related models, we can use this space to argue about what it means to be “rational”. People argue over various particular proposed “rationality constraints” which limit this space of possibilities to varying degrees.
In economics, the standard model starts with a large (finite) state space, wherein each state resolves all relevant uncertainty; every interesting question is completely answered once you know which state is the true state. Each agent in this model has a prior function which assigns a probability to each state in this space. For any given time and situation an agent’s info can be expressed as as set; at any state, each agent has an info set of states where they know that the true state is somewhere within that set, but don’t know where within that set. Any small piece of info is also expressible as a set; to combine info, you intersect sets.
Given a state space, prior, and info, an agent’s expectation or belief is given by a weighted average, using their prior and conditioned on their info set. That is, all variations in agent beliefs across time or situation are to be explained by variations in their info. We usually assume that info is cumulative, so that each agent knows everything that they have ever known in the past. In order to predict actions, in addition to beliefs, the most common approach is to assume agents maximize expected utility, where each agent has another function that assigns a numerical utility value to each possible state.
Some people study ways to relax these assumptions, such as by using a set of priors instead of a single prior, by seeking computationally feasible approximations, or by allowing agents to forget info they once knew. Other people focus on adding stronger assumptions. For example, when a situation has a natural likelihood function giving the chances of particular outcomes assuming particular parameter settings, we usually assume that each agent’s prior agrees with this likelihood. Some people offer arguments for why particular priors are natural for particular situations. And models also usually assume that differing agents have the same prior.
One key rationality question is when it is reasonable to disagree with other people. Most intellectuals see disagreement as rational, and are surprised to learn that theory often says otherwise. This issue turns crucially on the common prior assumption. Given uncommon priors, it is easy to disagree, but given common priors it is hard to escape the conclusion that it is irrational to knowingly disagree, in the following sense of “foresee to disagree.” Assume you are now estimating some number X, and also now estimating some other person’s future estimate of X, an estimate that they will make at some future time. There is a difference now between these two numbers, and you will now clearly tell that other person the sign of this difference. They will then take this sign into account when making their future estimate.
In this situation, for standard Bayesians, this sign must equal zero; you can’t both warn them that you expect their estimate will be too high relative to your estimate, and then also still expect them to remain too high. They will instead listen to your warning and correct enough based on it. This sort of result holds nearly exactly for many slight weakenings of the standard rationality assumptions, but not if we assume big prior differences. And we have seen clearly in the lab, and in real life, humans can in fact often “foresee to disagree” in this sense.
Humans do foresee to disagree, while Bayesians with common priors do not. So are humans rational or irrational here? To answer that question, we must study the arguments for and against common priors. Not just arguments that particular aspects of priors should be common, or that they should be the common in certain simple situations. No, here we need arguments that entire prior functions should or should not be the same. And you can look long and hard without finding much on this topic.
Some people simply declare that differing beliefs should only result from differing information, but others are not persuaded by this. Some people note that as expected utility is a sum over products of probability and utility, one can arbitrarily rescale each probability and utility together holding constant that product, and get all the same decisions. So one can assume common priors without loss of generality, as long as one is free enough to change utility functions. But of course this also makes uncommon priors also without loss of generality. And we are often clear that we mean different things by probabilities and utilities, and thus are not free to vary them arbitrarily. If it means something different to say that an event is unlikely than it means to say that that event’s outcome differences are less important to you, then probabilities mean something different from utilities.
And so finally we get to my paper, Uncommon Priors Require Origin Disputes, which offers one of the few papers I have ever seen to give a concrete argument on common priors. Most everyone who hears it seems persuaded, yet it is rarely mentioned when people summarize what we know about rationality in Bayesian frameworks. If you read the rest of this post, at least you will know.
My argument is pretty simple, though I needed a clever construction to let me say it formally. If the beliefs of a person are described in part by a prior, then that prior must have come from somewhere. My key idea is to use beliefs about the origins of priors to constrain rational priors. For example, if you knew that a few minutes ago someone stuck a probe into your brain and randomly changed your prior, you would probably want to reverse that change. So not all causal origins of priors seem equally rational.
However, there’s one big obstacle to reasoning about prior origins. The natural way to talk about origins is to make and use some sort of probability distribution over different possible priors, origin features, and other events. But in every standard Bayesian model, the priors of all agents are common knowledge. That is, priors are all the same in all possible states, so no one can have any degree of uncertainty about them, or about what anyone else knows about them. Everyone is always completely sure about who has what priors.
To evade this obstacle, I chose to embed a standard model within a larger standard model. So there is a model and a pre-model. While the ordinary model has ordinary states and priors, the pre-model has pre-states and pre-priors. It is in the pre-model that we can reason about the causal origins of the priors of the model.
The pre-states of the pre-model are simply pairs of an ordinary state and an ordinary prior assignment, that says which agents get which priors. So a pre-prior is a probability distribution over the set of all combinations of possible states in the ordinary model, and possible prior assignments for that ordinary model. Each agent would initially know nothing about anything, including about ordinary states or who will get which prior. Their pre-prior would summarize their beliefs in this state of ignorance. Then at some point all agents would have learned about which prior they and the other agents will be using. From this point forward, agent info sets are entirely within an ordinary model, where their prior is common knowledge and gives them ordinary beliefs about ordinary states. So from this point on, an ordinary model is sufficient to describe everyone’s beliefs.
The key pre-rationality constraint that I propose is to have pre-priors agree with priors when they can condition on the same info. So if we condition an agent’s pre-prior on the assignment of who gets which priors, and then ask for the probability of some ordinary event, we should get the same answer as when we simply ask their prior for the probability of that ordinary event. And merely inspecting the form of this simple key equation is enough to draw my key conclusion: Within any single pre-prior that satisfies the pre-rationality condition, all ordinary events are conditionally independent of other agent’s priors, given that agent’s prior.
So, within a pre-prior, an agent believes that ordinary events and their own prior are informative about each other; priors are different when events are different, and in the sensible way. But also within this pre-prior, each agent believes that the priors of other agents are not otherwise informative about ordinary events. The priors of other agents can only predict ordinary events by predicting the prior of this agent; absent that connection, ordinary events and other priors do not predict each other.
I summarize this as believing that “my prior had special origins.” My prior was created via a process that caused it to correlate with other events in the world, but the priors of other agents were not created in this way. And of course this belief that you were made special is hard to square with many common beliefs about the causal origins of priors. This belief is not consistent with your prior being encoded in your genes via the usual processes of genetic inheritance and variation. It is similarly not consistent with many common theories of cultural inheritance and variation.
The obvious and easy way to not believe that your prior resulted from a special unusual origin process is to have common priors. And so this pre-rationality constraint can be seen as usually favoring common priors. I thus have a concrete argument that Bayesians should have common priors, an argument based on the reasonable rationality consideration that not all causal origins of priors are equally rational. If priors should be consistent with plausible beliefs about their causal origins, then priors must typically be common.