Why Not Impossible Worlds?

Physicists, statisticians, computer scientists, economists, and many philosophers rely on the following standard ("Bayesian") approach to analyzing and modeling information:

  1. Identify a set of "possible worlds," i.e., self-consistent sets of answers to all relevant questions.
  2. Express the information in any situation as clues that can exclude some worlds from consideration.
  3. Assign a "reasonable" probability distribution over all these worlds.
  4. Calculate any desired expected value in any information situation by averaging over non-excluded worlds.

This is a normative ideal, not a practical exact procedure.  That is, we try to correct for any "bias," or systematic deviation between what a complete analysis of this sort would give and what we actually believe.

This approach has been applied to many kinds of "possibilities."  In computer science worlds describe different possible states of a computer.  In physics worlds describe different possible arrangements of particles in space.  Centered-possible-worlds can describe uncertainty about where you are in a physical world.  In scientific inference one considers worlds with different physical laws.  In game theory one considers any outcome that any player thinks is possible, or thinks that other players think, etc. … is possible.

What if we used "impossible worlds," i.e., not necessarily self-consistent sets of answers to relevant questions?   The idea would be to analyze and model situations where we are prone to errors and other limitations when we reason about logic and "a priori" truths, i.e., claims which would be either true or false in all ordinary possible worlds.  (E.g., "All bachelors are unmarried.")  In such situations, our information includes not only clues about what atoms are where, but also clues about what sets of answers are consistent with each other.   

A lone agent with ideal reasoning abilities would find no useful clues about a priori truths; while he could calculate expected values, his beliefs about such things would never change with time or context. The beliefs of real social creatures, however, do change with time and context, and reasonably so.   Learning about arguments for or against claims, and about the opinions of various people on such claims, provides us with relevant reasons for changing our beliefs.   

Through the use of impossible worlds, our standard approach to information seems capable of usefully describing such imperfect logic situations.  And I see no reason not to use them this way.   Thus I conclude that standard "agreeing to disagree" results apply to disagreements about a priori truths. 

GD Star Rating
Tagged as: , ,
Trackback URL:
  • Bill

    Is this an example? I want to see if I understand.

    Let’s think about the 1000th digit of pi. I would describe the numbers 0-9 as its “possible” values. However, in fact, nine of those values are impossible; by reason alone (i.e. a priori), we could find the true value, and any world where pi had a different value is an impossible world.

    If I had to take a bet right now, I would assign a 10% chance to each value, and bet according to these probabilities. I therefore would use probabilities to represent my lack of knowledge of this a priori truth.

  • Paul Gowder

    This is very interesting — but I tend to still think step #3 is the real kicker, and want to resurrect my arbitrariness objection. Here’s why. An a posteriori possibility can have a specific probability assigned to it by a Bayesian based on prior observed similar events: if 80% of the French waiters Bill the Bayesian has previously observed are rude, Bill can reasonably assign .8 to the probability of a specific French waiter’s being rude.

    However, there are big problems with doing this in the context of a priori possibilities. By definition, a priori truths aren’t known by observation. This means that there will be no pattern of prior observations about similar a priori truths on which you can base your reasonable probability estimate about the a priori possibility at issue. The best you could do would be to use fairly dubious approximations based on things like level of conviction. (For example, if I’ve heard a seemingly deductive argument for P, I could in theory go back and consider the percentage of times I’ve heard similarly convincing arguments for P’, P”, etc., where P’n has turned out to be false. But that’s pretty weak as a basis for probability assignment, not least because by problematizing our evaluation of deductive arguments, we’ve thrown our only way to determinately believe today in the truth or falsity of the various prior P’ns, and thus thrown out our only way of getting a prior percentage to justify our present probability estimate.)

    The other option would be what Bill’s comment just suggested, just assign equal probabilities where unknown. I think this is dubious for two reasons. First, and maybe this is some vestigial frequentist part of me talking, I don’t see why there’s any more reason to assign, say, .5 to one of two wholly unknown probabilities than there is to assign .1 to it. Second, once you admit of things like impossible possible worlds, it seems like you have an infinite number of [impossible] possible worlds with unknown probability. Following the “assign an equal probability to each option” rule would require assigning each a probability of zero, or as close to zero as you’d like (lim as x approaches infinity of 1/x, non?).

  • I hope this question doesn’t take over the thread, but, Robin, do you think moral agreement should apply between a human and any other possible mind?

  • Bill, yes, that is a fine example, though I’m not sure the base ten digits are equally distributed in pi.

    Paul, I fear you do not understand what a prior is; the things you call priors, that you get from previous observations, are not priors. They are posteriors on previous data. Priors are counterfactual beliefs in a situation of minimum possible information. Every prior in any context is arbitrary in the sense that bothers you.

  • Eliezer, assuming we are talking about morals as usually understood, and not as preferences or expressions, then my guess is yes. But I admit that what I’ve managed to show formally is more limited. And it could be that what all minds should agree on is complete uncertainty about moral truths.

  • Neel Krishnaswami

    “Ideal reasoning capabilities” seems to very nearly assume the conclusion to me. Even if we assume we have an agent who can compute any computable fact, then it doesn’t follow that her beliefs would never change. That’s because it takes time to compute things, and as she learns new mathematical truths her beliefs about the world will change. (In fact, it is precisely this intuition that classical mathematicians use to model intuitionistic mathematics. You give a Kripke semantics in which worlds are collections of mathematical deductions. One world precedes another if its facts entail the the facts in the other. The intuition is that the world represents the knowledge of an ideal mathematician, and as time passes the mathematican deduces more and more truths.)

    A related point is that even if you do assume perfect reasoning abilities with classical (or intuitionistic) logic, you can’t work with inconsistent sets of facts. That’s because perfect reasoning means that an agent knows the deductive closure of whatever she knows, and if you know inconsistent facts (eg, A and not-A) you can immediately deduce everything. So inconsistency is trivial in the presence of perfect reasoning.

    One possible out is to restrict the kinds of deduction of you are allowed to perform, so that you can have nontrivial inconsistencies. This is the approach taken by paraconsistent logicians, though not, in my opinion, successfully.

  • Neel, I think you have misread me. The “ideal reasoning” descriptor was presented as an extreme *contrast* to my proposal, and was intended to include no computation time limitations.

  • Neel Krishnaswami

    Thanks for the correction! I think I misread you because I was looking for a story you wanted to tell about the relationship between impossible worlds. Presumably, you can move from one world to another by learning new things or deducing new things. Do you have any ideas about modelling that — what are permissible and impermissible moves?

  • Paul Gowder

    Robin: I’m sure you have a much deeper understanding of Bayesian reasoning than I do, so I’ll take your word for that — but I don’t see what good it does: what’s the justification for acting based on probabilities conditioned on that sort of belief? If I happen to hold priors associated with divine command, even though those priors are totally unreasonable, what can a Bayesian say to me to come to agreement about the propositions associated with those commands?

  • Paul, pretty much every probability is in the end conditioned on priors, which by definition are based on no data. Perhaps that will make you swear off speaking of probabilities ever again, and perhaps we should explore this issue further in future posts. It is not the issue at this post, however.

  • Here is a paper I found helpful. The first part is basically a tutorial on the possible-world formalism and shows how to use it to solve classic logic problems like the “two brothers” and the “three hats”. It’s a draft of a widely cited paper by Geanakoplos:


  • Imagine I’m thinking about some mathematical conjecture, say P!=NP. I think hard about it for years but can’t find a proof. Many smart people spend a lot of time on it but can’t find a proof either. But yet we still believe that P!=NP is highly likely. Somehow we must have a high prior for P!=NP and we must also believe that the probability of finding a proof for P!=NP is low even if P!=NP is true. But why? Where do these beliefs come from?

    I’ve previously suggested that we interpret priors for possible worlds as representing how much one cares about each possible world, which explains (or rather removes the need to explain) where priors come from. But that gambit doesn’t work for impossible possible worlds.

    Clearly the way we reason about a priori truths does have a Bayesian-like element, but the rest of it is still rather mysterious, at least to me.

  • Wei, yes, decision theory only cares about the combination of marginal value of a world and probability of a world. And I don’t see why that can’t be applied to impossible worlds as well. But I don’t think it really helps us answer the question of how to choose priors.

  • Paul Gowder

    Robin: It is an issue in this post, because it’s a question of how far you have to go. Even if every probability “in the end” is conditioned on priors which are based on no data, it matters how far “in the end” happens to be. In ordinary reasoning, we give a lot more credibility to claims that are based on a lot of actual evidence before we get to the giant assumption, as opposed to ones that go roughly “unknown prior, thus P.”

    It seems to me that part of the problem with assigning probabilities to a priori claims is just that. The reasoning goes “Assuming Q, the probability of P is X,” rather than “Here’s a bunch of observed evidence consistent with the probability of P being X, if Q is true,” which is how a posteriori claims run.

  • Neel Krishnaswami

    Doesn’t Solomonoff induction give a good (if idealized) model of how to assign priors? The procedure there is to take the space of models, and give each model that matches known observations a prior probability of 2^-K, where K is the Kolmogorov complexity of the model. This assignment of priors satisfies both Occam’s razor (it favors simpler models) and the principle of multiplicity (no model that works is given a zero probability). Then, your beliefs when you get new evidence according to Bayes’s rule. It’s possible to prove that this assignment of priors will always rapidly converge to the true model.

    The Kolmogorov complexity is uncomputable, but you can look at computable learning schemes as approximations to Solomonoff induction.

  • Neel,
    Solomonoff induction has serious problems:
    #1 if you weigh things using 2^-k, you don’t get a probability distribution (the sum doesn’t converge). Fixing this will create an arbitrary distortion.
    #2 the underlying machine is also arbitrary.
    #3 your prior will depend on the language with which you describe the world (see “Goodman’s grue”).

    Basically, you can’t avoid bias. We can talk about this offline.

  • Paul, you seem to be recommending “withhold judgment on any issue until your beliefs don’t depend much on a prior.” At the very best, this is equivalent to some new prior. At worst, it is incoherent. You can’t escape having estimates, and you can’t escape having them depend on a prior.

  • Robin,
    You may be interested in Fitelson’s work on logical omniscience, and his treatment of “logical learning”: http://fitelson.org/amsterdam.pdf

  • Gustavo, yes, that is an example of what I consider analysis using impossible worlds.

  • tweedledee

    A) I understand 1 above as implying that each person is working with the same set of possible worlds, but I’m wondering if the no disagreement theorem has been extended to agents with partitions that are not made up of the same basic possible world elements (that is, a world may be in one agents’ partition, but not the others’). This seems a requisite for substantial real-world applicability to me because I don’t think that there is such a thing as a set of all possible worlds, be if finite or infinite. The argument is a relatively straight-forward extension of some arguments that have been developed in reasonsing about omniscience: The set of all possible worlds must contain all possible facts. That is, it must contain all possible sheep, ponies, rocks, etc…and sentences. The problem is that there can be no set of all sentences. If there were, there would be a power set of all sentences of a higher cardinality, but there would also have to be a sentence about each element of the power set. Thus the set of all sentences would have a higher cardinality than itself, a contradiction. Since there can’t be a set of all sentences, and a set of all worlds would have to contain all sentences, there therefore be a set of all possible worlds.

    My intuition based on this is that one can still have a realistic Bayesian model, but it must be one where agents constantly learn of other possible world from reflection and from other agents, and they never cease addition of possible worlds. So if it takes time and effort to come to agreement, it may or may not ever actually happen because new possible worlds will constantly be added to partitions, requiring time and effort to compute posteriors in a way that cannot be anticipated beforehand.

    B) As for impossible worlds, it seems to me that each impossible world would have to contain all events insomuch as I understand them as implying that the principle of non-contradiction does not hold. Since each sentence in a possible world can be both true and false, I don’t see how an impossible world can fit into any kind of information partition model. All impossible worlds, it seems to me, would have to be both in and out of every information partition, making the calculation of probabilities impossible.

  • Twe, if you follow the “impossible worlds” link I gave you will find papers explaining how impossible worlds can be sensibly distinguished; they do not each need to include all events. And I think people should consider the same set of worlds. Remember, this is an ideal standard, not a description of actual details of reasoning. Yes there are special complications when dealing with infinities, but I don’t think it is plausible to say we disagree because of infinities, and if only the world were finite we would not disagree. After all, we do not in fact know that our physical world is not finite.

  • Robin, if we don’t have a model of where priors come from, or if the model says everyone is entitled to his own personal priors, what use are the no-disagreement theorems that assume common priors?

    Tweedledee, it’s not clear to me that the set of all possible worlds must contain the set of all sentences. But your idea of constantly adding new possible worlds is an interesting one, and perhaps can help solve the problem I posed at http://groups-beta.google.com/group/everything-list/browse_frm/thread/c7442c13ff1396ec
    I invite you to continue the discussion there.

  • Robin,

    On the one hand, you said:
    “I think people _should_ consider the same set of worlds. ”
    This seems to imply that each individual should consider the same set of worlds across time.

    On the other hand, you also said:
    “The beliefs of real social creatures [about a priori truths], however, do change with time and context, and _reasonably so_.” (emphasis mine, in both quotes)
    i.e. people should update their beliefs about a priori truths.

    Together, these two imply that people should update their probabilities over the set of worlds, but they should *never* rule out a world. This reminds of Cheeseman’s position in the debate with Halpern: http://www.cs.cornell.edu/home/halpern/papers/cheeseman.pdf

    Do you actually subscribe to Cheeseman’s position, or does my interpretation of your statements mix up “should”s that really correspond to different modalities? (just like “possible” can mean “physically possible”, “logically possible”, “epistemically possible”, etc)

  • Wei, I don’t think I claimed we can’t have an answer to how to choose priors; just that we can’t escape the question. I don’t think everyone is entitled to choose their own priors. I’m just trying to keep each of these post conversations limited to a particular topic.

    Gustavo, as a practical matter, given how error prone our reasoning is, I don’t see how we could ever completely rule out any ordinary claim (of ordinary complexity). I worked with Cheeseman for five years, by the way.

  • If one has a state space containing impossible worlds, it becomes completely unreasonable to assume a common prior. Certainly, we cannot call a person irrational for having no impossible worlds with probability in her state space.

  • Michael, a person who realizes that they can make errors in logic does seem irrational for having no impossible worlds in their state space. If one happens to know that a world is impossible, one should consider that to be information, not a prior.

  • “Michael, a person who realizes that they can make errors in logic does seem irrational for having no impossible worlds in their state space.”

    In such a world one would also have to consider unawareness, which is incompatible with standard state space models.

    Is everyone born with a possibility correspondence that makes one consider that all sets of reals are Lebesgue measurable? I don’t think one can model knowledge of abstract entities like a posteriori knowledge. You would basically run into the Benacaraffian problems haunting platonism: Are the natural numbers as sets the von Neumann or the Zermelo natural numbers?