Physicists, statisticians, computer scientists, economists, and many philosophers rely on the following standard ("Bayesian") approach to analyzing and modeling information: Identify a set of "possible worlds," i.e., self-consistent sets of answers to all relevant questions.

"Michael, a person who realizes that they can make errors in logic does seem irrational for having no impossible worlds in their state space."

In such a world one would also have to consider unawareness, which is incompatible with standard state space models.

Is everyone born with a possibility correspondence that makes one consider that all sets of reals are Lebesgue measurable? I don't think one can model knowledge of abstract entities like a posteriori knowledge. You would basically run into the Benacaraffian problems haunting platonism: Are the natural numbers as sets the von Neumann or the Zermelo natural numbers?

Michael, a person who realizes that they can make errors in logic does seem irrational for having no impossible worlds in their state space. If one happens to know that a world is impossible, one should consider that to be information, not a prior.

If one has a state space containing impossible worlds, it becomes completely unreasonable to assume a common prior. Certainly, we cannot call a person irrational for having no impossible worlds with probability in her state space.

Wei, I don't think I claimed we can't have an answer to how to choose priors; just that we can't escape the question. I don't think everyone is entitled to choose their own priors. I'm just trying to keep each of these post conversations limited to a particular topic.

Gustavo, as a practical matter, given how error prone our reasoning is, I don't see how we could ever completely rule out any ordinary claim (of ordinary complexity). I worked with Cheeseman for five years, by the way.

On the one hand, you said:"I think people _should_ consider the same set of worlds. "This seems to imply that each individual should consider the same set of worlds across time.

On the other hand, you also said:"The beliefs of real social creatures [about a priori truths], however, do change with time and context, and _reasonably so_." (emphasis mine, in both quotes)i.e. people should update their beliefs about a priori truths.

Together, these two imply that people should update their probabilities over the set of worlds, but they should *never* rule out a world. This reminds of Cheeseman's position in the debate with Halpern: http://www.cs.cornell.edu/h...

Do you actually subscribe to Cheeseman's position, or does my interpretation of your statements mix up "should"s that really correspond to different modalities? (just like "possible" can mean "physically possible", "logically possible", "epistemically possible", etc)

Robin, if we don't have a model of where priors come from, or if the model says everyone is entitled to his own personal priors, what use are the no-disagreement theorems that assume common priors?

Tweedledee, it's not clear to me that the set of all possible worlds must contain the set of all sentences. But your idea of constantly adding new possible worlds is an interesting one, and perhaps can help solve the problem I posed at http://groups-beta.google.c...I invite you to continue the discussion there.

Twe, if you follow the "impossible worlds" link I gave you will find papers explaining how impossible worlds can be sensibly distinguished; they do not each need to include all events. And I think people should consider the same set of worlds. Remember, this is an ideal standard, not a description of actual details of reasoning. Yes there are special complications when dealing with infinities, but I don't think it is plausible to say we disagree because of infinities, and if only the world were finite we would not disagree. After all, we do not in fact know that our physical world is not finite.

A) I understand 1 above as implying that each person is working with the same set of possible worlds, but I’m wondering if the no disagreement theorem has been extended to agents with partitions that are not made up of the same basic possible world elements (that is, a world may be in one agents’ partition, but not the others’). This seems a requisite for substantial real-world applicability to me because I don’t think that there is such a thing as a set of all possible worlds, be if finite or infinite. The argument is a relatively straight-forward extension of some arguments that have been developed in reasonsing about omniscience: The set of all possible worlds must contain all possible facts. That is, it must contain all possible sheep, ponies, rocks, etc…and sentences. The problem is that there can be no set of all sentences. If there were, there would be a power set of all sentences of a higher cardinality, but there would also have to be a sentence about each element of the power set. Thus the set of all sentences would have a higher cardinality than itself, a contradiction. Since there can’t be a set of all sentences, and a set of all worlds would have to contain all sentences, there therefore be a set of all possible worlds.

My intuition based on this is that one can still have a realistic Bayesian model, but it must be one where agents constantly learn of other possible world from reflection and from other agents, and they never cease addition of possible worlds. So if it takes time and effort to come to agreement, it may or may not ever actually happen because new possible worlds will constantly be added to partitions, requiring time and effort to compute posteriors in a way that cannot be anticipated beforehand.

B) As for impossible worlds, it seems to me that each impossible world would have to contain all events insomuch as I understand them as implying that the principle of non-contradiction does not hold. Since each sentence in a possible world can be both true and false, I don’t see how an impossible world can fit into any kind of information partition model. All impossible worlds, it seems to me, would have to be both in and out of every information partition, making the calculation of probabilities impossible.

Paul, you seem to be recommending "withhold judgment on any issue until your beliefs don't depend much on a prior." At the very best, this is equivalent to some new prior. At worst, it is incoherent. You can't escape having estimates, and you can't escape having them depend on a prior.

Neel,Solomonoff induction has serious problems:#1 if you weigh things using 2^-k, you don't get a probability distribution (the sum doesn't converge). Fixing this will create an arbitrary distortion.#2 the underlying machine is also arbitrary.#3 your prior will depend on the language with which you describe the world (see "Goodman's grue").

Basically, you can't avoid bias. We can talk about this offline.

Doesn't Solomonoff induction give a good (if idealized) model of how to assign priors? The procedure there is to take the space of models, and give each model that matches known observations a prior probability of 2^-K, where K is the Kolmogorov complexity of the model. This assignment of priors satisfies both Occam's razor (it favors simpler models) and the principle of multiplicity (no model that works is given a zero probability). Then, your beliefs when you get new evidence according to Bayes's rule. It's possible to prove that this assignment of priors will always rapidly converge to the true model.

The Kolmogorov complexity is uncomputable, but you can look at computable learning schemes as approximations to Solomonoff induction.

Robin: It is an issue in this post, because it's a question of how far you have to go. Even if every probability "in the end" is conditioned on priors which are based on no data, it matters how far "in the end" happens to be. In ordinary reasoning, we give a lot more credibility to claims that are based on a lot of actual evidence before we get to the giant assumption, as opposed to ones that go roughly "unknown prior, thus P."

It seems to me that part of the problem with assigning probabilities to a priori claims is just that. The reasoning goes "Assuming Q, the probability of P is X," rather than "Here's a bunch of observed evidence consistent with the probability of P being X, if Q is true," which is how a posteriori claims run.

Wei, yes, decision theory only cares about the combination of marginal value of a world and probability of a world. And I don't see why that can't be applied to impossible worlds as well. But I don't think it really helps us answer the question of how to choose priors.

Imagine I'm thinking about some mathematical conjecture, say P!=NP. I think hard about it for years but can't find a proof. Many smart people spend a lot of time on it but can't find a proof either. But yet we still believe that P!=NP is highly likely. Somehow we must have a high prior for P!=NP and we must also believe that the probability of finding a proof for P!=NP is low even if P!=NP is true. But why? Where do these beliefs come from?

I've previously suggested that we interpret priors for possible worlds as representing how much one cares about each possible world, which explains (or rather removes the need to explain) where priors come from. But that gambit doesn't work for impossible possible worlds.

Clearly the way we reason about a priori truths does have a Bayesian-like element, but the rest of it is still rather mysterious, at least to me.

"Michael, a person who realizes that they can make errors in logic does seem irrational for having no impossible worlds in their state space."

In such a world one would also have to consider unawareness, which is incompatible with standard state space models.

Is everyone born with a possibility correspondence that makes one consider that all sets of reals are Lebesgue measurable? I don't think one can model knowledge of abstract entities like a posteriori knowledge. You would basically run into the Benacaraffian problems haunting platonism: Are the natural numbers as sets the von Neumann or the Zermelo natural numbers?

Michael, a person who realizes that they can make errors in logic does seem irrational for having no impossible worlds in their state space. If one happens to know that a world is impossible, one should consider that to be information, not a prior.

If one has a state space containing impossible worlds, it becomes completely unreasonable to assume a common prior. Certainly, we cannot call a person irrational for having no impossible worlds with probability in her state space.

Wei, I don't think I claimed we can't have an answer to how to choose priors; just that we can't escape the question. I don't think everyone is entitled to choose their own priors. I'm just trying to keep each of these post conversations limited to a particular topic.

Gustavo, as a practical matter, given how error prone our reasoning is, I don't see how we could ever completely rule out any ordinary claim (of ordinary complexity). I worked with Cheeseman for five years, by the way.

Robin,

On the one hand, you said:"I think people _should_ consider the same set of worlds. "This seems to imply that each individual should consider the same set of worlds across time.

On the other hand, you also said:"The beliefs of real social creatures [about a priori truths], however, do change with time and context, and _reasonably so_." (emphasis mine, in both quotes)i.e. people should update their beliefs about a priori truths.

Together, these two imply that people should update their probabilities over the set of worlds, but they should *never* rule out a world. This reminds of Cheeseman's position in the debate with Halpern: http://www.cs.cornell.edu/h...

Do you actually subscribe to Cheeseman's position, or does my interpretation of your statements mix up "should"s that really correspond to different modalities? (just like "possible" can mean "physically possible", "logically possible", "epistemically possible", etc)

Robin, if we don't have a model of where priors come from, or if the model says everyone is entitled to his own personal priors, what use are the no-disagreement theorems that assume common priors?

Tweedledee, it's not clear to me that the set of all possible worlds must contain the set of all sentences. But your idea of constantly adding new possible worlds is an interesting one, and perhaps can help solve the problem I posed at http://groups-beta.google.c...I invite you to continue the discussion there.

Twe, if you follow the "impossible worlds" link I gave you will find papers explaining how impossible worlds can be sensibly distinguished; they do not each need to include all events. And I think people should consider the same set of worlds. Remember, this is an ideal standard, not a description of actual details of reasoning. Yes there are special complications when dealing with infinities, but I don't think it is plausible to say we disagree because of infinities, and if only the world were finite we would not disagree. After all, we do not in fact know that our physical world is not finite.

A) I understand 1 above as implying that each person is working with the same set of possible worlds, but I’m wondering if the no disagreement theorem has been extended to agents with partitions that are not made up of the same basic possible world elements (that is, a world may be in one agents’ partition, but not the others’). This seems a requisite for substantial real-world applicability to me because I don’t think that there is such a thing as a set of all possible worlds, be if finite or infinite. The argument is a relatively straight-forward extension of some arguments that have been developed in reasonsing about omniscience: The set of all possible worlds must contain all possible facts. That is, it must contain all possible sheep, ponies, rocks, etc…and sentences. The problem is that there can be no set of all sentences. If there were, there would be a power set of all sentences of a higher cardinality, but there would also have to be a sentence about each element of the power set. Thus the set of all sentences would have a higher cardinality than itself, a contradiction. Since there can’t be a set of all sentences, and a set of all worlds would have to contain all sentences, there therefore be a set of all possible worlds.

My intuition based on this is that one can still have a realistic Bayesian model, but it must be one where agents constantly learn of other possible world from reflection and from other agents, and they never cease addition of possible worlds. So if it takes time and effort to come to agreement, it may or may not ever actually happen because new possible worlds will constantly be added to partitions, requiring time and effort to compute posteriors in a way that cannot be anticipated beforehand.

B) As for impossible worlds, it seems to me that each impossible world would have to contain all events insomuch as I understand them as implying that the principle of non-contradiction does not hold. Since each sentence in a possible world can be both true and false, I don’t see how an impossible world can fit into any kind of information partition model. All impossible worlds, it seems to me, would have to be both in and out of every information partition, making the calculation of probabilities impossible.

Gustavo, yes, that is an example of what I consider analysis using impossible worlds.

Robin,You may be interested in Fitelson's work on logical omniscience, and his treatment of "logical learning": http://fitelson.org/amsterd...

Paul, you seem to be recommending "withhold judgment on any issue until your beliefs don't depend much on a prior." At the very best, this is equivalent to some new prior. At worst, it is incoherent. You can't escape having estimates, and you can't escape having them depend on a prior.

Neel,Solomonoff induction has serious problems:#1 if you weigh things using 2^-k, you don't get a probability distribution (the sum doesn't converge). Fixing this will create an arbitrary distortion.#2 the underlying machine is also arbitrary.#3 your prior will depend on the language with which you describe the world (see "Goodman's grue").

Basically, you can't avoid bias. We can talk about this offline.

Doesn't Solomonoff induction give a good (if idealized) model of how to assign priors? The procedure there is to take the space of models, and give each model that matches known observations a prior probability of 2^-K, where K is the Kolmogorov complexity of the model. This assignment of priors satisfies both Occam's razor (it favors simpler models) and the principle of multiplicity (no model that works is given a zero probability). Then, your beliefs when you get new evidence according to Bayes's rule. It's possible to prove that this assignment of priors will always rapidly converge to the true model.

The Kolmogorov complexity is uncomputable, but you can look at computable learning schemes as approximations to Solomonoff induction.

Robin: It is an issue in this post, because it's a question of how far you have to go. Even if every probability "in the end" is conditioned on priors which are based on no data, it matters how far "in the end" happens to be. In ordinary reasoning, we give a lot more credibility to claims that are based on a lot of actual evidence before we get to the giant assumption, as opposed to ones that go roughly "unknown prior, thus P."

It seems to me that part of the problem with assigning probabilities to a priori claims is just that. The reasoning goes "Assuming Q, the probability of P is X," rather than "Here's a bunch of observed evidence consistent with the probability of P being X, if Q is true," which is how a posteriori claims run.

Wei, yes, decision theory only cares about the combination of marginal value of a world and probability of a world. And I don't see why that can't be applied to impossible worlds as well. But I don't think it really helps us answer the question of how to choose priors.

Imagine I'm thinking about some mathematical conjecture, say P!=NP. I think hard about it for years but can't find a proof. Many smart people spend a lot of time on it but can't find a proof either. But yet we still believe that P!=NP is highly likely. Somehow we must have a high prior for P!=NP and we must also believe that the probability of finding a proof for P!=NP is low even if P!=NP is true. But why? Where do these beliefs come from?

I've previously suggested that we interpret priors for possible worlds as representing how much one cares about each possible world, which explains (or rather removes the need to explain) where priors come from. But that gambit doesn't work for impossible possible worlds.

Clearly the way we reason about a priori truths does have a Bayesian-like element, but the rest of it is still rather mysterious, at least to me.