The discussion about "agreeing to disagree" assumes ideal bayesians, and the preferred resolution requires that the disputants are willing to spend the time to reach agreement. Prediction markets are one of the mechanisms used by imperfect bayesians to short-circuit the long discussion and find a reasonable compromise. Markets seem to provide good estimates for outsiders to use as the updated value coming out of these disagreements. In a recent conversation with Dan Reeves, I found another reason to doubt that, which seems to play into the discussion started by Manski on what prediction market odds mean.

Greetings from the future! The argument is that if Alice's probability is p1 and Bob's is p2 and they have a bet using odds corresponding to probability p=(p1+p2)/2 (the arithmetic mean) then Alice's expected winnings are equal to Bob's. That's the definition of fairness I was using. Equal expected utility using each player's own probability estimate.

This totally sweeps under the rug the incentive compatibility question. But sometimes you can assume that each player will just truthfully reveal their true probability estimate. In that case arithmetic mean is fairest.

Hi, I have been following this thread with interest.

How would one calculate the probabilities or odds of an event as the event takes place. flipping a coin for example, the first player to reach 5 successful flips.

The probablities of either is 50% at the beginning but as the game goes on the real probilities change and therefore the odds.

Dave Pennock wrote: "The proper update rule depends on how the Bayesian models others' beliefs. One can set up a model where the correct Bayesian update amounts to an algebraic average with others' beliefs."

When I try to apply the model in your paper to a situation where one observer has a probability of 30%, and another gives 99%, I end up using a weighting term that says that the 99% must correspond to stronger confidence or more observations. That seems to have a similar effect to using geometric means; the result is closer to the more extreme belief than the arithmetic mean would suggest.

The closing line on page 17 is: "the relative stock that an agent places on its own prior probability [...] serves to attenuate demand by a multiplicative factor." This doesn't seem a pure linear model to me, but maybe I'm not reading it correctly.

When I compute out the probabilities using twenty draws, I get a probability distribution (from 1 to 9 blue balls) of

.00001 .0043 .065, .25 .37 .25 .065 .0043 .00001

Which doesn't match either the geometric or the arithmetic mean the way I've calculated them. I'll have to appeal to a higher authority to see how this should be calculated. The good news is that the calculated probability of drawing blue is 50%, so not everything is lost.

If anyone wants to look over my spreadsheet to help figure out what's going on, I'll be glad to provide copies. The email address is my last name, and the domain is mydruthers dot com.

CrimsonGirl: I said "the geometric mean of the odds". odds are p/(1-p). So, to compute the geometric mean of the two probabilities, P and Q, you convert to odds (p and q), take the geometric mean (r), and convert back to percentages.

p = P/(1-P), q = Q/(1-Q), r = sqrt( p * q ).

To convert r back from odds to probability you compute R = r/(1+r).

With P=.3, and Q=.8, I get p=.429, q=4, r=1.3, and R=.57.

"the arithmetic mean is 55%, while the geometric mean of the odds is 57%;"

I guess I'm dumb, but why it the geometric mean 57%? The geometric mean of 30 and 80 is 49, and the Bayesian update of 30% with 80% evidence should be 63% by my calculation. I can't get 57%.

The analysis seems too simplistic to me. In order to follow Bayes' rule, Alice had to have an initial probability distribution which she would update using the new information. The estimate of 40% red assumes some kind of initial state of ignorance, which is non-bayesian. In order to figure out what Alice's estimate is, we have to start with what she thought before she starting drawing from the urn. Let's assume she was told there are 10 balls in the bag for concreteness. Bayes law says that Alice will update her estimates of the likelihood of each possible distribution using Bayes' law:

p(h|e) = p(e|h) * p(h) / p(e)

Alice has to start with an initial hypothesis that assigns a probability to every possibility from 0 to 10 red balls. Let's assume each possible distribution was equally likely. That means the prior has 1/11 as the probability of each number from 0 (all blue) to 10 (all red).

After drawing 6 blues and 4 reds, Alice will rule out 0 and 10, and will update the likelihood of each of the other outcomes. Her estimate of an even split will be

With that distribution of probabilities, Alice's estimate of the probability of drawing a blue ball is .58. By symmetry, Bob's estimate is .42. But more interesting, his probability distribution is the inverse of Alice's.

Both of these combinations pass my first cross check, which is that they're both symmetric. I'm not sure how to figure out which is better, except to compute out the same probabilities using 20 draws. It'll take me a little while, even though I have the spreadsheet started. I thought sharing these results now would give people a chance to show where I'm already wrong, or help me work out how to compare these estimates.

Note that Bethany's third case is an example where the two parties both agree with their initial estimates, but upon sharing them they both change, and the result is farther from the midpoint than either of their starting points.

One other idea on the distortion of extreme odds has been widely discussed by players of the FX game. That game uses play money and gives players initial fixed sums (which they can then grow or shrink according to how well they do). If a claim has a price of $FX 0.10, and 10 players have $FX 100 to bet on this, one player betting YES can match the bets of 9 players betting NO. In these extreme price ranges, players betting on the long shots have more "market power" than those betting the other way.

Here are some simple situations Daniel Reeves and I worked out. In the first, arithmetic mean is the right way to aggregate; in the second, geometric mean; and in the third, neither is.

1) Alice & Bob are trying to estimate the percentage of red balls inan urn. Alice draws 10 balls (without replacing), 4 are red, 6 areblue -- she estimates 40% red. Bob draws 10 more balls (withoutreplacing), 6 are red, 4 are blue -- he estimates 60% red. If theycombine their evidence, they in effect have 20 balls, 10 of which arered, and 10 of which are blue, so to combine their estimates, clearlythe arithmetic mean is right way to go, yielding an estimate of 50%red. (Geometric mean gives slightly less than 50%.)

2) Alice & Bob are trying to estimate the probability that all theballs in the urn are red. Alice draws a bunch of red balls and guessesthat the probability is fairly high. Bob draws one ball -- a blue one,and gives a probability of 0. The right way to aggregate theirprobabilities here is clearly not the arithmetic mean, but rather thegeometric.

3) This time let's just place 2 balls in the urn. Alice and Bob aretrying to predict the probability that they are both red (prior is 25%).Alice draws a red one and replaces it. She estimates the probability at 50%.Bob draws one ball, red, and replaces it, estimating the probability of bothred at 50% also. If they combine their info from the two independent trials,they ought to come up with 2/3 which neither the geometric nor arithmeticmean will give.

Chris, I don't think that this statement in your post is necessarily true: "the way that [B]ayesians would update their odds is to use the geometric mean of their odds". The proper update rule depends on how the Bayesian models others' beliefs. One can set up a model where the correct Bayesian update amounts to an algebraic average with others' beliefs. Mike Wellman and I did this in this paper:http://dpennock.com/papers/...

Commissions would make the problem worse, but the problem arises without them. The Foresight Exchange (ideafutures.com) is play money, and there's no commission, but we see the tailing off of accuracy there as well. I think that may have been the first community to notice the problem. Of course, FX's precision on any question is only down to 1%. (TradeSports usually uses .1%) With that precision, you can't even say anything finer than 100:1. 1000:1 and 10000:1 are both unexpressible.

Is it possible that the reason predictions are less accurate in the high and low range stem from the fact that the betting market needs to take a commission?

That is, if the minimum commission is 0.2% (which I think it is at TradeSports), even if you think the odds are 10000:1, and I think the odds are only 1000:1, there's no bet that makes both of us happy.

That was certainly not his whole argument. We went through a couple of examples, and Dan (and Bethany Soule) convinced me that arithmetic mean produces expected payouts for each party that are equal. I'd be interested in hearing arguments that it would be straightforward to convince disputants that some other trade-off would be fair.

To give the additional detail, from each party's viewpoint, their expected payout is E(right) * payout - E(wrong) * (1 - payout). They have separate estimates to start. As an example, let's say that Ann thinks the odds are no more than 30% for some event, while Bob thinks the odds are at least 99.9%. The arithmetic mean of these two estimates is 65%.

In a prediction market, Ann would put up 35 cents, and Bob 65 cents. Bob sees his expected payoutas a 99.9% chance that he'll gain 35 cents, and a .1% chance of losing 65 cents. Ann expects a 70% chance of winning 65 cents, and a 30% chance of losing 35 cents. Both therefore have expected values from the bet of gaining 35 cents. (.7 * .65) - (.3 * .35) = (.999 * .35) - (.001 * .65) = .35

The most counterintuitive aspect of this outcome is that if Bob changes his estimate from one-in-a-hundred to one-in-a-million, Bob is claiming vastly improved information about the outcome, but the the arithmetic mean (and the expected values) change very little. What do you say to Bob and Alice to convince them that different odds produce a fairer bet? If they understand Bayes rule, they may agree that the geometric mean produces a better estimate, but it would give them very unequal expected outcomes from a bet.

Chris, the fact that Dan Reeves wrote a program that takes a mean of two estimates is not to me enough evidence to support the claim that this is "the odds that appear fair to bettors."

Greetings from the future! The argument is that if Alice's probability is p1 and Bob's is p2 and they have a bet using odds corresponding to probability p=(p1+p2)/2 (the arithmetic mean) then Alice's expected winnings are equal to Bob's. That's the definition of fairness I was using. Equal expected utility using each player's own probability estimate.

This totally sweeps under the rug the incentive compatibility question. But sometimes you can assume that each player will just truthfully reveal their true probability estimate. In that case arithmetic mean is fairest.

Hi, I have been following this thread with interest.

How would one calculate the probabilities or odds of an event as the event takes place. flipping a coin for example, the first player to reach 5 successful flips.

The probablities of either is 50% at the beginning but as the game goes on the real probilities change and therefore the odds.

Heads tails probability heads/tails0 0 50%/50%1 0 ?/?2 0 ?/?2 1 ?/?etc

ThanksSacha

Dave Pennock wrote: "The proper update rule depends on how the Bayesian models others' beliefs. One can set up a model where the correct Bayesian update amounts to an algebraic average with others' beliefs."

When I try to apply the model in your paper to a situation where one observer has a probability of 30%, and another gives 99%, I end up using a weighting term that says that the 99% must correspond to stronger confidence or more observations. That seems to have a similar effect to using geometric means; the result is closer to the more extreme belief than the arithmetic mean would suggest.

The closing line on page 17 is: "the relative stock that an agent places on its own prior probability [...] serves to attenuate demand by a multiplicative factor." This doesn't seem a pure linear model to me, but maybe I'm not reading it correctly.

When I compute out the probabilities using twenty draws, I get a probability distribution (from 1 to 9 blue balls) of

.00001 .0043 .065, .25 .37 .25 .065 .0043 .00001

Which doesn't match either the geometric or the arithmetic mean the way I've calculated them. I'll have to appeal to a higher authority to see how this should be calculated. The good news is that the calculated probability of drawing blue is 50%, so not everything is lost.

If anyone wants to look over my spreadsheet to help figure out what's going on, I'll be glad to provide copies. The email address is my last name, and the domain is mydruthers dot com.

CrimsonGirl: I said "the geometric mean of the odds". odds are p/(1-p). So, to compute the geometric mean of the two probabilities, P and Q, you convert to odds (p and q), take the geometric mean (r), and convert back to percentages.

p = P/(1-P), q = Q/(1-Q), r = sqrt( p * q ).

To convert r back from odds to probability you compute R = r/(1+r).

With P=.3, and Q=.8, I get p=.429, q=4, r=1.3, and R=.57.

"the arithmetic mean is 55%, while the geometric mean of the odds is 57%;"

I guess I'm dumb, but why it the geometric mean 57%? The geometric mean of 30 and 80 is 49, and the Bayesian update of 30% with 80% evidence should be 63% by my calculation. I can't get 57%.

The analysis seems too simplistic to me. In order to follow Bayes' rule, Alice had to have an initial probability distribution which she would update using the new information. The estimate of 40% red assumes some kind of initial state of ignorance, which is non-bayesian. In order to figure out what Alice's estimate is, we have to start with what she thought before she starting drawing from the urn. Let's assume she was told there are 10 balls in the bag for concreteness. Bayes law says that Alice will update her estimates of the likelihood of each possible distribution using Bayes' law:

p(h|e) = p(e|h) * p(h) / p(e)

Alice has to start with an initial hypothesis that assigns a probability to every possibility from 0 to 10 red balls. Let's assume each possible distribution was equally likely. That means the prior has 1/11 as the probability of each number from 0 (all blue) to 10 (all red).

After drawing 6 blues and 4 reds, Alice will rule out 0 and 10, and will update the likelihood of each of the other outcomes. Her estimate of an even split will be

p(drawing 6 blues|even split) * p(even split) / p(drawing 6 blues)

The prior probability for an even split is 1/11.

The probability of drawing 6 blues if the urn is evenly split is(10 choose 6) * .5^10 = 10!/6!(10-6)!/1024 = 210/1024 =~ .205

The prior probility of drawing 6 balls (over all the possible distributions of balls) was .083 (if I've calculated correctly.)

So Alice's posterior for 5 Blue/5 Red is .205*.083*11 =~ .226

The full probability distribution is (sorry, I can't make a table here.)

# blue 1 2 3 4 5 6 7 8 9prob 0.00015 0.006 0.04 0.12 0.23 0.28 0.22 0.097 0.012

(all blue and all red are impossible)

With that distribution of probabilities, Alice's estimate of the probability of drawing a blue ball is .58. By symmetry, Bob's estimate is .42. But more interesting, his probability distribution is the inverse of Alice's.

0.00015 0.006 0.04 0.12 0.23 0.28 0.22 0.097 0.012

When we want to combine the estimates, we have to combine the individual probabilities.

Combined using arithmetic mean:0 0.0062 0.0514 0.13 0.199 0.226 0.199 0.13 0.051 0.00621 0

Combined using Geometric mean:0 0.0016 0.029 0.113 0.22 0.27 0.22 0.113 0.029 0.0016 0

Both of these combinations pass my first cross check, which is that they're both symmetric. I'm not sure how to figure out which is better, except to compute out the same probabilities using 20 draws. It'll take me a little while, even though I have the spreadsheet started. I thought sharing these results now would give people a chance to show where I'm already wrong, or help me work out how to compare these estimates.

Note that Bethany's third case is an example where the two parties both agree with their initial estimates, but upon sharing them they both change, and the result is farther from the midpoint than either of their starting points.

One other idea on the distortion of extreme odds has been widely discussed by players of the FX game. That game uses play money and gives players initial fixed sums (which they can then grow or shrink according to how well they do). If a claim has a price of $FX 0.10, and 10 players have $FX 100 to bet on this, one player betting YES can match the bets of 9 players betting NO. In these extreme price ranges, players betting on the long shots have more "market power" than those betting the other way.

I like Soule and Reeve's demonstration. Very concise.

Here are some simple situations Daniel Reeves and I worked out. In the first, arithmetic mean is the right way to aggregate; in the second, geometric mean; and in the third, neither is.

1) Alice & Bob are trying to estimate the percentage of red balls inan urn. Alice draws 10 balls (without replacing), 4 are red, 6 areblue -- she estimates 40% red. Bob draws 10 more balls (withoutreplacing), 6 are red, 4 are blue -- he estimates 60% red. If theycombine their evidence, they in effect have 20 balls, 10 of which arered, and 10 of which are blue, so to combine their estimates, clearlythe arithmetic mean is right way to go, yielding an estimate of 50%red. (Geometric mean gives slightly less than 50%.)

2) Alice & Bob are trying to estimate the probability that all theballs in the urn are red. Alice draws a bunch of red balls and guessesthat the probability is fairly high. Bob draws one ball -- a blue one,and gives a probability of 0. The right way to aggregate theirprobabilities here is clearly not the arithmetic mean, but rather thegeometric.

3) This time let's just place 2 balls in the urn. Alice and Bob aretrying to predict the probability that they are both red (prior is 25%).Alice draws a red one and replaces it. She estimates the probability at 50%.Bob draws one ball, red, and replaces it, estimating the probability of bothred at 50% also. If they combine their info from the two independent trials,they ought to come up with 2/3 which neither the geometric nor arithmeticmean will give.

Chris, I don't think that this statement in your post is necessarily true: "the way that [B]ayesians would update their odds is to use the geometric mean of their odds". The proper update rule depends on how the Bayesian models others' beliefs. One can set up a model where the correct Bayesian update amounts to an algebraic average with others' beliefs. Mike Wellman and I did this in this paper:http://dpennock.com/papers/...

(See Section 4 and Equation 10.)

Commissions would make the problem worse, but the problem arises without them. The Foresight Exchange (ideafutures.com) is play money, and there's no commission, but we see the tailing off of accuracy there as well. I think that may have been the first community to notice the problem. Of course, FX's precision on any question is only down to 1%. (TradeSports usually uses .1%) With that precision, you can't even say anything finer than 100:1. 1000:1 and 10000:1 are both unexpressible.

Is it possible that the reason predictions are less accurate in the high and low range stem from the fact that the betting market needs to take a commission?

That is, if the minimum commission is 0.2% (which I think it is at TradeSports), even if you think the odds are 10000:1, and I think the odds are only 1000:1, there's no bet that makes both of us happy.

I might be more persuaded by data on actual betting odds chosen, or a strategic model of a game where betting odds are chosen.

That was certainly not his whole argument. We went through a couple of examples, and Dan (and Bethany Soule) convinced me that arithmetic mean produces expected payouts for each party that are equal. I'd be interested in hearing arguments that it would be straightforward to convince disputants that some other trade-off would be fair.

To give the additional detail, from each party's viewpoint, their expected payout is E(right) * payout - E(wrong) * (1 - payout). They have separate estimates to start. As an example, let's say that Ann thinks the odds are no more than 30% for some event, while Bob thinks the odds are at least 99.9%. The arithmetic mean of these two estimates is 65%.

In a prediction market, Ann would put up 35 cents, and Bob 65 cents. Bob sees his expected payoutas a 99.9% chance that he'll gain 35 cents, and a .1% chance of losing 65 cents. Ann expects a 70% chance of winning 65 cents, and a 30% chance of losing 35 cents. Both therefore have expected values from the bet of gaining 35 cents. (.7 * .65) - (.3 * .35) = (.999 * .35) - (.001 * .65) = .35

The most counterintuitive aspect of this outcome is that if Bob changes his estimate from one-in-a-hundred to one-in-a-million, Bob is claiming vastly improved information about the outcome, but the the arithmetic mean (and the expected values) change very little. What do you say to Bob and Alice to convince them that different odds produce a fairer bet? If they understand Bayes rule, they may agree that the geometric mean produces a better estimate, but it would give them very unequal expected outcomes from a bet.

Chris, the fact that Dan Reeves wrote a program that takes a mean of two estimates is not to me enough evidence to support the claim that this is "the odds that appear fair to bettors."