Fair betting odds and Prediction Market prices

The discussion about "agreeing to disagree" assumes ideal bayesians, and the preferred resolution requires that the disputants are willing to spend the time to reach agreement.  Prediction markets are one of the mechanisms used by imperfect bayesians to short-circuit the long discussion and find a reasonable compromise.  Markets seem to provide good estimates for outsiders to use as the updated value coming out of these disagreements.  In a recent conversation with Dan Reeves, I found another reason to doubt that, which seems to play into the discussion started by Manski on what prediction market odds mean.

Dan, in his Yootles system, has a facility that supports bets between two or more people.  When two people disagree, and want to subject the disagreement to a wager, they each submit their estimate of the correct odds to the system.  The system then uses the arithmetic mean of their percentage odds as the fair odds.  Dan argues, convincingly, that the arithmetic mean gives each party the same expectation of gain, and that is what fairness requires.

On the other hand, the way that bayesians would update their odds is to use the geometric mean of their odds.  (Robin Hanson points out that this is equivalent to the arithmetic mean of the log odds.)  With estimates in the range of 10% to 90%, it doesn’t make much difference which of these you use, but when one of the parties has an extreme view of the possibility of the event, the geometric mean is sensitive to changes in a way that the arithmetic mean is not.

If Alice believes that the chances of some event are 30%, (odds of 3 to 7) and Bob’s estimate is 80%, (4:1) the arithmetic mean is 55%, while the geometric mean of the odds is 57%; the results are quite close.  You start to see noticable differences when one estimate gets above 95% or below 5%.  One intuitive explanation for the difference is that the arithmetic mean is based on the percentage chances, which don’t have the resolution to change much once above 95%. Odds above 95% can still change from 1 in 20 to 1 in a hundred or one in a million, representing very significant differences in a visible way. This allows the geometric mean to be more sensitive in this range.

Another intuitive presentation is that when Bob’s estimate of the likelihood changes by a factor of 5 or 10, the bayesian combination of Bob’s and Ann’s estimates should move significantly (a factor of 2 or 3).  The arithmetic mean, used to compute the respective expected values of a bet, moves by at most a few percent when below 95%, and when above 99% (or below 1%) by less than a percentage.  But that’s where the most interesting changes in the individual estimates take place.

The implications of this difference between the odds that appear fair to bettors and the expectations of bayesian observers seem to touch on a few well-known conundrums.  The commonly observed drop-off in predictiveness when prediction market odds are above 90 or below 10 could be partly due in part to the participants’ lack of incentive to push the odds further towards the end points.  The fact that we mostly use percentage odds may also contribute: with whole number percentages, you can’t express odds more extreme than 99:1, with tenths, you can express up to 999:1.  BetFair’s use of odds rather than percents may actually be an advantage here.  (I usually complain that I find betting odds opaque; the increased resolution at the ends of the spectrum may be worth the confusion.)

I’m not sure how to integrate this into the discussionn, but this idea that the participants’ betting incentives don’t lead directly to bayesian updates may also have implications for the discussion started by Manski, and picked up by Wolfers and Zitzewitz and by Ottaviani and Sørensen.  If prediction market participants don’t have sufficient incentive to move the odds to the extremes they believe are true, then the market outcomes may have reduced fidelity in those ranges as well.

GD Star Rating
Tagged as:
Trackback URL:
  • Chris, the fact that Dan Reeves wrote a program that takes a mean of two estimates is not to me enough evidence to support the claim that this is “the odds that appear fair to bettors.”

    • Greetings from the future! The argument is that if Alice’s probability is p1 and Bob’s is p2 and they have a bet using odds corresponding to probability p=(p1+p2)/2 (the arithmetic mean) then Alice’s expected winnings are equal to Bob’s. That’s the definition of fairness I was using. Equal expected utility using each player’s own probability estimate.

      This totally sweeps under the rug the incentive compatibility question. But sometimes you can assume that each player will just truthfully reveal their true probability estimate. In that case arithmetic mean is fairest.

  • That was certainly not his whole argument. We went through a couple of examples, and Dan (and Bethany Soule) convinced me that arithmetic mean produces expected payouts for each party that are equal. I’d be interested in hearing arguments that it would be straightforward to convince disputants that some other trade-off would be fair.

    To give the additional detail, from each party’s viewpoint, their expected payout is E(right) * payout – E(wrong) * (1 – payout). They have separate estimates to start. As an example, let’s say that Ann thinks the odds are no more than 30% for some event, while Bob thinks the odds are at least 99.9%. The arithmetic mean of these two estimates is 65%.

    In a prediction market, Ann would put up 35 cents, and Bob 65 cents. Bob sees his expected payout
    as a 99.9% chance that he’ll gain 35 cents, and a .1% chance of losing 65 cents. Ann expects a 70% chance of winning 65 cents, and a 30% chance of losing 35 cents. Both therefore have expected values from the bet of gaining 35 cents. (.7 * .65) – (.3 * .35) = (.999 * .35) – (.001 * .65) = .35

    The most counterintuitive aspect of this outcome is that if Bob changes his estimate from one-in-a-hundred to one-in-a-million, Bob is claiming vastly improved information about the outcome, but the the arithmetic mean (and the expected values) change very little. What do you say to Bob and Alice to convince them that different odds produce a fairer bet? If they understand Bayes rule, they may agree that the geometric mean produces a better estimate, but it would give them very unequal expected outcomes from a bet.

  • I might be more persuaded by data on actual betting odds chosen, or a strategic model of a game where betting odds are chosen.

  • Phil

    Is it possible that the reason predictions are less accurate in the high and low range stem from the fact that the betting market needs to take a commission?

    That is, if the minimum commission is 0.2% (which I think it is at TradeSports), even if you think the odds are 10000:1, and I think the odds are only 1000:1, there’s no bet that makes both of us happy.

  • Commissions would make the problem worse, but the problem arises without them. The Foresight Exchange (ideafutures.com) is play money, and there’s no commission, but we see the tailing off of accuracy there as well. I think that may have been the first community to notice the problem. Of course, FX’s precision on any question is only down to 1%. (TradeSports usually uses .1%) With that precision, you can’t even say anything finer than 100:1. 1000:1 and 10000:1 are both unexpressible.

  • Chris, I don’t think that this statement in your post is necessarily true: “the way that [B]ayesians would update their odds is to use the geometric mean of their odds”. The proper update rule depends on how the Bayesian models others’ beliefs. One can set up a model where the correct Bayesian update amounts to an algebraic average with others’ beliefs. Mike Wellman and I did this in this paper:

    (See Section 4 and Equation 10.)

  • Here are some simple situations Daniel Reeves and I worked out. In the first, arithmetic mean is the right way to aggregate; in the second, geometric mean; and in the third, neither is.

    1) Alice & Bob are trying to estimate the percentage of red balls in
    an urn. Alice draws 10 balls (without replacing), 4 are red, 6 are
    blue — she estimates 40% red. Bob draws 10 more balls (without
    replacing), 6 are red, 4 are blue — he estimates 60% red. If they
    combine their evidence, they in effect have 20 balls, 10 of which are
    red, and 10 of which are blue, so to combine their estimates, clearly
    the arithmetic mean is right way to go, yielding an estimate of 50%
    red. (Geometric mean gives slightly less than 50%.)

    2) Alice & Bob are trying to estimate the probability that all the
    balls in the urn are red. Alice draws a bunch of red balls and guesses
    that the probability is fairly high. Bob draws one ball — a blue one,
    and gives a probability of 0. The right way to aggregate their
    probabilities here is clearly not the arithmetic mean, but rather the

    3) This time let’s just place 2 balls in the urn. Alice and Bob are
    trying to predict the probability that they are both red (prior is 25%).
    Alice draws a red one and replaces it. She estimates the probability at 50%.
    Bob draws one ball, red, and replaces it, estimating the probability of both
    red at 50% also. If they combine their info from the two independent trials,
    they ought to come up with 2/3 which neither the geometric nor arithmetic
    mean will give.

  • I like Soule and Reeve’s demonstration. Very concise.

  • Note that Bethany’s third case is an example where the two parties both agree with their initial estimates, but upon sharing them they both change, and the result is farther from the midpoint than either of their starting points.

    One other idea on the distortion of extreme odds has been widely discussed by players of the FX game. That game uses play money and gives players initial fixed sums (which they can then grow or shrink according to how well they do). If a claim has a price of $FX 0.10, and 10 players have $FX 100 to bet on this, one player betting YES can match the bets of 9 players betting NO. In these extreme price ranges, players betting on the long shots have more “market power” than those betting the other way.

  • The analysis seems too simplistic to me. In order to follow Bayes’ rule, Alice had to have an initial probability distribution which she would update using the new information. The estimate of 40% red assumes some kind of initial state of ignorance, which is non-bayesian. In order to figure out what Alice’s estimate is, we have to start with what she thought before she starting drawing from the urn. Let’s assume she was told there are 10 balls in the bag for concreteness. Bayes law says that Alice will update her estimates of the likelihood of each possible distribution using Bayes’ law:

    p(h|e) = p(e|h) * p(h) / p(e)

    Alice has to start with an initial hypothesis that assigns a probability to every possibility from 0 to 10 red balls. Let’s assume each possible distribution was equally likely. That means the prior has 1/11 as the probability of each number from 0 (all blue) to 10 (all red).

    After drawing 6 blues and 4 reds, Alice will rule out 0 and 10, and will update the likelihood of each of the other outcomes. Her estimate of an even split will be

    p(drawing 6 blues|even split) * p(even split) / p(drawing 6 blues)

    The prior probability for an even split is 1/11.

    The probability of drawing 6 blues if the urn is evenly split is
    (10 choose 6) * .5^10 = 10!/6!(10-6)!/1024 = 210/1024 =~ .205

    The prior probility of drawing 6 balls (over all the possible distributions of balls) was .083 (if I’ve calculated correctly.)

    So Alice’s posterior for 5 Blue/5 Red is .205*.083*11 =~ .226

    The full probability distribution is (sorry, I can’t make a table here.)

    # blue 1 2 3 4 5 6 7 8 9
    prob 0.00015 0.006 0.04 0.12 0.23 0.28 0.22 0.097 0.012

    (all blue and all red are impossible)

    With that distribution of probabilities, Alice’s estimate of the probability of drawing a blue ball is .58. By symmetry, Bob’s estimate is .42. But more interesting, his probability distribution is the inverse of Alice’s.

    0.00015 0.006 0.04 0.12 0.23 0.28 0.22 0.097 0.012

    When we want to combine the estimates, we have to combine the individual probabilities.

    Combined using arithmetic mean:
    0 0.0062 0.0514 0.13 0.199 0.226 0.199 0.13 0.051 0.00621 0

    Combined using Geometric mean:
    0 0.0016 0.029 0.113 0.22 0.27 0.22 0.113 0.029 0.0016 0

    Both of these combinations pass my first cross check, which is that they’re both symmetric. I’m not sure how to figure out which is better, except to compute out the same probabilities using 20 draws. It’ll take me a little while, even though I have the spreadsheet started. I thought sharing these results now would give people a chance to show where I’m already wrong, or help me work out how to compare these estimates.

  • “the arithmetic mean is 55%, while the geometric mean of the odds is 57%;”

    I guess I’m dumb, but why it the geometric mean 57%? The geometric mean of 30 and 80 is 49, and the Bayesian update of 30% with 80% evidence should be 63% by my calculation. I can’t get 57%.

  • CrimsonGirl: I said “the geometric mean of the odds”. odds are p/(1-p). So, to compute the geometric mean of the two probabilities, P and Q, you convert to odds (p and q), take the geometric mean (r), and convert back to percentages.

    p = P/(1-P), q = Q/(1-Q), r = sqrt( p * q ).

    To convert r back from odds to probability you compute R = r/(1+r).

    With P=.3, and Q=.8, I get p=.429, q=4, r=1.3, and R=.57.

  • When I compute out the probabilities using twenty draws, I get a probability distribution (from 1 to 9 blue balls) of

    .00001 .0043 .065, .25 .37 .25 .065 .0043 .00001

    Which doesn’t match either the geometric or the arithmetic mean the way I’ve calculated them. I’ll have to appeal to a higher authority to see how this should be calculated. The good news is that the calculated probability of drawing blue is 50%, so not everything is lost.

    If anyone wants to look over my spreadsheet to help figure out what’s going on, I’ll be glad to provide copies. The email address is my last name, and the domain is mydruthers dot com.

  • Dave Pennock wrote: “The proper update rule depends on how the Bayesian models others’ beliefs. One can set up a model where the correct Bayesian update amounts to an algebraic average with others’ beliefs.”

    When I try to apply the model in your paper to a situation where one observer has a probability of 30%, and another gives 99%, I end up using a weighting term that says that the 99% must correspond to stronger confidence or more observations. That seems to have a similar effect to using geometric means; the result is closer to the more extreme belief than the arithmetic mean would suggest.

    The closing line on page 17 is: “the relative stock that an agent places on its own prior probability […] serves to attenuate demand by a multiplicative factor.” This doesn’t seem a pure linear model to me, but maybe I’m not reading it correctly.

  • Alex

    Hi, I have been following this thread with interest.

    How would one calculate the probabilities or odds of an event as the event takes place. flipping a coin for example, the first player to reach 5 successful flips.

    The probablities of either is 50% at the beginning but as the game goes on the real probilities change and therefore the odds.

    Heads tails probability heads/tails
    0 0 50%/50%
    1 0 ?/?
    2 0 ?/?
    2 1 ?/?


  • Hi there!!! Signing your site… not bad at all!!! Simply and great!!! I wanna know your opinion about my 🙂