Markets That Explain, Via Markets To Pick A Best

I recently heard someone say “A disadvantage of prediction markets is that they don’t explain their estimates.” I responded: “But why couldn’t they?” That feature may cost you more, and it hasn’t been explored much in research or development. But I can see how to do it; in this post, I’ll outline a concept.

Previously, I’ve worked on a type of combinatorial prediction market built on a Bayes-Net structure. And there are standard ways to use such a structure to “explain” the estimates of any one variable in terms of the estimates of other variables. So obviously one could just apply those methods directly to get explanations for particular estimates in Bayes-Net based prediction markets. But I suspect that many would see such explanations as inadequate.

Here I’m instead going to try to solve the explanation problem by solving a more general problem: how to cheaply get a single good thing, if you have access to many people willing to submit and evaluate distinguishable things, and you have access to at least one possibly expensive judge who can rank these things. With access to this general pick-a-best mechanism, you can just ask people to submit explanations of a market estimate, and then deliver a single best explanation that you expect to be rated highly by your judge.

In more detail, you need five things:

  1. a prize Z you can pay to whomever submits the winning item,
  2. a community of people willing to submit candidate items to be evaluated for this prize, and to post bonds in the amount B supporting their submissions,
  3. an expensive (cost J) and trustworthy “gold standard” judge who has an error-prone tendency to pick the “better” item out of two items submitted.
  4. a community of people who think that they can guess on average how the judge will rate items, with some of these people being right about this belief, and
  5. a costly (amount B) and only mildly error-prone way to decide if one submission is overly derivative of another.

With these five things, you can get a pretty good thing if you pay Z+J. The more Z you offer, the better will be your good thing. Here is the procedure. First, anyone in a large community may submit candidates c, if they post a bond B for each submission. Each candidate c is publicly posted as it becomes available.

A prediction market is open on all candidates submitted so far, with assets of the form “Pays $1 if c wins.” We somehow define prices pc for such assets which satisfy 1 = pY + Sumc pc, where pY is the price of the asset “The winner is not yet submitted.” Submissions are not accepted after some deadline, and at that point I recommend the candidate c with the highest price pc; that will be a good choice. But to make it a good choice, the procedure has to continue.

A time is chosen randomly from a final time window (such as a final day) after the deadline. We use the market prices pc at that random time to pick a pair of candidates to show the judge. We draw twice randomly (with replacement) using the price pc as the random chance of picking each c. The judge then picks a single tentative winning candidate w out of this pair.

Anyone who submitted a candidate before w can challenge it within a limited challenge time window, claiming that the tentative winner w is overly derivative of their earlier submission e. An amount B is then spent to judge if w is derivative of e. If w is not judged derivative, then the challenger forfeits their bond B, and w remains the tentative winner. If w is judged derivative, then the tentative winner forfeits their bond B, and the challenger becomes a new tentative winner. We need potential challengers to expect a less than B/Z chance of a mistaken judgement regarding something being derivative.

Once all challenges are resolved, the tentative winner becomes the official winner, the person who submitted it is given a large prize Z, and prediction market betting assets are paid off. The end.

This process can easily be generalized in many ways. There could be more than one judge, each judge could be given more than two items to rank, the prediction markets could be subsidized, the chances of picking candidates c to show judges might be non-linear in market prices pc, and when setting such chances prices could be averaged over a time period. If pY is not zero when choosing candidates to evaluate, the prices pc could be renormalized. We might add prediction markets in whether any given challenge would be successful, and allow submissions to be withdrawn before a formal challenge is made.

Now I haven’t proven a theorem to you that this all works well, but I’m pretty sure that it does. By offering a prize for submissions, and allowing bets on which submissions will win, you need only make one expensive judgement between a pair of items, and have access to an expensive way to decide if one submission is overly derivative of another.

I suspect this mechanism may be superior to many of the systems we now use to choose winners. Many existing systems frequently invoke low quality judges, instead of less frequently invoking higher quality judges. I suspect that market estimates of high quality judgements may often be better than direct application of low quality judgements.

GD Star Rating
Tagged as:
Trackback URL:
  • Pingback: Rational Feed – deluks917()

  • Paul Christiano

    It would be great to see more analysis of market mechanisms like this one. Note that this particular pick-a-best algorithm is quite similar to our approach for RL from human preferences here (, except that we don’t need ingredient #5), and in general it seems like there is a strong correspondence between market mechanisms and ML training protocols.

    I think that many open questions about AI alignment can be translated to open questions about these market mechanisms. For example, this mechanism requires access to a ground truth judge who can compare two explanations, but in the long run we want to use mechanisms of this kind to answer questions that no trusted judge can answer; there are various plausible ways to do that but they are all tricky. For example, you could make predictions about future outcomes, but this introduces new difficulties that are very closely analogous to usual AI safety problems. Andreas Stuhlmüller is also thinking about some similar issues at Ought (

    Aside from AI safety, I especially agree with:

    > Many existing systems frequently invoke low quality judges, instead of less frequently invoking higher quality judges. I suspect that market estimates of high quality judgements may often be better than direct application of low quality judgements.

    And would love to see such mechanisms tested in practice. This seems like that may be somewhat easier than testing otehr applications of prediction markets, since feedback doesn’t have to wait on external events so can be much faster.

  • Frank Lantz

    Why draw twice randomly rather than pick the top two?

    • That’s not a crazy idea, but I’d worry more about incentives to manipulate the price in that case. Something to test in theory or in the lab.

  • Andrew Lohn

    Very fun and very important topic. Several reasons come to mind for wanting an explanation. At a high level, the ones I’m thinking of can be tossed into one of these two categories.
    1) To transfer understanding (not just knowledge) to people
    2) To provide comfort that an estimate is justifiable

    It seems that since this method is trying to select an answer that gets chosen by a judge, it searches out the most satisfying explanation (#2) vice the most accurate or most informative explanation (#1).

    Some of that could surely be alleviated by not disclosing the judge, but I’m concerned that there are human biases which help determine what a judge will find satisfying which could be exploited by the market at the expense of accuracy. Thoughts?

    • Surely that problem exists just as strongly today with the explanations that humans offer to each other. So this new system wouldn’t be any worse.