6 Comments

Surely that problem exists just as strongly today with the explanations that humans offer to each other. So this new system wouldn't be any worse.

Expand full comment

Very fun and very important topic. Several reasons come to mind for wanting an explanation. At a high level, the ones I'm thinking of can be tossed into one of these two categories.1) To transfer understanding (not just knowledge) to people2) To provide comfort that an estimate is justifiable

It seems that since this method is trying to select an answer that gets chosen by a judge, it searches out the most satisfying explanation (#2) vice the most accurate or most informative explanation (#1).

Some of that could surely be alleviated by not disclosing the judge, but I'm concerned that there are human biases which help determine what a judge will find satisfying which could be exploited by the market at the expense of accuracy. Thoughts?

Expand full comment

That's not a crazy idea, but I'd worry more about incentives to manipulate the price in that case. Something to test in theory or in the lab.

Expand full comment

Why draw twice randomly rather than pick the top two?

Expand full comment

Yes, this could be tested quickly. Alas far more resources are available for testing and developing ML methods than prediction market methods. Good work on that paper, btw.

Expand full comment

It would be great to see more analysis of market mechanisms like this one. Note that this particular pick-a-best algorithm is quite similar to our approach for RL from human preferences here (https://arxiv.org/abs/1706...., except that we don't need ingredient #5), and in general it seems like there is a strong correspondence between market mechanisms and ML training protocols.

I think that many open questions about AI alignment can be translated to open questions about these market mechanisms. For example, this mechanism requires access to a ground truth judge who can compare two explanations, but in the long run we want to use mechanisms of this kind to answer questions that no trusted judge can answer; there are various plausible ways to do that but they are all tricky. For example, you could make predictions about future outcomes, but this introduces new difficulties that are very closely analogous to usual AI safety problems. Andreas Stuhlmüller is also thinking about some similar issues at Ought (https://blog.ought.com/).

Aside from AI safety, I especially agree with:

> Many existing systems frequently invoke low quality judges, instead of less frequently invoking higher quality judges. I suspect that market estimates of high quality judgements may often be better than direct application of low quality judgements.

And would love to see such mechanisms tested in practice. This seems like that may be somewhat easier than testing otehr applications of prediction markets, since feedback doesn't have to wait on external events so can be much faster.

Expand full comment