Tag Archives: Prediction Markets

Markets That Explain, Via Markets To Pick A Best

I recently heard someone say “A disadvantage of prediction markets is that they don’t explain their estimates.” I responded: “But why couldn’t they?” That feature may cost you more, and it hasn’t been explored much in research or development. But I can see how to do it; in this post, I’ll outline a concept.

Previously, I’ve worked on a type of combinatorial prediction market built on a Bayes-Net structure. And there are standard ways to use such a structure to “explain” the estimates of any one variable in terms of the estimates of other variables. So obviously one could just apply those methods directly to get explanations for particular estimates in Bayes-Net based prediction markets. But I suspect that many would see such explanations as inadequate.

Here I’m instead going to try to solve the explanation problem by solving a more general problem: how to cheaply get a single good thing, if you have access to many people willing to submit and evaluate distinguishable things, and you have access to at least one possibly expensive judge who can rank these things. With access to this general pick-a-best mechanism, you can just ask people to submit explanations of a market estimate, and then deliver a single best explanation that you expect to be rated highly by your judge.

In more detail, you need five things:

  1. a prize Z you can pay to whomever submits the winning item,
  2. a community of people willing to submit candidate items to be evaluated for this prize, and to post bonds in the amount B supporting their submissions,
  3. an expensive (cost J) and trustworthy “gold standard” judge who has an error-prone tendency to pick the “better” item out of two items submitted.
  4. a community of people who think that they can guess on average how the judge will rate items, with some of these people being right about this belief, and
  5. a costly (amount B) and only mildly error-prone way to decide if one submission is overly derivative of another.

With these five things, you can get a pretty good thing if you pay Z+J. The more Z you offer, the better will be your good thing. Here is the procedure. First, anyone in a large community may submit candidates c, if they post a bond B for each submission. Each candidate c is publicly posted as it becomes available.

A prediction market is open on all candidates submitted so far, with assets of the form “Pays $1 if c wins.” We somehow define prices pc for such assets which satisfy 1 = pY + Sumc pc, where pY is the price of the asset “The winner is not yet submitted.” Submissions are not accepted after some deadline, and at that point I recommend the candidate c with the highest price pc; that will be a good choice. But to make it a good choice, the procedure has to continue.

A time is chosen randomly from a final time window (such as a final day) after the deadline. We use the market prices pc at that random time to pick a pair of candidates to show the judge. We draw twice randomly (with replacement) using the price pc as the random chance of picking each c. The judge then picks a single tentative winning candidate w out of this pair.

Anyone who submitted a candidate before w can challenge it within a limited challenge time window, claiming that the tentative winner w is overly derivative of their earlier submission e. An amount B is then spent to judge if w is derivative of e. If w is not judged derivative, then the challenger forfeits their bond B, and w remains the tentative winner. If w is judged derivative, then the tentative winner forfeits their bond B, and the challenger becomes a new tentative winner. We need potential challengers to expect a less than B/Z chance of a mistaken judgement regarding something being derivative.

Once all challenges are resolved, the tentative winner becomes the official winner, the person who submitted it is given a large prize Z, and prediction market betting assets are paid off. The end.

This process can easily be generalized in many ways. There could be more than one judge, each judge could be given more than two items to rank, the prediction markets could be subsidized, the chances of picking candidates c to show judges might be non-linear in market prices pc, and when setting such chances prices could be averaged over a time period. If pY is not zero when choosing candidates to evaluate, the prices pc could be renormalized. We might add prediction markets in whether any given challenge would be successful, and allow submissions to be withdrawn before a formal challenge is made.

Now I haven’t proven a theorem to you that this all works well, but I’m pretty sure that it does. By offering a prize for submissions, and allowing bets on which submissions will win, you need only make one expensive judgement between a pair of items, and have access to an expensive way to decide if one submission is overly derivative of another.

I suspect this mechanism may be superior to many of the systems we now use to choose winners. Many existing systems frequently invoke low quality judges, instead of less frequently invoking higher quality judges. I suspect that market estimates of high quality judgements may often be better than direct application of low quality judgements.

GD Star Rating
a WordPress rating system
Tagged as:

Dealism, Futarchy, and Hypocrisy

Many people analyze and discuss the policies that might be chosen by organizations such as governments, charities, clubs, and firms. We economists have a standard set of tools to help with such analysis, and in many contexts a good economist can use such tools to recommend particular policy options. However, many have criticized these economic tools as representing overly naive and simplistic theories of morality. In response I’ve said: policy conversations don’t have to be about morality. Let me explain.

A great many people presume that policy conversations are of course mainly about what actions and outcomes are morally better; which actions do we most admire and approve of ethically? If you accept this framing, and if you see human morality as complex, then it is reasonable to be wary of mathematical frameworks for policy analysis; any analysis of morality simple enough to be put into math could lead to quite misleading conclusions. One can point to many factors, given little attention by economists, but which are often considered relevant for moral analysis.

However, we don’t have to see policy conversations as being mainly about morality. We can instead look at them as being more about people trying to get what they want, and using shared advisors to help. We economists make great use of the concept of “revealed preference”; we infer what people want from what they do, and we expect people to continue to act to get what they want. Part of what people want is to be moral, and to be seen as moral. But people also want other things, and sometimes they make tradeoffs, choosing to get less morality and more of these other things. Continue reading "Dealism, Futarchy, and Hypocrisy" »

GD Star Rating
a WordPress rating system
Tagged as: , ,

Prediction Markets Update

Prediction markets continue to offer great potential to improve society at many levels. Their greatest promise lies in helping organizations to better aggregate info to enable better key decisions. However, while such markets have consistently performed well in terms of cost, accuracy, ease of use, and user satisfaction, they have also tended to be politically disruptive – they often say things that embarrass powerful people, who get them killed. It is like putting a smart autist in the C-suite, someone who has lots of valuable info but is oblivious to the firm’s political landscape. Such an executive just wouldn’t last long, no matter how much they knew.

Like most promising innovations, prediction markets can’t realize their potential until they have been honed and evaluated in a set of increasingly substantial and challenging trials. Abstract ideas must be married to the right sort of complementary details that allow them to function in specific contexts. For prediction markets, real organizations with concrete forecasting needs related to their key decisions need to experiment with different ways to field prediction markets, in search of arrangements that minimize political disruption. (If you know of an organization willing to put up with the disruption that such experimentation creates, I know of a patron willing to consider funding such experiments.)

Alas, few such experiments have been happening. So let me tell you what has been happening instead. Continue reading "Prediction Markets Update" »

GD Star Rating
a WordPress rating system
Tagged as:

MRE Futures, To Not Starve

The Meal, Ready-to-Eat – commonly known as the MRE – is a self-contained, individual field ration in lightweight packaging bought by the United States military for its service members for use in combat or other field conditions where organized food facilities are not available. While MREs should be kept cool, they do not need to be refrigerated. .. MREs have also been distributed to civilians during natural disasters. .. Each meal provides about 1200 Calories. They .. have a minimum shelf life of three years. .. MREs must be able to withstand parachute drops from 380 metres, and non-parachute drops of 30 metres. (more)

Someday, a global crisis, or perhaps a severe regional one, may block 10-100% of the normal food supply for up to several years. This last week I attended a workshop set up by ALLFED, a group exploring new food sources for such situations. It seems that few people need to starve, even if we lose 100% of food for five years! And feeding everyone could go a long way toward keeping such a crisis from escalating into a worse catastrophic or existential risk. But for this to work, the right people, with the means and will to act, need to be aware of the right options at the right time. And early preparation, before a crisis, may go a long way toward making this feasible. How can we make this happen?

In this post I will outline a plan I worked out at this workshop, a plan intended to simultaneously achieve several related goals:

  1. Support deals for food insurance expressed in terms that ordinary people might understand and trust.
  2. Create incentives for food producers, before and during a crisis, to find good local ways to make and deliver food.
  3. Create incentives for researchers to find new food sources, develop working processes, and demonstrate their feasibility.
  4. Share information about the likelihood and severity of food crises in particular times, places, and conditions.

My idea starts with a new kind of MRE, one inspired by but not the same as the familiar military MRE. This new MRE would also be ready to eat without cooking, and also have minimum requirements for calories (after digesting), nutrients, lack of toxins, shelf life, and robustness to shocks. But, and this is key, suppliers would be free to meet these requirements using a wide range of exotic food options, including bacteria, bugs, and rats. (Or more conventional food made in unusual ways, like sugar from corn stalks or cows eating tree leaves.) It is this wide flexibility that could actually make it feasible to feed most everyone in a crisis. MREs might be graded for taste quality, perhaps assigned to three different taste quality levels by credentialed food tasters.

As an individual, you might want access to a source of MREs in a crisis. So you, or your family, firm, club, city, or nation, may want to buy or arrange for insurance which guarantees access to MREs in a crisis. A plausible insurance deal might promise access to so many MREs of a certain quality level per per time period, delivered at standard periodic times to a standard location “near” you. That is, rather than deliver MREs to your door on demand, you might have to show up at a certain more central location once a week or month to pick up your next batch of MREs.

The availability of these MREs might be triggered by a publicly observable event, like a statistical average of ordinary food prices over some area exceeding a threshold. Or, more flexibly, standard MRE insurance might always give one the right to buy, at a pre-declared high price and at standard places and times, a certain number of MREs per time period.  Those who fear not having enough cash to pay this pre-declared MRE price in a crisis might separately arrange for straight financial insurance, which pays cash tied either to a publicly triggered event, or to a market MRE price. Or the two approaches could be combined, so that MRE are available at a standard price during certain public events.

The organizations that offer insurance need ways to ensure customers that they can actually deliver on their promises to offer MREs at the stated times, places, and prices, given relevant public events. In addition, they want to minimize the prices they pay for these supplies of MREs, and encourage suppliers to search for low cost ways to make MREs.

This is where futures markets could help. In a futures market for wheat, people promise to deliver, or to take delivery, of certain quantities of certain types of wheat at particular standard times and places. Those who want to ensure a future supply of wheat against risks of changing prices can buy these futures, and those who grow wheat can ensure a future revenue for their wheat by selling futures. Most traders in futures markets are just speculating, and so arrange to leave the market before they’d have to make or take delivery. But the threat of making or taking delivery disciplines the prices that they pay. Those who fail to make or take delivery as promised face large financial and other penalties.

Analogously, those who offer MRE insurance could use MRE futures markets to ensure an MRE supply, and convince clients that they have ensured a supply. Yes, compared to the terms of the insurance offered by insurance organizations, the futures markets may offer fewer standard times, places, quality levels, and triggering public events. (Though the lab but not field tested tech of combinatorial markets make feasible far more combinations.) Even so, customers might find it easy to believe that, if necessary, an organization that has bought futures for a few standard times and places could actually take delivery of these futures contracts, store the MREs for short periods, and deliver them to the more numerous times and places specified in their insurance deals.

MRE futures markets could also ensure firms who explore innovative ways to make MREs of a demand for their product. By selling futures to deliver MREs at the standard times and places, they might fund their research, development, and production. When it came time to actually deliver MREs, they might make side deals with local insurance organizations to avoid any extra storage and transport costs of actually transferring MREs according to the futures contract details.

To encourage innovation, and to convince everyone that the system actually works, some patron, perhaps a foundation or government, could make a habit of periodically but randomly announcing large buy orders for MRE futures at certain times and places in the near future. They actually take delivery of the MREs, and then auction them off to whomever shows up there then to taste the MREs at a big social event. In this way ordinary people can sometimes hold and taste the MREs, and we can all see that there is a system capable of producing and delivering at least modest quantities on short notice. The firms who supply these MREs will of course have to set up real processes to actually deliver them, and be paid big premiums for their efforts.

These new MREs may not meet current regulatory requirements for food, and it may not be easy to adapt them to meet such requirements. Such requirements should be relaxed in a crisis, via a new crisis regulatory regime. It would be better to set that regime up ahead of time, instead of trying to negotiate it during a crisis. Such a new regulatory regime could be tested during these periodic random big MRE orders. Regulators could test the delivered MREs and only let people eat the ones that pasts their tests. Firms that had passed tests at previous events might be pre-approved for delivering MREs to future events, at least if they didn’t change their product too much. And during a real crisis, such firms could be pre-approved to rapidly increase production and delivery of their product. This offers an added incentive for firms to participate in these tests.

MRE futures markets might also help the world to coordinate expectations about which kinds of food crises might appear when under what circumstances. Special conditional futures contracts could be created, where one only promises to deliver MREs given certain world events or policies. If the event doesn’t happen, you don’t have to deliver. The relative prices of future contracts for different events and policies would reveal speculator expectations about how the chance and severity of food crises depend on such events and policies.

And that’s my big idea. Yes it will cost real resources, and I of course hope we never have to use it in a real crisis. But it seems to me far preferable to most of us starving to death. Far preferable.

GD Star Rating
a WordPress rating system
Tagged as: , ,

Compare Institutions To Institutions, Not To Perfection

Mike Thicke of Bard College has just published a paper that concludes:

The promise prediction markets to solve problems in assessing scientific claims is largely illusory, while they could have significant unintended consequences for the organization of scientific research and the public perception of science. It would be unwise to pursue the adoption of prediction markets on a large scale, and even small-scale markets such as the Foresight Exchange should be regarded with scepticism.

He gives three reasons:

[1.] Prediction markets for science could be uninformative or deceptive because scientific predictions are often long-term, while prediction markets perform best for short-term questions. .. [2.] Prediction markets could produce misleading predictions due to their requirement for determinable predictions. Prediction markets require questions to be operationalized in ways that can subtly distort their meaning and produce misleading results. .. [3.] Prediction markets offering significant profit opportunities could damage existing scientific institutions and funding methods.

Imagine that you want to travel to a certain island. Some else tells you to row a boat there, but I tell you that a helicopter seems more cost effective for your purposes. So the rowboat advocate replies, “But helicopters aren’t as fast as teleportation, they take longer and cost more when to go longer distances, and you need more expert pilots to fly in worse weather.” All of which is true, but not very helpful.

Similarly, I argue that with each of his reasons, Thicke compares prediction markets to some ideal of perfection, instead of to the actual current institutions it is intended to supplement. Lets go through them one by one. On 1:

Even with rational traders who correctly assess the relevant probabilities, binary prediction markets can be expected to have a bias towards 50% predictions that is proportional to their duration. .. it has been demonstrated both empirically and theoretically .. long-term prediction markets typically have very low trading volume, which makes it unlikely that their prices react correctly to new information. .. [Hanson] envisions Wegener offering contracts ‘to be judged by some official body of geologists in a century’, but this would not have been an effective criterion given the problem of 50%-bias in long-term prediction markets. .. Prediction markets therefore would have been of little use to Wegener.

First a predictable known distortion isn’t a problem at all for forecasts; just invert the distortion to get the accurate forecast. Second, this is much less of an issue in combinatorial markets, where all questions are broken into thousands or more tiny questions, all of which have tiny probabilities, and a global constraint ensures they all add up to one. But more fundamentally, all institutions face the same problem that all else equal, it is easier to give incentives for accurate short term predictions, relative to long term ones. This doesn’t show that prediction markets are worse in this case than status quo institutions. On 2:

Even if prediction markets correctly predict measured surface temperature, they might not predict actual surface temperature if the measured and actual surface temperatures diverge. .. Globally averaged surface air temperature [might be] a poor proxy for overall global temperature, and consequently prediction market prices based on surface air temperature could diverge from what they purport to predict: global warming. .. If interpreting the results of these markets requires detailed knowledge of the underlying subject, as is needed to distinguish global average surface air temperature from global average temperature, the division of cognitive labour promised by these markets will disappear. Perhaps worse, such predictions could be misinterpreted if people assume they accurately represent what they claim to.

All social institutions of science must deal with the facts that there can be complex connections between abstract theories and specific measurements, and that ignorant outsiders may misinterpret summaries. Yes prediction market summaries might mislead some, but then so can grant and article abstracts, or media commentary. No, prediction markets can’t make all such complexities go away. But this hardly means that prediction markets can’t support a division of labor. For example, in combinatorial prediction markets different people can specialize in the connections between different variables, together managing a large Bayesian network of predictions. On 3:

If scientists anticipate that trading on prediction markets could generate significant profits, either due to being subsidized .. or due to legal changes allowing significant amounts of money to be invested, they could shift their attention toward research that is amenable to prediction markets. The research most amenable to prediction markets is short-term and quantitative: the kind of research that is already encouraged by industry funding. Therefore, prediction markets could reinforce an already troubling push toward short-term, application-oriented science. Further, scientists hoping to profit from these markets could withhold salient data in anticipation of using that data to make better informed trades than their peers. .. If success in prediction markets is taken as a marker of scientific credibility, then scientists may pursue prediction-oriented research not to make direct profit, but to increase their reputation.

Again, all institutions work better on short term questions. The fact that prediction markets also work better on short term questions does not imply that using them creates more emphasis on short term topics, relative to using some other institution. Also, every institution of science must offer individuals incentives, incentives which distract them from other activities. Such incentives also imply incentives to withhold info until one can use that info to one’s maximal advantage within the system of incentives. Prediction markets shouldn’t be compared to some perfect world where everyone shares all info without qualification; such worlds don’t exist.

Thicke also mentioned:

Although Hanson suggests that prediction market judges may assign non-binary evaluations of predictions, this seems fraught with problems. .. It is difficult to see how such judgements could be made immune from charges of ideological bias or conflict of interest, as they would rely on the judgement of a single individual.

Market judges don’t have to be individuals; there could be panels of judges. And existing institutions are also often open to charges of bias and conflicts of interest.

Unfortunately many responses to reform proposals fit the above pattern: reject the reform because it isn’t as good as perfection, ignoring the fact that the status quo is nothing like perfection.

GD Star Rating
a WordPress rating system
Tagged as:

A Call To Adventure

I turn 58 soon, and I’m starting to realize that I may not live long enough to finish many of my great life projects. So I want to try to tempt younger folks to continue them. Hence this call to adventure.

One way to create meaning for your life is join a grand project. Or start a new one. A project that is both obviously important, and that might also bring you personal glory, if you were to made a noticeable contribution to it.

Yes, most don’t seek meaning this way. But many of our favorite fictional characters do. If you are one of the few who find grand adventures irresistibly romantic, then this post is for you. I call you to adventure.

Two great adventures actually, in this post. Both seem important, and in the ballpark of doable, at least for the right sort of person.

ADVENTURE ONE: The first adventure is to remake collective decision-making via decision markets (a.k.a. futarchy). Much of the pain and loss in the world results from bad decisions by key organizations, such as firms, clubs, cities, and nations. Some of these bad decisions result because actors with the wrong mix of values hold too much power. But most result from our not aggregating info well; people who could have or did know better were not enticed enough to share what they know. Or others didn’t believe them.

We actually know of a family of simple robust mechanisms that typically do much better at aggregating info. And we have a rough idea of how organizations could use such mechanisms. We even had a large academic literature testing and elaborating these mechanisms, resulting in a big pile of designs, theorems, software, computer simulations, lab tests, and field tests. We don’t need more of these, at least for now.

What we need is concrete evolution within real organizations. Like most good abstract ideas, what this innovation most needs are efforts to work out variations that can fit well in particular existing organization contexts. That is, design and try out variations that can avoid the several practical obstacles that we know about, and help identify more such obstacles to work on.

This adventure less needs intellectuals, and more sharp folks willing to get their hands dirty dealing with the complexities of real organizations, and with enough pull to get real organizations near them to try new and disruptive methods.

Since these mechanisms have great potential in a wide range of organizations, we first need to create versions that are seen to work reliably over a substantial time in concrete contexts where substantial value is at stake. With such a concrete track record, we can then push to get related versions tried in related contexts. Eventually such diffusion could result in better collective decision making worldwide, for many kinds of organizations and decisions.

And you might have been one of the few brave far-sighted heroes who made it happen.

ADVENTURE TWO: The second adventure is to figure out real typical human motives in typical familiar situations. You might think we humans would have figured this out long ago. But as Kevin Simler and I argue in our new book The Elephant in the Brain: Hidden Motives in Everyday Life, we seem to be quite mistaken about our basic motives in many familiar situations.

Kevin and I don’t claim that our usual stated motives aren’t part of the answer, only that they are much less than we like to think. We also don’t claim to have locked down the correct answer in all these situations. We instead offer plausible enough alternatives to suggest that the many puzzles with our usual stories are due to more than random noise. There really are systematic hidden motives behind our behaviors, motives substantially different from the ones we claim.

A good strategy for uncovering real typical human motives is to triangulate the many puzzles in our stated motives across a wide range of areas of human behavior. In each area specialists tend to think that the usual stated motive deserves to be given a strong prior, and they rarely think we’ve acquired enough “extraordinary evidence” to support the “extraordinary claims” that our usual stated motives are wrong. And if you only ever look at evidence in a narrow area, it can be hard to escape this trap.

The solution is expect substantial correlations between our motives in different areas. Look for hidden motive explanations of behaviors that can simultaneously account for puzzles in a wide range of areas, using only a few key assumptions. By insisting on a high ratio of apparently different puzzles explained to new supporting assumptions made, you can keep yourself disciplined enough not to be fooled by randomness.

This strategy is most effective when executed over a lifetime. The more different areas that you understand well enough to see the key puzzles and usual claims, the better you can triangulate their puzzles to find common explanations. And the more areas that you have learned so far, the easier it becomes to learn new areas; areas and methods used to study them tend to have many things in common.

This adventure needs more intellectual heroes. While these heroes may focus for a time on studying particular areas, over the long run their priority is to learn and triangulate many areas. They seek simple coherent accounts that explain diverse areas of human behavior. To figure out what the hell most humans are actually up to most of the time. Which we do not actually know now. And which would enable better policy; today policy reform efforts are often wasted due to mistaken assumptions about actual motives.

Wouldn’t someone who took a lifetime to help work that out be a hero of the highest order?

Come, adventures await. For the few, the brave, the determined, the insightful. Might that be you?

GD Star Rating
a WordPress rating system
Tagged as: , ,

Why We Mix Fact & Value Talk

For a while now I’ve been tired of the US political drama, and I’ve been hoping that others would tire of it as well. Then maybe we could talk about something else, like say, my books. So I was thinking of writing a post reminding folks about futarchy, saying that politics doesn’t have to be this way. That is, we could largely (if not entirely) separate the political processes that deal with facts and values. In this case, even when there’s a big change in which values set policy, the fact estimates that set policy could remain the same, and be very expert.

In contrast, most of our current political processes mix up facts and values. The candidates we vote for, the bills they adopt, and the rulings that agencies make, all represent bundles of opinions on both facts and values. As a result, the fact estimates implicit in policy choices are less than fully expert, as such estimates must appeal to the citizens, politicians, administrators, etc. who we choose in part for their value positions. And so, to influence the values that our systems uses, we must each talk about facts as well, even when we aren’t personally very expert on those facts.

On reflection, however, I think I had it wrong. Most of those engaged by the current US political drama are enjoying it, even if they say otherwise. They get a rare chance to feel especially self-righteous, and to bond more strongly with political allies. And I think the usual mixing of facts and values actually helps them achieve these ends. Let me explain.

For the purpose of making effective decisions, on average the best mix of fact vs. value in analysis has over 90% of the attention go to facts. Yes, you need to pay some attention to values, but most of the devil is in the details, and most of the relevant details are on facts. This is true at all levels, including personal, family, firm, church, city, state, and national levels.

However, for the purpose of feeling self-righteous and bonding with allies, value talk is much more potent than fact talk. You need to believe that your values are superior to feel self-righteous, and shared values bond you with allies much more strongly than do shared facts. Yet even for this purpose, the ideal conversation isn’t more than 90% focused on values; something closer to a 50-50 mix works better.

The problem is that when we frame a debate as a pure value disagreement, we actually find it harder to feel enough obviously superior, and to dismiss the other side. We aren’t really as confident in our value positions as we pretend. We can see how observers might perceive a symmetry between us and our opponents, and label us unfair if we just try to crush the other side to achieve our values at the expense of their values.

However, by mixing enough facts into a value discussion, we can explain to ourselves and others why crushing them is really best for everyone. We can say that they just don’t understand that global warming is a real thing, or that kids really need two parents to grow up healthy. It is the other side’s failure to accept key facts that can justify to outsiders our uncompromising determination to crush them for a total win. Later on they may see we were right, and even thank us. But even if that doesn’t happen, right now we can feel justified in dismissing them.

I expect this dynamic plays out not only in national politics, but also in firm, church, and family politics. And it helps explain our widespread reluctance to adopt prediction markets, and other neutral fact estimation methods such as experiments, in relatively political contexts. We regularly want to support decisions that advance the values we share with our political allies, but we prefer the cover of seeming to be focused on estimating facts. To successfully use facts as a cover for values, we need to have enough fact issues mixed into our debates. And we need to avoid out-of-control fact estimation mechanisms that lack enough adjustment knobs to let us get the answers we want.

GD Star Rating
a WordPress rating system
Tagged as: , ,

Surprising Popularity

This week Nature published some empirical data on a surprising-popularity consensus mechanism (a previously published mechanism, e.g., Science in 2004, with variations going by the name “Bayesian Truth Serum”). The idea is to ask people to pick from several options, and also to have each person forecast the distribution of opinion among others. The options that are picked surprisingly often, compared to what participants on average expected, are suggested as more likely true, and those who pick such options as better informed.

Compared to prediction markets, this mechanism doesn’t require that those who run the mechanism actually know the truth later. Which is indeed a big advantage. This mechanism can thus be applied to most any topic, such as the morality of abortion, the existence of God, or the location of space aliens. Also, incentives can be tied to this method, as you can pay people based on how well they predict the distribution of opinion. The big problem with this method, however, is that it requires that learning the truth be the cheapest way to coordinate opinion. Let me explain.

When you pay people for better predicting the distribution of opinion, one way they can do this prediction task is to each look for and report their best estimate of the truth. If everyone does this, and if participant errors and mistakes are pretty random, then those who do this task better will in fact have a better estimate of the distribution of opinion.

For example, imagine you are asked which city is the the capital of a particular state. Imagine you are part of a low-incentive one-time survey, and you don’t have an easy way to find and communicate with other survey participants. In this case, your best strategy may well be to think about which city is actually the capital.

Of course even in this case your incentive is to report the city that most sources would say is the capital. If you (and a few others) in fact know that according to the detailed legal history another city is rightfully the capital, not the city that the usual records give, your incentive is still to go with usual records.

More generally, you want to join the largest coalition who can effectively coordinate to give the same answers. If you can directly talk with each other, then you can agree on a common answer and report that. If not, you can try to use prearranged Schelling points to figure out your common answer from the context.

If this mechanism were repeated, say daily, then a safe way to coordinate would be to report the same answer as yesterday. But since everyone can easily do this too, it doesn’t give your coalition much of a relative advantage. You only win against those who make mistakes in implementing this obvious strategy. So you might instead coordinate to change your group’s answer each day based on some commonly observed changing signal.

To encourage this mechanism to better track truth, you’d want to make it harder for participants to coordinate their answers. You might ask random people at random times to answer quickly, put them in isolated rooms where they can’t talk to others, and ask your questions in varying and unusual styles that make it hard to guess how others will frame those questions. Prefer participants with more direct personal reasons to care about telling related truth, and prefer those who used different ways to learn about a topic. Perhaps ask different people for different overlapping parts and then put the final answer together yourself from those parts. I’m not sure how far you could get with these tricks, but they seem worth a try.

Or course these tricks are nothing like the way most of us actually consult experts. We are usually eager to ask standard questions to standard experts who coordinate heavily with each other. This is plausibly because we usually care much more to get the answers that others will also get, so that we don’t look foolish when we parrot those answers to others. That is, we care more about getting a coordinated standard answer than a truthful answer.

Thus I actually see a pretty bright future for this surprisingly-popular mechanism. I can see variations on it being used much more widely to generate standard safe answers that people can adopt with less fear of seeming strange or ignorant. But those who actually want to find true answers even when such answers are contrarian, they will need something closer to prediction markets.

GD Star Rating
a WordPress rating system
Tagged as: , ,

Needed: Social Innovation Adaptation

This is the point during the electoral cycle when people are most willing to consider changing political systems. The nearly half of voters whose candidates just lost are now most open to changes that might have let their side win. But even in an election this acrimonious, that interest is paper thin, and blows away in the slightest breeze. Because politics isn’t about policy – what we really want is to feel part of a political tribe via talking with them about the same things. So if the rest of your tribe isn’t talking about system change, you don’t want to talk about that either.

So I want to tell or remind everyone that if you actually did care about outcomes instead of feeling part of a big tribe, large social gains wait untapped in better social institutions. In particular, very large gains await detailed field trials of institutional innovations. Let me explain.

Long ago when I was a physicist turned computer researcher who started to study economics, I noticed that it seemed far easier to design new better social institutions than to design new better computer algorithms or physical devices. This helped inspire me to switch to economics.

Once I was in graduate program with a thesis advisor who specialized in institution/mechanism design, I seemed to see a well established path for social innovations, from vague intuitions to theoretical analysis to lab experiments to simplified field experiments to complex practice. Of course as with most innovation paths, as costs rose along the path most candidates fell by the wayside. And yes, designing social institutions was harder that it looked at first, though it still seems easier than for computers and physical devices.

But it took me a long time to learn that this path is seriously broken near the end. Organizations with real problems do in fact sometimes allow simplified field trials of institutional alternatives that social scientists have proposed, but only in a very limited range of areas. And usually they mainly just do this to affiliate with prestigious academics; most aren’t actually much interested in adopting better institutions. (Firms mostly outsource social innovation to management consultants, who don’t actually endorse much. Yes startups explore some innovations, but relatively few.)

So by now academics have accumulated a large pile of promising institution ideas, many of which have supporting theory, lab experiments, and even simplified field trials. In addition, academics have even larger literatures that measure and theorize about existing social institutions. But even after promising results from simplified field experiments, much work usually remains to adapt such new proposals to the many complex details of existing social worlds. Complex worlds can’t usefully digest abstract academic ideas without such adaptation.

And the bottom line is that we very much lack organizations willing to do that work for social innovations. Organizations do this work more often for computer or device innovations, and sometimes social innovations get smuggled in via that route. A few organizations sometimes work on social innovations directly, but mostly to affiliate with prestigious academics, so if you aren’t such an academic you mostly can’t participate.

This is the point where I’ve found myself stuck with prediction & decision markets. There has been prestige and funding to prove theorems, do lab experiments, analyze field datasets, and even do limited simplified field trials. But there is little prestige or funding for that last key step of adapting academic ideas to complex social worlds. Its hard to apply rigorous general methods in such efforts, and so hard to publish on that academically. (Even blockchain folks interested have mainly been writing general code, not working with messy organizations.)

So if you want to make clubs, firms, cities, nations, and the world more effective and efficient, a highly effective strategy is to invest in widening the neglected bottleneck of the social innovation pathway. Get your organization to work on some ideas, or pay other organizations to work on them. Yes some ideas can only be tried out at large scales, but for most there are smaller scale analogues that it makes sense to work on first. I stand ready to help organizations do this for prediction & decision markets. But alas to most organizations I lack sufficient prestige for such associations.

GD Star Rating
a WordPress rating system
Tagged as: ,

Big Impact Isn’t Big Data

A common heuristic for estimating the quality of something is: what has it done for me lately? For example, you could estimate the quality of a restaurant via a sum or average of how much you’ve enjoyed your meals there. Or you might weight recent visits more, since quality may change over time. Such methods are simple and robust, but they aren’t usually the best. For example, if you know of others who ate at that restaurant, their meal enjoyment is also data, data that can improve your quality estimate. Yes, those other people might have different meal priorities, and that may be a reason to give their meals less weight than your meals. But still, their data is useful.

Consider an extreme case where one meal, say your wedding reception meal, is far more important to you than the others. If you weigh your meal experiences in proportion to meal importance, your whole evaluation may depend mainly on one meal. Yes, if meals of that important type differ substantially from other meals then using this method best avoids biases from using unimportant types of meals to judge important types. But the noise in your estimate will be huge; individual restaurant meals can vary greatly for many random reasons even when the underlying quality stays the same. You just won’t know much about meal quality.

I mention all this because many seem eager to give the recent presidential election (and the recent Brexit vote) a huge weight in their estimate the quality of various prediction sources. Sources that did poorly on those two events are judged to be poor sources overall. And yes, if these were by far more important events to you, this strategy avoids the risk that familiar prediction sources have a different accuracy on events like this than they do on other events. Even so, this strategy mostly just puts you at the mercy of noise. If you use a small enough set of events to judge accuracy, you just aren’t going to be able to see much of a difference between sources; you will have little reason to think that those sources that did better on these few events will do much better on other future events.

Me, I don’t see much reason to think that familiar prediction sources have an accuracy that is very different on the most important events, relative to other events, and so I mainly trust comparisons that use a lot of data. For example, on large datasets prediction markets have shown a robustly high accuracy compared to other sources. Yes, you might find other particular sources that seem to do better in particular areas, but you have to worry about selection effects – how many similar sources did you look at to find those few winners? And if prediction market participants became convinced that these particular sources had high accuracy, they’d drive market prices to reflect those predictions.

GD Star Rating
a WordPress rating system
Tagged as: