Tag Archives: Prediction Markets

Prediction Markets Update

Prediction markets continue to offer great potential to improve society at many levels. Their greatest promise lies in helping organizations to better aggregate info to enable better key decisions. However, while such markets have consistently performed well in terms of cost, accuracy, ease of use, and user satisfaction, they have also tended to be politically disruptive – they often say things that embarrass powerful people, who get them killed. It is like putting a smart autist in the C-suite, someone who has lots of valuable info but is oblivious to the firm’s political landscape. Such an executive just wouldn’t last long, no matter how much they knew.

Like most promising innovations, prediction markets can’t realize their potential until they have been honed and evaluated in a set of increasingly substantial and challenging trials. Abstract ideas must be married to the right sort of complementary details that allow them to function in specific contexts. For prediction markets, real organizations with concrete forecasting needs related to their key decisions need to experiment with different ways to field prediction markets, in search of arrangements that minimize political disruption. (If you know of an organization willing to put up with the disruption that such experimentation creates, I know of a patron willing to consider funding such experiments.)

Alas, few such experiments have been happening. So let me tell you what has been happening instead.

For example, some public markets, such as PredictIt continue to function, and others, such as InTrade, have gone away. While I wish such markets well, I’m not that optimistic about markets on broad public questions, sold to the public, relative to markets on specific organization questions, funded by those organizations. I’m skeptical that there is much public demand for betting on odd questions, or that such markets do much to promote the more promising organizational markets.

Academics and their patrons have showed continued interest, but mostly in the form of abstract efforts such as papers, theorems, and lab experiments. We continue to collect an overhang of abstractly promising mechanisms that haven’t been tried in real organizations. (Including my combinatorial prediction markets.)

A few books have reached wide popular audiences, but mainly by focusing on signs that readers can use to tell themselves that they are better forecasters than their rivals. For example, The Wisdom of Crowds lets ordinary people tell themselves that those damn self-appointed experts are over-rated, compared to we wise crowds. Superforecastors gives several other indicators that readers can collect to tell themselves they are better. While the authors of these books clearly favor creating more markets, they are aware that this isn’t the readers’ priority.

In business, the firms who a decade ago tried to sell straight prediction market software and services to firms for their key decisions have mostly either gone out of business or switched to safer related products. These related products tend to keep the appearance of markets but undercut their main incentive benefits, and they tend to stay away from topics on which key firm insiders might express opinions.

For example, “innovation markets” suggest new research or development projects for firms to pursue. But instead of betting on the consequences of starting such projects, which would take years to see, they just bet on which projects will be funded. And
prediction market research” replaces focus groups with an interface where users seem to bet on which products will be more popular. Except they are just taking a survey, not actually betting on real outcomes.

The last few years has seen great interest in “blockchain” technology and ventures. While these are often used for illegal purposes, authorities have not cracked down as much as they might, as blockchains are new and sexy and promise many non-illegal and useful applications. However, these other applications have been slow in coming. I expect that if blockchains do not soon deliver a majority of their activity as legal laudable applications, authorities will crack down. And while hardcore fans may do what it takes to continue to use them even in the face of such a crackdown, most users will cave and quit, resulting in far lower activity levels.

Many firms have recently issued “crypto coins” to support their blockchain efforts, and current market prices suggest that speculators see a substantial (even if low) chance of large future blockchain activity levels. Even if the underlying technology has promise, such market prices can still be in error in either direction, and it is my opinion that such prices are now too high.

Prediction markets are one of the most frequently mentioned blockchain applications, and I’ve advised several related ventures. As sports betting seems one of the most likely uses of such blockchain based prediction markets, it isn’t clear that activity in this area will count as the legal laudible applications that blockchains need to survive. Even so, many are pursuing this possibility.

Overall blockchain based ventures tend to be heavy on algorithms and software, and light on all the other inputs needed to make a successful business venture. Many of these ventures seek to create general platforms, and hope that other more specific ventures will fill in the more specific business details.

The first firm to issue a blockchain prediction market coin was Augur, two years ago. They still haven’t delivered their software product, but they say they are close, and hopefully extra quality and reliability will result from their extra effort.

Gnosis issued their coin back in April, and they also say they are near ready to deliver their software. They also plan to do a set of experiments to test decision market variations. Both Augur and Gnosis are focused on creating general platforms, expecting others to fill in the specific betting topics and to do the marketing to attract specific customers.

Stox issued its prediction market coin last month. They say they plan more to team with existing trading websites: ”Stox incentivizes other industry leaders with existing customer bases, like invest.com, to join the Stox network and drive traffic to the network.” But they have yet to announce specific teaming deals.

Enjin is issuing its coin now. Instead of supporting prediction markets, Enjin says its coin makes it easier to trade assets from games like Minecraft, even without the support of the makers of such games.

I wish all these ventures well, though I fear a blockchain price crash is coming soon, and I wish there was more of a focus on selling organization over amateur prediction markets, and on particular business applications, rather than general software platforms. But it isn’t yet too late for someone to start to focus there.

GD Star Rating
loading...
Tagged as:

MRE Futures, To Not Starve

The Meal, Ready-to-Eat – commonly known as the MRE – is a self-contained, individual field ration in lightweight packaging bought by the United States military for its service members for use in combat or other field conditions where organized food facilities are not available. While MREs should be kept cool, they do not need to be refrigerated. .. MREs have also been distributed to civilians during natural disasters. .. Each meal provides about 1200 Calories. They .. have a minimum shelf life of three years. .. MREs must be able to withstand parachute drops from 380 metres, and non-parachute drops of 30 metres. (more)

Someday, a global crisis, or perhaps a severe regional one, may block 10-100% of the normal food supply for up to several years. This last week I attended a workshop set up by ALLFED, a group exploring new food sources for such situations. It seems that few people need to starve, even if we lose 100% of food for five years! And feeding everyone could go a long way toward keeping such a crisis from escalating into a worse catastrophic or existential risk. But for this to work, the right people, with the means and will to act, need to be aware of the right options at the right time. And early preparation, before a crisis, may go a long way toward making this feasible. How can we make this happen?

In this post I will outline a plan I worked out at this workshop, a plan intended to simultaneously achieve several related goals:

  1. Support deals for food insurance expressed in terms that ordinary people might understand and trust.
  2. Create incentives for food producers, before and during a crisis, to find good local ways to make and deliver food.
  3. Create incentives for researchers to find new food sources, develop working processes, and demonstrate their feasibility.
  4. Share information about the likelihood and severity of food crises in particular times, places, and conditions.

My idea starts with a new kind of MRE, one inspired by but not the same as the familiar military MRE. This new MRE would also be ready to eat without cooking, and also have minimum requirements for calories (after digesting), nutrients, lack of toxins, shelf life, and robustness to shocks. But, and this is key, suppliers would be free to meet these requirements using a wide range of exotic food options, including bacteria, bugs, and rats. (Or more conventional food made in unusual ways, like sugar from corn stalks or cows eating tree leaves.) It is this wide flexibility that could actually make it feasible to feed most everyone in a crisis. MREs might be graded for taste quality, perhaps assigned to three different taste quality levels by credentialed food tasters.

As an individual, you might want access to a source of MREs in a crisis. So you, or your family, firm, club, city, or nation, may want to buy or arrange for insurance which guarantees access to MREs in a crisis. A plausible insurance deal might promise access to so many MREs of a certain quality level per per time period, delivered at standard periodic times to a standard location “near” you. That is, rather than deliver MREs to your door on demand, you might have to show up at a certain more central location once a week or month to pick up your next batch of MREs.

The availability of these MREs might be triggered by a publicly observable event, like a statistical average of ordinary food prices over some area exceeding a threshold. Or, more flexibly, standard MRE insurance might always give one the right to buy, at a pre-declared high price and at standard places and times, a certain number of MREs per time period.  Those who fear not having enough cash to pay this pre-declared MRE price in a crisis might separately arrange for straight financial insurance, which pays cash tied either to a publicly triggered event, or to a market MRE price. Or the two approaches could be combined, so that MRE are available at a standard price during certain public events.

The organizations that offer insurance need ways to ensure customers that they can actually deliver on their promises to offer MREs at the stated times, places, and prices, given relevant public events. In addition, they want to minimize the prices they pay for these supplies of MREs, and encourage suppliers to search for low cost ways to make MREs.

This is where futures markets could help. In a futures market for wheat, people promise to deliver, or to take delivery, of certain quantities of certain types of wheat at particular standard times and places. Those who want to ensure a future supply of wheat against risks of changing prices can buy these futures, and those who grow wheat can ensure a future revenue for their wheat by selling futures. Most traders in futures markets are just speculating, and so arrange to leave the market before they’d have to make or take delivery. But the threat of making or taking delivery disciplines the prices that they pay. Those who fail to make or take delivery as promised face large financial and other penalties.

Analogously, those who offer MRE insurance could use MRE futures markets to ensure an MRE supply, and convince clients that they have ensured a supply. Yes, compared to the terms of the insurance offered by insurance organizations, the futures markets may offer fewer standard times, places, quality levels, and triggering public events. (Though the lab but not field tested tech of combinatorial markets make feasible far more combinations.) Even so, customers might find it easy to believe that, if necessary, an organization that has bought futures for a few standard times and places could actually take delivery of these futures contracts, store the MREs for short periods, and deliver them to the more numerous times and places specified in their insurance deals.

MRE futures markets could also ensure firms who explore innovative ways to make MREs of a demand for their product. By selling futures to deliver MREs at the standard times and places, they might fund their research, development, and production. When it came time to actually deliver MREs, they might make side deals with local insurance organizations to avoid any extra storage and transport costs of actually transferring MREs according to the futures contract details.

To encourage innovation, and to convince everyone that the system actually works, some patron, perhaps a foundation or government, could make a habit of periodically but randomly announcing large buy orders for MRE futures at certain times and places in the near future. They actually take delivery of the MREs, and then auction them off to whomever shows up there then to taste the MREs at a big social event. In this way ordinary people can sometimes hold and taste the MREs, and we can all see that there is a system capable of producing and delivering at least modest quantities on short notice. The firms who supply these MREs will of course have to set up real processes to actually deliver them, and be paid big premiums for their efforts.

These new MREs may not meet current regulatory requirements for food, and it may not be easy to adapt them to meet such requirements. Such requirements should be relaxed in a crisis, via a new crisis regulatory regime. It would be better to set that regime up ahead of time, instead of trying to negotiate it during a crisis. Such a new regulatory regime could be tested during these periodic random big MRE orders. Regulators could test the delivered MREs and only let people eat the ones that pasts their tests. Firms that had passed tests at previous events might be pre-approved for delivering MREs to future events, at least if they didn’t change their product too much. And during a real crisis, such firms could be pre-approved to rapidly increase production and delivery of their product. This offers an added incentive for firms to participate in these tests.

MRE futures markets might also help the world to coordinate expectations about which kinds of food crises might appear when under what circumstances. Special conditional futures contracts could be created, where one only promises to deliver MREs given certain world events or policies. If the event doesn’t happen, you don’t have to deliver. The relative prices of future contracts for different events and policies would reveal speculator expectations about how the chance and severity of food crises depend on such events and policies.

And that’s my big idea. Yes it will cost real resources, and I of course hope we never have to use it in a real crisis. But it seems to me far preferable to most of us starving to death. Far preferable.

GD Star Rating
loading...
Tagged as: , ,

Compare Institutions To Institutions, Not To Perfection

Mike Thicke of Bard College has just published a paper that concludes:

The promise prediction markets to solve problems in assessing scientific claims is largely illusory, while they could have significant unintended consequences for the organization of scientific research and the public perception of science. It would be unwise to pursue the adoption of prediction markets on a large scale, and even small-scale markets such as the Foresight Exchange should be regarded with scepticism.

He gives three reasons:

[1.] Prediction markets for science could be uninformative or deceptive because scientific predictions are often long-term, while prediction markets perform best for short-term questions. .. [2.] Prediction markets could produce misleading predictions due to their requirement for determinable predictions. Prediction markets require questions to be operationalized in ways that can subtly distort their meaning and produce misleading results. .. [3.] Prediction markets offering significant profit opportunities could damage existing scientific institutions and funding methods.

Imagine that you want to travel to a certain island. Some else tells you to row a boat there, but I tell you that a helicopter seems more cost effective for your purposes. So the rowboat advocate replies, “But helicopters aren’t as fast as teleportation, they take longer and cost more when to go longer distances, and you need more expert pilots to fly in worse weather.” All of which is true, but not very helpful.

Similarly, I argue that with each of his reasons, Thicke compares prediction markets to some ideal of perfection, instead of to the actual current institutions it is intended to supplement. Lets go through them one by one. On 1:

Even with rational traders who correctly assess the relevant probabilities, binary prediction markets can be expected to have a bias towards 50% predictions that is proportional to their duration. .. it has been demonstrated both empirically and theoretically .. long-term prediction markets typically have very low trading volume, which makes it unlikely that their prices react correctly to new information. .. [Hanson] envisions Wegener offering contracts ‘to be judged by some official body of geologists in a century’, but this would not have been an effective criterion given the problem of 50%-bias in long-term prediction markets. .. Prediction markets therefore would have been of little use to Wegener.

First a predictable known distortion isn’t a problem at all for forecasts; just invert the distortion to get the accurate forecast. Second, this is much less of an issue in combinatorial markets, where all questions are broken into thousands or more tiny questions, all of which have tiny probabilities, and a global constraint ensures they all add up to one. But more fundamentally, all institutions face the same problem that all else equal, it is easier to give incentives for accurate short term predictions, relative to long term ones. This doesn’t show that prediction markets are worse in this case than status quo institutions. On 2:

Even if prediction markets correctly predict measured surface temperature, they might not predict actual surface temperature if the measured and actual surface temperatures diverge. .. Globally averaged surface air temperature [might be] a poor proxy for overall global temperature, and consequently prediction market prices based on surface air temperature could diverge from what they purport to predict: global warming. .. If interpreting the results of these markets requires detailed knowledge of the underlying subject, as is needed to distinguish global average surface air temperature from global average temperature, the division of cognitive labour promised by these markets will disappear. Perhaps worse, such predictions could be misinterpreted if people assume they accurately represent what they claim to.

All social institutions of science must deal with the facts that there can be complex connections between abstract theories and specific measurements, and that ignorant outsiders may misinterpret summaries. Yes prediction market summaries might mislead some, but then so can grant and article abstracts, or media commentary. No, prediction markets can’t make all such complexities go away. But this hardly means that prediction markets can’t support a division of labor. For example, in combinatorial prediction markets different people can specialize in the connections between different variables, together managing a large Bayesian network of predictions. On 3:

If scientists anticipate that trading on prediction markets could generate significant profits, either due to being subsidized .. or due to legal changes allowing significant amounts of money to be invested, they could shift their attention toward research that is amenable to prediction markets. The research most amenable to prediction markets is short-term and quantitative: the kind of research that is already encouraged by industry funding. Therefore, prediction markets could reinforce an already troubling push toward short-term, application-oriented science. Further, scientists hoping to profit from these markets could withhold salient data in anticipation of using that data to make better informed trades than their peers. .. If success in prediction markets is taken as a marker of scientific credibility, then scientists may pursue prediction-oriented research not to make direct profit, but to increase their reputation.

Again, all institutions work better on short term questions. The fact that prediction markets also work better on short term questions does not imply that using them creates more emphasis on short term topics, relative to using some other institution. Also, every institution of science must offer individuals incentives, incentives which distract them from other activities. Such incentives also imply incentives to withhold info until one can use that info to one’s maximal advantage within the system of incentives. Prediction markets shouldn’t be compared to some perfect world where everyone shares all info without qualification; such worlds don’t exist.

Thicke also mentioned:

Although Hanson suggests that prediction market judges may assign non-binary evaluations of predictions, this seems fraught with problems. .. It is difficult to see how such judgements could be made immune from charges of ideological bias or conflict of interest, as they would rely on the judgement of a single individual.

Market judges don’t have to be individuals; there could be panels of judges. And existing institutions are also often open to charges of bias and conflicts of interest.

Unfortunately many responses to reform proposals fit the above pattern: reject the reform because it isn’t as good as perfection, ignoring the fact that the status quo is nothing like perfection.

GD Star Rating
loading...
Tagged as:

A Call To Adventure

I turn 58 soon, and I’m starting to realize that I may not live long enough to finish many of my great life projects. So I want to try to tempt younger folks to continue them. Hence this call to adventure.

One way to create meaning for your life is join a grand project. Or start a new one. A project that is both obviously important, and that might also bring you personal glory, if you were to made a noticeable contribution to it.

Yes, most don’t seek meaning this way. But many of our favorite fictional characters do. If you are one of the few who find grand adventures irresistibly romantic, then this post is for you. I call you to adventure.

Two great adventures actually, in this post. Both seem important, and in the ballpark of doable, at least for the right sort of person.

ADVENTURE ONE: The first adventure is to remake collective decision-making via decision markets (a.k.a. futarchy). Much of the pain and loss in the world results from bad decisions by key organizations, such as firms, clubs, cities, and nations. Some of these bad decisions result because actors with the wrong mix of values hold too much power. But most result from our not aggregating info well; people who could have or did know better were not enticed enough to share what they know. Or others didn’t believe them.

We actually know of a family of simple robust mechanisms that typically do much better at aggregating info. And we have a rough idea of how organizations could use such mechanisms. We even had a large academic literature testing and elaborating these mechanisms, resulting in a big pile of designs, theorems, software, computer simulations, lab tests, and field tests. We don’t need more of these, at least for now.

What we need is concrete evolution within real organizations. Like most good abstract ideas, what this innovation most needs are efforts to work out variations that can fit well in particular existing organization contexts. That is, design and try out variations that can avoid the several practical obstacles that we know about, and help identify more such obstacles to work on.

This adventure less needs intellectuals, and more sharp folks willing to get their hands dirty dealing with the complexities of real organizations, and with enough pull to get real organizations near them to try new and disruptive methods.

Since these mechanisms have great potential in a wide range of organizations, we first need to create versions that are seen to work reliably over a substantial time in concrete contexts where substantial value is at stake. With such a concrete track record, we can then push to get related versions tried in related contexts. Eventually such diffusion could result in better collective decision making worldwide, for many kinds of organizations and decisions.

And you might have been one of the few brave far-sighted heroes who made it happen.

ADVENTURE TWO: The second adventure is to figure out real typical human motives in typical familiar situations. You might think we humans would have figured this out long ago. But as Kevin Simler and I argue in our new book The Elephant in the Brain: Hidden Motives in Everyday Life, we seem to be quite mistaken about our basic motives in many familiar situations.

Kevin and I don’t claim that our usual stated motives aren’t part of the answer, only that they are much less than we like to think. We also don’t claim to have locked down the correct answer in all these situations. We instead offer plausible enough alternatives to suggest that the many puzzles with our usual stories are due to more than random noise. There really are systematic hidden motives behind our behaviors, motives substantially different from the ones we claim.

A good strategy for uncovering real typical human motives is to triangulate the many puzzles in our stated motives across a wide range of areas of human behavior. In each area specialists tend to think that the usual stated motive deserves to be given a strong prior, and they rarely think we’ve acquired enough “extraordinary evidence” to support the “extraordinary claims” that our usual stated motives are wrong. And if you only ever look at evidence in a narrow area, it can be hard to escape this trap.

The solution is expect substantial correlations between our motives in different areas. Look for hidden motive explanations of behaviors that can simultaneously account for puzzles in a wide range of areas, using only a few key assumptions. By insisting on a high ratio of apparently different puzzles explained to new supporting assumptions made, you can keep yourself disciplined enough not to be fooled by randomness.

This strategy is most effective when executed over a lifetime. The more different areas that you understand well enough to see the key puzzles and usual claims, the better you can triangulate their puzzles to find common explanations. And the more areas that you have learned so far, the easier it becomes to learn new areas; areas and methods used to study them tend to have many things in common.

This adventure needs more intellectual heroes. While these heroes may focus for a time on studying particular areas, over the long run their priority is to learn and triangulate many areas. They seek simple coherent accounts that explain diverse areas of human behavior. To figure out what the hell most humans are actually up to most of the time. Which we do not actually know now. And which would enable better policy; today policy reform efforts are often wasted due to mistaken assumptions about actual motives.

Wouldn’t someone who took a lifetime to help work that out be a hero of the highest order?

Come, adventures await. For the few, the brave, the determined, the insightful. Might that be you?

GD Star Rating
loading...
Tagged as: , ,

Why We Mix Fact & Value Talk

For a while now I’ve been tired of the US political drama, and I’ve been hoping that others would tire of it as well. Then maybe we could talk about something else, like say, my books. So I was thinking of writing a post reminding folks about futarchy, saying that politics doesn’t have to be this way. That is, we could largely (if not entirely) separate the political processes that deal with facts and values. In this case, even when there’s a big change in which values set policy, the fact estimates that set policy could remain the same, and be very expert.

In contrast, most of our current political processes mix up facts and values. The candidates we vote for, the bills they adopt, and the rulings that agencies make, all represent bundles of opinions on both facts and values. As a result, the fact estimates implicit in policy choices are less than fully expert, as such estimates must appeal to the citizens, politicians, administrators, etc. who we choose in part for their value positions. And so, to influence the values that our systems uses, we must each talk about facts as well, even when we aren’t personally very expert on those facts.

On reflection, however, I think I had it wrong. Most of those engaged by the current US political drama are enjoying it, even if they say otherwise. They get a rare chance to feel especially self-righteous, and to bond more strongly with political allies. And I think the usual mixing of facts and values actually helps them achieve these ends. Let me explain.

For the purpose of making effective decisions, on average the best mix of fact vs. value in analysis has over 90% of the attention go to facts. Yes, you need to pay some attention to values, but most of the devil is in the details, and most of the relevant details are on facts. This is true at all levels, including personal, family, firm, church, city, state, and national levels.

However, for the purpose of feeling self-righteous and bonding with allies, value talk is much more potent than fact talk. You need to believe that your values are superior to feel self-righteous, and shared values bond you with allies much more strongly than do shared facts. Yet even for this purpose, the ideal conversation isn’t more than 90% focused on values; something closer to a 50-50 mix works better.

The problem is that when we frame a debate as a pure value disagreement, we actually find it harder to feel enough obviously superior, and to dismiss the other side. We aren’t really as confident in our value positions as we pretend. We can see how observers might perceive a symmetry between us and our opponents, and label us unfair if we just try to crush the other side to achieve our values at the expense of their values.

However, by mixing enough facts into a value discussion, we can explain to ourselves and others why crushing them is really best for everyone. We can say that they just don’t understand that global warming is a real thing, or that kids really need two parents to grow up healthy. It is the other side’s failure to accept key facts that can justify to outsiders our uncompromising determination to crush them for a total win. Later on they may see we were right, and even thank us. But even if that doesn’t happen, right now we can feel justified in dismissing them.

I expect this dynamic plays out not only in national politics, but also in firm, church, and family politics. And it helps explain our widespread reluctance to adopt prediction markets, and other neutral fact estimation methods such as experiments, in relatively political contexts. We regularly want to support decisions that advance the values we share with our political allies, but we prefer the cover of seeming to be focused on estimating facts. To successfully use facts as a cover for values, we need to have enough fact issues mixed into our debates. And we need to avoid out-of-control fact estimation mechanisms that lack enough adjustment knobs to let us get the answers we want.

GD Star Rating
loading...
Tagged as: , ,

Surprising Popularity

This week Nature published some empirical data on a surprising-popularity consensus mechanism (a previously published mechanism, e.g., Science in 2004, with variations going by the name “Bayesian Truth Serum”). The idea is to ask people to pick from several options, and also to have each person forecast the distribution of opinion among others. The options that are picked surprisingly often, compared to what participants on average expected, are suggested as more likely true, and those who pick such options as better informed.

Compared to prediction markets, this mechanism doesn’t require that those who run the mechanism actually know the truth later. Which is indeed a big advantage. This mechanism can thus be applied to most any topic, such as the morality of abortion, the existence of God, or the location of space aliens. Also, incentives can be tied to this method, as you can pay people based on how well they predict the distribution of opinion. The big problem with this method, however, is that it requires that learning the truth be the cheapest way to coordinate opinion. Let me explain.

When you pay people for better predicting the distribution of opinion, one way they can do this prediction task is to each look for and report their best estimate of the truth. If everyone does this, and if participant errors and mistakes are pretty random, then those who do this task better will in fact have a better estimate of the distribution of opinion.

For example, imagine you are asked which city is the the capital of a particular state. Imagine you are part of a low-incentive one-time survey, and you don’t have an easy way to find and communicate with other survey participants. In this case, your best strategy may well be to think about which city is actually the capital.

Of course even in this case your incentive is to report the city that most sources would say is the capital. If you (and a few others) in fact know that according to the detailed legal history another city is rightfully the capital, not the city that the usual records give, your incentive is still to go with usual records.

More generally, you want to join the largest coalition who can effectively coordinate to give the same answers. If you can directly talk with each other, then you can agree on a common answer and report that. If not, you can try to use prearranged Schelling points to figure out your common answer from the context.

If this mechanism were repeated, say daily, then a safe way to coordinate would be to report the same answer as yesterday. But since everyone can easily do this too, it doesn’t give your coalition much of a relative advantage. You only win against those who make mistakes in implementing this obvious strategy. So you might instead coordinate to change your group’s answer each day based on some commonly observed changing signal.

To encourage this mechanism to better track truth, you’d want to make it harder for participants to coordinate their answers. You might ask random people at random times to answer quickly, put them in isolated rooms where they can’t talk to others, and ask your questions in varying and unusual styles that make it hard to guess how others will frame those questions. Prefer participants with more direct personal reasons to care about telling related truth, and prefer those who used different ways to learn about a topic. Perhaps ask different people for different overlapping parts and then put the final answer together yourself from those parts. I’m not sure how far you could get with these tricks, but they seem worth a try.

Or course these tricks are nothing like the way most of us actually consult experts. We are usually eager to ask standard questions to standard experts who coordinate heavily with each other. This is plausibly because we usually care much more to get the answers that others will also get, so that we don’t look foolish when we parrot those answers to others. That is, we care more about getting a coordinated standard answer than a truthful answer.

Thus I actually see a pretty bright future for this surprisingly-popular mechanism. I can see variations on it being used much more widely to generate standard safe answers that people can adopt with less fear of seeming strange or ignorant. But those who actually want to find true answers even when such answers are contrarian, they will need something closer to prediction markets.

GD Star Rating
loading...
Tagged as: , ,

Needed: Social Innovation Adaptation

This is the point during the electoral cycle when people are most willing to consider changing political systems. The nearly half of voters whose candidates just lost are now most open to changes that might have let their side win. But even in an election this acrimonious, that interest is paper thin, and blows away in the slightest breeze. Because politics isn’t about policy – what we really want is to feel part of a political tribe via talking with them about the same things. So if the rest of your tribe isn’t talking about system change, you don’t want to talk about that either.

So I want to tell or remind everyone that if you actually did care about outcomes instead of feeling part of a big tribe, large social gains wait untapped in better social institutions. In particular, very large gains await detailed field trials of institutional innovations. Let me explain.

Long ago when I was a physicist turned computer researcher who started to study economics, I noticed that it seemed far easier to design new better social institutions than to design new better computer algorithms or physical devices. This helped inspire me to switch to economics.

Once I was in graduate program with a thesis advisor who specialized in institution/mechanism design, I seemed to see a well established path for social innovations, from vague intuitions to theoretical analysis to lab experiments to simplified field experiments to complex practice. Of course as with most innovation paths, as costs rose along the path most candidates fell by the wayside. And yes, designing social institutions was harder that it looked at first, though it still seems easier than for computers and physical devices.

But it took me a long time to learn that this path is seriously broken near the end. Organizations with real problems do in fact sometimes allow simplified field trials of institutional alternatives that social scientists have proposed, but only in a very limited range of areas. And usually they mainly just do this to affiliate with prestigious academics; most aren’t actually much interested in adopting better institutions. (Firms mostly outsource social innovation to management consultants, who don’t actually endorse much. Yes startups explore some innovations, but relatively few.)

So by now academics have accumulated a large pile of promising institution ideas, many of which have supporting theory, lab experiments, and even simplified field trials. In addition, academics have even larger literatures that measure and theorize about existing social institutions. But even after promising results from simplified field experiments, much work usually remains to adapt such new proposals to the many complex details of existing social worlds. Complex worlds can’t usefully digest abstract academic ideas without such adaptation.

And the bottom line is that we very much lack organizations willing to do that work for social innovations. Organizations do this work more often for computer or device innovations, and sometimes social innovations get smuggled in via that route. A few organizations sometimes work on social innovations directly, but mostly to affiliate with prestigious academics, so if you aren’t such an academic you mostly can’t participate.

This is the point where I’ve found myself stuck with prediction & decision markets. There has been prestige and funding to prove theorems, do lab experiments, analyze field datasets, and even do limited simplified field trials. But there is little prestige or funding for that last key step of adapting academic ideas to complex social worlds. Its hard to apply rigorous general methods in such efforts, and so hard to publish on that academically. (Even blockchain folks interested have mainly been writing general code, not working with messy organizations.)

So if you want to make clubs, firms, cities, nations, and the world more effective and efficient, a highly effective strategy is to invest in widening the neglected bottleneck of the social innovation pathway. Get your organization to work on some ideas, or pay other organizations to work on them. Yes some ideas can only be tried out at large scales, but for most there are smaller scale analogues that it makes sense to work on first. I stand ready to help organizations do this for prediction & decision markets. But alas to most organizations I lack sufficient prestige for such associations.

GD Star Rating
loading...
Tagged as: ,

Big Impact Isn’t Big Data

A common heuristic for estimating the quality of something is: what has it done for me lately? For example, you could estimate the quality of a restaurant via a sum or average of how much you’ve enjoyed your meals there. Or you might weight recent visits more, since quality may change over time. Such methods are simple and robust, but they aren’t usually the best. For example, if you know of others who ate at that restaurant, their meal enjoyment is also data, data that can improve your quality estimate. Yes, those other people might have different meal priorities, and that may be a reason to give their meals less weight than your meals. But still, their data is useful.

Consider an extreme case where one meal, say your wedding reception meal, is far more important to you than the others. If you weigh your meal experiences in proportion to meal importance, your whole evaluation may depend mainly on one meal. Yes, if meals of that important type differ substantially from other meals then using this method best avoids biases from using unimportant types of meals to judge important types. But the noise in your estimate will be huge; individual restaurant meals can vary greatly for many random reasons even when the underlying quality stays the same. You just won’t know much about meal quality.

I mention all this because many seem eager to give the recent presidential election (and the recent Brexit vote) a huge weight in their estimate the quality of various prediction sources. Sources that did poorly on those two events are judged to be poor sources overall. And yes, if these were by far more important events to you, this strategy avoids the risk that familiar prediction sources have a different accuracy on events like this than they do on other events. Even so, this strategy mostly just puts you at the mercy of noise. If you use a small enough set of events to judge accuracy, you just aren’t going to be able to see much of a difference between sources; you will have little reason to think that those sources that did better on these few events will do much better on other future events.

Me, I don’t see much reason to think that familiar prediction sources have an accuracy that is very different on the most important events, relative to other events, and so I mainly trust comparisons that use a lot of data. For example, on large datasets prediction markets have shown a robustly high accuracy compared to other sources. Yes, you might find other particular sources that seem to do better in particular areas, but you have to worry about selection effects – how many similar sources did you look at to find those few winners? And if prediction market participants became convinced that these particular sources had high accuracy, they’d drive market prices to reflect those predictions.

GD Star Rating
loading...
Tagged as:

Regulating Self-Driving Cars

Warning: I’m sure there’s a literature on this, which I haven’t read. This post is instead based on a conversation with some folks who have read more of it. So I’m “shooting from the hip” here, as they say.

Like planes, boats, submarines, and other vehicles, self-driving cars can be used in several modes. The automation can be turned off. It can be turned on and advisory only. It can be driving, but with the human watching carefully and ready to take over at any time. Or it can be driving with the human not watching very carefully, so that the human would take a substantial delay before being able to take over. Or the human might not be capable of taking over at all; perhaps a remote driver would stand ready to take over via teleoperation.

While we might mostly trust vehicle owners or passengers to decide when to use which modes, existing practice suggest we won’t entirely trust them. Today, after a traffic accident, we let some parties sue others for damages. This can improves driver incentives to drive well. But we don’t trust this to fully correct incentives. So in addition, we regulate traffic. We don’t just suggest that you stop at a red light, keep in one lane, or stay below a speed limit. We require these things, and penalize detected violations. Similarly, we’ll probably want to regulate the choice of self-driving mode.

Consider a standard three-color traffic light. When the light is red, you are not allowed to go. When it is green you are allowed, but not required, to go; sometimes it is not safe to go even when a light is green. When the light is yellow, you are supposed to pay extra attention to a red light coming soon. We could similarly use a three color system as the basis of a three-mode system of regulating self-driving cars.

Imagine that inside each car is a very visible light, which regulators can set to be green, yellow or red. When your light is red you must drive your car yourself, even if you get advice from automation. When the light is yellow you can let the automation take over if you want, but you must watch carefully, ready to take over. When the light is green, you can usually ignore driving, such as by reading or sleeping, though you may watch or drive if you want.

(We might want a standard way to alert drivers when their color changed away from green. Of course we could imagine adding more colors, to distinguish more levels of attention and control. But a three level system seems a reasonable place to start.)

Under this system, the key regulatory choice is the choice of color. This choice could in principle be set different for each car at each moment. But early on the color would probably be set the same for all cars and drivers of a type, in a particular geographic area at a particular time. The color might come from in part a broadcasted signal, with the light perhaps defaulting to red if it can’t get a signal.

One can imagine a very bureaucratic system to set the color, with regulators sitting in a big room filled with monitors, like NASA mission control. That would probably be too conservative and fail to take local circumstances enough into account. Or one might imagine empowering fancy statistical or machine learning algorithms to make the choice. But most any algorithm would make a lot of mistakes, and the choice of algorithm might be politicized, leading to a poor choice.

Let me suggest using prediction markets for this choice. Regulators would have to choose a large set of situation buckets, such that the color must be the same for all situations in the same bucket. Then for each bucket we’d have three markets, estimating the accident rate conditional on a particular color. Assuming that drivers gain some direct benefit from paying less attention to driving, we’d set the color to green unless the expected difference between the green and yellow accident rate became high enough. Similarly for the choice between red and yellow.

Work on combinatorial prediction markets suggests that it is feasible to have billions or more such buckets at a time. We might use audit lotteries and only actually estimate accident rates for some small fraction of these buckets, using bets conditional on such auditing. But even with a much smaller number of buckets, our experience with prediction markets suggests that such a system would work better than either a bureaucratic or statistical system with a similar number of buckets.

Added 1p: My assumptions were influenced by the book Our Robots, Ourselves on the history of automation.

GD Star Rating
loading...
Tagged as: , ,

Merkle’s Futarchy

My futarchy paper, Shall We Vote on Values But Bet on Beliefs?, made public in 2000 but officially “published” in 2013, has gotten more attention lately as some folks talk about using it to govern blockchain organizations. In particular, Ralph Merkle (co-inventor of public key cryptography) has a recent paper on using futarchy within “Decentralized Autonomous Organizations.”

I tried to design my proposal carefully to avoid many potential problems. But Merkle seems to have thrown many of my cautions to the wind. So let me explain my concerns with his variations.

First, I had conservatively left existing institutions intact for Vote on Values; we’d elect representatives to oversee the definition and measurement of a value metric. Merkle instead has each citizen each year report a number in [0,1] saying how well their life has gone that year:

Annually, all citizens are asked to rank the year just passed between 0 and 1 (inclusive). .. it is intended to provide information about one person’s state of satisfaction with the year that has just passed. .. Summed over all citizens and divided by the number of citizens, this gives us an annual numerical metric between 0 and 1 inclusive. .. An appropriately weighted sum of annual collective welfares, also extending indefinitely into the future, would then give us a “democratic collective welfare” metric. .. adopting a discount rate seems like at least a plausible heuristic. .. To treat their death: .. ask the person who died .. ask before they die. .. [this] eliminates the need to evaluate issues and candidates. The individual citizen is called upon only to determine whether the year has been good or bad for themselves. .. We’ve solved .. the need to wade through deceptive misinformation.

Yes, it could be easy to decide how your last year has gone, even if it is harder to put that on a scale from worst to best possible. But reporting that number is not your best move here! Your optimal strategy here is almost surely “bang-bang”, i.e., reporting either 0 or 1. And you’ll probably want to usually give the same consistent answer year after year. So this is basically a vote, except on “was this last year a good or a bad year?”, which in practice becomes a vote on “has my life been good or bad over the last decades.” Each voter must pick a threshold where they switch their vote from good to bad, a big binary choice that seems ripe for strong emotional distortions. That might work, but it is pretty far from what voters have done before, so a lot of voter learning is needed.

I’m much more comfortable with futarchy that uses value metrics tied to the reason an organization exists. Such as using the market price of investment to manage an investment, attendance to manage a conference, or people helped (& how much) to manage a charity.

If there are too many bills on the table at anyone one time for speculators to consider, many bad ones can slip through and have effects before bills to reverse them can be proposed and adopted. So I suggested starting with a high bar for bills, but allowing new bills to lower the bar. Merkle instead starts with a very low bar that could be raised, and I worry about all the crazy bills that might pass before the bar rises:

Initially, anyone can propose a bill. It can be submitted at any time. .. At any time, anyone can propose a new method of adopting a bill. It is evaluated and put into effect using the existing methods. .. Suppose we decided that it would improve the stability of the system if all bills had a mandatory minimum consideration period of three months before they could be adopted. Then we would pass a bill modifying the DAO to include this provision.

I worried that the basic betting process could bias the basic rules, so I set basic voting and process rules off limits from bet changes, and set an independent judiciary to judge if rules are followed. Merkle instead allows this basic bet process to change all the rules, and all the judges, which seems to me to risk self-supporting rule changes:

How the survey is conducted, and what instructions are provided, and the surrounding publicity and environment, will all have a great impact on the answer. .. The integrity of the annual polls would be protected only if, as a consequence, it threatened the lives or the well-being of the citizens. .. The simplest approach would be to appoint, as President, that person the prediction market said had the highest positive impact on the collective welfare if appointed as President. .. Similar methods could be adopted to appoint the members of the Supreme Court.

Finally, I said explicitly that when the value formula changes then all the previous definitions must continue to be calculated to pay off past bets. It isn’t clear to me that Merkle adopts this, or if he allows the bet process to change value definitions, which also seems to me to risk self-supporting changes:

We leave the policy with respect to new members, and to births, to our prediction market. .. difficult to see how we could justify refusing to adopt a policy that accepts some person, or a new born child, as a member, if the prediction market says the collective welfare of existing members will be improved by adopting such a policy. .. Of greater concern are changes to the Democratic Collective Welfare metric. Yet even here, if the conclusion reached by the prediction market is that some modification of the metric will better maximize the original metric, then it is difficult to make a case that such a change should be banned.

I’m happy to see the new interest in futarchy, but I’m also worried that sloppy design may cause failures that are blamed on the overall concept instead of on implementation details. As recently happened to the DAO concept.

GD Star Rating
loading...
Tagged as: , ,