Powered by Disqus
Alexander Berger from GiveWell interviewed me on prediction markets, and has posted his notes here. Alex and I seem to disagree about the importance of this topic:
Organizational obstacles The main barrier to wider-scale adoption of prediction markets is that most organizations are reluctant to use them. It is unclear why this is the case. Those currently in power within firms may resist prediction markets because the markets would spread previously privileged information across the company and change perceptions of what is knowable and who knows
I tried to emphasize this topic, but Alex devotes only 60 out of 1800 words to it.
In a column, Andrew Gelman and Eric Loken note that academia has a problem:
Unfortunately, statistics—and the scientific process more generally—often seems to be used more as a way of laundering uncertainty, processing data until researchers and consumers of research can feel safe acting as if various scientific hypotheses are unquestionably true.
They consider prediction markets as a solution, but largely reject them for reasons both bad and not so bad. I’ll respond here to their article in unusual detail. First the bad:
Would prediction markets (or something like them) help? It’s hard to imagine them working out in practice. Indeed, the housing crisis was magnified by rampant speculation in derivatives that led to a multiplier effect.
Yes, speculative market estimates were mistaken there, as were most other sources, and mistaken estimates caused bad decisions. But speculative markets were the first credible source to correct the mistake, and no other stable source had consistently more accurate estimates. Why should the most accurate source should be blamed for mistakes made by all sources?
Allowing people to bet on the failure of other people’s experiments just invites corruption, and the last thing social psychologists want to worry about is a point-shaving scandal.
What about letting researchers who compete for grants, jobs, and publications write critical referee reports and publish criticism, doesn’t that invite corruption too? If you are going to forbid all conflicts of interest because they invite corruption, you won’t have much left you will allow. Surely you need to argue that bet incentives are more corrupting that other incentives. Continue reading "Academic Stats Prediction Markets" »
It looks bad for a manager to have one of his projects fail. So to “cover his ass”, such a manager often tries to prevent any records showing that people saw failure coming. After a failure, he wants to say “this was just random bad luck; no one could have foreseen seen it.” His bosses up the chain of command tend to allow this, because they also want to avoid being held responsible for failures during their watch. So they also prefer the random back luck story.
Unfortunately, this approach tends to prevent organizations from getting signals that would let them mitigate failures, such as by quitting projects earlier. For example, most startup firms don’t fail until they have spent nearly all of the cash they were given. It is rare for a startup to admit it isn’t going to work out, and give some cash back to investors. Similarly, government agencies created to achieve some purpose rarely recommend to legislatures that they be eliminated when their find that they aren’t achieving their intended purposes.
Of course bosses don’t want to be too obvious about silencing possible signals of failure. They find it hard to silence what have become standard signals, like cost accounting measures.
A great application of prediction markets is to give better and clearer warnings of upcoming failure, to enable better mitigation, such as quitting. Of course project bosses anticipate this, and oppose prediction markets on their projects, for exactly this reason. But we can still hope that prediction market warnings may someday become a standard signal, and thus hard to silence:
I hope prediction markets within firms may someday gain a status like cost accounting today. In a world were no one else did cost accounting, proposing that your firm do it would basically suggest that someone was stealing there. Which would look bad. But in a world where everyone else does cost accounting, suggesting that your firm not do it would suggest that you want to steal from it. Which also looks bad.
Similarly, in a world where few other firms use prediction markets, suggesting that your firm use them on your project suggests that your project has an unusual problem in getting people to tell the truth about it via the usual channels. Which looks bad. But in a world where most firms use prediction markets on most projects, suggesting that your project not use prediction markets would suggest you want to hide something. (more)
Long ago our primate ancestors learned to be “political.” That is, instead of just acting independently, we learned to join into coalitions for mutual advantage, and to switch coalitions for private advantage. Our human ancestors added social norms, i.e., rules enforced by feelings of outrage in broad coalitions. Foragers used norms and coalitions to manage bands of roughly thirty members, and farmers applied similar behaviors to village communities of roughly a thousand.
In ancient politics, people learned to attract allies, to judge who else was reliable as an ally, to gossip about who was allied with who, and to help allies and hurt rivals. In particular we learned to say good things about allies and bad things about rivals, such as accusing rivals of violating key social norms, and praising allies for upholding them.
Today many people consider themselves to be very “political”, and they treat this aspect of themselves as central to their identity. They spend lots of time talking about related views, associating with those who share them, and criticizing those who disagree. They often feel especially proud of how boldly and freely they do these things, relative to their ancestors and those in “backward” cultures.
Trouble is, such folks are mostly “political” about national or international politics. Their interest fades as the norms and coalitions at stake focus on smaller scales, such as regions, cities, or neighborhoods. The politics of firms, clubs, and families hardly engage them at all. Of course such people are members of local coalitions, and do sometimes voice support for enforcing related norms. So they are political there to some extent. But they are much less bold, self-righteous, and uncompromising about local politics, and don’t consider related views to be central to their identity. Such folks are eager to associate with those who sacrifice to improve world politics, but are only mildly interested in associating with those who sacrifice to improve local politics.
This focus on politics at the largest scale is both relatively safe, and relatively useless. On the one hand, your efforts to take sides and support norm enforcement at very local levels are far more likely to benefit you personally via better local outcomes. On the other hand, such efforts are far more likely to bother opposing coalitions, leaving you vulnerable to retaliation. Given these risks, and the greater praise given to for those who push politics at the largest scales, it is understandable if people tend to focus on safe-scale politics, unlikely to cause them personal troubles.
Near-far theory predicts that we’d tend to focus our ideals and moral outrage and praise more on the largest social scales. But a net result of this tendency is that we seem far less effective today than were our ancestors at enforcing very-local-level social norms, and at discouraging related harms from local coalitions. We chafe at the idea of letting our nation be dominated by a king, but we easily and quietly submit to local kings in firms, clubs, and families.
Our political instincts and efforts are largely wasted, because we just are much less able to coordinate to identify and right wrongs on the largest scales. Now to some extent this is healthy. There was a lot of destructive waste when most political efforts were directed at very local politics. But many wrongs were also detected and righted. The human political instinct does serve some positive functions. After all, human bands were much larger than other primate bands, suggesting that human politics was less destructive than other primate politics.
I’ve suggested that organizations use decision markets to help advise key decisions. And to illustrate the idea, I’ve discussed the example of how it could apply to national politics. I’ve done this because people seem far more interested in reforming national politics, relative to reforming local small organizations. But honestly, I see a much bigger gains overall from smaller scale applications. And small scale application is where the idea needs to start, to work out the kinks. And such trials are feasible now. If only I could get some small orgs to try. Sigh.
I posted back in ’07 on a hero of local politics:
A colleague of my wife was a nurse at a local hospital, and was assigned to see if doctors were washing their hands enough. She identified and reported the worst offender, whose patients were suffering as a result. That doctor had her fired; he still works there not washing his hands. (more)
I’d admire you much more if you acted like this, relative to your marching on Washington, soliciting door-to-door for a presidential candidate, or posting ever so many political rants on Facebook. Shouldn’t you admire such folks far more as well?
Most of us live in worlds of conversation, like books or blogs or chats, where we tend to give many others the benefit of the doubt that they are mostly talking “in good faith.” We don’t just talk to show off or to support allies and knock rivals – we hold our selves to higher standards. But let me explain why that may often be wishful thinking.
I’ve previously suggested that coalition politics infuses a lot of human behavior. That is, we tend to use all available means to try to help “us “and hurt “them”, even if on average these games hurt us all. Coalition politics is a dirt that regularly accumulates in most any corner that is not vigorously and regularly cleaned.
This view predicts that coalition politics also infuses a lot of how writings (and speeches, etc.) are evaluated. That is, when we evaluate the writings of others, we attend to how such evaluations may help our coalitions and hurt rival coalitions. Especially for writings on subjects that have little direct relevance for how we live our lives. Like most topics in most blogs, magazines, journals, books, speeches, etc.
However, while we may find such cynicism plausible as a theory of rivals, we are reluctant to consciously embrace it as theory of ourselves. We instead want to say that we mostly evaluate the writings of others using different criteria. And when we are part of a group that evaluates writings similarly, we want to say this is because our group shares key evaluation criteria beyond “us good, them bad.”
Now some groups can offer concrete evidence for their claims to be relatively clean of coalition politics. These are groups who declare specific “objective” standards to judge writing. That is, they use standards that are relatively easy for outsiders to check. For example, outsiders can relatively easily check groups who evaluate writings based on word count, or on correctness of spelling and grammar. Yes, a commitment to such standards may favor some groups over others, such as good spellers over bad spellers. But it can’t be adjusted very easily to shifting coalitions. Which makes it a poor tool for supporting coalition politics.
Some groups say they judge writings based on their popularity in some audience. And yes, it can be pretty easy to evaluate the popularity of writings. However, it could easily be the audience that is using coalition politics to decide what is popular. Thus using popularity to evaluate writings doesn’t at all ensure that coalition politics doesn’t dominate evaluations.
Some groups claim to evaluate written “maps” based on how well they match intended “territories”. And when it is easy for many clearly-neutral outsiders to visit a territory, it can be easy for outsiders to check that territory-matching is actually how this group evaluates maps. But the harder it is for outsiders to see territories, or to read their supposedly matching maps, and the more easily that outside critics can be credibly accused of political bias, the more easily a group could pretend to evaluate maps based on territory matches, but actually evaluate them via coalition politics. For example, anthropologists watching the private lives of the very rich might write descriptions of those lives that pander to academic presumptions about the very rich, since few academics ever see those lives directly, and the few who do can be accused of biased by association.
Some groups use objective criteria for evaluations, but don’t give those criteria enough weight to stop coalition politics from dominating evaluations. For example, economic theory journals can claim that they only publish articles containing proofs without obvious errors. And the ability of readers to seek errors may ensure that this criteria is usually satisfied. But such journals may still reject most submissions that meet this criteria, allowing coalition politics to dominate which articles are accepted. Winning coalitions may be constrained to include only members capable of constructing proofs without obvious errors, but this need not be very constraining to them.
Another approach is to only use objective evaluation criteria, but to use many such criteria and to be unclear about their relative weights. The more such criteria, the greater the chance of finding criteria to reach whatever evaluation one wants. For example, in many legal areas there is wide agreement on the relevant factors, and on which directions each factor points to in a final decision. Nevertheless, given enough relevant factors, courts may usually have enough discretion to favor either side.
For any one group and their declared criteria of evaluation, it can be hard for outsiders to judge just how much leeway that group has left for coalition politics to influence evaluations. We tend to give the benefit of the doubt to our own groups, but not to rivalrous groups. For example, pro-science anti-religion folks may presume that peer review in scientific journals is mainly used to enforce good evidence norms, but that religious leaders mainly use their discretion in interpreting scriptures to favor their allies.
If they were honest, each group would either declare objective evaluation criteria that leave little room for coalition politics, or accept that outsiders can reasonably presume that coalition politics probably dominates their evaluations. And everyone should expect that even if their group now seems an exception where other criteria dominate, it will probably not remain so for long. Because these are in fact reasonable assumptions in a world where collation politics is a dirt that regularly and rapidly accumulates in any corner not vigorously and regularly cleaned.
Hey there reader, I’m really am talking about you and the worlds of writing where you live. Do you presume that your worlds are mostly dominated by politics, where different coalitions vie to support allies and knock rivals? Or do you see the groups you hang with as holding themselves to higher standards? If higher standards, are they standards that outsiders can easily check on? Or do you in practice mostly have to trust a small group of insiders to judge if standards are met? And if you have to trust insiders, how sure can you be their choices aren’t mostly driven by coalition politics?
Years ago I struggled with this issue, and wondered what evaluation criteria a group could adopt to robustly induce their writings to roughly tract truth on a wide range of topics, and resist the corrupting pressures of coalition politics to say what key audiences want or expect to hear. I was delighted to find that for a wide range of topics open prediction markets offer such robust criteria. Each trade can be an “edit” of the highly-evaluated “writing” that is the current market odds on each topic. Such edits are rewarded or punished via cash for moving the consensus toward or away from the truth.
I had hoped that many groups would be anxious to avoid the appearance that coalition politics may dirty their evaluations, and thus be eager to adopt new standards that can avoid such an appearance. So I hoped that many groups would want to adopt prediction markets, once they were clearly shown to be feasible and practical. Alas, that seems to not be so.
Today’s winning coalitions seem to prefer to let coalition politics continue to determine who wins in each group. This seems like how police departments would like to appear free from corruption, but not enough to actually make their internal affairs departments report to someone other than the chief of police. We are fond of tarring rival groups with the accusation that coalition politics dominates their evaluations, and we are fond of pretending that we are different. But not enough to visibly block that politics.
I hereby offer Robin Hanson (only) 2-to-1 odds on the following statement:
“There will, by 1 January 2010, exist a robotic system capable of the cleaning an ordinary house (by which I mean the same job my current cleaning service does, namely vacuum, dust, and scrub the bathroom fixtures). This system will not employ any direct copy of any individual human brain. Furthermore, the copying of a living human brain, neuron for neuron, synapse for synapse, into any synthetic computing medium, successfully operating afterwards and meeting objective criteria for the continuity of personality, consciousness, and memory, will not have been done by that date.”
Since I am not a bookie, this is a private offer for Robin only, and is only good for $100 to his $50. –JoSH
At the time I replied that my estimate for the chance of this was in the range 1/5 to 4/5, so we didn’t disagree. But looking back I think I was mistaken – I could and should have known better, and accepted this bet.
I’ve posted on how AI researchers with twenty years of experience tend to see slow progress over that time, which suggests continued future slow progress. Back in ’91 I’d had only seven years of AI experience, and should have thought to ask more senior researchers for their opinions. But like most younger folks, I was more interested in hanging out and chatting with other young folks. While this might sometimes be a good strategy for finding friends, mates, and same-level career allies, it can be a poor strategy for learning the truth. Today I mostly hear rapid AI progress forecasts from young folks who haven’t bothered to ask older folks, or who don’t think those old folks know much relevant.
I’d guess we are still at least two decades away from a situation where over half of US households use robots do to over half of the house cleaning (weighted by time saved) that people do today.
A year ago I announced that our IARPA-funded DAGGRE prediction market on world events had finally implemented my combinatorial prediction market tech (which I was prevented from showcasing nine years earlier), with a new-improved tech for efficient exact computation in near-tree-shaped networks.
Now we announce: DAGGRE is dead, and SciCast is born. Still funded by IARPA, SciCast focuses on predicting science and technology, it has a cleaner interface developed by Inkling, and it has been reimplemented from scratch to support ten times as many users and questions. We also now have Bruce D’Ambrosio’s firm Tuuyi on board to develop and implement even more sophisticated algorithms.
But wait, there’s more. We’ve got formal partnerships with AAAS and IEEE, have a thousand folks pre-registered to participate, and we hope to attract thousands of expert users, folks who really know their sci/tech. We’ve seeded SciCast with over a hundred questions, many contributed by top experts, and hope to soon have thousands of questions, mostly submitted by users.
Alas, we aren’t allowed to pay our participants money or prizes. But if you have sci/tech issues you want forecasted, if you want to prove your insight into the future of sci/tech, or if you want to influence the perceived consensus on sci/tech, join us at SciCast.org!
It’s perhaps no great surprise that we haven’t embraced Hanson’s “futarchy.” Our current political system resists dramatic change, and has resisted it for 237 years. More traditional modes of prediction have proved astonishingly bad, yet they continue to run our economic and political worlds, often straight into the ground. Bubbles do occur, and we can all point to examples of markets getting blindsided. But if prediction markets are on balance more accurate and unbiased, they should still be an attractive policy tool, rather than a discarded idea tainted with the odor of unseemliness. As Hanson asks, “Who wouldn’t want a more accurate source?”
Maybe most people. What motivates us to vote, opine, and prognosticate is often not the desire for efficacy or accuracy in worldly affairs—the things that prediction markets deliver—but instead the desire to send signals to each other about who we are. Humans remain intensely tribal. We choose groups to associate with, and we try hard to show everybody which groups we belong to. We don’t join the Tea Party because we have exhaustively studied and rejected monetarism, and we don’t pay extra for organic food because we have made a careful cost-benefit analysis based on research about its relative safety. We do these things because doing so says something that we want to convey to others. Nor does the accuracy of our favorite talking heads matter that much to us. More than we like accuracy, we like listening to talkers on our side, and identifying them as being on our team—the right team.
“We continue to have consistent results and evidence that markets are accurate,” Hanson says. “If the question is, ‘Do these things predict well?,’ we have an answer: They do. But that story has to be put up against the idea that people never really wanted more accurate sources.”
On this theory, the techno-libertarian enthusiasts got the technology right, and the humanity wrong. Whenever John Delaney showed up on CNBC, hawking his Intrade numbers and describing them as the most accurate and impartial around, he was also selling a future that people fundamentally weren’t interested in buying. (more)
I don’t much disagree — I raised these issues with Wood when he interviewed me. As usual, our hopes for idealistic outcomes mostly depend on finding ways to shame people into actually supporting what they pretend to support, by making the difference too obvious to ignore.
More specifically, I hope prediction markets within firms may someday gain a status like cost accounting today. In a world were no one else did cost accounting, proposing that your firm do it would basically suggest that someone was stealing there. Which would look bad. But in a world where everyone else does cost accounting, suggesting that your firm not do it would suggest that you want to steal from it. Which also looks bad.
Similarly, in a world where few other firms use prediction markets, suggesting that your firm use them on your project suggests that your project has an unusual problem in getting people to tell the truth about it via the usual channels. Which looks bad. But in a world where most firms use prediction markets on most projects, suggesting that your project not use prediction markets would suggest you want to hide something. That is, you don’t want a market to predict if your project will make its deadline because you don’t want others to see that it won’t make the deadline. Which would look bad.
Once prediction markets were a standard accepted practice within firms, it would be much easier to convince people to use them in government as well.