Tag Archives: Prediction Markets

More Academic Prestige Futures

Academia functions to (A) create and confer prestige to associated researchers, students, firms, cities, and nations, (B) preserve and teach what we know on many general abstract topics, and (C) add to what we know over the long run. (Here “know” includes topics where we are uncertain, and practices we can’t express declaratively.)

Most of us see (C) as academia’s most important social function, and many of us see lots of room for improvement there. Alas, while we have identified many plausible ways to improve this (C) function, academia has known about these for decades, and has done little. The problem seems less a lack of knowledge, and more a lack of incentives.

You might think the key is to convince the patrons who fund academia to change their funding methods, and to make funding contingent on adopting other fixes. After all, this should induce more of the (C) that we presume that patrons seek. Problem is, just like all the other parties involved, patron motives also focus on more on function (A) than on (C). That is, state, firm, and philanthropic patrons of academia mainly seek to buy what academia’s other customers, e.g., students and media, also buy: (A) prestige by association with credentialed impressiveness.

Thus offering better ways to fund (C) doesn’t help much. In fact, history actually moved in the other direction. From 1600 to 1800, science was mainly funded via prizes and infrastructure support. But then prestigious scientific societies pushed to replace prizes with grants. Grants give scientists more discretion, but are worse for (C). Scientists won, however; now grants are standard, and prizes rare.

But I still see a possible route to reform here, based on the fact that academics usually deny that their prestige is arbitrary, to be respected only because others respect it. Academics instead usually justify their prestige in function (A) as proxies for the ends of function (B,C). That is, academics tend to say that your best way to promote the preservation, teaching, and increase of our abstract knowledge is to just support academics according to their current academic prestige.

Today, academic prestige of individuals is largely estimated informally by gossip, based on the perceived prestiges of particular topics, institutions, journals, funding sources, conferences, etc. And such gossip estimates the prestige of each of these other things similarly, based on the prestige of their associations. This whole process takes an enormous amount of time and energy, but even so it attends far more to getting everyone to agree on prestige estimates, than to whether those estimates are really deserved.

Academics typically say that such sacred an end as intellectual progress is so hard to predict or control that it is arrogant of people like you to think you can see how to promote such things in any other way than to just give your money to the academics designated as prestigious by to this process, and let them decide what to do with it. And most of us have in fact accepted this story, as this is in fact what we mostly do.

Thus one way that we could hope to challenge the current academic equilibrium is to create better clearly-visible estimates of who or what contributes how much to these sacred ends. If academics came to accept another metric as offering more accurate estimates than what they now get from existing prestige processes, then that should pressure them into adjusting their prestige ratings to better match these new estimates. Which should then result in their assigning publications, jobs, grants etc. in ways that better promote such ends. Which should thus improve intellectual progress, perhaps by large amounts.

And as I outlined in my last post, we could actually create such new better estimates of who deserves academic prestige, via creating complex impact futures markets. Pay distant future historians (e.g., in a century or two) to judge then which of our academic projects (e.g., papers) actually better achieved adding to what we know. (Or also achieved preserving and teaching what we know.) Also create betting markets today that estimate those future judgments, and suggest to today’s academics and their customers that these are our best estimates of who and what deserve academic prestige. (Citations being lognormal suggests this system’s key assumptions are a decent approximation.)

These market prices would no doubt correlate greatly with the usual academic prestige ratings, but any substantial persistent deviations would raise a question: if, in assigning jobs, publications, grants, etc., you academics think you know better than these markets prices who is most likely to deserve academic prestige, why aren’t you or your many devoted fans trading in those markets to make the profits you think you see? If such folks were in fact trading heavily, but were resisted by outsiders with contrary strong opinions, that would look better than if they weren’t even bothering to trade on their supposed superior insight.

Academics seeking higher market estimates about they and their projects would be tempted to trade to push up those prices, even though their private info didn’t justify such a move. Other traders would expect this, and push prices back down. These forces would create liquidity in these markets, and subsidize trading overall.

Via this approach, we might reform academia to better achieve intellectual progress. So who wants to make this happen?

GD Star Rating
a WordPress rating system
Tagged as: , ,

Complex Impact Futures

Imagine a world of people doing various specific projects, where over the long run the net effect of all these projects is to produce some desired outcomes. These projects may interact in complex ways. To encourage people to do more and better such projects along the way, we might like a way to eventually allocate credit to these various projects for their contributions to desired outcomes.

And we might like to have good predictions of such credit estimates, available either right after project completion, so we can praise project supporters, or available before projects start, to advise on which projects to start. Such a mechanism could be applied to projects within a firm or other org re achieving that org’s goals, or to charity projects re doing various kinds of general good, or to academic projects re promoting intellectual progress. In this post, I outline a way to do all this.

First, let us assume that we have available to us “historians” who could in groups judge after the fact which of two actual projects had contributed the most to desired outcomes. (And assume a way to pay such historians to make them sufficiently honest and careful in such judgments.) These judgments might be made with noise, well after the fact, and at great expense, but are still possible. (Remember, the longer one waits to judge, the more budget one can spend on judging.)

Consider two projects that have relative strengths A and B in terms of the credit each deserves for desired outcomes. Assume further that the chance that a random group of historians will pick A over B is just A/(A+B). This linear rule is a standard assumption made for many kinds of sporting contests (e.g. chess), with contestant strengths being usually distributed log-normally. (E.g., chess “Elo rating” is proportional to a log of such a player strength estimate.)

Given these assumptions, project strength estimates can be obtained via a “tournament parimutuel” (a name I just made up). Let there be a pool of money associated with each project, where each trader who contributes to a pool gets the payoffs from that pool in proportion to their contributions.

If each project were randomly matched to another project, and random historian groups were assigned to judge each pair, then it would work to let the winning pool divide up the money from both pools, just as if there had been a simple parimutuel on that pair. Traders would then tend to set the relative amounts in each pool in proportion to the relative strengths of associated projects.

If judging were very expensive, however, then we might not be able to afford to have historians judge every project. But in that case it could work to randomize across projects. Pick sets of projects to judge, throw away the rest, and boost the amount in each retained pool by moving money from thrown-away (now boost-zero) pools into retained pools in proportion to pool size.

All you have to do is make sure that, averaged over the ways to randomly throw away projects, each project has a unit average boost. For example, you could partition the projects, and pick each partition set with a chance proportion to its pool size. With this done right, those who invest in pools should expect the same average payout as if all projects were judged, though such payouts would now have more variance.

Within a set of projects chosen for judging, any ways to pair projects to judge should work.  It would make sense to pair projects with similar strength estimates, to max the info that judging gives, but beyond that we could let judges pick, and at the last minute, pairs they think easier to judge, such as projects that are close to each other in topic spaces, or similar in methods and participants. Or pairs that they would find interesting and informative to judge.

Historians might even pick random projects to judge, and then look nearby to select comparison projects, as long as they ensured a symmetric choice habit, or corrected for asymmetries. (It can also work to allow judges to sometimes say they can’t judge, or to rank more than two projects at the same time.) It would be good if the might-be-paired network of connections between projects were fully connected across all projects.

Parimutuel pools can make sense when all pool contributions are made at roughly the same time, so that contributors have similar info. But when bets will be made over longer time durations, betting markets make more sense. Thus we’d like to have a “complex impact futures” market over the various projects for most of our long duration, and then convert such bets into parimutuel tournament holdings just before judging.

We can do that by letting anyone split cash $1 into N betting assets of the form “Pays $xinto p pool” for each of N projects p, where xp refers to the market price of this asset at the time when betting assets are converted to claims in a tournament parimutuel. At that time, each outstanding asset of the form “Pays $xp into p pool” is converted into $xp put into the parimutuel pool for project p.

This method ensures that project pool amounts have the ratios xp. Note that 1 = Sump=1N xp, that a logarithmic market scoring rule would work find for trading in these markets, and that via a “rest of field” asset we don’t need to know about all projects p when the market starts.

Thus traders in our complex impact futures markets should treat prices of these assets as estimates of the relative strength of projects p in the credit judging process. They’ll want to buy projects whose relative strength seems underestimated, and sell those that seem overestimated. And so these prices right after a project is completed should give speculators’ consensus estimate on that project’s relative credit for desired outcomes. And the prices on future possible projects, conditional on the project starting, give consensus estimates of the future credit of potential projects. As promised.

Some issues remain to consider. For example, how could we allow judging of pairs, and the choice of which pairs to judge, to be spread out across time, while allowing betting markets on choices that remain open to continue as long as possible into that process? Should judgements of credit just look at a project’s actual impact on desired outcomes, or should they also consider counterfactual impact, to correct for unforeseeable randomness, or others’ misbehavior? Should historians judge impact relative to resources used or available, or just judge impact without considering costs or opportunities? Might it work better to randomly pick particular an outcome of interest, and then only judge pairs on their impact re that outcome?

GD Star Rating
a WordPress rating system
Tagged as: ,

Brand Truth Narrowly

McDonalds is a famous food brand. Not everyone likes what they sell, but many do, and under this brand they can reliably find a kind of food they like at a predictable price which is below their value.

Imagine you were hungry and came across a Capitalist Food joint. You’ve never heard of them before, but their pitch is that you should like them because they are capitalist, and all the best food comes from capitalists. E.g., McDonalds. Which is in fact true.

But this should not persuade you much to buy from them. Yes, if you could choose only between a generic capitalist food place and generic non=profit food place, you’d probably do better with the capitalist one. But the reason the best food comes from capitalists is that they usually develop much narrower food brands. Like McDonalds.

Imagine that you knew how to make  a better yet cheaper burger. Most people who like burgers and who tried your burger for a half dozen times would conclude that yours are better. But by itself knowing how to make your better burger would not let you profit from selling them. Because you’d also need to create (or merge with) an acceptable brand to go with them.

For example, if your burger was branded with disliked and low status associations, people might avoid it even if your burgers were better. Such as being associated with the Russian side of their war with Ukraine when your customers live in Europe, or with Trump when your customers live in Seattle. Or being associated with insects, such as if your burger meat was made out of them.

Now consider prediction markets. We have good reasons to think that speculative markets are a great way to generate parameter estimates and decision advice. And many good people are now trying to sell this as a truth brand, that is, as a generic way to find truth. They set up a website where such markets can exist, put in a few sample claims, invite folks to suggest more claims, and step back. Somewhat like Capitalist Food as a brand

But the thing that I’ve long been struggling to explain to these good folks is: that is too wide a brand to work well. Few people want truth in general. Yes decision theory says that people want truth near their decisions, and want it more the biggest their decision. But there are many kinds of truths that they positively do not want, and many more truths where their generally positive value for truth is below its cost of production.

In fact, most of the claims on most of these prediction market sites are actually of this sort: general world events, politics, and celebrity gossip topics. Topics where people care a bit about truth, all else equal, but aren’t much willing to pay to improve on the level of truth that results from the usual news, gossip, and punditry on such topics. A few people are willing to pay to gamble on these topics just for fun, and that can support a few small businesses that serve them. But that leaves the huge social potential of prediction markets unrealized.

A related failure happens when other good people see lamentably low levels of truth in public conversations, and decide that their fix is to just think honestly and carefully and tell the truth as they see it. The problem is that their audiences cannot reliably distinguish sources that are actually more accurate due to being truly honest and careful, from the many other sources that look just like those, yet merely like to tell themselves that they are being honest and careful, but are actually motivated and sloppy.

That is, these good people have failed to create a brand to distinguish their superior truth product. Most individually honest and careful people don’t live long enough or have consistent enough reliability to enable most audiences to distinguish them via personal topic-specific brands. So we mainly distinguish them via larger existing truth brands, e.g., via academic or news media brands. But to gain such brand approval, they must make the many usual compromises re honesty and truth that such brands demand.

A solution here I think is: application-specific prediction market brands. For example, a brand that specializes in estimating the chance of making project deadlines, sold to orgs that actually want to know if they will make their deadlines. Or a brand that specializes in estimating the two-year-later employee evaluation that each new hire candidate would have if hired, sold to orgs that actually want to evaluate new hires.

Such brands would invest in early trials, first to learn the many details of how to do these applications well, and then second to collect a track record proving such knowledge. And they would also do what it takes to acquire and maintain whatever prestige associations their customers demand, and to avoid disliked associations that put off customers. Which yes could be a lot more work than just putting up a betting website with a few sample questions on current events.

But this is the work that needs to happen to create narrow-enough truth brands to be useful. Don’t try to sell Capitalist Food, but instead create your version of McDonalds in the truth space. Find the particular kinds of truths whose value of use is plausibly more than its cost of production, learn how to increase value and lower costs in that particular area, and then prove your learning to potential customers via a statistically-validated track record.

GD Star Rating
a WordPress rating system
Tagged as: ,

Decision Market Math

Let me share a bit of math I recently figured out regarding decision markets. And let me illustrate it with Fire-The-CEO markets.

Consider two ways that we can split $1 cash into two pieces. One way is: $1 = “$1 if A” + “$1 if not A”, where A is 1 or 0 depending on if a firm CEO stays in power til the end of the current quarter. Once we know the value of A, exactly one of these two assets can be exchanged for $1; the other is worthless. The chance a of the CEO staying is revealed by trades exchanging one unit of “$1 if A” for a units of $1.

The other way to split is $1 = “$x” + “$(1-x)”, where x a real number in [0,1], representing the stock price of that firm at quarter end, except rescaled and clipped so that x is always in [0,1]. Once we know the value of x, then one unit of “$x” can be exchanged for x units of $1, while one unit of “$(1-x)” can be exchanged for 1-x units of $1. The expected value x of the stock is revealed by trades exchanging one unit of “$x” for x units of $1.

We can combine this pair of two-way splits into a single four-way split:
$1 = “$x if A” + “$x if not A” + “$(1-x) if A” + “$(1-x) if not A”.
A simple combinatorial trading implementation would keep track of the quantities each user has of these four assets, and allow them to trade some of these assets for others, as long as none of these quantities became negative. The min of these four quantities is the cash amount that a user can walk away with at any time. And at quarter’s end, the rest turn into some amount of cash, which the user can then walk away with.

To advise the firm board on whether to fire the CEO, we are interested in the value that the CEO adds to the firm value. We can define this added value as x1-x2, where
x1 = E[x|A] is revealed by trades exchanging 1 unit of “$x if A” for x1 units of “$1 if A”
x2 = E[x|not A] is revealed by trades exchanging 1 unit of “$x if not A” for x2 units of “$1 if not A”.

In principle users could trade any bundle of these four assets for any other bundle. But three kinds of trades have the special feature of supporting maximal use of user assets in the following sense: when users make trades of only that type, two of their four asset quantities will reach zero at the same time. Reaching zero sets the limit of how far a user can trade in that direction.

To see this, let us define:
d1 = change in quantity of “$x if A”,
d2 = change in quantity of “$x if not A”,
d3 = change in quantity of “$(1-x) if A”,
d4 = change in quantity of “$(1-x) if not A”.

Two of these special kinds of trades correspond to the simple A and x trades that we described above. One kind exchanges 1 unit of “$1 if A” for a units of $1, so that d1=d3, d2=d4, -d1*(1-a)=a*d2. The other kind exchanges 1 unit of “$x” for x units of $1, so that d1=d2, d3=d4, -d1*(1-x)=x*d3.

The third special trade bundles the diagonals of our 2×2 array of assets, so that d1=d4, d2=d3, -q*d1=(1-q)*d2. But what does q mean? That’s the math I worked out: q = (1-a) + (2a-1)*x + 2a(1-a)*r*x, where r = (x1-x2)/x, and x = a*x1 + (1-a)*x2. So when we have market prices a,x from the other two special markets, we can describe trade ratios q in this diagonal market in terms of the more intuitive parameter r, i.e., the percent value the CEO adds to this firm.

When you subsidize markets with many possible dimensions of trade, you don’t have to subsidize all the dimensions equally. So in this case you could subsidize the q=r type trades much more than you do the a or x type trades. This would let you take a limited subsidy budget and direct it as much as possible toward the main dimension of interest: this CEO’s added value.

GD Star Rating
a WordPress rating system
Tagged as:

New-Hire Prediction Markets

In my last post, I suggested that the most promising place to test and develop prediction markets is this: get ordinary firms to pay for mechanisms that induce their associates to advise their key decisions. I argued that what we need most is a regime of flexible trial and error, searching in the space of topics, participants, incentives, etc. for approaches that can add value here while avoiding the political disruptions that have plagued previous trials.

If you had a firm willing to participate in such a process, you’d want to be opportunistic about the topics of your initial trials. You’d ask them what are their most important decisions, and then seek topics that could inform some of those decisions cheaply, quickly, and repeatedly, to allow rapid learning from experimentation. But what if you don’t have such a firm on the hook, and instead seek a development plan to attract many firms?

In this case, instead of planning to curate a set of topics specific to your available firm, you might want to find and focus on a general class of topics likely to be especially valuable and feasible in roughly the same way at a wide range of firms. When focused on such a class, trials at any one firm should be more informative about the potential for trials at other firms.

One plausible candidate is: deadlines. A great many firms have projects with deadlines, and are uncertain on if they will meet those deadlines. They should want to know not only the chance of making the deadline, but how that chance might change if they changed the project’s resources, requirements, or management. If one drills down to smaller sub-projects, whose deadlines tend to be sooner, this can allow for many trials within short time periods. Alas, this topic is also especially disruptive, as markets here tend to block project managers’ favorite excuses for deadline failure.

Here’s my best-guess topic area: new hires. Most small firms, and small parts of big firms, hire a few new people every year, where they pay special attention to comparing each candidate to small pool of “final round” candidates. And these choices are very important; they add up to a big fraction of total firm decision value. Furthermore, most firms also have a standard practice of periodically issuing employee evaluations that are comparable across employees. Thus one could create prediction markets estimating the N-year-later (N=2?) employee evaluation of each final candidate, conditional on their being hired, as advice about whom to hire.
Yes, having to wait two years to settle bets is a big disadvantage, slowing the rate at which trial and error can improve practice. Yes, at many firms employee evaluations are a joke, unable to bear any substantial load of criticism or attention. Yes, you might worry about work colleauges trying to sabotage the careers of new hires that they bet against. And yes, new hire candidates would have to agree to have their application evaluated by everyone in the potential pool of market participants, at least if they reach the final round.

Even so, the value here seems so large as to make it well worth trying to overcome these obstacles. Few firms can be that happy with their new hire choices, reasonably fearing they are missing out on better options. And once you had a system working for final round hire choices, it could plausibly be extended to earlier hiring decision rounds.

Yes, this is related to my proposal to use prediction markets to fire CEOs. But that’s about firing, and this is about hiring. And while each CEO choice is very valuable, there is far more total value encompassed in all the lower personnel choices.

GD Star Rating
a WordPress rating system
Tagged as: ,

Prediction Markets Need Trial & Error

We economists have a pretty strong consensus on a few key points: 1) innovation is the main cause of long-term economic growth, 2) social institutions are a key changeable determinant of social outcomes, and 3) inducing the collection and aggregation of info is one of the key functions of social institutions. In addition, better institutional-methods for collecting and aggregating info (ICAI) could help with the key meta-problems of making all other important choices, including the choice of our other institutions, especially institutions to promote innovation. Together all these points suggest that one of the best ways that we today could help the future is to innovate better ICAI.

After decades pondering the topic, I’ve concluded that prediction markets (and closely related techs) are our most promising candidate for a better ICAI; they are relatively simple and robust with a huge range of potential high-value applications. But, alas, they still need more tests and development before wider audiences can be convinced to adopt them.

The usual (good) advice to innovators is to develop a new tech first in the application areas where it can attract the highest total customer revenue, and also where customer value can pay for the highest unit costs. As the main direct value of ICAI is to advise decisions, we should thus seek the body of customers most willing to pay money for better decisions, and then focus, when possible, on their highest-value versions.

Compared to charities, governments, and individuals, for-profit firms are more used to paying money for things that they value, including decision advice. And the decisions of such firms encompass a large fraction, perhaps most, of the decision value in our society. This suggests that we should seek to develop and test prediction markets first in the context of typical decisions of ordinary business, slanted when possible toward their highest value decisions.

The customer who would plausibly pay the most here is the decision maker seeing related info, not those who want to lobby for particular decisions, nor those who want to brag about how accurate is their info. And they will usually prefer ways to elicit advice from their associates, instead of from distant curated panels of advisors.

We have so far seen dozens of efforts to use prediction markets to advise decisions inside ordinary firms. Typically, users are satisfied and feel included, costs are modest, and market estimates are similarly or substantially more accurate than other available estimates. Even so, experiments typically end within a few years, often due to political disruption. For example, market estimates can undermine manager excuses (e.g., “we missed the deadline due to a rare unexpected last-minute problem”), and managers dislike seeing their public estimates beaten by market estimates.

Here’s how to understand this: “Innovation matches elegant ideas to messy details.” While general thinkers can identify and hone the elegant ideas, the messy details must usually come from context-dependent trial and error. So for prediction markets, we must search in the space of detailed context-dependent ways to structure and deploy them, to find variations that cut their disruptions. First find variations that work in smaller contexts, then move up to larger trials. This seems feasible, as we’ve already done so for other potentially-politically-disruptive ICAI, such as cost-accounting, AB-tests, and focus groups.

Note that, being atheoretical and context-dependent, this needed experimentation poorly supports academic publications, making academics less interested. Nor can these experiments be enabled merely with money; they crucially need one or more organizations willing to be disrupted by many often-disruptive trials.

Ideally those who oversee this process would be flexible, willing and able as needed to change timescales, topics, participants, incentives, and who-can-see-what structures. An d such trials should be done where those in the org feel sufficiently free to express their aversion to political disruption, to allow the search process to learn to avoid it. Alas, I have so far failed to persuade any organizations to host or fund such experimentation.

This is my best guess for the most socially valuable way to spend ~<$1M. Prediction markets offer enormous promise to realize vast social value, but it seems that promise will remain only potential until someone undertakes the small-scale experiments needed to find the messy details to match its elegant ideas. Will that be you?

GD Star Rating
a WordPress rating system
Tagged as: , ,

Intellectual Prestige Futures

As there’s been an uptick of interest in prediction markets lately, in the next few posts I will give updated versions of some of my favorite prediction market project proposals. I don’t own these ideas, and I’d be happy for anyone to pursue any of them, with or without my help. And as my first reason to consider prediction markets was to reform academia, let’s start with that.

Back in 2014, I restated my prior proposals that research patrons subsidize markets, either on relatively specific results likely to be clearly resolved, such as the mass of the electron neutrino, or on simple abstract statements to be judged by a distant future consensus, conditional on such a consensus existing. Combinatorial markets connecting abstract questions to more specific ones could transfer their subsidizes to those the latter topics.

However, I fear that this concept tries too hard to achieve what academics and their customers say they want, intellectual progress, relative to what they more really want, namely affiliation with credentialed impressiveness. This other priority better explains the usual behaviors of academics and their main customers, namely students, journalists, and patrons. (For example, it was a bad sign when few journals showed interest in using prediction market estimates of which of their submissions were likely to replicate.) So while I still think the above proposal could work, if patrons cared enough, let me now offer a design better oriented to what everyone cares more about.

I’d say what academics and their customers want more is a way to say which academics are “good”. Today, we mostly use recent indicators of endorsement by other academics, such as publications, institutional affiliations, research funding, speaking invitations, etc. But we claim, usually sincerely, to be seeking indicators of long term useful intellectual impact. That is, we want to associate with the intellectuals about whom we have high and trustworthy shared estimates of the difference that their work will make in the long run toward valuable intellectual progress.

A simple way to do this would be to create markets in assets on individuals, where each asset pays as a function of a retrospective evaluation of that individual, an evaluation made in the distant future via detailed historical analysis. By subsidizing market makers who trade in such assets, we could today have trustworthy estimates to use when deciding which individuals among us we should consider for institutional affiliations, funding, speaking invitations, etc. (It should be easy for trade on assets that merge many individuals with particular features, such as Ph.Ds from a particular school.)

Once we had a shared perception that these are in fact our best available estimates, academics would prefer them over less reliable estimates such as publications, funding, etc. As the value of an individual’s work is probably non-linear in their rank, it might make sense to have people trade assets which pay as a related non-linear function of their rank. This could properly favor someone with a low median rank but high variance in that rank over someone else with a higher median but lower variance.

Why wait to evaluate? Yes, distant future evaluators would know our world less well. But they would know much better which lines of thought ended up being fruitful in a long run, and they’d have more advanced tech to help them study intellectual connections and lineages. Furthermore, compound interest would give us access to a lot more of their time. For example, at the 7% post-inflation average return of the S&P500 1871-2021, one dollar becomes one million dollars in 204 years. (At least if the taxman stays aside.)

Furthermore, such distant evaluations might only be done on a random fraction, say one percent, of individuals, with market estimates being conditional on such a future evaluation being made. And as it is likely cheaper to evaluate people who worked on related topics, it would make sense to randomly pick large sets of related individuals to evaluate together.

Okay, but having ample resources to support evaluations by future historians isn’t enough; we also need to get clear on the evaluation criteria they are to apply. First, we might just ask them to sort a sample of intellectuals relative to each other, instead of trying to judge their overall quality on some absolute scale. Second, we might ask them to focus on an individual’s contributions to helping the world figure out what is true on important topics; being influential but pushing in the wrong directions might count against them. Third, to correct for problems caused by scholars who play organizational politics, I’d rather ask future historians to rate how influential an individual should have been, if others had been a bit more fair in choosing to whom to listen.

The proposal I’ve sketched so far is relatively simple, but I fear it looks too stark; forcing academics to admit more than they’d like that the main thing they care about is their relative ranking. Thus we might prefer to pay a mild complexity cost to focus instead on having future historians rate particular works by intellectuals, such as their journal articles or books. We could ask future historians to rate such works in such a way that the total value of each intellectual was reasonably approximated by the sum of the values of each of their work’s.

Under this system, intellectuals could more comfortably focus on arguing about the the total future impact of each work. Derivatives could be created to predict the total value of all the works by an individual, to use when choosing between individuals. But everyone could claim that is just a side issue, not their main focus.

To pursue this project concept, a good first step would be to fund teams of historians to try to rank the works of intellectuals from several centuries ago. Compare the results of different historian teams assigned to the same task, and have teams seek evaluation methods that can be both reliable and also get at the key questions of actual (or counterfactual) impact on the progress that matters. Then figure out which kinds of historians are best suited to applying such methods, and which funding methods best induce them to do such work in a cost-effective manner.

With such methods in hand, we could with more confidence set up markets to forecast the impact of particular current intellectuals and their works. We’d probably want to start with particular academic fields, and then use success there to persuade other fields to follow their example. This seems easier the higher the prestige of the initial academic fields, and the more open are they all to using new methods.

GD Star Rating
a WordPress rating system
Tagged as: ,

The Accuracy of Authorities

“WHO treads a difficult line, & tends to be quite conservative in its recommendations to avoid putting out info that later proves to be incorrect. ‘You can’t be backtracking’ … because ‘then you lose complete credibility’.” (More)

There is something important to learn from this example. The best estimates of a maximally accurate source would be very frequently updated and follow a random walk, which implies a large amount of backtracking. And authoritative sources like WHO are often said to be our most accurate sources. Even so, such sources do not tend to act this way. They instead update their estimates rarely, and are especially reluctant to issue estimates that seem to backtrack. Why?

First, authoritative sources serve as a coordination point for the behavior of others, and it is easier to coordinate when estimates change less often. Second, authoritative sources need to signal that they have power; they influence others far more than others influence them. Both of these pressures push them toward making infrequent changes. Ideally only one change, from “we don’t know”, to “here is the answer”. But if so, why do they feel pressures to issue estimates more often than this?

First, sometimes there are big decisions that need to be made, and then authorities are called upon to issue estimates in time to help with those decisions. For example, WHO was often called upon to issue estimates to help with a rapidly changing covid epidemic.

Second, sometimes a big source of relevant info appears, and it seems obvious to all that it must be taken into account. For example, no matter how confident we were to win a battle, we should expect to get news about how that battle actually went, and update accordingly. In this case, the authority is more pressed to update its estimate, but also more forgiven for changing its estimate. So during covid, authorities were expected to update on changing case and death counts, and that didn’t count so much as “backtracking”.

Third, sometimes rivals compete for authority. And then sources might be compared regarding their accuracy track record. This would push them toward the frequently updated random walk scenario, which can degrade the appearance of authority for all such competitors. (The other two pressures to update more often may also degrade authority; e.g., WHO’s authority seems to have degraded during covid.)

Due to the first of these pressures, the need to inform decisions, authoritative sources prefer that dependent decisions be made infrequently and opaquely. Such as by central inflexible organizations, who decide by opaque political processes. E.g., masking, distancing, and vaccine policies for covid. There can thus form a natural alliance between central powers and authoritative sources.

Due to the second of these pressures, authoritative sources prefer a strong consensus on what are the big sources of info that force them to update. This pushes for making very simple, stable, and clear distinctions between “scientific” info sources, on which one must update, and “unscientific” sources, on which it is in considered inappropriate for authors to update. Those latter sources must be declared not just less informative, but un-informative, and slandered in enough ways so that few who aspire to authority are tempted to rely on them.

Due to the third of these pressures, authoritative sources will work hard to prevent challengers competing on track record accuracy. Authorities will issue vague estimates that are hard to compare, prevent the collection of data that would support comparisons, and accuse challengers of crimes (e.g., moral positions) to make them seem ineligible for authority. And other kinds of powers, who prefer a single authority source they can defer to in order to avoid responsibility for their decisions, will help to suppress such competitors.

This story seems to explain why ordinary people take backtracking as a sign of inaccuracy. They have a hidden motive to follow authorities, but give accuracy as their excuse for following such sources. This forces them to see backtracking as a general sign of inaccuracy.

This all seems to be bad news for efforts to gain credibility, funding, and legal permission for alternative estimate sources, such as those based on prediction markets or forecasting competitions. This helps explain why individual org managers are reluctant to support such alternate sources, and why larger polities create barriers to them, such as via censorship, professional licensing, and financial regulation.

This all points to another risk of our increasingly integrated world community of elites. They may form central sources of authoritative estimates, which coordinate with other authorities to suppress alternate sources. Previously, world wide competition made it easier to defy and challenge such estimate authorities.

Added: As pointed out by @TheZvi, a 4th pressure on authorities to update more often is to stay consistent with other authorities. This encourages authorities to coordinate to update together at the same time, by talking first behind the scenes.

Added 11Apr: Seem many comments on this over at Marginal Revolution.

GD Star Rating
a WordPress rating system
Tagged as: ,

Can We Tame Political Minds?

Give me a firm spot on which to stand, and I shall move the earth. (Archimedes)

A democracy … can only exist until the voters discover that they can vote themselves largesse from the public treasury. (Tytler)

Politics is the mind killer. (Yudkowsky)

The world is a vast complex of interconnected subsystems. Yes, this suggests that you can influence most everything else via every little thing you do. So you might help the world by picking up some trash, saying a kind word, or rating a product on Yelp.

Even so, many are not satisfied to have some effect, they seek a max effect. For this reason, they say, they seek max personal popularity, wealth, or political power. Or they look for the most neglected people to help, like via African bed nets. Or they seek dramatic but plausibly neglected disaster scenarios to prevent, such as malicious foreigners, eco-apocalypse, or rampaging robots.

Our future is influenced by a great many things, including changes in tech, wealth, education, political power, military power, religion, art, culture, public opinion, and institutional structures. But which of these offers the strongest lever to influence that future? Note that if we propose to change one factor in order to induce changes in all the others, critics may reasonably question our ability to actually control that factor, since in the past such changes seem to have been greatly influenced by other factors.

Thus a longtime favorite topic in “serious” conversation is: where are the best social levers, i.e. factors which do sometimes change, which people like us (this varies with who is in the conversation) can somewhat influence, and where the effects of this factor on other factors seem lasting and stronger than reverse-direction effects.

When I was in tech, the consensus there saw tech as the strongest lever. I’ve heard artists make such claims about art. And I presume that priests, teachers, activists, and journalists are often told something similar about their factors.

We economists tend to see strong levers in the formal mechanisms of social institutions, which we happen to be well-placed to study. And in fact, we have seen big effects of such formal institutions in theory, the lab, and the field. Furthermore, we can imagine actually changing these mechanisms, because they tend to be stable, are sometimes changed, and can be clearly identified and concisely described. Even stronger levers are found in the higher level legal, regulatory, and political institutions that control all the other institutions.

My Ph.D. in social science at Caltech focused on such controlling institutions, via making formal game theory models, and testing them in the lab and field. This research finds that institution mechanisms and rules can have big effects on outcomes. Furthermore, we seem to see many big institutional failures in particular areas like telecom, transport, energy, education, housing, and medicine, wherein poor choices of institutions, laws, and regulations in such areas combine to induce large yet understandable waste and inefficiency. Yes institutions do matter, a lot.

However, an odd thing happens when we consider higher level models. When we model the effects of general legal and democratic institutions containing rational agents, we usually find that such institutions work out pretty well. Common fears of concentrated interests predating on diffuse interests, or of the poor taxing the rich to death, are not usually borne out. While the real world does seem full of big institutional problems at lower levels, our general models of political processes do not robustly predict these common problems. Even when such models include voters who are quite ignorant or error prone. What are such models missing?

Bryan Caplan’s book Myth of the Rational Voter gets a bit closer to the truth with his concept of “rational irrationality”. And I was heartened to see Alex Tabarrok [AT] and Ezra Klein [EK], who have quite different political inclinations, basically agree on the key problem in their recent podcast:

[AT:] Mancur Olson thought he saw … more and more of these distributional coalitions, which are not just redistributing resources to themselves, but also slowing down… change. … used to be that we required three people to be on the hiring committee. This year, we have nine … Now, we need [more] rules. … we’ve created this more bureaucratic, kind of rule-bound, legalistic and costly structure. And that’s not a distributional coalition. That’s not lobbying. That’s sort of something we’ve imposed upon ourselves. …

[EK:] it’s not that I want to go be part of slowing down society and an annoying bureaucrat. Everybody’s a hero of their own story. So how do you think the stories people tell themselves in our country have changed for this to be true? …

[AT:] an HOA composed of kind of randos from the community telling you what your windows can look like, it’s not an obvious outcome of a successful society developing coalitions who all want to pursue their own self-interest. … naked self-interest is less important than some other things. And I’ll give you an example which supports what you’re saying. And that is, if you look at renters and the opinions of renters, and they are almost as NIMBY, Not In My Backyard, as owners, right, which is crazy.… farmers get massive redistribution in their favor. … But yet, if you go to the public … They’re, oh, no, we’ve got to protect the family farm. …

[EK:] a lot of political science … traditionally thought redistribution would be more powerful than it has proven to be … as societies get richer, they begin emphasizing what he calls post-materialist values, these moral values, these identity values, values about fairness. (More)

That is, our larger political and legal systems induce, and do not fix, many more specific institutional failures. But not so much because of failures in the structure of our political or legal institutions. Instead, the key problem seems to lie in voters’ minds. In political contexts, minds that are usually quite capable of being reasonable and pragmatic, and attending to details, instead suffer from some strange problematic mix of confused, incoherent, and destructive pride, posturing, ideology, idealism, loyalty, and principles. For want of a better phrase, let’s just call these “political minds.”

Political minds are just not well described by the usual game theory or “rational” models. But they do seem to be a good candidate for a strong social level to move the future. Yes, political minds are probably somewhat influenced by political institutions, and by communications structures of who talks to and listens to whom. And by all the other systems in the world. Yet it seems much clearer how they influence other systems than how the other systems influence them. In particular, it is much clearer how political minds influence institution mechanisms than how those mechanisms influence political minds.

In our world today, political minds somehow induce and preserve our many more specific institutional failures. And also the accumulation of harmful veto players and added procedures discussed by [AT] and [EK]. Even so, as strong levers, these political minds remain gatekeepers of change. It seems hard to fix the problems they cause without somehow getting their buy-in. But can we tame politician minds?

This is surely one of the greatest questions to be pondered by those aware enough to see just how big a problem this is. I won’t pretend to answer it here, but I can at least review six possibilities.

War – One ancient solution was variation and selection of societies, such as via war and conquest. These can directly force societies to accept truths that they might not otherwise admit. But such processes are now far weaker, and political minds fiercely oppose strengthening them. Furthermore, the relevant political minds are in many ways now integrated at a global level.

Elitism – Another ancient solution was elitism: concentrate political influence into fewer higher quality hands. Today influence is not maximally distributed; we still don’t let kids or pets vote. But the trend has definitely been in that direction. We could today limit the franchise more, or give more political weight to those who past various quality tests. But gains there seem limited, and political minds today mostly decry such suggestions.

Train – A more modern approach is try to better train minds in general, in the hope that will also improve minds in political contexts. And perhaps universal education has helped somewhat there, though I have doubts. It would probably help to replace geometry with statistics in high school, and to teach more economics and evolutionary biology earlier. But remember that the key problem is reasonable minds turning unreasonable when politics shows up; none of these seem to do much there.

Teach – A more commonly “practiced” approach today is just to try to slowly persuade political minds person by person and topic by topic, to see and comprehend their many particular policy mistakes. And do this faster than new mistakes accumulate. That has long been a standard “educational” approach taken by economists and policy makers. It seems especially popular because one can pretend to do this while really just playing the usual political games. Yes, there are in fact people like Alex and Ezra who do see and call attention to real institutional failures. But overall this approach doesn’t seem to be going very well. Even so, it may still be our best hope.

Privatize – A long shot approach is to try to convince political minds to not trust their own judgements as political minds, and thus to try to reduce the scope for politics to influence human affairs. That is, push to privatize and take decisions away from large politicized units, and toward more local units who face stronger selection and market pressures, and induce less politicized minds. Of course many have been trying to do exactly this for centuries. Even so, this approach might still be our best hope.

Futarchy – My proposed solution is also to try to convince political minds to not trust their own judgements, but only regarding on matters of fact, and only relative to the judgements of speculative markets. Speculative market minds are in fact vastly more informed and rational than the usual political minds. And cheap small scale trials are feasible that could lead naturally to larger scale trials that could go a long way toward convincing many political minds of this key fact. It is quite possible to adopt political institutions that put speculative markets in charge of estimating matters of fact. At which point we’d only be subject to political mind failures regarding values. I have other ideas for this, but let’s tackle one problem at a time.

Politics is indeed the mind killer. But once we know that, what can we do? War could force truths, though at great expense. Elitism and training could improve minds, but only so far. Teaching and privatizing are being tried, but are progressing terribly slowly, if at all.

While it might never be possible to convince political minds to distrust themselves on facts, relative to speculative markets, this approach has hardly been tried, and seems cheap to try. So, world, why not try it?

GD Star Rating
a WordPress rating system
Tagged as: ,

Me on Prediction Markets

Here’s a more-polished-than-usual video by me summarizing the idea of prediction market:

GD Star Rating
a WordPress rating system
Tagged as: